(Kelompok 1) JHE Vol. 33 (January 2014)

Volume 33, January 2014 ISSN: 0167-6296
JOURNAL OF
HEALTH
ECONOMICS
JOURNAL OF HEALTH ECONOMICS Publication information: Journal of Health Economics (ISSN 0167-6296). For 2014, volumes 33–38 are scheduled for publication.
Subscription prices are available upon request from the Publisher or from the Regional Sales Office nearest you or from this journal’s website
(http://www.elsevier.com/locate/jhe). Further information is available on this journal and other Elsevier products through Elsevier’s website:
(http://www.elsevier.com). Subscriptions are accepted on a prepaid basis only and are entered on a calendar year basis. Issues are sent by
standard mail (surface within Europe, air delivery outside Europe). Priority rates are available upon request. Claims for missing issues should
Aims and Scope be made within six months of the date of dispatch.
This Journal seeks articles related to the economics of health and medical care. Its scope will include the following
topics: production of health and health services; demand and utilization of health services; financing of health services; Advertising information: If you are interested in advertising or other commercial opportunities please e-mail Commercialsales@elsevier.com
and your enquiry will be passed to the correct person who will respond to you within 48 hours.
measurement of health; behavioral models of demanders, suppliers and other health care agencies; health behaviors
and policy interventions; efficiency and distributional aspects of health policy; and such other topics as the Editors Funding body agreements and policies
may deem appropriate. Applications to problems in both developed and less-developed countries are welcomed. Elsevier has established agreements and developed policies to allow authors whose articles appear in journals published by Elsevier, to
comply with potential manuscript archiving requirements as specified as conditions of their grant awards. To learn more about existing
agreements and policies please visit http://www.elsevier.com/fundingbodies
Editors
M. CHALKLEY, Centre for Health Economics, University of York, Heslington, York YO10 5DD, UK. Orders, claims, and journal enquiries: Please contact the Elsevier Customer Service Department nearest you:
E-mail: martin.chalkley@york.ac.uk St. Louis: Elsevier Customer Service Department, 3251 Riverport Lane, Maryland Heights, MO 63043, USA; phone: (877) 8397126 [toll free
D. CUTLER, Littauer Center 230, Harvard Yard, Harvard University, Cambridge, MA, USA. E-mail: dcutler@harvard.edu within the USA]; (+1) (314) 4478878 [outside the USA]; fax: (+1) (314) 4478077; e-mail: JournalCustomerService-usa@elsevier.com
Tokyo: Elsevier Customer Service Department, 4F Higashi-Azabu, 1-Chome Bldg, 1-9-15 Higashi-Azabu, Minato-ku, Tokyo 106-0044, Japan;
R.G. FRANK, Department of Health Care Policy, Harvard Medical School, 180 Longwood Avenue, Boston, MA 02115, phone: (+81) (3) 5561 5037; fax: (+81) (3) 5561 5047; e-mail: JournalsCustomerServiceJapan@elsevier.com
USA. Singapore: Elsevier Customer Service Department, 3 Killiney Road, #08-01 Winsland House I, Singapore 239519; phone: (+65) 63490222;
A. LLERAS-MUNEY, Economics Department, UCLA, Los Angeles, CA, USA. E-mail: alleras@econ.ucla.edu fax: (+65) 67331510; e-mail: JournalsCustomerServiceAPAC@elsevier.com
Oxford: Elsevier Customer Service Department, The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK; phone: (+44) (1865) 843434;
N. RICE, Centre for Health Economics, University of York, Heslington, York YO10 5DD, UK. E-mail: nigel.rice@ fax: (+44) (1865) 843970; e-mail: JournalsCustomerServiceEMEA@elsevier.com
york.ac.uk
L. SICILIANI, Department of Economics and Related Studies, University of York, Heslington, York, YO10 5DD, UK. Author enquiries
For enquiries relating to the submission of articles (including electronic submission) please visit this journal’s homepage at
luigi.siciliani@york.ac.uk http://www.elsevier.com/locate/jhe. For detailed instructions on the preparation of electronic artwork, please visit http://www.elsevier.com/
A.D. STREET, Centre for Health Economics, University of York, Heslington, York YO10 5DD, UK. artworkinstructions. Contact details for questions arising after acceptance of an article, especially those relating to proofs, will be provided
E-mail: andrew.street@york.ac.uk by the publisher. You can track accepted articles at http://www.elsevier.com/trackarticle. You can also check our Author FAQs at http://www.
elsevier.com/authorFAQ and/or contact Customer Support via http://support.elsevier.com.
Associate Editors USA mailing notice: Journal of Health Economics (ISSN 0167-6296) is published bimonthly (January, March, May, July, September and
J.E. ASKILDSEN, University of Bergen, Bergen, Norway. November) by Elsevier B.V. (P.O. Box 211, 1000 AE Amsterdam, The Netherlands). Periodicals postage paid at Jamaica, NY 11431 and
K. BAICKER, Harvard School of Public Health, Boston, MA, USA. additional mailing offices (not valid for journal supplements).
USA POSTMASTER: Send change of address to Journal of Health Economics, Elsevier Customer Service Department, 3251 Riverport Lane,
P.P. BARROS, Universidade Nova de Lisboa, Lisbon, Portugal. Maryland Heights, MO 63043, USA.
A. BASU, University of Washington, Seattle, WA, USA.
H. BLEICHRODT, Erasmus University, Rotterdam, The Netherlands. AIRFREIGHT AND MAILING in USA by Air Business Ltd., c/o Worldnet Shipping Inc., 156-15, 146th Avenue, 2nd Floor, Jamaica, NY 11434,
USA.
R.P. ELLIS, Boston University, Boston, MA, USA.
J. GLAZER, Tel Aviv University, Tel Aviv, Israel. The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992
S. GLIED, Columbia University, New York, NY, USA. (Permanence of Paper).
M. GROSSMAN, National Bureau of Economic Research, New York, NY, USA.
J. GRUBER, Massachusetts Institute of Technology, Cambridge, MA, USA. Printed by Henry Ling Ltd, Dorchester, UK.
R. KAESTNER, University of Illinois at Chicago, Chicago, IL, USA.
M. KIFMANN, University of Hamburg, Hamburg, Germany.
W.G. MANNING, University of Chicago, Chicago, IL, USA. “For a full and complete Guide for Authors, please go to: http://www.elsevier.com/locate/Jhe”
J. MULLAHY, University of Wisconsin-Madison, Madison, WI, USA.
O.A. O’DONNELL, Erasmus University, Rotterdam, The Netherlands.
P. OLIVELLA, Universitat Autònoma de Barcelona, Barcelona, Spain.
C. PROPPER, University of Bristol, Bristol, UK.
A. SCOTT, University of Melbourne, Victoria, Australia.
The Journal of Health Economics has no page charges
doi:10.1016/S0167-6296(13)00175-6
Journal of Health Economics 33 (2014) 1–27
Contents lists available at ScienceDirect
Journal of Health Economics

journal homepage: www.elsevier.com/locate/econbase
How does provider supply and regulation influence health care

markets? Evidence from nurse practitioners and physician assistants夽
Kevin Stange ∗
Gerald R. Ford School of Public Policy, University of Michigan, 735 S. State Street #5236, Ann Arbor, MI 48109, United States
a r t i c l e i n f o a b s t r a c t
Article history: Nurse practitioners (NPs) and physician assistants (PAs) now outnumber family practice doctors in the
Received 2 January 2013 United States and are the principal providers of primary care to many communities. Recent growth of
Received in revised form 3 October 2013 these professions has occurred amidst considerable cross-state variation in their regulation, with some
Accepted 16 October 2013
states permitting autonomous practice and others mandating extensive physician oversight. I find that
expanded NP and PA supply has had minimal impact on the office-based healthcare market overall, but
JEL classification:
utilization has been modestly more responsive to supply increases in states permitting greater autonomy.
I11
Results suggest the importance of laws impacting the division of labor, not just its quantity.
J44
© 2013 Elsevier B.V. All rights reserved.
Keywords:
Health care workforce
Occupational licensing
1. Introduction care services, among others.1 These provisions of the ACA repre-
sent just the latest manifestation of public concern for the number,
The Patient Protection and Affordable Care Act (ACA) of 2010 quality, and geographic distribution of healthcare providers in the
contains a number of provisions predicated on the belief that ade- United States. This concern stretches back more than a century,
quate availability of primary care providers is crucial if expanded when Flexner’s (1910) conclusion that the United States had an
insurance coverage is to translate into greater healthcare access. oversupply of poorly trained physicians resulted in a substantial
The ACA calls for a significant expansion of the National Health contraction in the number of medical schools and new physicians
Service Corps (NHSC), more primary care residency positions, and at the start of the 20th century (Blumenthal, 2004). Subsequent pol-
increases in Medicare and Medicaid reimbursement for primary icy attempts to influence the healthcare workforce has taken many
forms, from funding for graduate medical education via Medicare
to the establishment in 1972 of the NHSC and its recent expansions
through the ACA and the American Recovery and Reinvestment Act
of 2009.
A recent development in this old policy issue is the emergence
of nurse practitioners (NPs) and physician assistants (PAs) as part
夽 The dataset used in this paper was constructed in collaboration with Dr. Deb- of the solution.2 Though around since the 1960s, only after expe-
orah Sampson of the Boston College School of Nursing. Helpful feedback was also riencing rapid growth in the 1990s have these professions become
provided by seminar participants at the RWJ Health Policy Scholars 2009 and 2010
sizable enough to provide a large scale complement or alterna-
Annual Meetings, the University of Michigan (Ford School of Public Policy, School
of Public Health, Economics Department), the Upjohn Institute, the 2011 Associ- tive to physician care (Fig. 1). With more than 85,000 PAs and
ation for Public Policy Analysis and Management Annual Meeting, the University 150,000 NPs eligible to practice, their ranks now exceed the num-
of Chicago, and the 2012 American Society of Health Economists meeting. I am ber of general and family practice MDs and are approaching the
grateful for the excellent research assistance provided by Morgen Miller in partic-
ular, and also by Phil Kurdunowicz, Jennifer Hefner, Sheng-Hsiu Huang, and Irine
Sorser. Funding from the University of Michigan RWJ HSSP small Grant Program
and the Rackham Spring/Summer Research Grant Program is gratefully acknowl-
1
edged. Lastly, I thank Christal Ramos and David Ashner of the AAPA and numerous American Association of Medical Colleges (2010) summarizes the workforce
state Boards of Nursing, Medicine, Licensing, and Health for providing data and provisions of the ACA.
2
responding to many inquiries and questions. A recent report by the Kaiser Commission on Medicaid and the Uninsured
∗ Tel.: +1 415 215 6015. (2011), for instance, highlights the potential of NPs and PAs to address the primary
E-mail address: kstange@umich.edu care physician shortage.
0167-6296/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.jhealeco.2013.10.009
2 K. Stange / Journal of Health Economics 33 (2014) 1–27
preventative health care services, or prices. However, primary care

utilization is modestly more responsive to provider supply in states
that grant NPs the greatest autonomy. I find no evidence that
increases in provider supply decreases prices, even for visits most
likely to be affected by NPs and PAs: primary care visits in states
with a favorable regulatory environment for NP and PAs. My esti-
mates are sufficiently precise to rule out fairly small changes in
price and utilization. I can rule out increases in the likelihood of
having at least one visit of 0.75 (1.12) percentage points associ-
ated with a large 50% increase in NP (PA) supply and an elasticity
of 0.03 (0.08) on the intensive utilization margin. Results using the
county fixed effects and 2SLS approaches are very similar. I also
examine the direct effect of occupational regulation by exploiting
changes in state-level prescribing laws over time. I find that expan-
sions in prescriptive authority for NPs are associated with modest
increases in utilization and expenditure, though no consistent pat-
tern emerges for expansions to PA prescriptive authority. Neither
Fig. 1. Aggregate trends in health care providers, 1980–2008. change appears to consistently reduce visit prices.
Source: Health Resources and Service Administration Area Resource File, National This study is the first to quantify the effects of increased
Survey Sample of Registered Nurses, American Academy of Physician Assistants.
supply of non-physician clinicians on access, costs, and pat-
terns of utilization for a broad population-based sample. Previous
number of primary care physicians, estimated to be about 260,000.3 research has focused on very specific settings or populations or
In many communities, physician assistants and nurse practitioners has not accounted for fixed differences between areas that may be
are already the principal providers of primary care and their ranks correlated with regulations, provider supply and outcomes. Under-
are projected to grow even further (Auerbach, 2012). standing the effects of one of the largest changes in the delivery
Supply growth has occurred against the backdrop of consider- of healthcare in the past few decades is a first-order question for
able cross-state variation in what NPs and PAs are permitted to health policy. This paper also represents one of the first analyses
do, with some states permitting autonomous practice while oth- of the consequences of occupational regulation on output markets.
ers mandating extensive physician oversight and collaboration. In How changes in occupational boundaries affect demand for and
fact, one of the four key messages in a recent Institute of Medicine supply of services as well as prices and quality is not well under-
study was that “nurses should practice to the full extent of their stood. Findings about the impact of scope-of-practice regulations
education and training,” noting that a “variety of historical, regula- have implications for many other sectors, both within and outside
tory, and policy barriers have limited nurses’ ability to generate of health care, that have seen a blurring of occupational bound-
widespread transformation” to the healthcare system (Institute aries and an increase in licensing. Dental hygienists, paralegals, and
of Medicine, 2011). Significant occupational restrictions thus may tax professionals now perform many duties historically performed
limit the extent to which expansions in the number of providers by dentists, lawyers, and accountants. The occupational regulatory
has translated into meaningful changes in healthcare outcomes. environment moderates these shifts in the division of labor, but has
Though several states have broadened scope-of-practice laws and not been studied extensively.
expanded prescriptive authority – innovations that should enable The remainder of this paper proceeds as follows. The next sec-
NPs and PAs to operate more independently from physicians – tion provides a brief background on NPs and PAs, summarizes
substantial restrictions on the substitutability of NP and PA for related literature, and describes anticipated effects. Section 3 intro-
physician care still remains in many states. duces the data, including the new dataset on county-level NP and
These workforce and regulatory changes have significantly PA supply and state-level regulations that was assembled for this
altered how primary care is delivered in this country, but the conse- project. Section 4 describes my empirical strategy. Results are pre-
quences for health care markets have not yet been studied. Previous sented in Sections 5 and 6 and Section 7 concludes.
research on the effects of physician supply is mostly cross-sectional
(limiting causal inference), has found mixed results, and may not
inform the likely effects of NPs and PAs. To fill this gap, I exploit 2. Background
variation in NP and PA concentration and regulatory environment
across areas and over time, made possible by a newly constructed 2.1. Nurse practitioners and physician assistants: background
panel dataset on the number of licensed NPs and PAs at the county and recent changes
level. I employ two complementary identification strategies to
address the possible endogeneity of NP and PA supply. A county Nurse practitioners (NPs) and physician assistants (PAs) are
fixed effects approach exploits within-county variation in provider health care professionals that perform tasks similar to many
supply over time while an instrumental variables approach exploits physicians. Both professions emerged in the 1960s as a way for indi-
cross-sectional geographic variation in provider supply that is due viduals with existing healthcare expertise to provide higher-level
to the historical location of educational infrastructure for training care more autonomously to underserved areas. NPs are registered
registered nurses and PAs. nurses (RNs) that have received advanced training which permits
My findings suggest that, on average, greater supply of NPs them to diagnose patients, order and interpret tests, write prescrip-
and PAs has had minimal impact on utilization, access, use of tions, and provide treatment for both acute and chronic illnesses.
NPs have typically completed a two-year nurse practitioner mas-
ters program, passed a national exam, and are licensed by state
3
These figures come from the American Academy of Physician Assistants, Amer-
boards of nursing. NPs practice in settings similar to physicians:
ican Academy of Nurse Practitioners, and the author’s analysis of the Area Resource doctors’ offices, hospitals, outpatient clinics, community clinics,
File. or their own practice (in some states). Physician assistants can
K. Stange / Journal of Health Economics 33 (2014) 1–27 3
perform any duties delegated to them by physicians, though in extensive division of labor.5 The efficient division of labor is deter-
practice the range of activities performed by PAs is very similar mined, in part, by coordination costs between workers (Becker
to NPs. PAs have typically graduated from a two-year PA program and Murphy, 1992), which may be low if NPs and PAs work col-
(usually housed in a medical school), passed a national exam, and laboratively with physicians. However, the primary care market is
are licensed by state boards of medicine. unlikely to be competitive, so increased provider supply could be
The level of physician supervision or collaboration required associated with no changes or even increases in price if greater
of NPs and PAs and their permitted tasks (referred to as “scope- NP and PA supply makes consumers’ provider search more diffi-
of-practice” laws) is determined by state law and thus varies cult (Pauly and Satterthwaite, 1981) or if it shifts bargaining power
tremendously by state. The West and New England regions are from insurance companies to health care providers.
thought to be the most favorable to non-physician clinicians, but Utilization may also respond to greater provider presence
there is variation within regions and across the two professions through several channels, though the combined effect is theoreti-
(US Health Resources and Service Administration, 2004). Individual cally ambiguous. Greater supply may increase utilization for people
state licensing laws regulating health professions have also been who previously went without care because they were not able to
changing in many states to permit NPs and PAs to practice more find a primary care provider. However, additional non-physician
independent (Fairman, 2008). The ability to write prescriptions is providers may partially “crowd-out” physicians if physician supply
one important feature that has changed dramatically over the past responds to the increased competition. The net effect on provider
two decades, as depicted in Fig. 2. Currently NPs and PAs can pre- availability is likely to be positive, though the magnitude will
scribe at least some controlled drugs in almost all states, up from 5 depend on the extent to which the NPs and PAs increase the num-
and 11, respectively, as recently as 1989. ber of primary care providers rather than merely substitute for
Care provided by nurse practitioners and physician assistants physicians. NPs or PAs may also make more referrals to special-
is reimbursed by insurers in two ways (US Department of Health ists and physicians may substitute to performing more specialized
and Human Services, 2011). Reimbursement can be made directly procedures, both of which would increase the utilization of (more
through these providers’ own National Provider Identifier (NPI), costly) specialist care. However, greater use of primary care and
often at a fraction of the physician reimbursement rate. For NPs’ greater focus on prevention may also reduce the need for some
instance, Medicare reimburses direct-billed services provided by health services, thus reducing utilization.
NPs and PAs at 85% of the physician rate, as do many private insurers Since these providers have different training than the physi-
and many state Medicaid programs (Chapman et al., 2010). Alter- cians they substitute for, the growth of NPs and PAs may also
natively, if NP or PA care is provided as part of an episode of care impact quality of care (either real or perceived). Evidence sug-
provided by a physician, the services can be reimbursed at 100% gests that patients treated by NPs have similar outcomes as those
through the physician’s NPI, which is referred to as reimburse- treated by physicians, but some critics still voice concern about
ment for NP or PA care provided “incident-to” physician care. In non-physicians’ ability to detect rare or severe illnesses.6 Even if
this case, the physician must both be on-site when the service is physicians and non-physicians provide care of equal clinical qual-
performed and must treat the patient on the patient’s first visit, ity, perceived quality differences between provider types could also
though different payers and states interpret these requirements lead to changes in utilization as the mix of providers is altered.
quite differently. Thus organizations where the higher rate is suffi- Furthermore, expanded NP supply may also increase rates of immu-
cient to offset the cost of additional physician oversight time may nization, screening, and routine checkups, as the nursing model
find it desirable to bill most NP and PA-provided care through the stresses prevention and health behaviors.
physician’s NPI number.4 Theoretical work on occupational regulation generally con-
Historically NPs and PAs are more likely to provide care for the cludes that stricter regulation increases prices, but has ambiguous
underserved and locate in rural areas than physicians (Larson et al., effects on utilization due to offsetting effects via supply (regula-
2003; Grumbach et al., 2003; Everett et al., 2009). For states with tion restricts supply, reducing quantity) and demand (regulation
provider supply data available (described in a later section), I esti- assures quality and motivates human capital investment, increas-
mate that the number of NPs per primary care physician increased ing quantity) (Leland, 1979; Shaked and Sutton, 1981; Shapiro,
from 0.25 in 1996 to 0.49 in 2008 and the number of PAs per primary 1986). While this theoretical work focused on the strictness
care MD increased from 0.13 to 0.29, though there is considerable of occupational entry requirements, it is reasonable to apply
cross- and within-state variability these trends. the result to task regulation as well. States that permit NPs
and PAs to perform more tasks independent from physi-
2.2. Expected effects of non-physician supply and regulation cians should experience lower prices, but ambiguous effects on
utilization.
An expansion of non-physician clinicians could impact the Thus a loosening of scope-of-practice laws for NPs and PAs is
health care market both through prices and utilization. On the expected to reinforce expansions in provider supply. I expect larger
price side, more NPs and PAs may lower prices indirectly by effects of non-physician supply on utilization and prices in states
injecting more competition into the market for primary care ser- that permit NPs and PAs to practice more autonomously, as this
vices (regardless of provider type). Economic theory predicts that allows production to be closer to the possibilities frontier. A similar
an increase in the supply of a key input to production (labor) logic implies that the effect of supply will be largest for the tasks
should lower output prices if markets are competitive. As imperfect (or types of visits) for which NPs, PAs, and physicians are most
substitutes for physicians, more NPs and PAs could also lower out- substitutable.
put prices directly by enhancing labor productivity through a more
4 5
There is no dataset that is able to document what fraction of NP- and PA- Scheffler et al. (1996) estimate that 70–80% of the work done by primary care
provided care is billed under the NPI of an NP/PA, but Skillman et al. (2012) estimate physicians could be done by nurse practitioners.
6
that only 76% of NPs in 2010 had an NPI. This would be an upper bound of the frac- See Mundinger et al. (2000) and Lenz et al. (2004) for the results from one
tion of visits that are billed through the NP’s NPI (similar estimates for PAs are not randomized trial and Horrocks et al. (2002) and Laurant et al. (2004) for broader
available). reviews.
Fig. 2. States where NPs and PAs can prescribe controlled substances, 1996–2008.
Source: Author’s tabulations from The Nurse Practitioner, Annual Legislative Update (various years) and Abridge State Regulation of Physician Assistant Practice, distributed by
the American Academy of Physician Assistants.
2.3. Previous research on the effects of provider supply more specialist rather than generalist MDs have higher health care
expenditures. Chernew et al. (2009) find that a greater concen-
Previous research has documented the aggregate growth of tration of primary care physicians is not associated with lower
nurse practitioners and physician assistants and discussed the spending growth, despite being correlated with lower spending
importance for primary care delivery, but has not quantified the levels at a point in time. An earlier line of research found a posi-
consequences.7 Previous analysis on provider supply has been tive association between physician supply and prices, interpreting
cross-sectional and focused on physicians, finding fairly mixed it either as evidence of physician-induced demand or diminished
evidence of effects on utilization, prices, and expenditure. Some consumer information when physician supply increases (Pauly and
have found that more primary care physicians is associated with Satterthwaite, 1981). No prior work provides direct estimates of the
more primary care visits, less emergency department use, greater market-wide effects of NPs or PAs on healthcare markets.
use of preventive health measures, fewer hospitalizations for
ambulatory-sensitive conditions (suggesting better primary care 2.4. Previous research on occupational regulation
access) and lower mortality (Chang et al., 2011; Laditka et al.,
2005; Guttman et al., 2010; Continelli et al., 2010). Other stud- There is also relatively little research on the labor and output
ies find no such relationship between provider concentration and market effects of occupational restrictions.8 Extant research has
self-reported measures of access (Grumbach et al., 1997). These focused on the consequence of stricter entry regulations for a sin-
cross-sectional studies might be subject to omitted variable bias gle licensed profession, rarely looking at the effects of regulations
(of unknown direction) if provider supply is correlated with unob- delineating the division of labor between various licensed profes-
served demand factors. sions. I am not aware of any previous work that examines how
On the expenditure side, Chang et al. (2011) finds no consis- occupational regulation moderates the growth of input supply to
tent association between provider supply and Medicare spending, influence output markets.
though Baicker and Chandra (2004a,b) do find that areas with Higher entry barriers have typically been associated with higher
prices and lower quantity, though quality effects are mixed. For
7
See Cooper et al. (1998a,b), Hooker and McCaig (2001), Hooker and Berlin (2002),
8
Druss et al. (2003), US GAO (2008), and Scheffler (2008) for descriptive work on the For an overview of the theoretical and empirical literature, see Kleiner (2000,
trends in NPs and PAs. 2006).
instance, Kleiner and Kudrle (2000) find that stricter licensing obstetrician/gynecologist physicians as “primary care” and include
raises the price of dental services and earnings of dentists, but is not only these providers in our measures of physician supply.
associated with better oral health. Schaumans and Verboven (2008) Occupational regulations in each state are characterized in two
find that entry restrictions and regulated mark-ups for pharmacies ways. First, I quantify the overall practice environment for NPs and
result in a welfare loss for consumers by inflating prices and signifi- PAs in the state at a single point in time (2000) using an index con-
cantly reducing the number of pharmacies and physicians. Hotz and structed by the Health Services Resource Administration (HRSA,
Xiao (2011) find that stricter child care regulations reduce supply of 2004). This index ranks states separately for NPs and PAs along
child care (particularly in low-income markets), but also increase three dimensions: (1) legal standing and requirement for physician
the quality of services provided (particularly in higher income mar- oversight/collaboration on diagnosis and treatment; (2) prescrip-
kets). Thus child care regulation creates a tradeoff between higher tive authority (type of drugs, requirements for MD oversight); (3)
quality care for high income families but restricted supply in low reimbursement policies (e.g. Medicaid reimbursement rates and
income markets. requirements for private insurers). These three dimensions are
Research on laws regulating which functions a licensed profes- then combined into a single index with a possible range from zero
sion can do is sparse and only a handful of studies exploit variation to one.12 This time-invariant index is used to assess whether out-
in laws over time to address the potential omitted variable bias in comes are more responsive to NP and PA supply in areas with more
cross-sectional approaches.9 Dueker et al. (2005) find that greater favorable environments for these providers. To be able to assess
prescriptive authority for advance practice nurses (APNs) is associ- the direct effect of regulation on outcomes (rather than the indi-
ated with lower earnings for APNs and physicians, but higher wages rect effect via supply responsiveness), as a second measure I also
of physician assistants. The present study is most closely related to constructed indicators for whether NPs and PAs are permitted to
Kleiner and Won Park (2010) and Kleiner et al. (2011), which exam- write prescriptions for any controlled substances in a given state
ine the labor market impact of scope-of-practice regulations for and year. Prescriptive authority is one aspect of provider regulation
dental hygienists and NPs, respectively. In the former, the authors that has seen considerable change over the past two decades and
find that laws that permit hygienists to operate independent from is also cleanly measured across years in my sample period.
dentists increase hygienists’ wages and result in lower wages and
employment growth for dentists. The latter study finds that wages
of NPs increase and the price of well-child visits decreases when 3.2. Outcomes
NPs were permitted to do more tasks in the mid-2000s.
These studies suggest that the growth of NP and PA indepen- I study the health care experience of participants in thirteen
dence should have both labor and output market consequences, waves of the Household Component of the Medical Expenditure
but no previous study has explored the output market side exten- Panel Survey (MEPS) from 1996 to 2008. The MEPS is a 2 year panel
sively. Nor has the role of occupational regulations in moderating of households drawn from the National Health Interview Survey,
the effects of input supply been examined.10 Relative to Kleiner which I treat as a repeated cross-section in each year. Characteris-
et al. (2011), the present study examines utilization and access, tics of respondents’ county and state were merged onto the MEPS
focuses on the interactive effects of provider supply and regulation, files using individuals’ state and county FIPS codes.13 Since histor-
and looks at a much broader set of health care outcomes. ical data on NP and PA supply could only be constructed for some
states and for some years, the final dataset has 293,100 person-year
observations (compared to 404,400 for all state-years), though the
3. Data analysis sample is slightly smaller due to missing values for some
key covariates.
3.1. New data on health care providers and regulations Basic characteristics of the sample are presented in Table 1,
with more detailed summary statistics included in Appendix B.
A huge barrier to research on NPs and PAs has been a lack of On average, individuals in the sample live in counties with 90 pri-
data on the number of these providers at the sub-national level mary care physicians, 30 NPs, and 17 PAs per 100,000 population.
over time. To fill this gap, in collaboration with Deborah Sampson Seventy-nine (seventy-three) percent live in states that permit NPs
from Boston College School of Nursing, I assembled a new dataset (PAs) to write prescriptions for controlled substances. On average
containing the number of licensed nurse practitioners and physi- they make 2.8 office-based health care visits per year, with 62%
cian assistants at the county level annually for the years 1990–2008 having at least one. Approximately half of these visits are for pri-
using individual licensing records obtained from relevant state mary care whose total expenditure (from all payers) is $153 per
agencies. The years for which data is available varies across states, year (in 2010 dollars, including those with zero expenditure).14
so our county panel is unbalanced: NP and PA supply data is avail- Similar to the national trend, the NP and PA to population ratios
able for 23 states covering 52% of the U.S. population in 1996, but for my sample more than doubled from 1996 to 2008. Despite
increases to 35 states covering 80% in 2008.11 Data on the number these extreme changes in the health care workforce, there has been
of primary care physicians was obtained from the Area Resource
File. Throughout I refer to all general practice, family practice,
generalist pediatric, general internal medicine, and general
12
The weights for the three components differ slightly by provider type. NP
index = 35% legal, 35% reimbursement, 30% prescribing. PA index = 35% legal, 25%
reimbursement, 40% prescribing. The total indices range from 0.43 (South Carolina)
9
Using cross-sectional approaches, White (1978), Adams et al. (2003), and Sass to 0.94 (New Mexico) for NPs and 0.37 (Ohio) to 0.94 (North Carolina) for PAs.
and Nichols (1996) find mixed effects of scope-of-practice laws on wages and uti- Appendix A describes these indices in more detail.
13
lization of several different health professions. Geographic identifiers are not part of the public-use MEPS files, so this merging
10
Several studies suggest that a favorable practice environment is correlated with was performed by AHRQ and all subsequent analysis was conducted under secu-
the supply of non-physician clinicians, including NPs and PAs. See Sekscenski et al. rity protocols at the Michigan Census Research Data Center. Reported sample sizes
(1994), Cooper et al. (1998a), and US DHHS (2004) for cross-sectional evidence. In are rounded to the nearest hundred to conform to Census Research Data Center
a longitudinal study, Kalist and Spurr (2004) found that more favorable state laws confidentiality protocols.
14
do encourage more people to enter advance practice nursing in the 1990s. All dollar variables have been adjusted for inflation using the CPI-U and are in
11
Appendix A describes the data collection in more detail. 2010 dollars.
Table 1
Summary statistics.
Analysis sample Full dataset

n = 293,100 n = 404,400
Mean SD Mean SD
Panel A. Person sample

Provider supply and regulation
MD per population (×100,000) 89.619 44.211 88.879 46.38
NP per population (×100,000) 30.166 20.629 N/A
PA per population (×100,000) 17.082 10.979 N/A
NP practice index (2000) 0.744 0.134 0.738 0.129
PA practice index (2000) 0.772 0.107 0.737 0.141
NPs can prescribe controlled substances in state × year 0.790 0.408 0.724 0.447
PAs can prescribe controlled substances in state × year 0.733 0.442 0.674 0.469
Health care utilization and expenditure
Office-based visits > 0 0.623 0.485 0.631 0.483
Number of office-based visits 2.781 5.517 2.831 5.479
Number of primary care office-based visits 1.456 2.702 1.47 2.674
Number of non-primary care office-based visits 1.325 3.373 1.362 3.376
Total amount paid for primary care office-based visits 152.84 365.66 151.34 356.82
Have usual source of care 0.780 0.414 0.788 0.409
Flu shot in last 12 months 0.272 0.445 0.276 0.447
Blood pressure check in last 12 months 0.773 0.419 0.782 0.413
Pap smear in last 12 months 0.569 0.495 0.568 0.495
Breast exam in last 12 months 0.615 0.487 0.617 0.486
Cholesterol check in last 12 months 0.526 0.499 0.523 0.499

n = 803,200 n = 1,114,900
Mean SD Mean SD
Panel B. Office-based visits sample

No condition associated with visit 0.20 0.40 0.20 0.40
Predicted likelihood that visit is primary care 0.51 0.10 0.51 0.10
Visit category = checkup or well-child 0.29 0.46 0.29 0.45
Visit category = diagnosis/treatment 0.54 0.50 0.54 0.50
Visit category = emergency 0.01 0.09 0.01 0.09
Visit category = followup 0.12 0.33 0.12 0.33
Visit category = shots 0.04 0.20 0.04 0.20
Total amount paid for visit (all sources) 118.76 152.44 116.79 151.52
Total charges for visit 218.84 356.19 213.16 351.57
Note: Sample sizes rounded to nearest 100. Person analysis sample includes all individuals living in counties for which physician, nurse practitioner, and physician assistant
supply data is available in their survey year. Provider supply measures are calculated at the county level. Physician supply only includes non-federal office-based physicians
in family/general practice, general pediatrics, general internal medicine, and general ob/gyn. The office-based visits sample includes only visits for which provider was doctor,
registered nurse or nurse practitioner, or physician assistant. Also excludes visits categorized as mental health, maternity, eye exam, laser eye surgery, and other. Analysis
sample further restricted to visits by individuals living in counties for which physician, nurse practitioner, and physician assistant supply data is available in their survey
year.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
surprisingly little change in most measures of office-based health 2002.16 Therefore I do not look extensively at visit provider type as
care utilization during this time period. reported in MEPS, and instead focus on market-level effects across
To assess the price impact of workforce and regulatory changes, all provider types. In order to identify visits that are potentially
I pool the MEPS office-based medical provider visits files from 1996 most affected by NP and PA supply growth and regulation, for each
to 2008. The visits files contain a separate observation for each visit, visit I construct a measure of the predicted likelihood that the visit
call, or interaction with office-based health care providers by MEPS would be to a primary care provider, given observed individual
participants during the survey period. I restrict the sample to visits and visit-level characteristics. The most common high likelihood
to physicians, nurse or nurse practitioners, or physician assistants visits includes flu shots, no condition checkups, and visits for a sore
and also exclude visits categorized as for mental health, mater- throat.17
nity, eye exam, laser eye surgery, or other reasons.15 After these Twenty percent of visits are unrelated to a specific medical con-
restrictions, my analysis sample includes 803,200 visits by individ- dition and relatively few include any specific treatment or service.
uals living in counties for which physician, nurse practitioner, and The vast majority of visits are categorized as general check-ups,
physician assistant supply data is available in the survey year. well-child exams, or for the diagnosis/treatment of a specific condi-
One limitation of the MEPS visits data is that visits to NPs and tion. I estimate that the average visit has a 51% likelihood of being to
PAs are often classified as physician visits due to question framing
and misreporting and physician specialty is only available since
16
As described by Morgan et al. (2007), the MEPS underreports care by NPs and PAs
because only visits during which a physician is not ever seen are further categorized
by type of non-physician (NPs or PAs) and because respondents tend to over report
15
Maternity and mental health visits are excluded, as these are rarely provided by the presence of physicians at visits.
17
NPs and PAs in the MEPS data. Appendix A describes this procedure in more detail.
a primary care provider which implies that in expectation, slightly The key parameters of interest are ˇ1 and ˇ2 , the change in
more than half of office-based visits are for primary care. I examine outcome y associated with a one unit increase in NP or PA concen-
two measures of visit price: total charges and total amount paid by tration, holding the included control variables constant. In order
all sources. Since amount paid is largely dictated by reimburse- to quantify the effect of state regulation on the responsiveness
ment rates set by Medicare and other insurers, it may not take of outcomes to supply, I let these parameters vary with the state
competitive pressures or provider availability into account, practice index in state s in 2000. The coefficients on the index
limiting observed price responsiveness; charges may be more interaction terms represent differences in outcome response to
responsive to market forces. On the other hand, charges are an increased provider supply between states that are fully support-
imperfect measure of resource-allocating price since they are not ive of NP and PA independent practice and those whose regulatory
fully paid.18 The average amount paid for a visit was $119 across environment is completely restrictive.22
all years, with visit charges approximately $100 more. To estimate effects on the prices of basic health care services, I
estimate a similar regression model using OLS:
4. Empirical approach ymijt = ˇ1 NPjt + ˇ2 PAjt + ˇQ Qmijt + ˇx Xijt + ˇz Zjt + ıj + ıt + εmijt
4.1. Fixed effects specification (2)
where ymijt is the log price of visit m made by individual i in county

To estimate the causal effect of NP and PA supply and its inter-
j at time t. In addition to the control variables used in (1), some
action with the regulatory environment on health care access,
specifications also include visit-specific characteristics, Qmijt , such
utilization, and expenditure, I estimate the following regression
as indicators for specific treatments or services provided during
model using OLS:
the visit, fixed effects for conditions (if any) associated with the
yijt = ˇ1 NPjt + ˇ2 PAjt + ˇx Xijt + ˇz Zjt + ıj + ıt + εijt (1) visit, or the predicted likelihood that a visit is primary care. εmijt
is an error term that is assumed to be uncorrelated with all the
where yijt is an outcome (number of visits, total expenditure, have right hand side variables. In order to permit the price response
usual source of care, etc.) for individual i in county j at time t.19 to additional supply to vary between types of visits and with the
As measures of provider concentration (NPjt and PAjt ) I use the log state practice environment, I also interact provider supply with
of the number of NPs and PAs per 100,000 population in area j at predicted likelihood of primary care, the state practice index, and
time t.20 Fixed and time varying factors at the individual level, such with both simultaneously. If NPs and PAs have a greater (negative)
as income category, age, race, insurance type, and self-reported price effect on visits that are the most substitutable for primary
health status are controlled for with Xijt . To control for fixed unob- care physicians or in states permitting greater autonomy of NPs
served determinants of outcomes across areas and over time that and PAs, then the coefficients on these interactions should be
may be correlated with NP/PA concentration or practice indices, I negative.
also include county and year fixed effects ıj and ıt . The vector Zjt To examine the direct effect of occupational regulations on
controls for time-varying factors at the county level that may be outcomes, rather than the indirect effect that operates through
correlated with both provider supply and outcomes. In this vec- provider supply, I also estimate OLS regressions of the form:
tor, most specifications control for the number of primary care
yijt = ˇ1 NPLawst + ˇ2 PALawst + ˇx Xijt + ˇz Zjt + ıj + ıt + εijt (3)
physicians per capita, to account for the possible crowd-out of
physicians by greater NP and PA presence. In practice, I find little where most variables are defined as before, but now NPLawst
evidence of crowd-out or crowd-in, so the results are insensitive (PALawst ) is an indicator variable that equals one if state s permits
to whether physician supply is included. My preferred specifica- NPs (PAs) to prescribe any controlled substances in year t. Identi-
tion also controls for state-specific linear time trends and a host of fication of the parameters of interest now comes from changes in
time-invariant county characteristics (measured at baseline) inter- laws within states over time. Since changes in laws may also corre-
acted with linear time trends. Some specifications also control for late with provider supply growth, some specifications also include
the predicted number of non-primary-care doctor visits made by the log of the number of NPs and PAs per 100,000 population in
an individual in the survey year. εijt is an error term that is assumed area j at time t.
to be uncorrelated with all the right hand side variables.21
4.2. Identification challenges with fixed effects specification

18
Since charges and amount paid for each visit obtained from the household
survey of MEPS participants are often incomplete, MEPS also collects this infor- The first concern with the OLS approach described above is that
mation directly from participants’ medical providers, prioritizing information from changes in NP or PA supply or laws may be correlated with other
providers whenever possible.
19
determinants of health care outcomes, causing biased estimates
Medical expenditure and utilization is right skewed with a long right tail and a
large mass at zero, which can cause simple OLS estimates to be miss-specified and of ˇ. Table 2 identifies the observable factors that predict varia-
imprecise (Jones, 2000). I separately analyze the extensive and intensive margins tion in provider supply across areas and over time. Cross-sectional
of utilization and expenditure using OLS, taking the log of right-skewed outcomes variation in provider supply is much more highly correlated with
and omitting persons with zero visits from the intensive margin analysis. Marginal observable county characteristics than provider growth. In fact,
effects and standard errors are nearly identical when binary outcomes are estimated
several of the strongest predictors of the level of provider density
using a logit model (instead of OLS) and results are qualitatively very similar if the
intensive and extensive margins of utilization are estimated together using a Poisson (number of MDs, HMO penetration, and industry mix) are not pre-
or negative binomial count model. dictive of NP or PA supply growth. Nonetheless, provider growth
20
This log specification imposes heterogeneity of treatment effect by treatment
intensity. For instance, counties whose NP/pop ratio increases from 1 to 2 are
assumed to have a larger change in the outcome than a county whose NP/pop ratio
22
increases from 40 to 41. An alternative interpretation of index interaction terms, which I cannot rule
21
To allow for the possibility of correlated errors among people sharing a similar out, is response heterogeneity by baseline provider supply level as areas with more
regulatory regime, standard errors are clustered by state in all analysis. Clustering favorable NP and PA practice environments also tend to have more of these providers
by county produces standard errors that are comparable. (Sekscenski et al., 1994).
Table 2
Time-invariant county characteristics that correlate with provider density and growth.
(1) log(NP per population) (2) log(PA per population)
Main effect Interaction with time Main effect Interaction with time
log(MDs per population) (in 1995) 0.310*** 0.003 0.309*** 0.000

(0.057) (0.005) (0.054) (0.005)
HMO penetration (1998) 0.725*** −0.001 −0.293* 0.018
(0.150) (0.013) (0.150) (0.014)
Log of population density (1992) −0.046** 0.003* −0.023 0.007***
(0.020) (0.002) (0.018) (0.002)
% persons in poverty (1989) 1.290* −0.198*** −1.476** 0.003
(0.751) (0.063) (0.725) (0.079)
Median household income ($1000, 1990) 0.018*** −0.001*** −0.004 −0.000
(0.005) (0.000) (0.006) (0.000)
Infant mortality rate (1988) 0.004 −0.003** 0.012 −0.000
(0.013) (0.001) (0.012) (0.001)
% workforce in health (1990) 4.431*** 0.083 3.999*** −0.039
(1.139) (0.099) (1.082) (0.114)
% workforce in manufacturing (1990) −1.553*** −0.019 −0.948*** 0.003
(0.325) (0.028) (0.304) (0.026)
Unemployment rate (1990) −0.019* −0.004*** 0.022* 0.000
(0.011) (0.001) (0.011) (0.001)
% White (1990) 1.141*** −0.045* 0.438* 0.055***
(0.264) (0.024) (0.258) (0.020)
% Hispanic (1990) −0.856*** 0.014 0.008 0.050**
(0.165) (0.015) (0.205) (0.022)
% high school or greater 0.375 −0.177*** 0.998** 0.032
(0.387) (0.034) (0.416) (0.043)
PAs can prescribe controlled drugs in state (in 1995) −0.228*** 0.021*** −0.029 0.016***
(0.047) (0.004) (0.049) (0.004)
NPs can prescribe controlled drugs in state (in 1995) 0.132*** −0.001 0.319*** −0.012***
(0.043) (0.003) (0.047) (0.004)
Constant 0.201 0.296*** 0.310 −0.049
(0.624) (0.055) (0.534) (0.044)
F-test for coefficients on above variables = 0 (excluding 50.84 10.17 22.56 7.01
constant and linear time trend)
Observations 17,235 17,235
R-squared 0.537 0.418
Note: Robust standard errors in parentheses, clustered by county. Time is normalized to zero in 2002 so main effects can be interpreted as the average in the mid-point of
the sample period. Provider density ratios are per 100,000 population. Sample includes 1695 counties for 13 years (1996–2008), but data is not available for all years so the
panel is unbalanced. Observations are weighted by county population in all specifications.
*
Significance at the p < 10% level.
**
***
did occur differentially between areas with different demograph- atypical health care needs, then the model may be picking up the
ics and economic circumstances. For this reason, the preferred consequences of these trending factors rather than changes in the
specification includes separate linear trends for each state and laws themselves. I cannot rule this possibility out, but I will note
linear trends that vary with these time-invariant county character- that many of these legislative changes are the end result of a long
istics. These linear trends eliminate bias resulting from areas with, series of political battles fought between groups over many years
for example, high poverty rates having lower utilization growth or decades (Iglehart, 2013) and so the precise timing of passage is
and lower NP growth. It should be noted that time-invariant area likely driven by idiosyncratic political factors rather than specific
characteristics – such as the high concentration of NPs and PAs in health care needs.
rural areas which may have low prices – are not a source of bias Measurement error in provider supply is a second concern.
when county fixed effects are included in the model, though this Some research suggests that county may not be the best geographic
basic source of bias is present is most of the previous work dis- level to measure the number of health care providers (Rosenthal
cussed earlier. More problematic for my approach is if changes in et al., 2005). Classical measurement error will attenuate estimates
provider supply or laws are correlated with unobservable, time- toward zero. The main results are robust to using workforce supply
varying factors, such as latent health care demand. The fixed effects and fixed effects at the Health Service Area level (an aggregation of
model addresses this source of bias in so far as the presence of counties in the same state) rather than county. Unfortunately the
observed medical conditions (which is controlled for) is associated MEPS does not contain geographic information below the county
with increased demand, but I am not able to rule out the contribu- level, so I am unable to explore more localized measures of provider
tion of changes in unobserved demand factors. availability. Error may also be present in the measure of physician
Legislative endogeneity is a specific form of time-varying omit- density, as reporting errors and delays are common in the AMA
ted variable bias when looking at the direct effect of practice Masterfile, the source of physician data in the Area Resource File
environment via Eq. (3). Fixed effects will absorb any fixed differ- (Staiger et al., 2009; Freed et al., 2006). Thus changes in physician
ences between states that correlate with the timing of adoption, density may not be adequately controlled for in the analysis.
such as the tendency of states with high demand for health care to Finally, specifications that interact provider supply with
pass prescribing laws earlier in the sample period. However, if pre- practice indices assume that these indices are uncorrelated with
scribing laws are changed as a consequence of time-varying factors other determinants of the responsiveness of demand to provider
within states, such as increased political power of nurse groups or supply. This assumption would be violated if states with the most
pent up demand (which are likely to be highly responsive to estimates imply an elasticity of office-based visits with respect to
provider supply) are more likely to grant autonomy to NPs and provider supply of 0.001 for NPs and 0.03 for PAs, with the confi-
PAs. I am unable to test for pent-up demand, but this source of bias dence interval ruling out elasticities greater than 0.032 and 0.076,
would cause me to overstate the effect of NP and PA autonomy on respectively.24
responsiveness to supply. Columns (5) and (6) examine the robustness to different sets
of controls. Column (5) only includes individual person-level con-
4.3. Instrumental variables specification trols and county fixed effects. Results are similar in magnitude,
precision, and fit to the model with extensive time trends, suggest-
To address some of these concerns, I also exploit cross-sectional ing my null results are not driven by over-controlling for temporal
variation in NPjt and PAjt induced by proximity to historical relevant variation.25 Column (6) does not control for physician supply. One
training infrastructure in a two stage least squares (2SLS) frame- channel through which changes to NP and PA supply could oper-
work. I instrument for NPjt and PAjt using the number of bachelor’s ate is though a displacement effect on physician supply. In fact,
RN programs in the county in 1963 and the number of PA programs the point estimates on NP and PA supply are nearly identical with
in the county in 1975 per 100,000 current population as excluded or without controlling for physician supply, so physician displace-
instruments. Two conditions must hold for 2SLS to provide consis- ment does not appear to be important.
tent estimates of ˇ1 and ˇ2 . First, the excluded instruments must To address the possibility of omitted variable bias due to time-
affect provider supply (the “relevance” condition), which is testable varying area characteristics and measurement error attenuation
and discussed later. Second, provider supply must be the only chan- bias, column (7) presents two-stage least squares (2SLS) estimates
nel through which the instruments affect (or are correlated with) that exploit cross-sectional variation in provider supply induced by
the outcomes (the “exclusion” assumption). While not testable, I proximity to the historical training infrastructure. Greater bache-
argue that this assumption is plausible in this setting. A bachelor’s lors RN program density in the 1960s is associated with greater
RN degree is a prerequisite for NP training, though most RN training NP supply (but not PA supply) today, while the opposite is true
programs only granted diplomas in the early 1960s and subsequent for the density of PA schools in 1975.26 It is reassuring that
demand for nurses was primarily met through Associates degree cross-instrument effects are minimal (e.g. bachelors RN program
programs. While the demand for healthcare may be correlated with density does not correlate with PA supply), which would be the
the presence of any RN training program, there is little reason to case if latent healthcare demand was correlated with both RN and
believe that it should be correlated with the specific type of RN PA school location and provider supply. For both measures of uti-
training program given that graduates of all programs take the lization, the 2SLS point estimates for NP supply are larger (and more
same licensure test and jobs upon graduation. The PA instrument positive) and less precise than the fixed effects estimates, though
is analogous to comparing counties that were the earliest to train they are never significantly different from zero or from the fixed
PAs with other counties, since the first wave of PA programs were effects estimates. For PA supply, the 2SLS point estimates are neg-
nationally certified in the early 1970s.23 The 2SLS specification ative and never significantly different from zero or from the fixed
exploits a completely different source of variation in provider sup- effects estimates. The point estimates imply that a 10% increase in
ply than the fixed effects specification and also possibly eliminates the NP to population ratio is associated with a 0.4 percentage point
attenuation bias caused by provider supply measurement error. increase in the fraction of individuals having at least one office-
based provider visit (95% CI: −0.08 to +0.89 percentage points) and
5. Impact of provider supply and interaction with
an intensive-margin elasticity of 0.051 (−0.061 to +0.163).
regulation
Critics of current state practice laws argue that a restrictive
5.1. Utilization practice environment limits the ability of non-physician providers
to practice to the fullest extent of their training, limiting the substi-
Table 3 examines the relationship between provider supply tutability between physician and non-physician care. Tables 4 and 5
and utilization. I first examine total provider supply, before distin- address this issue by examining the importance of regulatory
guishing by provider type. Though total provider supply is weakly environment to the relationship between utilization and provider
positively correlated with the likelihood of having any office-based supply. Positive point estimates on the supply-index interactions
visits, this correlation is diminished somewhat (and loses statistical
significance) once individual characteristics, fixed county charac-
teristics, and linear time trends are controlled for in column (2).
24
Marginal effects implied by Poisson and Negative Binomial count models are
Separating physicians from non-physician providers (column 3)
similar, as are the elasticities implied by OLS estimates from a model using total
and separating all three types of providers (column 4) gives a qual- number of visits (including zeros) as the outcome variable. These results are pre-
itatively similar result: provider supply has minimal relationship sented in Table B5 in Appendix B.
25
with office-based health care utilization, either on the extensive or Additional specifications that introduce different sets of controls progressively
intensive margin. The point estimates from the preferred specifi- for different ways of aggregating provider supply are presented in Table B6 in
Appendix B. Results from these specifications are qualitatively similar, suggesting
cation (4) suggests that a 10% increase in the NP to population ratio
that the null finding is not driven by co linearity between provider supply and the
is associated with a 0.03 percentage point decrease in the fraction extensive controls.
of individuals having at least one office-based provider visit. The 26
Table B7 in Appendix B presents the first stage relationships between provider
precision of the estimates permits me to rule out positive effects supply and these instruments. The F-statistics on the excluded instruments are 8.8
and 11.5 for NP and PA supply, respectively. As the NP F-statistic is not above the
greater than 0.75 percentage points associated with even a large
rule-of-thumb value of 10, some caution is warranted. Controlling for physician
50% increase in NP density (i.e. moving from the sample average of supply weakens the relationship further and reduces the F-statistics such that 2SLS
62.30 to 63.05%) or effects larger than 1.12 percentage points for a estimates may be biased due to weak instruments. However, physician supply may
similar increase in PA density. On the intensive margin, the point control for unobserved determinants of demand that happen to correlate with train-
ing infrastructure (making the exclusion assumption more plausible). Given that
the fixed effect analysis showed no relationship between physician supply and out-
comes, the preferred specification omits physician supply. Results are qualitatively
23
The first program was started at Duke University in North Carolina in the 1960s similar when controls for contemporaneous physician supply and the historical
as a means to integrate returning navy corpsman with medical experience into the presence of other types of RN programs are included (presented in Appendix B,
civilian healthcare system. Table B8).
Table 3
OLS estimates of provider density on number of office-based visits.
No controls Full controls Individual controls, county FE Full controls 2SLS
(1) (2) (3) (4) (5) (6) (7)
Panel A. Dept variable: have at least one office-based visit during year
log(MD + NP + PA per population) 0.029** 0.025
(0.013) (0.020)
log(NP + PA per population) 0.006
(0.011)
log(NP per population) −0.003 0.003 −0.004 0.041
(0.011) (0.012) (0.011) (0.025)
log(PA per population) 0.008 0.003 0.008 −0.009
(0.010) (0.011) (0.010) (0.030)
log(MD per population) 0.016 0.013 0.0002
(0.019) (0.020) (0.022)
N (rounded) 292,500 290,900 289,200 281,500 282,800 281,500 281,500
Adjusted R-squared 0.001 0.159 0.159 0.160 0.159 0.160 0.155
Panel B. Dept variable: log(number of office-based visits in year)
log(MD + NP + PA per population) 0.001 −0.027
(0.014) (0.026)
log(NP + PA per population) 0.022
(0.019)
log(NP per population) 0.001 −0.001 0.002 0.051
(0.016) (0.019) (0.016) (0.057)
log(PA per population) 0.031 0.026 0.029 −0.040
(0.023) (0.023) (0.023) (0.076)
log(MD per population) −0.023 −0.035** −0.027*
(0.016) (0.017) (0.014)
N (rounded) 182,200 181,100 180,100 175,100 176,000 175,100 175,100
Adjusted R-squared 0.0004 0.195 0.195 0.195 0.195 0.195 0.190
Note: Robust standard errors clustered by state in parentheses, Specification (1) includes year fixed effects only. Individual controls include male, age, age squared, dummies
for race/ethnicity, dummies for four income categories, dummies for public, private, or no insurance, and dummies for three self-reported health categories. Specifications
(2)-(4) and (6) also include state × time linear trends and time-invariant county characteristics interacted with linear time trends. Time-invariant county characteristics that
are interacted with time (linearly) include the fraction of persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health (1990), fraction of workforce
in manufacturing (1990), unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), population density (1992),
and HMO penetration rate (1998). Excluded instruments in 2SLS estimates are the number of BA RN programs in 1963 in county per population and the number of PA
programs in 1975 in county per population. See text for further explanation.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table 4
OLS Estimates of interaction between provider density and regulatory environment index on utilization.
Log(number of office-based visits)
At least one office-based All visits Primary care visits Non-primary care
visit during year
(1) (2) (3) (4) (5) (6)
log(NP per population) −0.069 −0.217** 0.001 −0.255*** −0.017 −0.037

(0.069) (0.100) (0.017) (0.075) (0.020) (0.103)
log(PA per population) −0.088* −0.093 0.020 −0.038 0.034* −0.058
(0.046) (0.128) (0.018) (0.117) (0.019) (0.133)
log(NP per population) × NP index 0.089 0.295** 0.348*** 0.026
(0.104) (0.143) (0.099) (0.140)
log(PA per population) × PA index 0.132* 0.170 0.079 0.125
(0.069) (0.167) (0.149) (0.171)
log(MD per population) 0.014 −0.034* −0.027 −0.027 −0.011 −0.009
(0.020) (0.018) (0.020) (0.019) (0.016) (0.017)
Controls Full Full Full Full Full Full
N (rounded) 281,500 175,100 160,700 160,700 114,500 114,500
Adjusted R-squared 0.160 0.195 0.115 0.115 0.196 0.196
F-test for provider supply coefficient = 0
when practice index = 1 (100%)
Nurse practitioners (p-value) 0.601 0.115 0.001 0.800
Physician assistants (p-value) 0.089 0.099 0.267 0.127
Note: Robust standard errors clustered by state in parentheses. All specifications include year fixed effects, individual controls, state × time linear trends, and time-invariant
county characteristics interacted with linear time trends. Individual controls include male, age, age squared, dummies for four income categories, dummies for race/ethnicity,
dummies for public, private, or no insurance, and dummies for three self-reported health categories. Time-invariant county characteristics that are interacted with time
(linearly) include the fraction of persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990),
unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), population density (1992), and HMO penetration
rate (1998). Number of primary and non-primary care visits was estimated by predicting whether each individual office visit was to a primary care provider based on broad
visit category, the individual characteristics listed above, and the medical condition (if any) associated with the visit. See text for further explanation.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table 5
OLS estimates of provider density and interaction with components of regulatory index on utilization.
Component-specific index
Overall index Reimbursement Legal Prescribing
Panel A. log(total number of visits)

log(NP per population) −0.217** −0.021 −0.119 −0.135***
(0.100) (0.077) (0.075) (0.044)
log(PA per population) −0.093 0.172** −0.028 −0.030
(0.128) (0.083) (0.065) (0.038)
log(NP per population) × NP index 0.295** 0.026 0.171 0.196**
(0.143) (0.095) (0.113) (0.075)
log(PA per population) × PA index 0.170 −0.163* 0.078 0.095
(0.167) (0.093) (0.086) (0.060)
NPxHigh P-val 0.115 0.853 0.243 0.142
PAxHigh P-val 0.099 0.717 0.143 0.049
Panel B. log(number of primary care visits)
log(NP per population) −0.219** −0.007 −0.132* −0.155***
(0.084) (0.054) (0.075) (0.033)
log(PA per population) −0.033 0.200*** 0.083** −0.047*
(0.084) (0.047) (0.033) (0.027)
log(NP per population) × NP index 0.291** 0.001 0.180 0.217***
(0.129) (0.073) (0.119) (0.058)
log(PA per population) × PA index 0.055 −0.220*** −0.102** 0.082*
(0.109) (0.053) (0.046) (0.041)
NPxHigh P-val 0.138 0.885 0.346 0.068
PAxHigh P-val 0.466 0.061 0.339 0.110
Note: All specifications include year fixed effects, log(MD per population), individual controls, county fixed effects, state × time linear trends, and time-invariant county
characteristics interacted with linear time trends. Primary care specifications also include log(number of non-primary care visits). Robust standard errors clustered by
state in parentheses. Individual controls include male, age, age squared, dummies for race/ethnicity, dummies for four income categories, dummies for public, private, or
no insurance, and dummies for three self-reported health categories. Time-invariant county characteristics that are interacted with time (linearly) include the fraction of
persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990), unemployment rate (1990),
fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), population density (1992), and HMO penetration rate (1998). Number of primary
and non-primary care visits was estimated by predicting whether each individual office visit was to a primary care provider based on broad visit category, the individual
characteristics listed above, and the medical condition (if any) associated with the visit. See text for further explanation.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
do suggest that utilization is more (positively) responsive to practice, and prescriptive authority. These specifications replace
provider supply in more NP and PA-friendly states. However, at the overall index (a weighted average of these three compo-
the extensive margin (column (1)), neither interaction is signifi- nents) with the component-specific indices one-by-one. Positive
cant at the 5% level and I cannot reject that the extensive margin coefficients on the interactions between NP supply and the pre-
response to supply is equal to zero even in the most favorable scriptive authority and legal standing indices suggest these are the
practice environments. On the intensive margin (column (2)), the components of the NP index (rather than reimbursement parity)
estimates imply an elasticity of 0.08 for both NP and PA supply in that explain its importance to NP supply effects. The patterns for the
states with the most favorable environments. Columns (3) and (4) components of the PA index are broadly consistent, though all these
examine the determinants of primary care visits. This variable is results should be interpreted cautiously, as estimates are imprecise.
constructed by summing the predicted likelihood that each visit Estimates suggest that provider concentration – whether NPs
is a primary care visit across all visits made by each individual. or PAs – has minimal impact on utilization (both extensive and
The estimated total number of non-primary care visits, analyzed in intensive margin) once time-invariant area characteristics and lin-
columns (5) and (6), is constructed similarly. Across all areas, the ear time trends are controlled for. The estimates are sufficiently
average response of both types of visit to provider supply is mini- precise that I can rule out increases in the likelihood of having at
mal. However, the number of primary care visits is more responsive least one visit of 0.75 (1.12) percentage points associated with a
to NP supply in states that permit NPs greater autonomy than those 50% increase in NP (PA) supply and an elasticity of 0.03 (0.08) on
with restrictive environments (column (4)). Since we may expect the intensive utilization margin. However, utilization does appear
that additional providers have a greater impact for certain patient to be more responsive to NP supply changes in states that permit
segments, I also examined utilization separately by type of insur- these non-physician clinicians greater autonomy, particularly in
ance coverage. I find no evidence that provider supply is more the realm of prescriptive authority.
important for the two groups most likely to face access problems
(Medicaid recipients and the uninsured), though practice environ-
5.2. Prices
ment estimates are imprecise.27 For most of these subpopulations,
the conditional correlation between provider supply and utiliza-
Theory predicts that an expansion of the supply and autonomy
tion is small and statistically insignificant, as are the coefficients
of NPs and PAs should reduce prices in the market for services
on the practice environment interactions.
for which they provide the greatest substitute for physician care.
Table 5 presents estimates that separate the practice index into
Table 6 reports estimates of Eq. (2) where log of visit price is
its three components: reimbursement policies, legal restrictions on
the dependent variable. Visit prices and provider supply are very
weakly positively correlated in the raw data (column 1). How-
ever, if NPs and PAs have expanded in areas with rising demand
27
See Table B9 in Appendix B for these results. for care due to increased health needs, then this could create a
12
Table 6
OLS estimates of provider density and interaction with regulatory environment index on visit prices.
log(total charges)
log(total charges) log(total paid) Exclude Medicare Check-up: no Well child exam: 2SLS log(total charges) log(total paid)
and Medicaid condition no condition
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
log(NP per population) 0.009 0.036 −0.017 −0.042 **

0.001 −0.019 0.005 −0.011 −0.092 −0.041
(0.020) (0.031) (0.030) (0.020) (0.028) (0.026) (0.056) (0.063) (0.179) (0.089)
log(PA per population) 0.007 0.004 −0.009 −0.009 0.004 −0.030 0.001 0.055 0.012 0.038
(0.022) (0.023) (0.016) (0.019) (0.025) (0.023) (0.063) (0.064) (0.077) (0.088)
log(NP per population) × predicted primary care 0.042* 0.042* 0.048* 0.022 0.035
(0.025) (0.022) (0.024) (0.041) (0.034)
K. Stange / Journal of Health Economics 33 (2014) 1–27

log(PA per population) × predicted primary care −0.038* −0.005 −0.047* −0.016 0.034
(0.022) (0.018) (0.023) (0.060) (0.058)
log(NP per population) × NP index 0.105 0.003
(0.223) (0.122)
log(PA per population) × PA index −0.030 −0.069
(0.096) (0.106)
log(NP per population) × predicted primary care × NP index 0.024 0.005
(0.036) (0.033)
log(PA per population) × predicted primary care × PA index −0.029 −0.044
(0.060) (0.058)
Predicted likelihood of primary care −0.513*** −0.548*** −0.495*** −0.561*** −0.541*** −0.499***
(0.015) (0.073) (0.061) (0.079) (0.070) (0.056)
Additional controls
Individual, MD density, procedures, county FE No Yes Yes Yes Yes Yes Yes See Yes Yes
State × time, county characteristics × time No No Yes Yes Yes Yes Yes Notes Yes Yes
F-test for provider supply coefficient = 0 when primary care = 1 (100%) and practice index = 1 (100%)
Nurse practitioners (p-value) 0.381 0.999 0.082 0.263 0.975
Physician assistants (p-value) 0.011 0.405 0.053 0.019 0.129
Adjusted R-squared 0.026 0.113 0.114 0.077 0.124 0.053 0.122 0.105 0.114 0.077
Rounded N 762,500 761,300 756,900 734,600 527,300 96,100 13,100 756,900 756,900 734,600
Note: All specifications include year fixed effects. Robust standard errors clustered by state in parentheses. Individual controls include male, age, age squared, dummies for four income categories, dummies for public, private,
or no insurance, and dummies for three self-reported health categories. Time-invariant county characteristics that are interacted with time (linearly) include the fraction of persons in poverty (1989), infant mortality rate
(1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990), unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), and
HMO penetration rate (1998). Predicted likelihood of being a primary care visit was estimated by predicting whether each individual office visit was to a primary care provider based on broad visit category, the individual
characteristics listed above, and the medical condition (if any) associated with the visit. Excluded instruments in 2SLS estimates are the number of BA RN programs in 1963 in county per population and the number of PA
programs in 1975 in county per population. 2SLS specifications include year and state fixed effects, individual controls, and time-invariant county characteristics, the predicted likelihood that a visit is primary care, and
procedure dummies. See text for further explanation.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
positive omitted variable bias between visit prices and NP or PA supply and −0.06 for PA supply, though the latter is statistically
concentration. Column (2) controls for individual characteristics, significant. Overall, it appears that provider supply has minimal
indicators for 20 different treatments or procedures performed impact on visit price, even for services expected to be easily shifted
during the visit, and the estimated likelihood that the visit is to from physician to non-physician care.
a primary care provider, based on person demographics, the type
of visit, and associated conditions. The estimates suggest that pri- 5.3. Expenditure on office-based visits
mary care visits are predicted to cost 40% less than visits that can
only be performed by specialists. This control has little effect on the Table 7 examines the impact of NP and PA supply on health
estimated price elasticities, which remain small and insignificant.28 care expenditure for total office-based provider visits.30 Expendi-
Since many visits are to specialist physicians, we may not expect ture tends to be positively (though insignificantly) correlated with
there to be large price impacts of greater availability of nurse practi- provider supply, even after the preferred set of controls (individual
tioners and physician assistants, who work largely in primary care. characteristics, physician supply, linear time trends, county fixed
We would expect to see the largest price effects on visits for which effects) are included. Across all individuals and areas, the point
NP and PA care is the most substitutable for physician care. Specifi- estimates imply an (insignificant) 0.032% increase in expenditure
cation (3) explores this possibility by interacting NP and PA supply associated with a 1% increase in PA supply and an (insignificant)
with the estimated likelihood that a given visit is primary care. 0.003% increase associated with a similarly sized expansion of NP
Negative point estimates on these interactions would suggest that supply. Point estimates of expenditure elasticities are largest for NP
the prices respond more (negatively) to expanded provider supply supply and Medicaid recipients and PA supply and the uninsured.
for visits that are more likely to be to a primary care (rather than Though most of the practice index interactions are positive, none
specialist) provider. This pattern is not seen in the data. Greater of the elasticities implied by the point estimates for the most NP-
NP supply is associated with a positive price change for visits that and PA-favorable states are significant at conventional levels.
are likely to be primary care, compared to an insignificant zero or
5.4. Qualitative measures of access, preventive care, and health
negative change for non-primary care visits. Point estimates for
PA supply are indeed negative and approaching statistical signif- Even if broad measures of utilization and expenditure are unre-
icance in some specifications, though still very small. The pattern sponsive to expanded NP and PA supply and scope-of-practice, it
is unchanged regardless of whether total charges or total amount is possible that these changes alter individuals’ interaction with
paid (column 4) is used as the measure of price. Fig. 3 applies the health system or the nature of the care they receive. Table 8
this approach even more flexibly. I estimate Eq. (2) separately for presents OLS estimates of Eq. (1) with an indicator for several
twenty quantiles of predicted probability of primary care. There other health care and health outcomes as the dependent variables.
is no obvious relationship between the estimated price elasticity Columns (1) and (2) examine whether the individual has a “usual
and predicted likelihood of being a primary care visit. At all ranges source of care,” the one qualitative measure of access that was con-
of visit types, from general check-ups (high likelihood of being pri- sistently assessed in the MEPS through the entire analysis period.
mary care) to cancer diagnosis (low likelihood), the estimated price Twenty-two percent of my sample does not have a usual source
elasticity bounces around zero. This is true both for NP and PA sup- of care. When only year fixed effects are controlled for, a greater
ply and regardless of how price is measured. Columns (5)–(8) probe number of providers of either type is associated with an increased
the robustness of this null result. Column (5) excludes respondents likelihood of have having a usual source of care. However, this pat-
with Medicare or Medicaid insurance coverage, whose prices may tern seems to be driven by county and individual characteristics
be less subject to market forces, as reimbursement rates are set by that differ across areas, since this relationship is greatly diminished
the Medicare and Medicaid programs. These results are indistin- with controls. Specification (2) controls for changes in population
guishable from the preferred specification in column (3). Columns characteristics that may be correlated both with provider concen-
(6) and (7) examine two common types of visits expected to be tration and access, fixed county characteristics, and linear time
easily performed by NPs and PAs: regular check-ups and well-child trends by state and county characteristic. The point estimates are
exams with no associated medical conditions. Again, provider sup- also small in magnitude: I can rule out an increase in the likelihood
ply has minimal association with price. Column (8) presents 2SLS of having a usual source of care of 0.3 percentage points associated
estimates that exploit cross-sectional variation in provider supply with a 10% increase in NP or PA supply.
induced by proximity to the historical training infrastructure. The Columns (3)–(8) examine the relationship between provider
2SLS results are very consistent with the fixed effects estimates: supply and several important preventive care outcomes. Greater
provider supply has minimal effect on visit prices overall, though availability of non-physician clinicians, particularly nurse practi-
the 2SLS estimates are much less precise.29 tioners, may expand the use of preventive care services both due
The final two columns of Table 6 permit the price elasticity to to greater provider availability to perform low-value (e.g. poorly
vary with predicted likelihood of being primary care, state practice reimbursed) services and also because nurse practitioners’ train-
environment index, and their interaction. If there is to be any sig- ing emphasize prevention. Estimates suggest that a greater supply
nificant price effect, we may expect to find it among visits for of non-physician clinicians is not associated with a greater likeli-
which provider type is highly substitutable and state laws are hood of getting a flu shot, checking blood pressure or cholesterol,
the least restrictive. Even for this specific group of visits, the esti- having a breast exam, or having a pap smear in the past 12 months.
mated price elasticity is wrong-signed or very small: +0.06 for NP The final column examines whether respondents report being in
very good or excellent health. The relationship between this mea-
sure of health and provider supply is positive, but very weak and
insignificant.31
28
Including fixed effects for one of 600 clinical conditions (or none) associated
with the visit instead of the predicted likelihood the visit is primary care produces
similar results.
29 30
I also constructed 2SLS estimates separately by quintile of predicted likelihood Extensive margin effects are very similar to those reported in Table 3 for total
of being made to a primary care provider. Even for visits that NPs and PAs would visits.
31
be expected to be the most substitutable for physician care, there is no evidence of In results not reported here, I also find that the coefficients on interactions
price impacts of greater NP or PA supply. See Appendix B, Table B10. between provider supply and NP and PA state practice indices are insignificant
Fig. 3. Estimated elasticity of price with respect to change in provider supply, by likelihood of visit being primary care.
Table 7
OLS estimates of provider density and interaction with regulatory environment index on total office-based visit expenditure.
Dependent variable = log(total amount paid)
All individuals Medicare Medicaid Private Uninsured
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
log(NP per population) 0.003 −0.287 *

−0.006 −0.443 *
0.052 −0.554 0.010 0.074 −0.004 −0.869*
(0.026) (0.148) (0.046) (0.261) (0.092) (0.371) (0.033) (0.213) (0.055) (0.441)
log(PA per population) 0.032 −0.085 0.015 −0.217 −0.035 −0.507*** 0.047* −0.013 0.186** 0.607
(0.027) (0.128) (0.063) (0.272) (0.049) (0.180) (0.028) (0.174) (0.086) (0.383)
log(NP per population) × NP index 0.393* 0.591 0.826 −0.087 1.188*
(0.212) (0.378) (0.519) (0.272) (0.607)
log(PA per population) × PA index 0.160 0.316 0.648** 0.083 −0.580
(0.167) (0.342) (0.277) (0.221) (0.558)
log(MD per population) −0.069 **
−0.068** −0.037 −0.033 −0.082 −0.072 −0.059** −0.057** −0.208 −0.238
(0.026) (0.028) (0.043) (0.043) (0.062) (0.061) (0.024) (0.024) (0.148) (0.148)
Adjusted R-squared 0.173 0.173 0.096 0.096 0.196 0.196 0.149 0.149 0.108 0.108
Rounded N 171,300 171,300 30,300 30,300 41,100 41,100 108,700 108,700 14,000 14,000
F-test for provider supply coefficient = 0 when practice index = 1 (100%)
Nurse practitioners (p-value) 0.140 0.265 0.143 0.848 0.085
Physician assistants (p-value) 0.129 0.307 0.180 0.220 0.891
Note: All specifications include year fixed effects, individual characteristics, county fixed effects, linear time trends for each state, and linear time trends by time-invariant
county characteristics. Robust standard errors clustered by state in parentheses. Individual controls include male, age, age squared, dummies for four income categories,
dummies for public, private, or no insurance, and dummies for three self-reported health categories. Time-invariant county characteristics that are interacted with time
(linearly) include the fraction of persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990),
unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), and HMO penetration rate (1998).
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
6. Direct impact of regulation relatively weak association with utilization and access, I do find
that provider supply is more positively correlated with utilization
This paper is primarily concerned with how the regulatory in states that permit NPs to be more substitutable for physicians.
environment moderates the effect of increases in NP and PA That is, there is some evidence that this form of occupational
supply on various outcomes. Though provider supply has a regulation weakly impacts the healthcare market by moderating
the effects of provider supply. It is also possible that the regula-
tory environment has a direct impact on these same outcomes.
The previous analysis controlled for the direct effect of states’
for all these outcomes. The direct and interactive effects are also small and regulatory environment (at a point in time) through the inclu-
insignificant for all insurance subgroups for the likelihood of having a usual source of sion of county fixed effects. In order to quantify the direct impact
care. Lastly, 2SLS estimates of these outcomes are all insignificant with mixed signs
(some positive, some negative), though much less precise than the fixed effects
of the regulatory environment while still controlling for cross-
estimates reported here. sectional differences between areas that may be correlated with
Table 8
OLS Estimates of provider density and interaction with regulatory environment index on usual source of care, preventative outcomes, and health (linear probability model).
Had the following in the previous 12 months
Have usual source of Flu shot Blood pressure Cholesterol Pap smear Breast exam Average of 5 Health is very
care check check (women 18 + ) (women 18 + ) preventive good or
outcomes excellent
(1) (2) (3) (4) (5) (6) (7) (8) (9)
log(NP per population) 0.041** 0.007 −0.008 −0.010 0.009 0.018 0.020 0.000 0.008
(0.018) (0.014) (0.016) (0.009) (0.009) (0.012) (0.012) (0.008) (0.020)
log(PA per population) 0.015* 0.009 0.013 0.011 0.015 0.002 0.001 0.006 0.022
(0.009) (0.009) (0.010) (0.010) (0.015) (0.018) (0.019) (0.011) (0.014)
log(MD per population) 0.032 0.006 0.011 0.004 −0.010 0.012 0.010 0.008
(0.022) (0.013) (0.017) (0.015) (0.020) (0.039) (0.013) (0.010)
Controls None Full Full Full Full Full Full Full Full
N (rounded) 265,500 264,200 172,600 171,400 166,000 90,500 90,800 175,300 281,500
Adjusted R-squared 0.006 0.191 0.209 0.159 0.230 0.089 0.072 0.230 0.110
Note: All specifications include year fixed effects, individual characteristics, county fixed effects, linear time trends for each state, and linear time trends by time-invariant
county characteristics. Robust standard errors clustered by state in parentheses. Individual controls include male, age, age squared, dummies for four income categories,
dummies for race/ethnicity, dummies for public, private, or no insurance, and dummies for three self-reported health categories (except (9)). Time-invariant county charac-
teristics that are interacted with time (linearly) include the fraction of persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health (1990), fraction
of workforce in manufacturing (1990), unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), population
density (1992), and HMO penetration rate (1998).
*
p < 0.1.
**
p < 0.05.
**
p < 0.01.
regulation and health care outcomes, I exploit changes in one environment – have only modest impact on the market for health
component of regulation – prescriptive authority – within states care services.
over time.
Table 9 presents estimates of models that regress health care 7. Discussion and conclusion
utilization, access, prices, and expenditure on time-varying indica-
tors for whether NPs and PAs are permitted to write prescriptions This paper is the first to assess the output market effects of the
for controlled substances, controlling for state fixed effects and enormous increase in supply of nurse practitioners and physician
individual characteristics. Since information about prescriptive assistants, the interaction of this growth with occupational restric-
authority is available for all states and years, these models use tions, and an expansion of these providers’ scope-of-practice. My
nearly the entire sample of individuals in the MEPS. The even rows findings suggest that, across all areas, greater supply of NPs and
additionally control separately for the log of number of NPs, PAs, PAs has had minimal impact on utilization, access, preventative
and primary care physicians. Changes in provider supply may be health services, and prices. However, primary care utilization is
part of the causal effect of changes in practice environment (Kalist moderately responsive to NP provider supply in areas that grant
and Spurr, 2004), so these specifications isolate any effects that non-physician clinicians the greatest autonomy to practice inde-
operate through other channels (and could reduce bias if the cor- pendently. I find no evidence that increases in provider supply
relation between provider supply and laws is spurious). When decreases prices, even for visits most likely to be affected by NPs
examining price of individual visits, these models also include con- and PAs: primary care visits in states with a favorable regulatory
trols for all procedures and treatments provided during the visit environment for NP and PAs. I also find that expansions in pre-
and indicators for one of 600 conditions associated with the visit scriptive authority for NPs are associated with modest increases in
(including none). utilization and expenditure, though no consistent pattern emerges
I find that granting NPs the ability to prescribe has a very mod- for expansions to PA prescriptive authority. Neither change appears
est impact on the intensive utilization margin: NP prescriptive to consistently reduce visit prices.
authority is associated with 3% more visits conditional on hav- The results of this paper suggest that even considerable changes
ing at least one. For PAs, the opposite is true: granting PAs the in the nature of who is providing health care can result in only
ability to prescribe is actually associated with 5% fewer visits con- modest changes in important outcomes such as access, overall
ditional on having at least one, though this result is sensitive to utilization, prices, and expenditure. There is also suggestive evi-
whether supply is controlled for. Expansive NP prescriptive author- dence that occupational regulation may play some role in input
ity is positively associated with increases in the likelihood of having substitutability and thus moderate the relationship between input
at least one visit, but this is statistically insignificant. NP prescrip- availability and the aggregate supply of primary health care. An
tive authority is modestly associated with greater visit charges, important implication is that licensing laws – which determine the
though this does not translate into greater prices paid. PA prescrip- division of labor and thus how labor inputs translate to services –
tive authority is not associated with changes in visit prices by either may be as important as policies that expand supply directly. My
measure. Thus, permitting NPs and PAs to do more also does not results call for a reconsideration of the nature of federal healthcare
appear to create price pressure on office-based visits. Given the workforce efforts, which have mostly focused on supply expansion
minimal price impact of the regulation, the patterns for expendi- rather than altering how existing labor is used.
ture follow those for utilization pretty closely. There is a positive, Why a greater number of providers have not significantly
though modest, association between NP prescriptive authority altered the healthcare market remains an unanswered question.
and expenditure (both on the extensive and intensive margin). One possibility is that existing providers – physicians, NPs, and
Together these results suggest that changes in NP and PA pre- PAs – reduce their work hours in response to provider expansion,
scriptive authority – one key component of the overall regulatory limiting the effective supply increase to less than the number of
Table 9
OLS Estimates of NP and PA prescriptive authority on various outcomes.
Coefficient on Control for supply Level of fixed effects N (rounded)
NP prescribe PA prescribe
Utilization (individuals)
Office-based provider visit > 0 (1) 0.005 0.004 N State 400,500
(0.006) (0.006)
(2) 0.013* 0.001 Y County 282,800
(0.007) (0.009)
log(office-based provider visits) (3) 0.019 −0.009 N State 272,600
(0.013) (0.013)
(4) 0.031** −0.053*** Y County 189,400
(0.013) (0.014)
Have usual source of care (5) −0.002 0.002 N State 371,100
(0.006) (0.005)
(6) −0.002 −0.010 Y County 265,500
(0.007) (0.007)
Visit prices (visits)
log amount paid (check-up visits) (7) 0.014 0.005 N State 313,100
(0.011) (0.010)
(8) −0.002 0.004 Y County 218,300
(0.013) (0.018)
log amount paid (diagnose/treat visits) (9) 0.017 −0.007 N State 573,100
(0.015) (0.013)
(10) −0.004 −0.005 Y County 395,100
(0.013) (0.011)
log total charges (check-up visits) (11) 0.035*** 0.010 N State 321,800
(0.013) (0.011)
(12) 0.029* −0.015 Y County 224,300
(0.016) (0.018)
log total charges (diagnose/treat visits) (13) 0.035* −0.005 N State 590,300
(0.019) (0.016)
(14) 0.005 −0.014 Y County 406,900
(0.020) (0.012)
Expenditures (individuals)
Office-based expenditure > 0 (15) 0.007 0.004 N State 400,500
(0.006) (0.005)
(16) 0.015** 0.000 Y County 282,800
(0.007) (0.008)
log(office-based expenditure) (17) 0.043** −0.010 N State 267,300
(0.021) (0.017)
(18) 0.027* −0.065*** Y County 185,700
(0.015) (0.014)
Note: Each row is a separate regression of the outcome on indicators for whether NPs and PAs were permitted to prescribe controlled substances in that state-year, controlling
for male, age, age squared, dummies for four income categories, dummies for public, private, or no insurance, dummies for three self-reported health categories, and either
state or county fixed effects. Even rows additionally control separately for the log of number of NPs, PAs, and primary care physicians. Models for visit-level prices also
include indicators for all procedures and treatments provided on the visit and indicators for one of 600 conditions associated with the visit. Robust standard errors clustered
by state in parentheses.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
providers would suggest. There is evidence that physicians reduce insurance plans) and when excluding people with public insurance,
the number of hours spent on patient care in response to public but reimbursement-driven rigidity in price-setting is one plausible
health insurance expansions (Garthwaite, 2012), so it is reasonable explanation for minimal price response.
to expect a similar response to a greater number of providers. A third possibility is that the number of providers may be less
Understanding how changes in provider supply and regulation alter important than the organizational structure in which their services
work hours and earnings of existing providers is an important topic are delivered. Community health clinics (CHCs) have been shown
to be explored, albeit with different data. to have substantial effects on healthcare access and health out-
A second possibility is that current reimbursement policies comes (Bailey and Goodman-Bacon, 2012), but isolated provider
– which create incentives for physician involvement in services supply expansions absent the outreach and other services pro-
provided by NPs and PAs in order to bill at a higher rate – limit effi- vided by CHCs may be less effective. A related possibility is that
cient substitution between providers, preventing cost (and price) physicians and non-physicians still have very different views about
reductions and utilization increases from materializing. While I do the role of the latter in health care delivery, limiting gains from
not observe a differential response in states with greater reim- provider supply despite recent legislative changes. Donelan et al.
bursement parity between physicians, NPs, and PAs, the test is (2013) find that MDs and NPs still possess very different views on
admittedly weak. Related, a lack of direct billing by lower-cost NPs hospital admitting privileges, equal pay, and quality of care pro-
and PAs may combine with rigid insurance payment schemes (par- vided by NPs and that 8 out of 10 NPs work in a practice with an
ticularly by the Medicare and Medicaid programs) to make prices MD. Scope-of-practice laws expand the frontier of what NPs and
unresponsive to provider supply. Minimal price effects are seen PAs can do, but do not require that practices take full advantage of
when looking at visit charges (which are not directly dictated by this frontier. Finally, it is possible that patients’ interactions with
the healthcare system have been altered in ways that that are not U.S. population in 1996, but increases to 35 states covering 80% in
easily captured by overall measures of utilization and prices. For 2008.
instance, greater NP and PA supply may facilitate the provision of Comparing our estimates of NP and PA supply to other sources
team-based care and task specialization that improves the quality is difficult because there are no definitive sources for NP supply
of and patients’ satisfaction with care without altering the overar- nationally or subnational. The implied NP to population ratio for
ching patterns of utilization. Changes in task specialization are one our sample in 2008 (40 NPs per 100,000 people) is just slightly
explanation proposed for the modest economic impacts observed less than that implied by the National Survey Sample of Registered
for immigration (Peri and Sparber, 2009). All of these are fruitful Nurses in 2008. In that survey (which is used to construct Fig. 1),
areas for further exploration, with important implications for the there are 158,348 nurses who have received NP preparation, or 52
design and implementation of healthcare workforce policy. per 100,000 people in the U.S in 2008. Of these, only 141,286 are
employed in nursing (46 per 100,000 people) and 131,678 (43 per
Appendix A. Data appendix 100,000 people) are employed in nursing and have either a national
certification or recognition as an NP from a State Board of Nursing
Nurse practitioner and physician assistant supply data (which is what they would need to be included in our sample). The
implied PA to population ratio for our sample in 2007 (22 PAs per
In collaboration with Deborah Sampson from Boston College 100,000 people) is similar to that implied by the number of PAs
School of Nursing, I assembled a new dataset containing the actively practicing in the U.S. (68,000 or 22 per 100,000 people) as
number of licensed nurse practitioners, physician assistants, and estimated by AAPA and reported by Morgan and Hooker (2010).
physicians (by specialty) at the county level annually for the years So we interpret our NP and PA supply estimates to be comparable
1990–2008. This data was constructed from individual licensing to national figures, but with the benefit of providing subnational
records obtained from state Boards of Nursing, Medicine, Health, estimates over time.
Commerce and other relevant state licensing agencies. The typical
license record includes the provider’s name, mailing address (typ- NP and PA practice index
ically home), license number, license type, issue date, expiration
date, and status. We aggregated these individual records to con- An overall index of the professional practice environment for
struct total counts of the number of active PA and NP licenses in NPs and PAs in each state in 2000 was obtained from the Health
each county in each year for as many years as possible.32 Data on Services Resource Administration (HRSA, 2004). This index ranks
the number of physicians (by specialty) was obtained from the Area states separately for NPs and PAs along three dimensions: (1) legal
Resource File. standing and requirement for physician oversight/collaboration on
Our aggregation currently makes three main assumptions. First, diagnosis and treatment; (2) prescriptive authority; and (3) reim-
only licensees’ current (or most recent, if the license is expired) bursement policies. These three dimensions are then combined
address is kept on file, so we have applied this address to all years of into a single index for each profession with a possible range from
license activity.33 Second, licenses with out-of-state addresses are zero to one. For each of the three indices, the legislation and policies
assumed not to be actively practicing in the state. Many providers of each state are scored along many specific criteria. For instance,
are licensed in multiple states, though primarily practice in only the “legal” index (35% of total for NPs, 35% for PAs) includes com-
one. Since address information was less complete for out-of-state ponents related to whether autonomous practice is possible, the
licenses and there is more uncertainty about county of practice, required type of practice agreements with physicians, rules regu-
we do not include out-of-state licenses in our county counts. This lating review by physicians, and board oversight, among others.
likely understates the number of providers, particularly for bor- While the specific components and weights differ between NPs
der counties and small states. This undercounting will not bias our and PAs, collectively they all measure the extent of autonomy the
estimates if it remains fixed over time since our analysis includes two professions have from physician oversight and control. The
county fixed effects. Lastly, our measures reflect active licenses prescriptive authority index (30% for NPs, 40% for PAs) includes
not necessarily actively practicing practitioners. It is possible that measures of the type of drugs NPs and PAs can prescribe, the
providers will maintain an active license even if they are not requirements for physician oversight, whether the NP or PA uses
actively practicing. If this pattern changes over time, our trends their own DEA number, and whether they sign the prescription or
may over or understate the true trends in provider supply. can sign for samples, among others. The reimbursement index (35%
We successfully collected at least some historical license data for NPs, 25% for PAs) includes points based on Medicaid reimburse-
on both NPs and PAs from 35 states. We found that many ment rates and requirements for private insurers to reimburse for
states did not retain or would not provide records on inac- NP or PA services. A detailed listing of the score of each state along
tive/expired licenses, or these licenses were missing key fields every specific criteria can be found in HRSA (2004).
(e.g. address or issue date). Our sample is geographically diverse,
with representation from most parts of the country. Our weak- State laws on prescriptive authority
est coverage is in the upper mountain/plains states and the
lower Mississippi River states. The years for which data is We also constructed indicators for whether nurse practition-
available varies across states, so our county panel is unbalanced: ers and physician assistants are permitted to write prescriptions
NP and PA supply data is available for 23 states covering 52% of the (any, some controlled substances, levels V through II controlled
substances) in a given state and year. Prescriptive authority was
coded from various issues of the journal Nurse Practitioner and from
32
For several states we obtained number of active licensed providers by county Abridged State Regulation of Physician Assistant Practice, distributed
over time directly from annual summary reports published by the states, rather than by the American Academy of Physician Assistants.
individual license records.
33
For instance, if a licensed NP lived in Washtenaw County (MI) from 1990 to Data on nursing and PA schools
2002 and Wayne County (MI) from 2003 to present, they would be counted in the
total for Wayne County for the entire 1990-present time period. This no-mobility
assumption is more problematic for years further back in time or far from the license Data on all current and closed PA schools and programs, includ-
expiration date. ing their location, opening and closing dates was obtained from the
Physician Assistant Education Association and the Accreditation five types of visits: general check-up or well-child visit, diagno-
Review Committee on Education for the Physician Assistant (ARC- sis or treatment, emergency, post-op follow-up visit, or shots (the
PA). Information on the location of basic RN training programs in baseline category). Condition is a set of 600 dummy variables for
1963, by type (diploma, Associates, Bachelors) was obtained from the condition associated with the visit (including none). Individual
State Approved Schools of Professional Nursing, 1963 (National factors such as income category, age, education, and health risks
League for Nursing). are included in Xijt . Estimates for this model are presented in the
third column of Table B4 in Appendix B. The model is then used to
Predicting likelihood of primary care predict the likelihood that each individual visit (in all years) would
be to a primary care provider based on these characteristics. Impor-
For each visit in the MEPS office-based visits files I construct tantly, this predicted value does not depend on provider supply. For
a measure of the predicted likelihood of seeing a primary care instance, all “check-ups” not associated with any specific condition
provider, given observed individual and visit-level characteristics. by 40 year old men with the same income and insurance will have
Specifically, I estimate the following equation using a probit model the same predicted likelihood of being primary care, regardless of
using data from 2002 to 2008: the NP and PA supply in their county in the year of the visit. The
most common high likelihood visits includes flu shots, no condition
PrimaryCareimjt = VisitCategoryimjt + Conditionimjt + ˇx Xijt + εimjt
checkup, and sore throat.
The outcome, PrimaryCare, is an indicator for whether the visit
was to a family practice, general practice, or internal medicine Appendix B. Additional tables
physician, pediatrician, nurse or nurse practitioner, or physician
assistant. VisitCategory is a set of dummy variables for each of Tables B1–B10.
Table B1
Summary statistics, person sample.
Mean SD Mean SD

MD per population (×100,000) 89.619 44.211 88.879 46.38
NP per population (×100,000) 30.166 20.629 N/A
PA per population (×100,000) 17.082 10.979 N/A
NP practice index (2000) 0.744 0.134 0.738 0.129
PA practice index (2000) 0.772 0.107 0.737 0.141
NPs can prescribe controlled substances in state × year 0.790 0.408 0.724 0.447
PAs can prescribe controlled substances in state × year 0.733 0.442 0.674 0.469
Individual characteristics
Male 0.477 0.499 0.477 0.499
Age 33.682 22.261 34.038 22.358
Income category 1 (lowest) 0.258 0.437 0.253 0.435
Income category 2 0.171 0.376 0.167 0.373
Income category 3 0.294 0.456 0.297 0.457
Income category 4 (highest) 0.277 0.448 0.283 0.45
Have private insurance 0.587 0.492 0.603 0.489
Have public insurance 0.246 0.431 0.238 0.426
Have no insurance 0.167 0.373 0.158 0.365
Health very good 0.603 0.489 0.599 0.49
Health good 0.247 0.431 0.245 0.43
Health bad 0.117 0.321 0.121 0.326
Hispanic 0.316 0.465 0.258 0.437
Non-Hispanic white 0.48 0.500 0.530 0.499
Non-Hispanic black 0.145 0.352 0.158 0.365
Other race 0.059 0.235 0.054 0.227
Health very good 0.603 0.489 0.599 0.49
Health good 0.247 0.431 0.245 0.43
Health bad 0.117 0.321 0.121 0.326
Hispanic 0.316 0.465 0.258 0.437
Non-Hispanic white 0.48 0.500 0.530 0.499
Non-Hispanic black 0.145 0.352 0.158 0.365
Other race 0.059 0.235 0.054 0.227
Office-based visits > 0 0.623 0.485 0.631 0.483
Number of office-based visits 2.781 5.517 2.831 5.479
Primary care office-based visits > 0 0.572 0.495 0.575 0.494
Number of primary care office-based visits 1.456 2.702 1.47 2.674
Charges for primary care office-based visits > 0 0.571 0.495 0.573 0.495
Total charges for primary care office-based visits 274.24 1035.25 268.60 972.92
Amount paid for primary care office-based visits > 0 0.561 0.496 0.563 0.496
Total amount paid for primary care office-based visits 152.84 365.66 151.34 356.82
Have usual source of care 0.780 0.414 0.788 0.409
Flu shot in last 12 months 0.272 0.445 0.276 0.447
Blood pressure check in last 12 months 0.773 0.419 0.782 0.413
Pap smear in last 12 months 0.569 0.495 0.568 0.495
Breast exam in last 12 months 0.615 0.487 0.617 0.486
Table B1 (Continued)
Mean SD Mean SD
Cholesterol check in last 12 months 0.526 0.499 0.523 0.499

County characteristics
% in poverty (1989) 13.819 7.07 13.989 7.339
Median income (1989) 31,167 7830 30,551 8173
Infant mortality rate (1989) 9.98 2.364 10.165 2.449
% workforce in health industry (1990) 8.065 2.06 8.152 2.093
% workforce in manufacturing (1990) 16.971 7.351 17.586 7.662
Unemployment rate 5.793 2.88 5.864 2.784
% white 77.934 14.416 79.111 15.108
% education HS+ 74.525 9.218 74.147 9.518
% Hispanic ethnicity 15.194 17.574 12.266 16.542
HMO penetration (1998) 0.307 0.171 0.285 0.175
Population density (1992) 2082 6060 1893 5488
# PA schools in 1975/100,000 population 0.012 0.045 0.015 0.09
# BA RN schools in 1963/100,000 population 0.047 0.126 0.05 0.133
# AA RN schools in 1963/100,000 population 0.032 0.094 0.027 0.107
# Diploma RN schools in 1963/100,000 population 0.268 0.472 0.286 0.503
Sample size (rounded to 100) 293,100 404,400
Note: Analysis sample includes all individuals living in counties for which physician, nurse practitioner, and physician assistant supply data is available in their survey year.
Provider supply measures are calculated at the county level. Physician supply only includes non-federal office-based physicians in family/general practice, general pediatrics,
general internal medicine, and general ob/gyn.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table B2
Summary statistics, office-based visit sample.
Mean SD Mean SD
No condition associated with visit 0.20 0.40 0.20 0.40

Predicted likelihood that visit is primary care 0.51 0.10 0.51 0.10
Visit category = checkup or well-child 0.29 0.46 0.29 0.45
Visit category = diagnosis/treatment 0.54 0.50 0.54 0.50
Visit category = emergency 0.01 0.09 0.01 0.09
Visit category = followup 0.12 0.33 0.12 0.33
Visit category = shots 0.04 0.20 0.04 0.20
See doctor 0.92 0.27 0.92 0.27
Provider was RN/NP 0.07 0.25 0.07 0.26
Provider was PA 0.01 0.11 0.01 0.10
Chemotherapy 0.01 0.07 0.01 0.07
Drug treatment 0.00 0.06 0.00 0.06
IV Therapy 0.00 0.05 0.00 0.05
Kidney dialysis 0.02 0.12 0.02 0.12
Occupational therapy 0.00 0.04 0.00 0.04
Physical therapy 0.03 0.16 0.03 0.16
Psycho therapy 0.01 0.11 0.01 0.11
Radiation therapy 0.00 0.06 0.00 0.07
Received shot 0.02 0.13 0.02 0.14
Speech therapy 0.00 0.02 0.00 0.02
Anesthesia 0.00 0.07 0.00 0.07
EEG 0.00 0.04 0.00 0.04
EKG 0.02 0.15 0.02 0.15
Lab tests 0.25 0.44 0.25 0.44
Mammogram 0.01 0.08 0.01 0.07
MRI 0.01 0.09 0.01 0.09
Other services 0.12 0.33 0.12 0.32
Received vaccine 0.03 0.17 0.03 0.17
Sonogram 0.01 0.12 0.01 0.11
X-rays 0.06 0.23 0.06 0.23
Total amount paid for visit (all sources) 118.76 152.44 116.79 151.52
Total charges for visit 218.84 356.19 213.16 351.57
Sample size (rounded to 100) 803,200 1,114,900
Note: Only office-based visits for which provider was doctor, registered nurse or nurse practitioner, or physician assistant are included. Also excludes visits categorized as
mental health, maternity, eye exam, laser eye surgery, and other. Analysis sample further restricted to visits by individuals living in counties for which physician, nurse
practitioner, and physician assistant supply data is available in their survey year.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
20
Table B3
Summary statistics by year, person sample.
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

MD per population (×100,000) 81.04 92.08 89.383 87.693 87.774 89.922 88.167 90.479 91.52 90.322 89.127 90.298 92.137
NP per population (×100,000) 17.16 22.578 22.596 25.276 26.674 27.791 28.812 29.982 31.68 32.965 34.838 36.855 39.643
PA per population (×100,000) 9.326 11.979 12.2 12.863 13.74 14.88 15.916 17.074 18.247 19.121 20.459 21.957 23.909
NPs can prescribe controlled substances in 0.328 0.760 0.624 0.629 0.699 0.747 0.756 0.747 0.894 0.901 0.902 0.936 0.928
state × year
PAs can prescribe controlled substances in 0.638 0.488 0.678 0.655 0.617 0.718 0.71 0.712 0.749 0.749 0.744 0.936 0.928
state × year
Individual characteristics
Male 0.476 0.473 0.478 0.479 0.478 0.48 0.478 0.475 0.474 0.475 0.475 0.48 0.479
Age 33.16 33.691 33.103 33.352 33.581 33.972 33.567 33.013 33.388 33.606 34.243 34.683 33.912
Income category 1 (lowest) 0.257 0.254 0.245 0.222 0.21 0.214 0.239 0.281 0.289 0.284 0.283 0.257 0.277
Income category 2 0.162 0.159 0.167 0.158 0.166 0.17 0.173 0.177 0.176 0.176 0.172 0.171 0.173
Income category 3 0.294 0.308 0.308 0.316 0.312 0.317 0.307 0.285 0.274 0.277 0.276 0.292 0.288
Income category 4 (highest) 0.286 0.279 0.28 0.304 0.312 0.299 0.282 0.257 0.261 0.263 0.269 0.28 0.262

Have private insurance 0.647 0.643 0.606 0.655 0.646 0.64 0.604 0.554 0.555 0.545 0.541 0.55 0.545
Have public insurance 0.181 0.205 0.22 0.191 0.194 0.2 0.233 0.273 0.273 0.283 0.29 0.279 0.278
Have no insurance 0.172 0.152 0.174 0.155 0.159 0.16 0.163 0.173 0.172 0.172 0.169 0.171 0.176
Health very good 0.614 0.624 0.617 0.633 0.624 0.619 0.601 0.591 0.587 0.578 0.578 0.596 0.617
Health good 0.233 0.232 0.236 0.238 0.245 0.238 0.255 0.253 0.255 0.26 0.258 0.25 0.231
Health bad 0.118 0.115 0.119 0.099 0.098 0.112 0.113 0.122 0.12 0.125 0.128 0.12 0.111
Hispanic 0.303 0.264 0.337 0.337 0.325 0.3 0.309 0.33 0.325 0.326 0.317 0.302 0.328
Non-Hispanic white 0.55 0.562 0.5 0.504 0.502 0.527 0.494 0.462 0.468 0.455 0.453 0.458 0.394
Non-Hispanic black 0.104 0.132 0.127 0.123 0.138 0.133 0.138 0.143 0.141 0.152 0.165 0.163 0.185
Other race 0.044 0.041 0.036 0.036 0.035 0.04 0.059 0.065 0.066 0.066 0.065 0.077 0.093
Office-based visits > 0 0.622 0.624 0.611 0.613 0.621 0.637 0.635 0.63 0.621 0.62 0.624 0.62 0.612
Number of office-based visits 2.797 2.841 2.726 2.603 2.724 2.882 2.892 2.828 2.845 2.8 2.783 2.756 2.58
Primary care office-based visits > 0 0.571 0.571 0.56 0.557 0.564 0.584 0.584 0.58 0.572 0.573 0.574 0.57 0.563
Number of primary care office-based visits 1.499 1.509 1.445 1.373 1.413 1.514 1.508 1.495 1.469 1.461 1.439 1.41 1.376
Non-primary care office-based visits > 0 0.411 0.411 0.397 0.397 0.414 0.426 0.419 0.408 0.407 0.401 0.412 0.409 0.387
Number of non-primary care office-based visits 1.298 1.332 1.281 1.23 1.311 1.368 1.385 1.334 1.375 1.339 1.344 1.346 1.204
Charges for primary care office-based visits > 0 0.567 0.57 0.558 0.556 0.563 0.583 0.583 0.579 0.571 0.572 0.572 0.568 0.561
Total charges for primary care office-based visits 198.36 205.25 211.37 206.49 228.73 264.60 284.13 276.66 302.45 302.76 310.70 316.76 325.36
Amount paid for primary care office-based visits > 0 0.559 0.557 0.547 0.548 0.55 0.569 0.573 0.569 0.563 0.562 0.563 0.559 0.553
Total amount paid for primary care office-based 131.78 129.27 129.39 125.50 135.85 156.52 165.19 159.87 162.62 162.11 162.08 160.10 160.27
visits
Have usual source of care 0.790 0.810 0.787 0.783 0.794 0.797 0.782 0.771 0.777 0.771 0.781 0.773 0.763
Flu shot in last 12 months 0.247 0.262 0.258 N/A 0.252 0.263 0.264 0.292 0.206 0.265 0.293 0.312 0.323
Blood pressure check in last 12 months 0.748 0.772 0.764 N/A 0.78 0.781 0.777 0.772 0.777 0.771 0.774 0.777 0.765
Pap smear in last 12 months 0.571 0.605 0.599 N/A 0.611 0.599 0.587 0.567 0.559 0.539 0.542 0.546 0.553
Breast exam in last 12 months 0.602 0.635 0.636 N/A 0.653 0.644 0.631 0.611 0.604 0.592 0.597 0.602 0.607
Cholesterol check in last 12 months 0.453 0.494 0.497 N/A 0.524 0.516 0.514 0.517 0.52 0.532 0.545 0.568 0.563
Sample size (rounded to 100) 12,300 18,400 15,700 16,200 17,000 24,700 30,000 26,300 27,000 26,900 27,400 24,800 26,500
Note: Analysis sample includes all individuals living in counties for which physician, nurse practitioner, and physician assistant supply data is available in their survey year. Provider supply measures are calculated at the
county level. Physician supply only includes non-federal office-based physicians in family/general practice, general pediatrics, general internal medicine, and general ob/gyn.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table B4
Determinants of whether a visit was to a primary care provider (Probit model).
Dept variable: provider was primary care provider
(1) (2) (3)
Broad visit category (omitted = “shots”)

Check-up −0.33993*** −0.31285*** −0.32029***
(0.004) (0.004) (0.004)
Diagnose or treat −0.42432*** −0.38006*** −0.38165***
(0.003) (0.004) (0.004)
Emergency −0.27555*** −0.26050*** −0.20946***
(0.007) (0.007) (0.008)
Follow-up −0.47956*** −0.43636*** −0.41842***
(0.002) (0.003) (0.003)
Individual characteristic
Male 0.03145*** 0.03117***
(0.001) (0.001)
Age −0.01596*** −0.01476***
(0.000) (0.000)
Age squared 0.00012*** 0.00011***
0.000 0.000
Poverty category 1 0.09324*** 0.10206***
(0.002) (0.002)
(0.002) (0.002)
(0.002) (0.002)
Private insurance −0.09580*** −0.09402***
(0.003) (0.003)
Public insurance −0.07239*** −0.06270***
(0.003) (0.003)
Health very good 0.01898*** 0.02247***
(0.002) (0.002)
Health good 0.00305* 0.0022
(0.002) (0.002)
Condition associated with visit
No condition 0.05408*** 0.20927
(0.002) (0.135)
Condition fixed effects No No Yes
Observations (rounded) 672,200 668,400 666,800
Psuedo-R2 0.029 0.107 0.198
Note: All specifications include year fixed effects. Robust standard errors clustered by state in parentheses. Sample includes only observations from 2002 to 2008, for which
specialty of physician seen is available. Primary care provider includes general and family practice physician, internal medicine physician, pediatrician, nurse or nurse
practitioner, and physician assistants. Reported coefficients are marginal effects.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table B5
Estimates of provider density on number of office-based visits, alternative models.
Dependent variable: number of office-based provider visits
OLS Poisson Negative binomial
X>0 ln X E[X] X>0 ln X E[X] X>0 ln X E[X]

(1) (2) (3) (4) (5) (6) (7) (8) (9)
log(NP per population) −0.003 0.001 −0.130 −0.008 −0.036 −0.100 −0.009 −0.050 −0.143
(0.011) (0.016) (0.095) (0.007) (0.032) (0.090) (0.008) (0.040) (0.115)
log(PA per population) 0.008 0.031 0.152 0.012 0.054 0.149 0.009 0.048 0.138
(0.010) (0.023) (0.107) (0.008) (0.037) (0.102) (0.007) (0.036) (0.105)
N (rounded) 281,500 175,100 281,500 281,500 281,500
Adjusted R-squared 0.163 0.201 0.142
−Log likelihood −835,663 −559,007
Note: Robust standard errors clustered by state in parentheses. All models include the full controls described in previous tables. Columns (1)–(3) represent three different
regressions. Column (3) uses total number of office visits (including zero) as the dependent variable. Columns (4)–(6) depict different marginal effects for a single Poisson
count model and columns (7)–(9) represent different marginal effects for a single negative binomial count model. No point estimates are significantly different from zero at
conventional levels.
22
Table B6
OLS Estimates of provider density on number of office-based visits, alternative controls.
Dept variable: have at least one office-based visit in year Dept variable: log(number of office-based visits in year) Dept variable: have usual source of care
No controls Individual Individual Full controls No controls Individual Individual Full controls No controls Individual Individual Full controls
controls, controls, controls, controls, controls, controls,
county FE county FE, county FE county FE, county FE county FE,
log(MD) log(MD) log(MD)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

Panel A. NP and PA supply combined
log(NP + PA per 0.047*** 0.005 0.005 0.006 0.042** 0.019 0.021 0.022 0.047** −0.009 −0.007 −0.010
population)
(0.015) (0.013) (0.012) (0.011) (0.017) (0.025) (0.026) (0.019) (0.020) (0.010) (0.010) (0.009)
log(MD per population) 0.0029 0.016 −0.0182 −0.023 0.0128 0.026
(0.022) (0.019) (0.013) (0.016) (0.022) (0.021)
N (rounded) 291,000 291,000 290,700 289,200 181,400 181,400 181,100 180,100 273,200 273,200 272,900 271,500
Adjusted R-squared 0.003 0.159 0.159 0.159 0.001 0.195 0.195 0.195 0.004 0.189 0.189 0.190
Panel B. NP and PA supply disaggregated
log(NP per population) 0.032** 0.003 0.003 −0.003 0.031 0.001 −0.001 0.001 0.041** 0.004 0.006 0.007
(0.014) (0.013) (0.012) (0.011) (0.019) (0.018) (0.019) (0.016) (0.018) (0.010) (0.011) (0.014)
log(PA per population) 0.023 0.003 0.003 0.008 0.021 0.025 0.026 0.031 0.015* 0.003 0.002 0.009
(0.015) (0.012) (0.011) (0.010) (0.020) (0.023) (0.023) (0.023) (0.009) (0.010) (0.009) (0.009)
log(MD per population) 0.0002 0.013 −0.027* −0.035** 0.019 0.032
(0.022) (0.020) (0.014) (0.017) (0.023) (0.022)
N (rounded) 282,800 282,800 282,800 281,500 176,000 176,000 176,000 175,100 265,500 265,500 265,500 264,200
Note: Robust standard errors clustered by state in parentheses. Specification (1) includes year fixed effects only. Individual controls include male, age, age squared, dummies for race/ethnicity, dummies for four income
categories, dummies for public, private, or no insurance, and dummies for three self-reported health categories. “Full controls” additionally includes state × time linear trends and time-invariant county characteristics
interacted with linear time trends. Time-invariant county characteristics that are interacted with time (linearly) include the fraction of persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health
(1990), fraction of workforce in manufacturing (1990), unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), population density (1992), and HMO penetration
rate (1998).
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table B7
Relationship between historical educational infrastructure and provider density (first stage).
Individual-level regressions Visit-level regressions
log(NP per population) log(PA per population) log(NP per population) log(PA per population)
(1) (2) (3) (1) (2) (3) (1) (2) (3) (1) (2) (3)
−0.135 −0.146 −0.106 −0.117

# BA RN schools in 1963/100,000 population 0.627*** 0.355** 0.362** 0.081 0.693*** 0.364** 0.377** 0.127
(0.211) (0.145) (0.145) (0.207) (0.198) (0.200) (0.209) (0.165) (0.163) (0.233) (0.226) (0.229)
# PA schools in 1975/100,000 population 0.384 −0.072 −0.083 1.242*** 0.976*** 0.967*** 0.266 −0.167 −0.176 1.294*** 1.037*** 1.021***
(0.342) (0.276) (0.277) (0.367) (0.322) (0.323) (0.341) (0.290) (0.288) (0.425) (0.392) (0.393)
# AA RN schools in 1963/100,000 population −0.294** 0.016 −0.287* −0.044
(0.140) (0.173) (0.147) (0.186)
Diploma RN schools in 1963/100,000 population −0.038 0.034 −0.050 0.027
(0.046) (0.050) (0.042) (0.049)
log(MD per population) 0.489*** 0.500*** 0.350*** 0.341*** 0.479*** 0.493*** 0.350*** 0.344***
(0.065) (0.065) (0.066) (0.067) (0.066) (0.066) (0.070) (0.071)
Individual controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
County controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Predicted likelihood visit is primary care N/A N/A N/A N/A N/A N/A Yes Yes Yes Yes Yes Yes
Procedure dummies N/A N/A N/A N/A N/A N/A Yes Yes Yes Yes Yes Yes
F-test for excluded instrument 8.839 5.993 6.265 11.45 9.206 8.95 10.94 4.87 5.329 9.28 7.00 6.74
Rounded N 286,100 286,100 286,100 286,100 286,100 286,100 781,300 781,200 781,200 778,100 777,300 777,300
Note: All specifications include year and state fixed effects. Robust standard errors clustered by county in parentheses. Individual controls include male, age, age squared, dummies for four income categories, dummies for
race/ethnicity, dummies for public, private, or no insurance, and dummies for three self-reported health categories. Time-invariant county controls include the fraction of persons in poverty (1989), infant mortality rate
(1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990), unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), population
density, and HMO penetration rate (1998).
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
23
24
Table B8
2SLS estimates of provider density on utilization, expenditure, and access.
Total visits Primary care Non-primary Total primary Have usual Flu shot Blood pressure Pap smear Breast exam Chol. check
visits care visits care amount source of care check
paid
>0 log( ) log( ) log( ) log( )

(1) (2) (4) (5) (9) (10) (11) (12) (13) (14) (15)
Panel A. fixed effects estimates

log(NP per population) −0.003 0.001 −0.005 −0.017 −0.007 0.007 −0.008 −0.010 0.018 0.020 0.009
(0.011) (0.016) (0.027) (0.020) (0.009) (0.014) (0.016) (0.009) (0.012) (0.012) (0.009)

*
log(PA per population) 0.008 0.031 0.007 0.03369 0.009 0.009 0.013 0.011 0.002 0.001 0.015
(0.010) (0.023) (0.013) (0.019) (0.014) (0.009) (0.010) (0.010) (0.018) (0.019) (0.015)
Panel B. 2SLS Estimates: first and second stage do not control for log(MD per population)
log(NP per population) 0.041 0.051 0.046 0.050 0.049 −0.034 0.036 0.027 −0.011 0.004 −0.055
(0.025) (0.057) (0.048) (0.055) (0.076) (0.039) (0.034) (0.020) (0.039) (0.039) (0.045)
log(PA per population) −0.009 −0.040 −0.082 −0.026 −0.088 −0.006 −0.020 0.002 0.024 0.018 0.015
(0.030) (0.076) (0.072) (0.076) (0.085) (0.040) (0.031) (0.034) (0.049) (0.040) (0.042)
Panel C. 2SLS estimates: first and second stage control for log(MD per population)
log(NP per population) 0.066 0.089 0.040 0.096 0.012 −0.074 0.047 0.065 −0.012 0.006 −0.109
(0.053) (0.111) (0.093) (0.110) (0.139) (0.089) (0.065) (0.046) (0.077) (0.074) (0.093)
log(PA per population) 0.001 −0.022 −0.085 −0.004 −0.104 −0.023 −0.015 0.019 0.024 0.019 −0.008
(0.040) (0.096) (0.093) (0.094) (0.105) (0.056) (0.037) (0.049) (0.062) (0.048) (0.059)
Panel D. 2SLS estimates: first and second stage control for AA and diploma RN programs per population in 1963 and for log(MD per population)
log(NP per population) 0.066 0.091 0.038 0.100 0.007 −0.080 0.053 0.066 0.008 0.029 −0.103
(0.052) (0.112) (0.093) (0.111) (0.140) (0.091) (0.063) (0.046) (0.076) (0.072) (0.091)
log(PA per population) 0.003 −0.020 −0.085 −0.002 −0.111 −0.029 −0.013 0.021 0.029 0.026 −0.008
(0.041) (0.098) (0.095) (0.096) (0.109) (0.058) (0.038) (0.050) (0.066) (0.053) (0.060)
N (rounded) 281,500 175,100 160,700 114,500 157,500 264,200 172,600 171,400 90,500 90,800 166,000
Note: Excluded instruments in 2SLS estimates are the number of BA RN programs in 1963 in county per population and the number of PA programs in 1975 in county per population. Fixed effects estimates include year
and county fixed effects, log(MD per population), individual controls, county characteristics interacted with linear time trends, and state-specific linear time trends. 2SLS specifications include year and state fixed effects,
individual controls, and time-invariant county characteristics. Robust standard errors clustered by county in parentheses. Individual controls include male, age, age squared, dummies for four income categories, dummies for
race/ethnicity, dummies for public, private, or no insurance, and dummies for three self-reported health categories. Time-invariant county characteristics include the fraction of persons in poverty (1989), infant mortality rate
(1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990), unemployment rate (1990), fraction white (1990), fraction with high school education (1990), fraction Hispanic (1990), population
density (1992), and HMO penetration rate (1998).
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table B9
OLS estimates of provider density and interaction with regulatory environment index on utilization, by insurance type.
Have at least one office-based visit log(total office-based visits in year) log(primary care office-based visits in year)
Medicare Medicaid Private Uninsured Medicare Medicaid Private Uninsured Medicare Medicaid Private Uninsured
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Panel A. No interactions with regulatory environment

log(NP per population) −0.015 −0.002 −0.007 −0.008 0.000 0.020 0.001 −0.028 0.022 0.001 −0.003 −0.077
(0.017) (0.031) (0.008) (0.015) (0.034) (0.073) (0.020) (0.070) (0.039) (0.068) (0.033) (0.054)

log(PA per population) 0.024* 0.018 0.009 −0.006 0.023 −0.007 0.035 0.090 0.005 0.009 0.000 0.026
(0.014) (0.019) (0.016) (0.027) (0.043) (0.037) (0.024) (0.072) (0.028) (0.029) (0.019) (0.040)
Panel B. Interactions with regulatory environment
log(NP per population) 0.044 −0.302 −0.091 0.067 −0.248 −0.456 −0.023 −0.225 −0.103 −0.476 −0.179 −0.237
(0.085) (0.189) (0.066) (0.075) (0.218) (0.314) (0.104) (0.293) (0.189) (0.288) (0.130) (0.262)
log(PA per population) −0.059 −0.032 −0.091 0.007 −0.135 −0.362** −0.037 0.433 0.089 −0.206 −0.017 0.121
(0.042) (0.075) (0.064) (0.121) (0.188) (0.163) (0.157) (0.340) (0.112) (0.163) (0.101) (0.190)
log(NP per population) × NP index −0.081 0.409 0.112 −0.104 0.336 0.649 0.032 0.267 0.170 0.647 0.239 0.223
(0.110) (0.263) (0.088) (0.100) (0.286) (0.395) (0.126) (0.458) (0.282) (0.387) (0.196) (0.341)
log(PA per population) × PA index 0.112 0.070 0.137 −0.019 0.214 0.488* 0.099 −0.476 −0.113 0.299 0.023 −0.131
(0.067) (0.095) (0.085) (0.156) (0.249) (0.243) (0.205) (0.494) (0.144) (0.210) (0.129) (0.260)
F-test for provider supply coefficient = 0 when practice
index = 1 (100%)
Nurse practitioners (p-value) 0.226 0.199 0.415 0.203 0.266 0.070 0.775 0.820 0.499 0.146 0.406 0.886
Physician assistants (p-value) 0.064 0.184 0.102 0.801 0.310 0.141 0.262 0.801 0.596 0.096 0.864 0.905
N (rounded) 34,800 64,100 162,000 47,100 30,400 42,200 110,000 15,400 26,100 18,400 67,400 6600
Note: All specifications include year fixed effects, county fixed effects, log(MD per population), individual controls, state × time linear trends, and time-invariant county characteristics interacted with linear time trends.
Primary care specifications also include log(number of non-primary care visits). Robust standard errors clustered by state in parentheses. Individual controls include male, age, age squared, dummies for four income categories,
dummies for race/ethnicity, dummies for public, private, or no insurance (when not collinear), and dummies for three self-reported health categories. Time-invariant county characteristics that are interacted with time
(linearly) include the fraction of persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990), unemployment rate (1990), fraction white (1990),
fraction with high school education (1990), fraction Hispanic (1990), population density (1992), and HMO penetration rate (1998). Number of primary and non-primary care visits was estimated by predicting whether each
individual office visit was to a primary care provider based on broad visit category, the individual characteristics listed above, and the medical condition (if any) associated with the visit. See text for further explanation. No
point estimates are significantly different from zero at conventional levels.
*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
25
26
Table B10
2SLS estimates of provider density on visit prices.
Log(total charges) Log(amount paid)
Quintile of predicted likelihood that visit is primary care Quintile of predicted likelihood that visit is primary care
All visits Lowest 2nd 3rd 4th Highest All visits Lowest 2nd 3rd 4th Highest
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Panel A. Fixed effects estimates

log(NP per population) 0.036 −0.012

(0.031) (0.025)
log(PA per population) 0.004 0.010
(0.023) (0.017)
Panel B. 2SLS estimates: first and second stage do not control for log(MD per population)
log(NP per population) −0.011 0.037 −0.052 −0.021 −0.052 0.062 0.019 0.010 −0.012 0.033 −0.026 0.089
(0.063) (0.110) (0.143) (0.075) (0.071) (0.081) (0.051) (0.107) (0.103) (0.066) (0.066) (0.087)
log(PA per population) 0.055 0.011 0.211 0.064 0.061 −0.043 0.003 0.035 0.085 −0.027 0.018 −0.040
(0.064) (0.116) (0.177) (0.094) (0.072) (0.087) (0.055) (0.126) (0.112) (0.106) (0.063) (0.088)
Panel C. 2SLS estimates: first and second stage control for log(MD per population)
log(NP per population) 0.025 0.054 0.023 0.106 −0.043 0.086 0.048 0.048 0.009 0.172 −0.026 0.108
(0.127) (0.202) (0.269) (0.200) (0.129) (0.170) (0.102) (0.190) (0.180) (0.194) (0.116) (0.181)
log(PA per population) 0.071 0.020 0.250 0.127 0.064 −0.033 0.017 0.055 0.096 0.044 0.017 −0.033
(0.084) (0.153) (0.228) (0.162) (0.085) (0.114) (0.073) (0.166) (0.145) (0.182) (0.072) (0.113)
Panel D. 2SLS estimates: first and second stage control for AA and diploma RN programs per population in 1963 and for log(MD per population)
log(NP per population) 0.034 0.029 0.034 0.148 −0.027 0.096 0.057 0.059 0.030 0.197 −0.024 0.109
(0.127) (0.197) (0.260) (0.225) (0.132) (0.171) (0.103) (0.193) (0.171) (0.209) (0.118) (0.180)
log(PA per population) 0.078 −0.004 0.258 0.170 0.065 −0.034 0.020 0.051 0.108 0.068 0.012 −0.036
(0.087) (0.159) (0.232) (0.183) (0.084) (0.115) (0.075) (0.171) (0.148) (0.193) (0.072) (0.114)
N (rounded) 756,900 150,000 151,000 151,800 151,400 152,700 734,600 147,000 147,200 147,500 146,000 147,100
Note: Excluded instruments in 2SLS estimates are the number of BA RN programs in 1963 in county per population and the number of PA programs in 1975 in county per population. Fixed effects estimates include year and
county fixed effects, log(MD per population), individual controls, county characteristics interacted with linear time trends, and state-specific linear time trends. 2SLS specifications include year and state fixed effects, individual
controls, and time-invariant county characteristics. All specifications include the predicted likelihood that a visit is primary care and procedure dummies. Robust standard errors clustered by state in parentheses. Individual
controls include male, age, age squared, dummies for four income categories, dummies for race/ethnicity, dummies for public, private, or no insurance, and dummies for three self-reported health categories. Time-invariant
county characteristics include the fraction of persons in poverty (1989), infant mortality rate (1988), fraction of workforce in health (1990), fraction of workforce in manufacturing (1990), unemployment rate (1990), fraction
white (1990), fraction with high school education (1990), fraction Hispanic (1990), population density (1992), and HMO penetration rate (1998). Predicted likelihood of being a primary care visit was estimated by predicting
whether each individual office visit was to a primary care provider based on broad visit category, the individual characteristics listed above, and the medical condition (if any) associated with the visit. See text for further
explanation.
References Institute of Medicine, 2011. The Future of Nursing: Leading Change, Advancing
Health. The National Academies Press, Washington, DC.
Adams, A.F., Ekelund Jr., R.B., Jackson, J.D., 2003. Occupational licensing of a credence Jones, A.M., 2000. Health econometrics. In: Culyer, A., Newhouse, J. (Eds.), Handbook
good: the regulation of midwifery. Southern Economic Journal 69 (3), 659–675. of Health Economics. Elsevier, Amsterdam.
American Association of Medical Colleges, 2010. Health Care Reform and Kaiser Commission on Medicaid and the Uninsured, 2011. Improving Access to Adult
the Health Workforce: Workforce Provisions Included in the Patient Primary Care in Medicaid: Exploring the Potential Role of Nurse Practitioners
Protection and Affordable Care Act. http://www.aamc.org/download/ and Physician Assistants, Issue Paper. March 2011.
142078/data/health workforce provisions and health care reform.pdf Kalist, D., Spurr, S., 2004. The effect of state laws on the supply of advanced practice
(accessed 26.05.10). nurses. International Journal of Health Care Finance and Economics 4, 271–281.
Auerbach, D., 2012. Will the NP workforce grow in the future? New forecasts and Kleiner, M.M., 2000. Occupational licensing. Journal of Economic Perspectives 14
implications for healthcare delivery. Medical Care 50 (7), 606–610. (4), 189–202.
Baicker, K., Chandra, A., 2004a. The productivity of physician specialization: evi- Kleiner, M.M., 2006. Licensing Occupations: Enhancing Quality or Restricting Com-
dence from the Medicare program. American Economic Review. Papers and petition? W.E. Upjohn Institute for Employment Research, Kalamazoo, MI.
Proceedings 93 (2), 357–361. Kleiner, M., Kudrle, R., 2000. Does regulation affect economic outcomes? The case
Baicker, K., Chandra, A., 2004, April. Medicare spending, the physician workforce, of dentistry. Journal of Law and Economics 63 (2), 547–582.
and beneficiaries quality of care. Health Affairs, 184–197. Kleiner, M.M., Marier, A., Park, K.W., Wing, C., 2011. Relaxing Occupational Licensing
Bailey, M., Goodman-Bacon, A., 2012. The War on Poverty’s Experiment in Public Requirements: Analyzing Wages and Prices for a Medical Service (Unpublished
Medicine: Community Health Centers and the Mortality of Older Americans. working paper).
University of Michigan (Unpublished manuscript). Kleiner, M.M., Park, K.W., 2010. Battles Among Licensed Occupations: Analyzing
Becker, G.S., Murphy, K.M., 1992. The division of labor, coordination costs, and Government Regulations on Labor Market Outcomes for Dentists and Hygien-
knowledge. Quarterly Journal of Economics 107 (4), 1137–1160. ists. NBER Working Paper No. 16560.
Blumenthal, D., 2004. New steam from an old cauldron – the physician supply Laditka, J., Laditka, S., Probst, J., 2005. More may be better: evidence of a negative
debate. New England Journal of Medicine 350 (17), 1780–1787. relationship between physician supply and hospitalization for ambulatory care
Chapman, S., Wides, C., Spetz, J., 2010. Payment regulations for advanced practice sensitive conditions. Health Services Research 40 (4), 1148–1166.
nurses: implications for primary care. Policy, Politics & Nursing Practice. 11 (2), Larson, E.H., Palazzo, L., Berkowitz, B., Pirani, M.J., Hart, L.G., 2003. The contribution
89–98. of nurse practitioners and physician assistants to generalist care in Washington
Chang, C.-H., Stukel, T.A., Flood, A.B., Goodman, D.C., 2011. Primary care physician State. Health Service Research 38 (4), 1033–1050.
workforce and Medicare beneficiaries’ health outcomes. Journal of the American Laurant, M., Reeves, D., Hermens, R., Braspenning, J., Grol, R., Sibbald, B., 2004. Sub-
Medical Association 305 (20), 2096–2105. stitution of doctors by nurses in primary care. Cochrane Database of Systemic
Chernew, M.E., Sabik, L., Chandra, A., Newhouse, J.P., 2009. Would having more Reviews (4), Art. No.: CD001271.
primary care doctors cut health spending growth? Health Affairs 28 (5), Leland, H.E., 1979. Quacks, lemons, and licensing: a theory of minimum quality
1327–1335. standards. Journal of Political Economy 87 (6), 1328–1346.
Continelli, T., McGinnis, S., Holmes, T., 2010. The effect of local primary care physi- Lenz, et al., 2004. Primary care outcomes in patients treated by nurse practitioners
cian supply on the utilization of preventive health services in the United States. or physicians: two-year follow-up. Medical Care Research and Review 61 (3).
Health and Place 16 (2010), 942–951. Morgan, P.A., Strand, J., Ostbye, T., Albenese, M.A., 2007. Missing in action: care by
Cooper, R., Henderson, T., Dietrich, C., 1998a. Roles of nonphysician clinicians as physician assistants and nurse practitioners in national health surveys. Health
autonomous providers of care. Journal of the American Medical Association 280 Services Research 42 (5), 2022–2037.
(9), 795–802. Mundinger, et al., 2000. Primary care outcomes in patients treated by nurse
Cooper, R., Laud, P., Dietrich, C., 1998b. Current and projected workforce of non- practitioners or physicians: a randomized trial. Journal of American Medical
physician clinicians. Journal of the American Medical Association 280 (9), Association 283 (1), 2000.
788–794. Pauly, M., Satterthwaite, M., 1981. The pricing of primary care physicians services:
Donelan, K., Des Roches, C., Dittus, R., Buerhaus, P., 2013. Perspectives of physi- a test of the role of consumer information. Bell Journal of Economics 12 (2),
cians and nurse practitioners on primary care practice. New England Journal of 488–506.
Medicine 368 (20), 1898–1906. Peri, G., Sparber, C., 2009. Task specialization, immigration, and wages. American
Druss, B., Marcus, S., Olfson, M., Tanielian, T., Alan Pincus, H., 2003. Trends in care by Economic Journal: Applied Economics 1 (3), 135–169.
nonphysician clinicians in the United States. New England Journal of Medicine Rosenthal, M., Zaslavsky, A., Newhouse, J., 2005. The geographic distribution of
348 (2), 130–137. physicians revisited. Health Services Research 40 (6, Part I), 1931–1952.
Dueker, M., Jacox, A., Kalist, D., Spurr, S., 2005. The practice boundaries of advanced Sass, T.R., Nichols, M.W., 1996. Scope-of-practice regulation: physician control and
practice nurses: an economic and legal analysis. Journal of Regulatory Economics the wages of non-physician health care professionals. Journal of Regulatory
27 (3), 309–329. Economics 9, 61–81.
Everett, C., Schumacher, J., Wright, A., Smith, M., 2009. Physician assistants and nurse Schaumans, C., Verboven, F., 2008. Entry and regulation: evidence from health care
practitioners as usual source of care. Journal of Rural Health 25 (4), 407–414. professions. RAND Journal of Economics 39, 949–972.
Fairman, J., 2008. Making Room in the Clinic; Nurse Practitioners and the Evolution Scheffler, R.M., 2008. Is There a Doctor in the House? Market Signals and the Physi-
of Modern Health Care. Rutgers University Press, New Brunswick, NJ. cian Supply Cycle. Stanford University Press, Palo Alto.
Flexner, A., 1910. Medical Education in the United States and Canada: A Report to Scheffler, R.M., Waitzman, N.J., Hillman, J.M., 1996. The productivity of
the Carnegie Foundation for the Advancement of Teaching, Bulletin No. 4. The physician assistants and nurse practitioners and health workforce pol-
Carnegie Foundation for the Advancement of Teaching, New York City, pp. 346 icy in the era of managed health care. Journal of Allied Health 25 (3),
(Retrieved 27.06.11). 207–217.
Freed, G.L., Nahra, T.A., Wheeler, J.R., 2006. Counting physicians: inconsistencies Sekscenski, E.S., Sansom, S., Bazell, C., Salmon, M.E., Mullan, F., 1994. State practice
in a commonly used source for workforce analysis. Academic Medicine 81 (9), environments and the supply of physician assistants, nurse practitioners,
847–852. and certified nurse-midwives. New England Journal of Medicine 331 (19),
Garthwaite, Craig, L., 2012. The doctor might see you now: the supply side effects 1266–1271.
of public health insurance expansions. American Economic Journal: Economic Shaked, A., Sutton, J., 1981. The self-regulating profession. Review of Economic
Policy 4 (3), 190–215. Studies 48, 217–234.
Grumbach, K.L., Hart, G., Mertz, E., Coffman, J., Palazzo, L., 2003. Who is caring Shapiro, C., 1986. Investment, moral hazard and occupational licensing. Review of
for the underserved? A comparison of primary care physicians and nonphysi- Economic Studies 53, 843–862.
cian clinicians in California and Washington. Annals of Family Medicine 1 (2), Skillman, S., Kaplan, L., Fordyce, M., McMenamin, P., Doescher, M., 2012, February.
97–104. Understanding Advance Practice Registered Nurse Distribution in Urban and
Grumbach, K., Vranizan, K., Bindman, A., 1997. Physician supply and access to care Rural Areas of the United States Using National Provider Identifier Data. WWAMI
in urban communities. Health Affairs 16 (1), 71–86. Rural Health Research Center Final Report #137.
Guttman, A., Shipman, S., Lam, K., Goodman, D., Stukel, T., 2010. Primary care physi- Staiger, D., Auerbach, D., Buerhaus, P., 2009. Comparison of physician workforce
cian supply and children’s health care use, access, and outcomes: findings from estimates and supply projections. Journal of the American Medical Association
Canada. Pediatrics 125 (6), 1119–1126. 302 (15), 1674–1680.
Hooker, R., Berlin, L., 2002. Trends in the supply of physician assistants and nurse US Department of Health and Human Services, Health Resources and Service Admin-
practitioners in the United States. Health Affairs 21 (5), 174–181. istration, 2004, February. A Comparison of Changes in the Professional Practice
Hooker, R., McCaig, L., 2001. Use of Physician assistants and nurse practitioners in of Nurse Practitioners, Physician Assistants, and Certified Nurse Midwives: 1992
primary care, 1995–1999. Health Affairs 20 (4), 231–238. and 2000.
Horrocks, S., Anderson, E., Salisbury, C., 2002. Systematic review of whether nurse US Department of Health and Human Services, Medicare Learning Network, 2011,
practitioners working in primary care can provide equivalent care to doctors. September. Medicare Information for Advanced Practice Registered Nurses,
British Medical Journal 324 (6), 819–823. Anesthesiologist Assistants, and Physician Assistants.
Hotz, V.J., Xiao, M., 2011. The impact of regulations on the supply and quality of care US Government Accountability Office, 2008. Primary Care Professionals: Recent
in child care markets. American Economic Review 101 (5), 1775–1805. Supply Trends, Projections, and Valuation of Services.
Iglehart, J., 2013. Expanding the role of advanced nurse practitioners – risks and White, W., 1978. The impact of occupational licensure of clinical laboratory person-
rewards. New England Journal of Medicine 368 (20), 1935–1941. nel. Journal of Human Resources 13 (1), 91–102.

Health information exchange, system size and information silos

Amalia R. Miller a,∗ , Catherine Tucker b,c,∗,1
a
Economics Department, University of Virginia, Charlottesville, VA, United States
b
MIT Sloan School of Management, MIT, Cambridge, MA, United States
c
NBER, United States
Article history: There are many technology platforms that bring benefits only when users share data. In healthcare, this
Received 30 March 2012 is a key policy issue, because of the potential cost savings and quality improvements from ‘big data’ in the
Received in revised form 9 September 2013 form of sharing electronic patient data across medical providers. Indeed, one criterion used for federal
subsidies for healthcare information technology is whether the software has the capability to share data.
Available online 30 October 2013
We find empirically that larger hospital systems are more likely to exchange electronic patient information
internally, but are less likely to exchange patient information externally with other hospitals. This pattern
is driven by instances where there may be a commercial cost to sharing data with other hospitals. Our
I1
K2
results suggest that the common strategy of using ‘marquee’ large users to kick-start a platform technology
L5 has an important drawback of potentially creating information silos. This suggests that federal subsidies
O3 for health data technologies based on ‘meaningful use’ criteria, that are based simply on the capability to
share data rather than actual sharing of data, may be misplaced.
Keywords: © 2013 Elsevier B.V. All rights reserved.
Healthcare IT
Technology policy
Network externalities
1. Introduction Due to marquee users’ scale, they can internalize some of the net-
work effects inherent in the platform and in turn then attract more
The need for information exchange in healthcare is pressing, users to the platform. To see this, consider a network technology
due to growing evidence that exchanging and sharing patient data that connects multiple separate firms. Each firm will adopt a net-
can potentially reduce mortality and even reduce costs (Bower, work technology based on whether it receives net benefits from
2005; Walker et al., 2005; Miller and Tucker, 2011a; McCullough being part of the network, but it will not internalize the positive
et al., 2011). The success of efforts to leverage ‘big data’ in health- effect that its adoption has for other firms in the network. If a
care, such as the ‘learning health’ system (Smith et al., 2012), will subset of these firms merge, then adoption increases, because the
depend crucially on the willingness of providers to share their data newly merged firm is able to internalize the network benefits from
(Goodby et al., 2010). However, it is unclear what the best steps adoption at different locations.
are for policymakers to take to ensure that information exchange This paper asks how the size of a user that adopts an information
happens. exchange technology affects subsequent usage. We use data on the
One commonly advocated strategy for kick-starting a platform exchange of electronic health data within a local health area and
for data exchange is to secure a large ‘marquee’ user to help attract investigate how the number of hospitals within a hospital’s system
other users to the platform. As described by Eisenmann et al. (2006), influences its likelihood of sharing data.
“the participation of ‘marquee users’ can be especially important In this setting, larger hospital systems may be better able to
for attracting participants.” Gowrisankaran and Stavins (2004) set internalize the high costs of ensuring compatibility with complex
out a foundational economic framework for understanding this. information exchange standards, making it cheaper for them to
exchange data both internally and externally. Correspondingly,
we find that hospitals with more hospitals in their system are
indeed more likely to exchange electronic information internally.
∗ Corresponding authors. Tel.: +1 434 924 6750.
However, they are less likely to exchange electronic information
E-mail addresses: armiller@virginia.edu (A.R. Miller),
externally with other nearby hospitals. This decision to exchange
cetucker@mit.edu (C. Tucker).
1
This research was supported by a grant from the NET Institute (www. information externally does not seem to be driven by the sys-
NETinst.org). tems’ age or manufacturer, nor by the number of other hospitals
A.R. Miller, C. Tucker / Journal of Health Economics 33 (2014) 28–42 29
they could potentially interact with. We argue that this contrast capacity to electronically exchange key clinical information.’4 This
between a willingness to share data internally and a lack of will- test would qualify even if it used fictional patient data.
ingness to share data externally reflects a tendency for larger However, our results suggest that compatibility or capability
hospital systems to create ‘information silos.’ An information silo alone will not be enough to ensure that electronic information is
is a data system that does not exchange data with other similar actually shared. To succeed in ensuring comprehensive meaningful
systems. use, the federal government will have to address the fact that larger
A potential explanation for larger hospital systems’ propensity hospital systems that may be producing the best health outputs
to create information silos is that they fear that by facilitating data may also be less willing to exchange information. This reluctance
outflow, they may lose patients. If the hospital allows data out- to share information may stem from the notion that records are the
flow, patients may seek more follow-up care in stand-alone or property of the hospital. As quoted in Knox (2009), Dr. Delbanco,
community hospitals, which may offer more convenience or lower a primary care specialist at Beth Israel Deaconess Medical Center
costs to patients whose insurance imposes substantial cost-sharing in Boston, states, “You can get it [the patient record] [...] But we do
(Melnick and Keeler, 2007). We offer three pieces of evidence, everything in the world to make sure you don’t get it.” The findings
based on estimating heterogeneous effects of system size on data of this paper suggest that this ethos may be echoed in the switch
exchange, that suggest that strategic motivations like these at least from paper to digital records. This means the digitization of health
partially drive our results. records may not make patient healthcare provider transitions as
First, we find a stronger negative relationship between hospital seamless as hoped for by policymakers. This is important as poli-
system size and external information exchange among hospitals cymakers set policy priorities for ‘stage 3 of meaningful use, the
that have insurance arrangements that make it easier for patients target date for which is currently 2016.
to leave their hospital system. Second, hospitals that pay their staff This adds to a broader literature which has questioned the wis-
more are less likely to share their data with hospitals outside their dom and likelihood of achieving a quick transition to digital health
system if they are part of a larger system. Third, specialty hospi- given larger general equilibrium issues (Christensen and Remler,
tals are less likely to share data outside their system if they are 2009; Murray et al., 2011). In particular, they highlight a potential
part of a larger system. The first result suggests that if patients are cost of speed, which is that in their haste to give incentives to adopt,
likely to seek treatment elsewhere, hospitals are less likely to share policymakers may inadvertently also be giving hospitals incentives
data. The latter two results suggest that if hospitals invest valuable to adopt systems that are incompatible with the ultimate aim of
resources in patient care, they may also be less likely to be willing widespread sharing of health information.
to share data. While not conclusive, these findings provide some
evidence that the creation of information silos that we observe is
linked to strategic concerns. 2. Conceptual framework
Policymakers and researchers have focused on questions of
encouraging compatibility and inter-operability at the IT vendor We study the decisions of hospitals to exchange patient infor-
level, but we show that users who have already adopted may also mation with other hospitals, inside and outside of their systems.
choose not to exchange information with others. This is important This section presents a conceptual framework for modeling these
because of recent policy emphasis on the diffusion of Electronic decisions and then illustrates the various ways in which they can
Medical Records (EMRs). The United States federal government has be affected by the hospital’s system size. This framework is used
provided $19 billion in financial incentives to healthcare providers to motivate our main empirical analysis and choices of control
under the 2009 HITECH Act to encourage them to adopt EMRs. variables.
Part of the motivation for government coordination is the belief Because data exchange is a classic network externality set-
that to reduce healthcare spending, it is not enough for health- ting, our framework allows for data exchange to generate positive
care providers to simply adopt the technologies. Providers need to externalities to other local hospitals. This is similar to models of
be able to share electronic patient data as well.2 To help coordi- information exchange technology adoption in other settings, such
nate this sharing of data, EMRs only qualify for aid if they fulfill as Gowrisankaran and Stavins (2004), who study the decision of
government criteria for ‘meaningful use.’3 banks to adopt electronic exchange capabilities. We differ from
Much of the policy literature has criticized the ‘meaningful use’ this prior literature along two key dimensions. First, we consider
criteria as setting too a low a standard in terms of adoption of the decision to participate in data exchange separately from the
technology (Wolf et al., 2012). This reflects that the focus so far technology adoption decision that enables exchange. This allows
of the ‘meaningful use’ criteria has been on achieving technical us to account for the facts that hospitals can selectively choose to
inter-operability rather than actual sharing of data. For example, exchange data (only within their system, for example) and that
the pivotal ‘Core Measure 13’ states that to qualify, a hospital has hospitals with IT systems capable of exchange may not partici-
to have ‘Performed at least one test of certified EHR technology’s pate at all. Second, unlike Gowrisankaran and Stavins (2004) and
other papers that assume that consumers are permanently tied to
their providers (banks, in their case), our model explicitly considers
the potential competitive effects of data exchange that arise when
2
The emphasis on data sharing is shared by industry leaders and consumer advo- consumers can switch providers. In our setting, this means that
cates (Clark, 2009). Jim Lott, Executive Vice President, Hospital Council of Southern we model hospitals as thinking about the impact of data exchange
California: “Looking for savings in hospitals that use EMRs is short-sighted. The real
on their their current customer base as well as how participating
payday for use of EMRs will come with interoperability. Measurable savings will
be realized as middleware is installed that will allow for the electronic transmis- in data exchange may change their future customer base. These
sion and translation of patient records across different proprietary systems between considerations are a crucial motivation for why we might expect
delivery networks.” Johnny Walker, Founder and past CEO of Patient Safety Institute: system size to have differential effects on the decisions to exchange
“EMRs don’t save money in standalone situations. However, EMRs will absolutely
save significant money (and improve care and safety) when connected and sharing
clinical information.”
3
For more historical and policy background on the ‘meaningful use’ criteria, see
4
Blumenthal and Tavenner (2010), Jha (2010), Buntin et al. (2010) and Adler-Milstein http://www.cms.gov/EHRIncentivePrograms/Downloads/13 Electronic
and Jha (2012). Exchange of Clinical Information.pdf.
30 A.R. Miller, C. Tucker / Journal of Health Economics 33 (2014) 28–42
data internally within the hospital system versus externally outside private value. Data sharing can also have positive external effects
the system. by reducing operating costs for other local hospitals.
Last, we consider the effect of the adoption of external data
exchange on revenues. Quality improvements from data exchange
2.1. Hospital decisions to exchange data
that increase demand for hospital care, overall or at particular hos-
pitals engaged in exchange, will imply a positive effect of data
We formalize the key considerations of hospitals deciding
exchange on revenues. However, the overall effect of data exchange
whether or not to exchange patient information with other hos-
on revenues is complicated by the possibility that better informa-
pitals in a simple model. We assume that hospitals maximize an
tion flow itself can cause patients to choose different hospitals
objective function that includes net revenues,5 and patient care
for their care. If patient records are seen by hospitals as part of
quality (converted to dollar terms through the function ˛, as a proxy
their property and containing proprietary business information,
for other non-financial missions of the hospital or its leaders, which
then there may be an additional cost from external data exchange
can include charity care or prestige). Each hospital faces a binary
that relates to sharing client records with competitors. Detailed
choice and will decide to exchange data only if its utility function
patient records that show interventions and health outcomes can
increases from doing so, i.e., if:
also reveal sensitive information about hospital practices and qual-
ity that hospitals prefer to keep internal. Without large financial
Utilityi = ˛i (Qualityi ) − Costsi + Revenuei ≥ 0 (1)
motivations for data sharing, these concerns may limit the willing-
In theory, information exchange can affect each of the three ness of hospitals to freely share their patient data and can lead to
terms in the utility function. The direction and magnitude of each data silos in healthcare.
effect can in turn vary with hospital characteristics. We discuss the Prestigiacomo (2012) illustrates these concerns when mak-
potential effects of data exchange and the key mediating factors in ing the case for the Carolina eHealth Alliance (CeHA), a health
more detail in this section. information exchange established in Charleston, S.C. in 2011. The
First, sharing information can improve the quality of hospital ‘competitive nature’ of the area’s hospital systems is reflected in
care implying a positive for this component. This is particu- the fact that CeHA only covers emergency departments. It is also
larly true for patients with chronic conditions who are seeing a reflected “in the design of the CeHA interface itself. No data is
new specialist or in emergency situations with patients who are permanently stored in CeHA, or is able to be saved into an orga-
unable to communicate their medical history or allergies (Brailer, nization’s electronic health record (EHR).. . . CeHA auto-populates
2005). Lowering emergency room admissions due to better access data from patient registration, rather than operating via physi-
to patient histories can itself improve quality by reducing wait cian query. When a patient registers and chooses not to opt-out,
times for other patients when capacity is limited (Delia and Cantor, CeHA queries the edge servers of the participating organizations to
2009). Hospitals with objective functions that include quality of aggregate and consolidate key electronic portions of their medical
care (˛i > 0) may invest in data exchange to achieve these quality records. A green checkmark in the interface indicates that CeHA has
improvements even if they do not improve net revenues. Quality clinical information on the patient. . . from the past 180 days. This
improvements will produce positive externalities to patients if hos- information appears in a temporary virtual record that the physi-
pitals are not able to capture (or directly value as much as patients cian has four hours to view, and afterwards is cleared upon patient
do, possibly because of asymmetric information about quality) the discharge.” CeHA founder Frank Clark explains the complex struc-
increase in patient welfare. There can also be spillover benefits to ture as accommodating opposition from participating hospitals to
other local hospitals if data sharing improves their quality as well. any permanent data storage from the exchange, saying “Given their
Second, we consider the effects of data exchange on operating competitive nature they didn’t want someone to be mining the
costs. Hospitals that participate in data exchange will incur initial data, or trying to lure the patient to another facility.”
setup and continued support costs associated with health IT sys-
tems and network use that enable exchange. Nevertheless, they
may still experience a net cost reduction if exchanging patient 2.2. System size and data exchange
information with other hospitals in the area allows hospitals to
avoid duplicative medical testing. This cost reduction will increase In this section we consider how alliances between hospitals
net revenues if the costs of testing are not fully reimbursed by insur- affect their willingness to exchange patient data and motivate the
ance, as is the case for patients whose insurance contract uses a central question of the paper: how system size affects the sharing
prospective payment scheme that pays a flat amount per diagnosis of such patient information.7 Once we expand our framework to
group, instead of fee-for-service. Hospitals may also have a finan- consider hospitals being part of systems, it becomes important to
cial incentive to reduce the frequency and treatment intensity of distinguish between two types of information exchange, namely,
emergency room visits, which often involve uncompensated care, internal exchange, with other hospitals in the same system, and
and where, in addition to Medicare and Medicaid, many private external exchange, outside the boundaries of the system.
insurers pay a fixed fee.6 In a case study of Memphis emergency The question of how system size affects network technology
departments, health information exchange was associated with use is related to more general questions of how user size affects
significant decreases in utilization and costs (Frisse et al., 2012). participation in information networks. In traditional theoretical
To the extent that some of the costs of duplicative care are borne models of network externalities (Katz and Shapiro, 1985; Farrell
by insurers, the social value of data sharing will be larger than the and Saloner, 1985; Economides, 1996), network participants are
assumed to be symmetrical in size and consequently the issue of
5
Maintaining non-negative flows of net revenues are still a concern for non-profit
hospitals.
6 7
Doctors have suggested that situations such as one where a patient had seven We follow Ho (2009), who studies networks in healthcare, and focus on hospital
computed tomography (CT) scans and five ultrasounds in 2007 in various hospital systems rather than hospital networks, because a hospital system is the closest
emergency rooms, could have been avoided with electronic health data exchange analog to a profit-maximizing unit. As pointed out by Burgess et al. (2005), hospital
(Calcanis, 2005). networks tend to be driven by the behavior of hospital systems in any case.
network user size is not discussed.8 Later empirical papers, such standards, has documented that ISPs are less likely to choose com-
as Gowrisankaran and Stavins (2004), argue that user size itself patible systems in a symmetric firm setting. Chen et al. (2009) built
can be used to detect the presence of network externalities. Their a dynamic model that can explain why in the long run some firms
argument is that because larger customers with more internal make their technology compatible despite gaining market domi-
sub-units are more able to internalize network externalities, any nance. Similarly to the empirical findings in this paper, their model
relative increase in adoption propensity by such larger firms is emphasizes that there is a tension for a firm with many in-network
itself evidence of network externalities. customers. There is also a small and related literature in ICT that
Although our focus is on technology use rather than technol- addresses the issue of ‘inter-connection’ (Shy, 2001). This litera-
ogy adoption, that argument from the network effects literature ture emphasizes that while smaller telecommunication firms want
that larger users are more likely to internalize the network bene- inter-connection, larger firms do not and instead prefer to merge.
fits from data exchange (in the form of either lower costs or higher Mata et al. (1995), by contrast, argues that switching costs are not a
quality) applies equally in our setting to internal data exchange sustainable source of competitive advantage for any firm regardless
among hospitals within a system. This channel implies that larger of size. The setting we study is different, because we do not exam-
systems are more likely to exchange data internally but not exter- ine the behavior of vendors of EMR technology and their incentives
nally (because they have no advantage in internalizing benefits to distort standards to gain market power. Instead, we study hos-
outside their system). In fact, hospitals in larger systems may find pital end-users who deploy standards-based technology and get a
less value from external data exchange (in terms of cost savings or direct benefit (or not) from inter-connection.
improved quality) than hospitals in smaller systems because they
may plausibly be able to serve customers’ needs entirely within
3. Data
their own firm boundaries, and consequently see less network
benefit to acquiring customer information from other firms. As
3.1. Electronic exchange of patient information
a general rule, the within system benefits from exchange will be
larger and cross-system benefits smaller, if patients tend to get all
We use the Hospital Electronic Health Record Adoption
of their care within the same system (in the extreme case of a health
DatabaseTM from the American Hospital Association (AHA, released
maintenance organization (HMO), for example, or if the system is
in May 2009), which reports data from a 2007 survey of members
dominant in that local area).
of the American Hospital Association.9 This survey is funded by
Still, it is also possible that the costs of supporting data exchange
the Office of the National Coordinator for Health Information Tech-
are lower, on a per-transaction basis, for larger systems. That would
nology (ONC) and is intended to be the most comprehensive and
lead larger systems to invest more in the IT capacity for exchange
representative survey of the state of healthcare IT.
and, conditional on capacity, to engage in more network use overall,
This survey asked whether hospitals exchanged patient and
both internal and external.
clinical data with other hospitals in their system and externally
These first two channels, through costs and quality, suggest
outside of their system. Having data on actual sharing of health
a positive relationship between system size and internal data
data rather than technology adoption allows us to advance on pre-
exchange but a weaker or negative relationship for external data
vious research in this area, which has only had data on health care
exchange. When strategic considerations about patient hospital
information technology adoption.10
choice are included in the calculation, then larger systems are
Because the original 2007 survey was not sufficiently compre-
expected to be less likely to engage in external data exchange.
hensive, the American Hospital Association repeated the survey
Larger systems may be especially fearful that facilitating data
with different supplementary questions in 2008 and 2009. We
outflow will lead their customers to leave their firm and seek
use these additional survey waves to augment our dataset where
service from smaller and potentially cheaper alternative firms. In
there are missing observations (around 600 cases). However, our
the healthcare setting that we study, Melnick and Keeler (2007)
results are similar if we restrict our analysis to 2007. We are not
documents that larger hospital systems have seen higher price
able to exploit these supplementary questions as a panel because
increases in recent years. Ho (2009) provides evidence that larger
three years of data is too short to measure effects. This is because
hospital systems exploit their bargaining power to negotiate better
of the two-year lead time for IT implementations (Miller and
prices with health insurers. This was confirmed in a recent study
Tucker, 2011a), and the antitrust scrutiny attendant on hospital
performed in Massachusetts by the state attorney general, which
system mergers and acquisitions, meaning that system size does
documented that larger hospital networks charge more even after
not change rapidly.
controlling for differences in difficulty of care provided (Coakley,
The survey did not ask hospitals to report with whom they
2010). Patients may therefore prefer to leave large hospital systems
exchanged their data. We use hospital referral regions (HRRs) as
to seek cheaper alternatives if they are responsive to deductibles
our definition of a local area within which patients plausibly might
and co-pays. This suggests that larger firms may see less value in
transfer between hospitals.11 There are 306 such regions within
allowing an outflow of patient records.
the US. We chose this as our underlying measure of other local
The question of how system size affects the decision to share
information is new. Nevertheless, our results do relate to a litera-
ture that asks whether competition encourages or deters technol-
9
ogy firms from adopting compatible standards for their technology In earlier versions of this paper, we show the results are robust to controlling
(Farrell and Klemperer, 2007). Work on standards deployment, for potential survey-response bias. We have also checked that consistency in the
answers to the questions over time did not vary with any of our key explana-
such as Augereau et al. (2006)’s paper on ISPs’ adoption of modem
tory variables, including system size. This helps rule out distortions in responses
based on strategic considerations and attempts to influence the development of the
meaningful use criteria.
10
Tucker (2008) exploits network usage data to identify network externalities,
8
More recently, Simcoe et al. (2009) find that small technology vendors are more but does not measure strategic decisions to interact or not over a network after
likely to litigate after they disclose patents to a standards-setting organization. adoption.
11
They suggest that this is because smaller firms are less likely to earn rents in com- The Dartmouth Atlas of Health Care defines an HRR as a regional health care mar-
plementary goods markets, and therefore defend their intellectual property more ket for tertiary medical care, which contains at least one hospital that has performed
aggressively. major cardiovascular procedures and neurosurgery.
Table 1 Table 2
Rates of data exchange by hospitals in the aha technology survey. Summary statistics for characteristics of hospitals in the aha technology survey.
Mean Mean Std. Dev.
External exchange 0.17 # Hospitals in system in HRR 1.48 2.87

External patient 0.11 # Hospitals outside system in HRR 28.4 21.6
External clinical 0.16 Admissions (000) 7.53 9.47
Exchange Insurance 0.54 Proportion Medicare inpatients 45.7 22.2
Not member RHIO 0.19 Proportion Medicaid inpatients 18.1 16.4
Internal exchange 0.68 No. doctors (000) 0.023 0.085
Internal patient 0.66 PPO 0.64 0.48
Internal clinical 0.64 HMO 0.56 0.50
Per capita payroll (000,000) 0.050 0.016
Observations 4060
Independent practice association 0.11 0.31
Internal exchange dependent variables only applicable to 2573 hospitals that are Group practice association 0.020 0.14
part of a system. Integrated salary model 0.30 0.46
Non-profit hospital 0.43 0.50
Speciality hospital 0.39 0.49
hospitals because it measures a broad but carefully defined geo- Cerner system 0.077 0.27
Eclipsys system 0.028 0.17
graphical area from which patients might obtain care. We found
Epic system 0.044 0.20
similar results when we ran our regressions using the narrower GE system 0.018 0.13
definition of a hospital service area (HSA), which are smaller and Mckesson system 0.071 0.26
are based on the customary geographical reach of patients. Meditech system 0.17 0.37
Siemens system 0.045 0.21
Other system 0.0049 0.070
3.2. Further controls # Hospitals outside HRR in system 19.7 42.7
Standalone 0.37 0.48
We matched the information on patient data exchange with the Observations 4060
most recent rounds of the AHA hospital survey to obtain detailed
data on hospital characteristics for controls in our regressions. The
data provide information on a hospital’s system’s size, defined as
structure) by controlling for independent practice association,
the number of hospitals owned, leased, sponsored or contract-
group practice association, and integrated salary model. These vari-
managed by a central organization. Though we use system size as
ables can affect the degree of control that physicians exercise over
measured by the number of hospitals in our main specifications,
hospitals’ investments and data use. Finally, we control for hospital
we also obtain similar results if we weight the system size vari-
type, to allow for the possibility that nonprofit hospitals have
ables by number of beds.12 We observe 430 hospital systems in
different objective functions than for-profits and that specialty
our data. The average system contains six hospitals and operates
hospitals may have different motivations for sharing data than
in just under four regional markets. Among hospitals in our data
general hospitals. This was something that was documented by
that belong to multi-hospital systems, the average system size is
Adler-Milstein et al. (2011) in other research that used this dataset.
36 hospitals. In our full sample, hospitals have an average of 1.5
In addition to these key AHA controls, we also add information from
other hospitals from their system in the same HRR and 19.7 hos-
HIMSS about the IT vendor used by each hospitals. This allows us to
pitals in their system located outside of the HRR. Tables 1 and 2
distinguish between variation in data exchange outcomes driven
provide summary statistics for our dependent and explanatory
by differences in hospitals’ ability to exchange data and by dif-
measures.
ferences in hospitals’ willingness to exchange data, conditional on
In addition to our main explanatory variables for system size
capacity.
and local hospital competition, our empirical models also include
In all our analysis, the unit of observation is a hospital rather
variables that could affect the propensity of hospitals to exchange
than a hospital system. This is motivated by the lack of unifor-
information, either internally or externally. Most of these variables
mity in the systems in our data - some hospitals share information
are from the AHA. Hospital size could predict data exchange
externally and some do not. This may stem from organizational
because it is related to patient flows and larger hospitals are more
structures surrounding IT purchases that predate the idea of a net-
willing to absorb the fixed costs associated with establishing data
worked IT system.13 Therefore, if hospital systems do exchange
exchanges. The insurance status of patients can also affect data
data externally at all, there is diversity in individual hospital
exchange among providers by affecting the mobility of patients,
exchanging behavior.
within or across systems, or by affecting the financial status of hos-
pitals (through differences in reimbursement rates). We separately
control for the proportions of inpatient days covered by Medicare 4. Analysis and results
and Medicaid because of differences in reimbursement levels and
patient populations in the two public programs. We also control 4.1. Exchange within a system
for hospital per-capita payroll costs, as these may be related to
greater quality and investment overall at the hospital, including To evaluate the relationship between hospital system size and
IT. Furthermore, we account for the fact that hospitals vary not the decision to exchange electronic data, we use our cross-sectional
only in their average payroll costs but also in their relationships data to estimate a static model. For a hospital that has completed
with physicians (affecting both compensation and governance
13
This is consistent with data from HIMSS surveys of both hospital and hospital
12
Because the AHA panel data on system membership is sometimes noisy from systems about the nature of the approval process for IT purchases. In only about
year to year (Madison, 2004), we also cross-checked the systemid variable that we 11% of hospital systems and 20% of hospitals were purchasing decisions taken at
base our results on with the systemid variable from the 1996 to 2006 AHA surveys the Board level. In the remainder of cases, the decision was taken at a far lower
to weed out any inconsistencies. level, most frequently that of the hospital’s chief information technology officer.
Table 3
Larger hospital systems are more likely to exchange information internally.
(1) (2) (3) (4) (5) (6)

Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E.
# Hospitals in system in HRR 0.017*** 0.020*** 0.021*** 0.022*** 0.021*** 0.019***

(0.010) (0.010) (0.011) (0.011) (0.011) (0.010)
# Hospitals outside system in HRR −0.001*** −0.001 −0.001** −0.001** −0.001** −0.001***
(0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Admissions (000) 0.009*** 0.009*** 0.007*** 0.007***
(0.004) (0.004) (0.004) (0.004)
Proportion Medicare inpatients −0.003*** −0.003*** −0.003*** −0.003***
(0.001) (0.002) (0.002) (0.002)
Proportion Medicaid inpatients −0.005*** −0.005*** −0.004*** −0.004***
(0.002) (0.002) (0.002) (0.002)
No. doctors (000) 0.301 0.302 0.283 0.250
(0.633) (0.613) (0.600) (0.561)
PPO −0.080** −0.088** −0.084** −0.110***
(0.102) (0.102) (0.102) (0.098)
HMO 0.112*** 0.121*** 0.113*** 0.139***
(0.099) (0.099) (0.099) (0.094)
Per capita payroll (000,000) 1.622* 1.588* 1.518* 1.786*
(2.704) (2.698) (2.579) (2.687)
Independent practice association −0.018 −0.026 −0.019 −0.027
(0.099) (0.100) (0.101) (0.099)
Group practice association 0.023 0.023 0.028 0.050
(0.229) (0.232) (0.236) (0.229)
Integrated salary model −0.015 −0.014 −0.015 −0.020
(0.066) (0.067) (0.067) (0.064)
Non-profit hospital 0.086*** 0.084*** 0.066*** 0.062***
(0.072) (0.073) (0.073) (0.069)
Speciality hospital 0.057** 0.056** 0.044* 0.041*
(0.068) (0.068) (0.069) (0.066)
Cerner system 0.167*** 0.160***
(0.124) (0.122)
Eclipsys system −0.017 −0.002
(0.193) (0.193)
Epic system 0.229*** 0.228***
(0.177) (0.175)
GE system 0.068 0.086
(0.219) (0.217)
Mckesson system 0.066 0.074*
(0.127) (0.126)
Meditech system −0.037 −0.032
(0.092) (0.090)
Siemens system 0.149** 0.149***
(0.170) (0.163)
Other system 0.280** 0.306**
(0.413) (0.409)
State fixed effects No Yes Yes Yes Yes No
Installation year controls No No No Yes Yes Yes
Observations 2573 2571 2571 2571 2571 2573

Log-likelihood −1589.61 −1539.01 −1440.10 −1426.39 −1403.49 −1435.67
Probit estimates. Dependent variable is whether the hospital exchanges electronic data internally within its system. Robust Standard Errors.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
the survey, the decision to exchange information electronically we restrict our attention to the 2,571 hospitals who are part of a
internally is specified as: system in our data.
Column (1) is a Probit regression for whether or not that hospital
exchanges data with other hospitals in its system. The positive and
Prob(ExchangeInternalij = 1|SystemSizeij , Xij ) significant marginal effect of the number of local in-system hos-
= ˚(SystemSizeij , Xij , ) (2) pitals suggests that the likelihood of exchanging data within the
system increases with system size. The decision does not appear
and ExchangeInternalij = 1 if hospital i in HRR j exchanges informa- to be positively related to the presence of other hospitals in the
tion internally. local area. This finding is in alignment with a traditional approach
SystemSizeij , our key variable of interest, captures the number of to network effects which suggests that larger coordinated firms are
hospitals within that system in that HRR. Xij is a vector of hospital better able to internalize network externalities and consequently
characteristics that affect the propensity to exchange information, more likely to share information.14
is a vector of unknown parameters, and is the cumulative
distribution function of the standard normal distribution.
Table 3 displays the results of our initial specification. Since only 14
To explore the possibility that this positive relationship between system size
hospitals in a system can answer in the affirmative to this question, and internal exchange is driven from a mechanical effect coming from hospitals in
In columns (2)–(5), we show that the results remain robust For this decision, we similarly estimate a separate equation
when we add controls for the state the hospital is located in, hospi- where:
tal characteristics, the age of the technology, and the manufacturer
of the system. We add these controls to address concerns that the
explanatory variable of system size is driven by external factors Prob(ExchangeExternalij = 1|SystemSizeij , Xij )
that also determine whether the hospital shares data with others. = ˚(SystemSizeij , Xij , ) (3)
The state fixed effects in columns (2)–(5) account for cross-state
differences in system size and the propensity to exchange data. In and ExchangeExternalij = 1 if hospital i in HRR j exchanges informa-
particular, as discussed in Miller and Tucker (2009, 2011b, 2012), tion externally. The controls remain the same as before.
state-level regulation of privacy, information security, and medical Table 4 reports the incremental results as we build up to our
malpractice can affect the adoption of EMRs and therefore poten- final specification. Column (1) of Table 4 is a Probit regression for
tially the use of EMRs to exchange information. They might also at whether the hospital exchanges information externally. Here, the
the same time lead hospital systems to grow or shrink. By including sign on the size of the local hospital system is strikingly different
the full set of state fixed effects in these models, we can abstract from the sign in Table 3. Larger hospital systems are less likely to
away from the impact of cross-sectional variation in such state exchange information externally.
regulations on hospitals’ exchanging decisions. Importantly, the decision to exchange information externally
There is also the possibility that our estimates are capturing does not appear to be positively affected by the number of potential
something else about the hospital that determines its decision external partners. The coefficient for this is negative, small, and
to share data – for example its organizational or financial struc- generally insignificant. As we discussed in Section 2, the finding
ture. In column (3), we add additional controls for such hospital that larger hospital systems are less likely to exchange information
characteristics. Many of these controls are insignificant. Gener- externally could be the result of some HRRs being dominated by a
ally, hospitals that see many Medicaid and Medicare patients single large system. That system would expect to receive little net
are less likely to exchange information within their systems. inflow of patients from exchanging patient information externally.
This could be because the information for such patients is cen- However, if concerns over the potential of the local HRR to produce
trally reported to the government and consequently there is less patient inflows were dominant, then we would expect hospitals to
need for a hospital-level information exchanging system.15 Non- be more likely to exchange data externally when there are more
profit hospitals and specialty hospitals are more likely exchange external hospitals. However, this is not what we find.
data internally. Controlling for the technology installation year Column (2) of Table 4 adds state fixed effects and column (3) add
in column (4) and vendor in column (5) leaves the main esti- hospital characteristics to control for observable differences in hos-
mates unchanged. Finally, column (6) reports estimates from the pitals’ underlying propensity to exchange information. Generally,
model with the full set of controls but omits the state fixed the ability to share data externally appears to increase in proxies
effects. for hospital size such as beds or number of doctors. It also rises
Across all specifications, the estimated marginal effects imply in the proportion of Medicare inpatients, which very speculatively
that each additional hospital in a system increases the chances may reflect the benefits to sharing data under fixed-fee payment
that individual hospitals in that system exchange data internally by systems. The indicator for having an integrated salary model with
about 2 percentage points. This is a moderate effect size in relation physicians has a positive effect on external exchange, but the con-
to the mean rate of internal exchange in our sample of 68% (Table 1). trols for other organizational forms, such as independent practice
A standard deviation increase in system size (of about 3 hospitals; association (IPA), are not significant.16
Table 2) would increase internal exchange by 6 percentage points The influence of per-capita payroll is of particular interest, since
or about 8.8 percent. it affects the decision to exchange inside a system and outside a
system in different ways. Table 3 shows that hospitals with high
per-capita payrolls are more likely to exchange information within
their system. However, hospitals with high per-capita payrolls are
4.2. External exchange of data
less likely to exchange information outside their system. If a general
lack of financial resources were driving the decisions to exchange
However, of crucial importance for firms that own information-
we see in the data, we would expect that hospitals that have the
sharing platforms and policymakers who are relying on large
financial ability to offer high salaries would consistently be more
hospital system users to kick-start the network (or to provide a
likely to exchange information. A possible interpretation of this
foundation for ‘big data’ application in healthcare) is whether a
result is that hospitals that pay their staff well want to ensure that
network user exchanges information externally.
they capitalize on the positive spillovers of, for example, attract-
ing more competent technicians to operate expensive diagnostic
testing technologies. Therefore, such hospitals are less willing for
patients to take their data from such tests that require expensive
larger systems having more potential opportunities for exchange, we also estimated
manpower away from their hospital and to other hospitals. We
the models in Table 3 using ‘any exchange’ (including either internal or external)
as the dependent variables. The effect of system size is positive and significant in
explore this in more detail in later regressions.
those models as well, where all hospitals in the same HRR have the same number A possible explanation for the negative relationship between
of potential exchange partners, which provides some additional support for the system size and external data exchange is that it simply reflects
interpretation that larger systems are better able to internalize network benefits technological incapacity. It is possible, for example, that hospitals
from data exchange.
15 in larger systems adopted Electronic Medical Record technology
The HHS Section 484.20 interim final rule from 1999 requires electronic repor-
ting of data from the Outcome and Assessment Information Set (OASIS) as a earlier. This means that the systems that they chose are less
condition of participation in the Medicare or Medicaid systems. Hospitals had the able to exchange information with other hospitals than newer
option of purchasing data collection software that can be used to support other clin-
ical or operational needs such as the ones that we study in this research, but they
could also use a HCFA-sponsored OASIS data entry system (that is, Home Assess-
16
ment Validation and Entry, or “HAVEN”) at no charge. The use of such a system, This may reflect the unusual profile of hospitals that retained their IPA arrange-
however, might limit the exchange of data within a system. ments through 2007 (Ciliberto, 2006).
Table 4
Larger hospital systems are less likely to exchange information externally.
All hospitals System hospital

(1) (2) (3) (4) (5) (6) (7)
Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E.
# Hospitals in system in HRR −0.005** −0.007*** −0.007*** −0.007*** −0.007*** −0.005* −0.007***
(0.009) (0.009) (0.010) (0.010) (0.010) (0.009) (0.011)
# Hospitals outside system in HRR −0.001* −0.000 −0.000 −0.000 −0.000 −0.000 −0.000
(0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.002)
Admissions (000) 0.002** 0.002** 0.002** 0.000 0.002**
(0.003) (0.003) (0.003) (0.003) (0.003)
Proportion Medicare inpatients 0.001** 0.001** 0.001** 0.001** 0.000
(0.001) (0.001) (0.001) (0.001) (0.002)
Proportion Medicaid inpatients 0.001 0.001 0.001* 0.001** 0.000
(0.002) (0.002) (0.002) (0.002) (0.002)
No. doctors (000) 0.165** 0.157** 0.144* 0.177** 0.150*
(0.296) (0.313) (0.316) (0.310) (0.351)
PPO −0.037* −0.042** −0.041** −0.029 −0.025
(0.080) (0.081) (0.081) (0.077) (0.115)
HMO 0.012 0.014 0.015 −0.004 0.013
(0.078) (0.079) (0.079) (0.075) (0.112)
Per capita payroll (000,000) −1.252*** −1.329*** −1.267*** −0.916** −1.431***
(1.915) (1.907) (1.904) (1.647) (2.340)
Independent practice association −0.010 −0.008 −0.007 −0.010 −0.008
(0.088) (0.089) (0.089) (0.086) (0.118)
Group practice association −0.012 −0.014 −0.016 −0.015 −0.018
(0.198) (0.195) (0.194) (0.188) (0.245)
Integrated salary model 0.026* 0.024* 0.024* 0.044*** 0.013
(0.055) (0.055) (0.056) (0.053) (0.074)
Non-profit hospital −0.000 −0.001 −0.000 0.008 0.009
(0.062) (0.063) (0.063) (0.058) (0.083)
Speciality hospital −0.004 −0.002 −0.002 −0.004 0.002
(0.061) (0.061) (0.062) (0.058) (0.081)
Cerner system −0.040 −0.046* −0.045
(0.106) (0.103) (0.133)
Eclipsys system −0.006 −0.012 0.020
(0.156) (0.155) (0.207)
Epic system 0.033 0.038 0.042
(0.125) (0.120) (0.153)
GE system −0.092** −0.085* −0.079*
(0.189) (0.193) (0.200)
Mckesson system −0.020 −0.011 0.004
(0.100) (0.100) (0.135)
Meditech system −0.035* −0.032* 0.001
(0.079) (0.077) (0.109)
Siemens system −0.015 −0.029 −0.020
(0.129) (0.122) (0.179)
Other system −0.075 −0.098 0.037
(0.383) (0.388) (0.418)
State fixed effects No Yes Yes Yes Yes No Yes
Installation year controls No No No Yes Yes Yes Yes
Observations 4060 4060 4060 4060 4060 4060 2561

Log-likelihood −1860.19 −1781.43 −1764.15 −1746.78 −1741.23 −1813.83 −1034.11
Probit estimates. Dependent variable is whether the hospital exchanges electronic data externally outside its system. Robust Standard Errors.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
systems which are built around the most current data interchange fixed effects for the year the EMR system was installed. In col-
standards.17 It could also be that they chose to buy their system umn (5), we also report vendor fixed effects for the largest EMR
from a vendor that makes interoperability harder. Early Meditech vendors. In both cases, the results remain robust. Often the ven-
systems, for example, were built around the MAGIC operating sys- dor that the hospital bought the system from seems to be not that
tem, meaning that they need special auxiliary customized add-ons significant a factor in whether or not they exchange information.
to be able to exchange data with other non-MAGIC EMR sys- This suggests that the policy needs to focus not just on ensur-
tems. The decision to purchase from a less-interoperable vendor ing interoperability at the vendor level, but also on encouraging
is bound up with the decision to exchange information, but it is hospitals to purchase systems that they actually use to exchange
possible that the hospital purchased from this vendor before such data.
inter-operability concerns were as important as they are today. To Column (6) shows the robustness of the estimates with the full
control for such concerns, in column (4) of Table 4 we include set of controls to omitting the state fixed effects.
Finally, in Column (7) of Table 4, we show that the result holds
when we restrict attention to hospitals that are part of hospitals
17
These standards were largely only formalized, by bodies like CCHIT, in systems. This is precisely the same sample as we used for the anal-
2006–2007. ysis in Table 3.
Table 5
Checking the robustness of our results to different dependent variables.
(1) (2) (3)

External patient External clinical Not member RHIO
Mrg Eff./S.E. Mrg Eff./S.E. Mrg Eff./S.E.
# Hospitals in system in HRR −0.006*** −0.005** 0.004*

(0.011) (0.010) (0.009)
# Hospitals outside system in HRR 0.000 −0.001* −0.000
(0.001) (0.001) (0.001)
Admissions (000) 0.002*** 0.002** 0.004***
(0.003) (0.003) (0.003)
Proportion Medicare inpatients 0.000 0.001** 0.001*
(0.002) (0.001) (0.001)
Proportion Medicaid inpatients 0.000 0.001* 0.001***
(0.002) (0.002) (0.002)
No. doctors (000) 0.063 0.152** −0.099
(0.300) (0.325) (0.339)
PPO −0.042*** −0.031* −0.041**
(0.091) (0.083) (0.078)
HMO 0.011 0.010 0.042**
(0.089) (0.081) (0.076)
Per capita payroll (000,000) −0.755** −0.909** 0.487
(2.168) (1.891) (1.465)
State fixed effects Yes Yes Yes
Hospital type controls Yes Yes Yes
Vendor controls Yes Yes Yes
Installation year controls Yes Yes Yes
Observations 4016 4060 4060

Log-likelihood −1296.40 −1632.28 −1807.81
Probit estimates. Robust standard errors. Dependent variable as described in column headers.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
The magnitude of the marginal effects indicates that each addi- The first columns of Table 6 address the concern that, because
tional hospital in a system lowers the chance of external data HRRs vary in size, the number of hospitals in and out of a system
exchange from hospitals in that system by 0.7 percentage points. in an HRR may not adequately capture market share or position.
This implies that a standard deviation increase in system size of To address this, we split our sample into HRRs that have above and
about 3 hospitals (Table 2) would decrease external exchange by below the mean number of hospitals. Columns (1) and (2) show
2.1 percentage points, or by about 12 percent of the sample mean the results. We find that the decision to share information is neg-
(of 17 percent; Table 1). atively related to system size in both instances. This suggests that
non-linearities introduced by differing HRR size are not driving our
results.
4.3. Robustness Another concern is that a merger between two nearby hospitals
who are already exchanging information will lead to both hospitals
We conduct multiple robustness checks for the novel finding in belonging to a larger system and to an increase in within-system
Table 4 that hospitals in larger systems are less likely to exchange exchanging and a decrease in external data exchanging, with no
data externally. change in the real level of information exchange. As pointed out
We first show the robustness of our result to alternative depend- by Town et al. (2007), there has been considerable consolidation
ent variables. Table 5 displays the results. The survey asked in the US over the past two decades. To check for this, we exclude
separately about whether a hospital exchanged patient data such observations of hospitals that had experienced mergers in the past
as name, background and insurance details and clinical data such 10 years in column (3) of Table 6. Column (4) also excludes hospi-
as medication lists, discharge summaries, and radiology reports. tals that are not part of systems. The results remain similar. This
Columns (1) and (2) distinguish between decisions to exchange suggests that the pattern we find is not a result of previous merger
different types of data. The pattern that hospitals are less likely to activity.
share externally if they are part of a large local system is replicated This robustness check is independently interesting because it
across these two types of data. illuminates arguments used in recent anti-trust cases. Hospital
In column (3), we check whether our result holds for a differ- systems have argued that mergers will promote adoption of EMRs
ent potential measure of external sharing of information, which and consequently benefit patients and society at large.18 For
is whether or not the hospital actively participates in a Regional example, in the Evanston Northwestern-Highland Park case, one
Health Information Organization (RHIO). RHIOs develop databases of the claims that Evanston Northwestern made was that it had
and software architectures that ease the electronic exchange of done much to improve the quality of medical care at Highland Park
patient-level clinical information between health-care providers. since the merger, including ‘investing millions of dollars in changes
It appears that indeed hospitals with a larger regional sys- [like] new information systems and electronic medical records’
tem presence are more likely to not actively participate in an
RHIO.
We then go on to check our results using alternative subsamples 18
This is an example of the “efficiencies defense” commonly used in hospital
and conduct falsification tests in Table 6. merger cases (Gaynor and Vogt, 2000).
Table 6
Robustness and falsification checks.
Big HHR Small HRR No merger No merger: systems only Out of network Insurance
(1) (2) (3) (4) (5) (6)
# Hospitals in system in HRR −0.009*** −0.016*** −0.007** −0.007** −0.003

(0.015) (0.024) (0.013) (0.013) (0.008)
# Hospitals outside HRR in system −0.000
(0.001)
# Hospitals outside system in HRR −0.001 −0.001 −0.001* −0.001* −0.001* 0.001
(0.004) (0.008) (0.002) (0.002) (0.002) (0.001)
Admissions (000) 0.000 0.003*** 0.002*** 0.002*** 0.002** 0.011***
(0.006) (0.005) (0.004) (0.004) (0.003) (0.003)
Proportion Medicare inpatients 0.000 0.000 0.000 0.000 −0.000 0.000
(0.003) (0.002) (0.002) (0.002) (0.002) (0.001)
Proportion Medicaid inpatients 0.001 −0.000 0.000 0.000 −0.000 0.001
(0.004) (0.003) (0.003) (0.003) (0.002) (0.001)
No. doctors (000) 0.107 0.324** 0.124 0.124 0.143* −0.262**
(0.457) (0.569) (0.375) (0.375) (0.361) (0.292)
PPO 0.038 −0.077** −0.016 −0.016 −0.027 0.064**
(0.194) (0.156) (0.136) (0.136) (0.115) (0.069)
HMO −0.029 0.057 −0.005 −0.005 0.013 0.023
(0.190) (0.153) (0.131) (0.131) (0.112) (0.067)
Per capita payroll (000,000) −2.065** −0.777 −1.606*** −1.606*** −1.366*** 0.892
(4.149) (3.095) (2.670) (2.670) (2.333) (1.753)
State fixed effects Yes Yes Yes Yes Yes Yes
Hospital type controls Yes Yes Yes Yes Yes Yes
Vendor controls Yes Yes Yes Yes Yes No
Installation year controls Yes Yes Yes Yes Yes No
Observations 1036 1450 1908 1908 2561 4060

Log-likelihood −378.00 −596.95 −771.17 −771.17 −1036.89 −2596.01
Probit estimates. Dependent variable is whether the hospital exchanges electronic data externally outside its system. Robust Standard Errors.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
(Japsen, 2005). Our analysis indicates that while larger firms are be reasonable if system size was largely determined prior to
indeed more likely to exchange information on an intra-firm basis, the advent of electronic health information exchange. However,
they are less likely to exchange information across an inter-firm though there we employ a battery of controls in case the decision to
network. This means that larger firms, while seemingly associated exchange data and system size are jointly determined by observed
with higher adoption levels, are actually associated with lower factors, there is still the potential for there being an unobserved
network externalities for a technology in the specific sense of source of bias or for system size itself to be endogenous. For
promoting information exchange. example, it may be the case that like-minded hospitals formed a
In columns (5) and (6), we move on to two falsification checks. system specifically because it facilitated their efforts to exchange
In column (5), we add a new variable that captures the number data with one another.
of hospitals outside the local HRR but within the same system. If We address this concern by using system size for the hospi-
we were capturing something about organizational capacity, for tal in 1994 as an instrumental variable for current system size.
example that larger systems have organizational structures which By using pre-internet era system size as our source of exogenous
means they adopt technological innovations more slowly, we variation, our idea is that this strips out system amalgamation
would expect this to have a similar negative and significant effect. decisions which occurred in response to similar goals or techni-
However, column (5) shows that we do not find such an effect. cal abilities related to health IT or data exchange. This is similar to
In column (6), we report results for a falsification check in the identification strategy used in Miller and Tucker (2013), who
which we estimate the effects of system size on the decision to use pre-internet capital-staff ratios to identify post-internet online
exchange information with an insurance provider. If, again, there sharing behavior.
were unobserved technological capacity issues to do with having a The IV-Probit estimate in column (1) confirms the finding of a
large system size that were leading firms to not be able to exchange positive effects of system size on internal exchange. The IV-Probit
information externally, we would expect to see a similar result estimates in column (2) confirm the negative effect of system size
for this metric because it also captures external exchange of data. on external exchange using the full sample of hospitals, while the
We do not know about the details of the insurance system imple- estimate in column (3) does the same for the sample restricted
mentation from the AHA survey so cannot include system age or to hospitals that are members of systems. The marginal effect esti-
manufacturer as controls. However, the results in column (6) sug- mates share the same signs in the basic Probit and IV-Probit models,
gest that indeed there is no negative relationship between system but the IV-Probit estimates are larger. The effect of an additional
size and the decision to exchange information with insurers. hospital in the system is estimated to increase internal exchange
by 4.3% in the IV model (rather than 2.1% from the comparable
4.4. Instrumenting for system size Probit model in column (5) of Table 3) and to decrease external
exchange by 4.2% (rather than 0.7% in the Probit model in column
Our estimates so far assume that system size is unrelated (5) of Table 4). These effect sizes are nontrivial relative to the mean
to other unobserved factors that also determines a hospital’s rates of data exchange in our sample of 68% for internal and 17%
decision to share information externally or internally. This may for external exchange (see Table 1).
Table 7
Instrumenting for system size.
System only All System only

(1) (2) (3)
Internal exchange External exchange External exchange
# Hospitals in system in HRR 0.0432** −0.0413** −0.0446*

(0.0208) (0.0176) (0.0253)
# Hospitals outside system in HRR −0.00182 −0.00148 −0.00340
(0.00221) (0.00163) (0.00281)
# Hospitals in system in HRR
# Hospitals in system in 1994 0.932*** 1.051*** 0.932***

(0.0258) (0.0165) (0.0258)
# Hospitals outside system in HRR 0.00316 −0.000963 0.00315
(0.00303) (0.00153) (0.00303)
Hospital controls Yes Yes Yes
Installation year controls Yes Yes Yes

Log-likelihood -3820.10 -6652.19 -3610.11
Marginal effects from IV-Probit models. Dependent variable in the top panel is internal exchange in column (1) and external exchange in columns (2) and (3). Dependent
variable in the bottom panel is the number of hospitals in the system in the HRR. Dependent variables as shown. Robust Standard Errors.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
4.5. Are large hospital systems’ decisions not to share data The ease with which a patient can leave a hospital system
strategic? may depend on their insurance plan. Generally, a patient with a
Preferred Provider Organization (PPO) insurance plan can seek a
Given that the results of Table 6 appear to rule out an explana- new provider at will. However, a patient with an HMO insurance
tion based on technological capacity, we turn to exploring whether plan must make a request for a new referral to their primary care
the decisions by large hospital systems to not share patient data provider. This means that patients with PPOs have more risk of
reflect a strategic decision to prevent an outflow of patient data leaving the system than HMO patients. Therefore, the kind of insur-
and, with it, patients (Table 7). ance plans that a hospital accepts will influence the likelihood of
Table 8
External data exchange is more responsive to system size at PPO, high-wage, and specialty hospitals.
(1) (2) (3) (4) (5) (6)

PPO No PPO Above average wage Below average wage Speciality Not specialty
# Hospitals in system in HRR -0.007** −0.005 −0.013*** −0.004 −0.011** −0.005*

(0.013) (0.017) (0.018) (0.013) (0.021) (0.012)
# Hospitals outside system in HRR −0.000 −0.000 −0.001 −0.000 −0.000 −0.000
(0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Admissions (000) 0.002** −0.002 0.001 0.002* 0.004** 0.001
(0.004) (0.008) (0.007) (0.004) (0.006) (0.004)
Proportion Medicare inpatients 0.000 0.002*** 0.000 0.001** 0.001* 0.000
(0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Proportion Medicaid inpatients 0.001 0.000 0.000 0.001** 0.001 0.001
(0.003) (0.003) (0.002) (0.003) (0.003) (0.002)
No. doctors (000) 0.014 0.856*** 0.131 0.176* 0.106 0.120
(0.316) (0.828) (0.414) (0.420) (0.547) (0.340)
Per capita payroll (000,000) −0.796 −1.580** −1.548 −1.231* −0.686 −1.966***
(2.774) (2.846) (4.708) (3.017) (2.723) (2.835)
PPO −0.048* −0.026 −0.027 −0.059**
(0.118) (0.124) (0.147) (0.103)
HMO 0.008 0.017 0.019 0.019
(0.115) (0.120) (0.148) (0.098)
Installation year controls Yes Yes Yes Yes Yes Yes
Vendor controls Yes Yes Yes Yes Yes Yes
Observations 2591 1444 1826 2221 1564 2478

Log-likelihood −1071.50 −620.61 −794.63 −893.77 −634.19 −1061.89
Marginal effects from Probit models. Dependent variable is whether the hospital exchanges electronic data externally outside its system. Robust Standard Errors.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
Table 9
Hospitals are more likely to exchange data externally when more external hospitals do the same.
Probit IV-Probit IV-Probit (System)

(1) (2) (3)
External exchange External exchange External exchange
# Out-system hospitals exchanging externally in HRR 0.0316*** 0.141* 0.233**

(0.00901) (0.0779) (0.0957)
# Hospitals in system in HRR −0.0293*** −0.0318*** −0.0430***
(0.00989) (0.00968) (0.0110)
# Hospitals outside system in HRR −0.00545*** −0.0181** −0.0282***
(0.00182) (0.00908) (0.0105)
# Out-system hospitals exchanging externally in HRR
# Hospitals in system in HRR 0.0326** 0.0633***

(0.0154) (0.0169)
# Hospitals outside system in HRR 0.118*** 0.113***
(0.00218) (0.00263)
HRR 4yrEDI 2.841*** 1.950***
(0.529) (0.612)
HRR 4yrPhysDoc −2.702*** −1.944***
(0.630) (0.699)
HRR 8yrPhysDoc 1.622 −0.461
(1.514) (1.717)
HRR 12yrPhysDoc −3.860* −2.763
(2.151) (2.494)
Deploy year controls Yes Yes Yes
Hospital controls Yes Yes Yes

Log-likelihood −1735.03 −11235.49 −6854.27
Sargan test of over-identification 0.40 1.54

Sargan test of over-identification P-value 0.94 0.67
First stage R2 0.40 0.44
First stage F-test 15.53 15.04
First stage F-test p-value 0.00 0.00
Marginal effects from Probit and IV-Probit models. Dependent variable in the top panel is external exchange. Dependent variable in the bottom panel is the number of
hospitals in the HRR outside of the focal hospital’s system that are exchanging data externally. Instrumental variables are HRR-level measures of technology adoption type
and timing. Standard errors in parentheses
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
patients transferring from that hospital to another. Columns (1) and create information silos is related to the value of the inputs that a
(2) of Table 8 presents estimates by whether or not that hospital firm is paying for the creation of that data. The more valuable the
has a non-zero number of PPO contracts.19 The results suggest that inputs, the more reluctant firms are to share such data externally.
PPO hospitals in larger systems are less likely to exchange data with Along similar lines, we also stratified by whether or not a hos-
outside hospitals than are hospitals that do not have PPO contracts. pital had a ‘specialty’ such as cardiology or oncology. Columns (5)
We caution that though the difference in size of point estimates is and (6) present the result. The point estimates suggest that spe-
suggestive, the large standard error in column (2) means that these cialty hospitals were more likely than non-specialty hospitals to
coefficients are not statistically different. It is also striking that we not share information externally, if they were part of a larger sys-
do not see this pattern for HMO providers since they are examples tem. Like the results in columns (3) and (4), an explanation of this
of healthcare systems where there is far less need for patient data may be that hospitals find it commercially more important to stop
portability, as patients have limited choice in providers. We repeat the outflow of data outside their hospital system from specialty
this estimation for the decision to share data internally within a hospitals (where there are often expensive inputs) than from reg-
system in Table A1 and find no such relationship. ular hospitals. This may be so that the hospital system can make
One of the many motivations that hospitals may have to silo sure that they are responsible for potentially profitable care such
their patients’ records is to avoid competitors benefiting from the as routine follow-up CT scans for cancer patients.
opinions of highly paid clinical or technical staff. Columns (3)
and (4) of Table 8 explore this by presenting estimates where we
allow the importance of hospital system size to vary by average 5. Responses to the installed base
salary paid to hospital staff. The results suggest that hospitals with
highly-paid employees have larger coefficient estimates for the The findings in Section 4 suggest that larger hospital systems
responsiveness of sharing to system size. We repeat this estimation are more likely to exchange patient records internally with other
for the decision to share data internally within a system in Table A1 hospitals in their same systems but less likely to share data with
and find no such relationship. This suggests that the decision to hospitals outside of their systems. The policy implications of these
findings for maximizing the exchange of patient data are not clear.
On the one hand, when hospitals exchange information purely
within their network, they are still exchanging information, which
19
We employ a binary indicator (as in Song (1995), for example) because the data can improve care within the system. On the other hand, the deci-
do not contain information on the share of patients or revenues from PPO contracts. sion not to exchange data externally can reduce the opportunities
for external data exchange for hospitals outside of the large For both columns (2) and (3), we report the F-test statistic for
hospital system. While the benefits from greater internal exchange the first stage of an identically specified two-stage least squares
may be captured within the system, the harm from less external linear probability model. The high value for this F-test suggests
exchange can produce negative spillovers for hospitals outside that our instruments are strong predictors of the installed base.
of the system and their patients. Such negative externalities may The estimated effects of older Physician Documentation systems
warrant policy intervention. To evaluate the empirical importance are negative and significant for systems older than 4 or 12 years.
of this concern, we assess whether there is in fact evidence in our However, conditional on the age of the Physician Documentation
data of positive spillovers from one hospital’s electronic patient system, the presence of an older EDI system is positively related
information exchange with external hospitals on the external to data exchange, possibly because the costs of updating an older
exchange decisions of other local hospitals.20 EDI system to comply with new standards were still lower than the
In Table 9, we study how one hospital’s decision to exchange costs of new adoption. We also report the Sargan test statistics for
data externally affects the decisions of other local hospitals who over-identification (which assumes the validity of at least one of
are not members of the same system. In the first column, we esti- our instruments), and this test suggests that we cannot reject the
mate a simple Probit model, treating the exchanging decisions of null hypothesis that the equation is over-identified.
other hospitals as exogenous and ignoring the potential reflection
problem or correlated local unobservable factors that affect exter-
6. Implications
nal exchange decisions for all hospitals (Manski, 1993). The main
explanatory variable is a count of the number of other hospitals in
This research investigates motivations for the sharing of elec-
the HRR but outside of the system who are exchanging information
tronic health information by hospitals. We find that larger hospital
externally. The positive and significant relationship indicates that
systems are less likely to exchange information across a network
external sharing decisions are correlated between hospitals within
and more likely to exchange information within their own net-
HRRs, even after conditioning on the key observable factors.
work. Our findings suggest that commonly-advocated strategies
The second column presents results from IV-Probit estimates
for vendors who sell network products to kick-start their com-
of the individual hospital decision to exchange information exter-
pany may need modifying. Often, software and hardware firms are
nally, treating the number of out-system hospitals in their HRR
advised to secure initial marquee users to help firms overcome
who are exchanging information externally as endogenous. We
the chicken-and-egg problem inherent in network technology
instrument for external exchange by other out-system hospitals
markets. However, our research suggests that when firms need
using the mean number of other hospitals in the local area with
to rely on the marquee user to establish system-wide network
Electronic Data Interchange (EDI) and Physician Documentation
effects, the success of their strategies in later stages of the network’s
systems that were old enough to have been made outdated by
development depend on whether marquee users are willing to use
changes in various HL7 standards (more than 4 years old for EDI
the network broadly. Therefore firms need to make sure, either
and more than 4, 8 or 12 years old for Physician Documentation).
contractually or technologically, that marquee users are obliged to
Our assumption is that technology adoption decisions were
share information across a network technology and not silo their
made before the standards changed, and that the updated
data.21
standards affect the number of other hospitals who exchange
The anticipated benefits from widespread health IT diffusion, in
information, but do not otherwise affect the value of informa-
terms of costs savings and improved health outcomes, depend in
tion exchange. The excluded variables reflect past choices made by
large part on the electronic exchange of patient information. The
neighboring hospital systems that have unintentionally rendered
results of this research suggest that adoption of EMR systems alone,
them less interoperable. Hospitals with older systems must bear an
even of systems with the capacity for data sharing, may not be
additional cost to make their systems comply with existing stan-
sufficient to ensure that the full value from health IT is realized. This
dards in order to be able to exchange data with other data systems
provides a potential rationale for public policy specifically aimed
and to participate in regional exchanges. Once they bear this cost,
at promoting the electronic exchange of clinical information across
however, the value of the data they share should not directly be
firms and hospital system boundaries.
affected by the timing of their technology adoption.
Currently the ‘Eligible Hospital and Critical Access Hospi-
This second column suggests that when other external hospitals
tal Meaningful Use Core Measure 13’ states that to qualify, a
are more willing to exchange information for exogenous reasons,
hospital has to have ‘Performed at least one test of certified
then this increases the propensity of a hospital in the same local
EHR technology’s capacity to electronically exchange key clinical
area but in a different hospital system to also exchange exter-
information.’22 To qualify, hospitals can simply use information of
nally. This implies that when larger hospital systems choose to
a fictional patient. This measure reflects the current policy focus on
not exchange information externally, they reduce the likelihood
technological inter-operability as being the most important barrier
that other hospitals also exchange information externally. Since it
to the exchange of healthcare information. However, our results
is unlikely that the large hospital system internalizes the negative
suggest that policymakers should also consider how to improve
welfare externalities for these external hospitals or their patients,
incentives so that hospitals actually share commercially valuable
this means such strategic behavior reduces welfare for these exter-
patient data with each other. This is important as policymakers set
nal hospitals and their patients.
policy priorities for ‘stage 3’ of meaningful use, the target date for
In the third column of the table, we repeat the instrumental
which is currently 2016.
variables estimation but only look at the subsample of hospitals
that are part of systems. The estimated effects are even larger in
this sample.
21
Though we our study is based on the healthcare industry, our results also appear
to apply to other industries such as the construction of platforms that enable cus-
tomers to share reward points. Here, despite the benefits of such schemes for
20
These estimates may understate the positive externalities associated with exter- consumers, larger vendors (such as major airlines) refuse to allow consumers to
nal data exchange, as they will not capture the potential externalities from data transfer their reward points outside of their system.
22
sharing that involve researchers or healthcare providers outside of the local area, as http://www.cms.gov/EHRIncentivePrograms/Downloads/13 Electronic
would occur, for example, in a national ‘learning health’ system (Smith et al., 2012). Exchange of Clinical Information.pdf.
Appendix A.
See Table A1.
Table A1
Repeating Table 8 for internal exchange.
(1) (2) (3) (4) (5) (6)

PPO No PPO Above average wage Below average wage Speciality Not speciality
# Hospitals in system in HRR 0.024*** 0.015** 0.023*** 0.018*** 0.026*** 0.018***

(0.013) (0.018) (0.016) (0.014) (0.020) (0.013)
# Hospitals outside system in HRR −0.001 −0.002* −0.001 −0.001 −0.002*** −0.000
(0.002) (0.003) (0.002) (0.002) (0.002) (0.002)
Admissions (000) 0.009*** 0.006* 0.012*** 0.006*** 0.010*** 0.006***
(0.005) (0.010) (0.010) (0.005) (0.009) (0.005)
Proportion Medicare inpatients −0.001 −0.003*** −0.003*** −0.003*** −0.003*** −0.003***
(0.003) (0.003) (0.002) (0.002) (0.002) (0.002)
Proportion Medicaid inpatients −0.002** −0.004*** −0.003*** −0.005*** −0.005*** −0.005***
(0.003) (0.004) (0.003) (0.003) (0.003) (0.003)
No. doctors (000) −0.122 1.829*** 0.811* 0.082 0.080 0.328
(0.499) (1.780) (1.160) (0.618) (0.931) (0.706)
Per capita payroll (000,000) 3.627*** 1.241* 1.146 0.888 2.300* 1.031
(3.291) (2.218) (5.989) (2.251) (3.441) (2.140)
PPO −0.046 −0.094** −0.082 −0.092**
(0.162) (0.143) (0.176) (0.132)
HMO 0.062 0.127*** 0.177*** 0.080*
(0.157) (0.139) (0.177) (0.126)
Installation year controls Yes Yes Yes Yes Yes Yes
Vendor controls Yes Yes Yes Yes Yes Yes
Observations 1635 924 960 1582 975 1567

Log-likelihood −867.19 −483.16 −546.96 −801.69 −518.77 −830.23
Probit estimates. Dependent variable is whether the hospital exchanges electronic data internally within its system. Robust standard errors.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
References Delia, D., Cantor, J., 2009. Emergency department utilization and capacity. The Syn-
thesis project. Research synthesis report (17).
Adler-Milstein, J., DesRoches, C.M., Jha, A.K., 2011. Health information exchange Economides, N., 1996. The economics of networks. International Journal of Industrial
among us hospitals. American Journal of Managed Care (11), 761–768. Organization 14 (6), 673–699.
Adler-Milstein, J., Jha, A., 2012. Sharing clinical data electronically: a critical chal- Eisenmann, T., Parker, G., Alstyne, M.W.V., 2006 October. Strategies for two-sided
lenge for fixing the health care system. Journal of American Medical Association markets. Harvard Business Review.
307 (16), 1695–1696. Farrell, J., Klemperer, P., 2007. Handbook of Industrial Organization, Vol 3, Chap-
Augereau, A., Greenstein, S., Rysman, M., 2006. Coordination versus differentiation ter Coordination and Lock. In: Competition with Switching Costs and Network
in a standards war: 56k modems. RAND Journal of Economics 37 (4), 887–909. Effects, North-Holland, pp. 1967–2072.
Blumenthal, D., Tavenner, M., 2010. The ‘meaningful use’ regulation for electronic Farrell, J., Saloner, G., 1985. Standardization, compatibility, and innovation. RAND
health records. New England Journal of Medicine 363 (6), 501–504, PMID: Journal of Economics 16, 70–83.
20647183. Frisse, M.E., Johnson, K.B., Nian, H., Davison, C.L., Gadd, C.S., Unertl, K.M., Turri, P.A.,
Bower, A.G., 2005. The diffusion and value of healthcare information technology. Chen, Q., 2012. The financial impact of health information exchange on emer-
RAND Monograph Report. gency department care. Journal of the American Medical Informatics Association
Brailer, D.J., 2005. Interoperability: the key to the future health care system. Health 19 (3), 328–333.
Affairs w5, 19–21. Gaynor, M., Vogt, W.B., 2000. Handbook of Health Economics. In: Chap-
Buntin, M.B., Jain, S.H., Blumenthal, D., 2010. Health information technology: ter Antitrust and Competition in Health Care Markets, Elsevier,
laying the infrastructure for national health reform. Health Affairs 29 (6), pp. 1405–1487.
1214–1219. Goodby, A.W., Olsen, L., McGinnis, M., 2010. Clinical data as the basic staple of health
Burgess, J.J., Carey, K., Young, G.J., 2005 March. The effect of network arrange- learning: creating and protecting a public good: workshop summary. In: IOM
ments on hospital pricing behavior. Journal of Health Economics 24 (2), Roundtable on Evidence-Based Medicine (Series), Institute of Medicine, National
391–405. Academies Press.
Calcanis, C., 2005. Milwaukee group develops EMR to reduce ER costs. Medical Gowrisankaran, G., Stavins, J., 2004. Network externalities and technology adop-
Informatics Insider November. tion: lessons from electronic payments. RAND Journal of Economics 35 (2),
Chen, J., Doraszelski, U., Harrington, J.E., 2009. Avoiding market dominance: product 260–276.
compatibility in markets with network effects. RAND Journal of Economics 40 Ho, K., 2009. Insurer-provider networks in the medical care market. American Eco-
(3), 455–485. nomic Review 99 (1), 393–430.
Christensen, M.C., Remler, D., 2009. Information and communications technology in Japsen, B., 2005. Final Volleys Over Evanston Healthcare Merger. Chicago Tribune
U.S. health care: Why is adoption so slow and is slower better? Journal of Health July.
Politics, Policy and Law 34 (6), 1011–1034. Jha, A., 2010. Meaningful use of electronic health records: the road ahead. JAMA 304
Ciliberto, F., 2006. Does organizational form affect investment decisions? Journal of (15), 1709–1710.
Industrial Economics 54 (March (1)), 63–93. Katz, M.L., Shapiro, C., 1985. Network externalities, competition, and compatibility.
Clark, C., 2009. Four health leaders weigh in on whether EMRs save money. Health American Economic Review 75 (3), 424–440.
Leaders Media November. Knox, R., 2009. Doctors don’t agree on letting patients see notes. NPR, September
Coakley, M., 2010. Examination of health care cost trends and cost drivers: Pursuant 21.
to g.l. c. 118g, s6.5(b). Report for Annual Public Hearing, Office of MA Attorney Madison, K., 2004. Multihospital system membership and patient treatments,
General. expenditures, and outcomes. Health Services Research 39 (4p1), 749–770.
Manski, C.F., 1993. Identification of endogenous social effects: the reflection prob- Prestigiacomo, J. (2012). Overcoming competition in private hies.
lem. Review of Economic Studies 60 (July (3)), 531–542. Shy, O., 2001. The Economics of Network Industries. Cambridge University Press.
Mata, F.J., Fuerst, W.L., Barney, J.B., 1995. Information technology and sustained Simcoe, T.S., Graham, S.J., Feldman, M., 2009. Competing on standards?
competitive advantage: a resource-based analysis. MIS Quarterly 19 (4), Entrepreneurship, intellectual property and the platform paradox. Journal of
487–505. Economics and Management Strategy 18 (fall (3)), 775–816.
McCullough, J., Parente, S., Town, R., 2011. The impact of health information tech- Smith, M., Saunders, R., Stuckhardt, L., McGinnis, J.M. (Eds.), 2012. Best care at lower
nology on patient outcomes. Mimeo, Minnesota. cost: the path to continuously learning health care in America. Committee on
Melnick, G., Keeler, E., 2007. The effects of multi-hospital systems on hospital prices. the Learning Health Care System in America; Institute of Medicine. National
Journal of Health Economics 26 (2), 400–413. Academies Press.
Miller, A.R., Tucker, C., 2009. Privacy protection and technology diffusion: Song, Y.I., 1995. Strategic alliances in the hospital industry: a fusion of institutional
the case of electronic medical records. Management Science 55 (7), and resource dependence views. Academy of Management Journal, 271–275.
1077–1093. Town, R.J., Wholey, D., Feldman, R., Burns, L.R., 2007. Revisiting the relationship
Miller, A.R., Tucker, C., 2013. Active social media management: the case of health between managed care and hospital consolidation. Health Services Research 42
care. Information Systems Research 24 (1), 52–70. (1 Pt 1), 219–238.
Miller, A.R., Tucker, C.E., 2011a. Can health care information technology save babies? Tucker, C., 2008. Identifying formal and informal influence in technol-
Journal of Political Economy 119 (April (2)), 289–324. ogy adoption with network externalities. Management Science 54 (12),
Miller, A.R., Tucker, C.E., 2011b. Encryption and the loss of patient data. Journal of 287–304.
Policy Analysis and Management 30 (3), 534–556. Walker, J., Pan, E., Johnston, D., Adler-Milstein, J., Bates, D.W., Middleton, B., 2005.
Miller, A.R., Tucker, C.E., 2012 Forthcoming. Electronic discovery and the adoption The value of health care information exchange and interoperability. Health
of information technology. Journal of Law, Economics, and Organization. Affairs, w5.10+.
Murray, E., Burns, J., May, C., Finch, T., O’Donnell, C., Wallace, P., Mair, F., 2011. Why Wolf, L., Harvell, J., Jha, A.K., 2012. Hospitals ineligible for federal meaningful-use
is it difficult to implement e-health initiatives? a qualitative study. Implemen- incentives have dismally low rates of adoption of electronic health records.
tation Science 6 (1), 6. Health Affairs 31 (3), 505–513.

Removing financial barriers to organ and bone marrow donation: The

effect of leave and tax legislation in the U.S.夽
Nicola Lacetera a , Mario Macis b,∗ , Sarah S. Stith c
a
University of TorontoCanada
b
Johns Hopkins University, United States
c
University of New Mexico, United States
Article history: Many U.S. states have passed legislation providing leave to organ and bone marrow donors and/or tax
Received 6 September 2012 benefits for live and deceased organ and bone marrow donations and to employers of donors. We exploit
Received in revised form 8 October 2013 cross-state variation in the timing of such legislation to analyze its impact on organ donations by living
and deceased persons, on measures of the quality of the transplants, and on the number of bone mar-
Available online 28 October 2013
row donations. We find that these provisions do not have a significant impact on the quantity of organs
donated. The leave laws, however, do have a positive impact on bone marrow donations, and the effect
Keywords:
increases with the size of the population of beneficiaries and with the generosity of the legislative pro-
Organ donation
Bone marrow donation
visions. Our results suggest that this legislation works for moderately invasive procedures such as bone
Leave and tax legislation marrow donation, but these incentives may be too low for organ donation, which is riskier and more
Policy evaluation burdensome.
1. Introduction in 2005, and only slightly over 60% of individuals waitlisted ever
received an organ. Approximately 300 individuals die each year
In virtually every country, the demand for organs and bone mar- because they cannot find a matching bone marrow donor.1 In addi-
row far exceeds supply, leaving many patients to spend years and tion to the implications for transplant candidates, a kidney trans-
even die waiting for donated organs or bone marrow. In the U.S., plant also saves at least $90,000 over the life of the individual rel-
for instance, where more than 100,000 people are on the waiting ative to on-going dialysis treatment (Matas and Schnitzler, 2003).
list for an organ transplant, the median wait-time until death or This supply shortage and the associated costs, as well as the
transplantation was 276 days for a liver and 547 days for a kidney loss in quality of life and even life itself drive the ongoing debate as
to whether donors should receive some form of compensation in
order to increase organ and bone marrow donation. Concerns about
exploitation of the poor and sick, adverse selection, motivational
夽 We thank seminar participants at the University of Michigan and at the crowding out, and a general “repugnance” toward the commer-
8th Annual International Industrial Organization Conference, as well as Thomas cialization of body parts have influenced the debate heavily (Frey,
Buchmueller, Ricard Gil, Brian Krauth, Stephen Leider, Robert Slonim and Jef- 1993; Roth, 2007; Titmuss, 1971).2 Some scholars and policy-
frey Smith for their feedback. Christina Davis provided valuable editorial services.
This work was supported in part by Health Resources and Services Adminis-
makers oppose any form of explicit reward for organ donors
tration contract 231-00-0115. The content is the responsibility of the authors whereas others advocate direct monetary payments. Becker and
alone and does not necessarily reflect the views or policies of the Department Elias (2007), for instance, estimate that amounts of $15,000 and
of Health and Human Services, nor does mention of trade names, commer- $38,000 would enable markets for kidneys and livers, respectively,
cial products, or organizations imply endorsement by the U.S. Government.
(http://optn.transplant.hrsa.gov/data/citing.asp). The data on kidney and liver
transplants have been supplied by the United Network for Organ Sharing as the con-
tractor for the Organ Procurement and Transplantation Network. The interpretation
and reporting of these data are the responsibility of the authors and in no way should
1
be seen as an official policy of or interpretation by the OPTN or the U.S. Government. These statistics are taken from the Organ Procurement and Transplantation Net-
(http://optn.transplant.hrsa.gov/shareddownloadables/data use agreement.pdf). work (OPTN) for organs, and from Bergstrom et al. (2009) (based on National Marrow
∗ Corresponding author. Johns Hopkins Carey Business School, 100 International Donor Program (NMDP) data) for bone marrow.
2
Dr., Baltimore MD 21202 USA. Tel.: +1 410 234 9431; Fax: +1 410 234 9439. Byrne and Thompson (2001) identify additional mechanisms that could lead
E-mail addresses: nicola.lacetera@utoronto.ca (N. Lacetera), mmacis@jhu.edu to perverse effects of incentives, including time-inconsistency issues related to
(M. Macis), ssstith@unm.edu (S.S. Stith). rewards for donor registration.
44 N. Lacetera et al. / Journal of Health Economics 33 (2014) 43–56
to clear.3 Other studies indicate that some form of non-cash, in- influence bone marrow donors more than organ donors. Bone mar-
kind rewards could help reduce the supply shortage of human row donation has a much lower risk of complications and death
organs and tissue (Frey, 1993; Lacetera and Macis, 2010, 2013; than does organ donation, and is much less burdensome to the
Leider and Roth, 2010; Rodrigue et al., 2009; Roth et al., 2004). donor in terms of recovery time, pain, and suffering. Also, bone
Singapore and Israel have instituted priority rules for organ donors marrow regenerates and can be donated multiple times, whereas
to reward those with signed donor cards and their families (Biyar, kidneys never re-grow. In the case of livers, which also regenerate,
2011; Duenwald and Shipley, 2011).4 Israel also allows direct pay- no cases of multiple donations are documented in the literature
ments to donor families to “memoralize” post-cadaveric donation, or the data and any prior hepatobiliary surgery complicates future
as well as more direct forms of compensation such as an exemp- transplant surgery should the donor ever need a liver transplant
tion from health insurance premiums for living donors (Levush, (Maddrey and Van Thiel, 1988). In other words, bone marrow dona-
2010; Satel, 2010). Several European countries make organ donation is less costly for a donor; therefore, at the margin, moderate
tion an opt-out rather than an opt-in decision; according to some incentives should tip the trade-off toward deciding to donate in
observers, this has led to significantly higher organ donation rates the case of bone marrow more than they do for organs. For similar
(Abadie and Gay, 2006). reasons, we differentiate between livers and kidneys within organ
In this paper, we present a comprehensive study of the impact of donation and, within bone marrow, aspiration and apheresis. The
legislative efforts in the U.S. to mitigate the organ and bone marrow risk of complications or death and the recovery period are greatest
supply shortage by removing disincentives to donation without for liver donation and lowest for apheresis donations (Confer et al.,
offering direct compensation, primarily through work leave leg- 2003; Karanes et al., 2003; Muzaale et al., 2011; Segev et al., 2010).
islation and tax credits and deductions. Specifically, between 1989 Our empirical strategy exploits the fact that different states have
and 2009, a number of U.S. states passed legislation that grants paid introduced legislation at different points in time. We take advan-
or unpaid leave to state government employees, paid and unpaid tage of the longitudinal nature of our dataset to allay potentially
leave to private firm employees, tax deductions to individuals who important selection, endogeneity, and omitted variable concerns.
donate their organs or bone marrow, and tax credits to employers In our regression models, we include state fixed effects as well
for promoting donation. as state-specific time trends to ensure that we are controlling for
We quantify the effects of these types of legislation on both omitted time-invariant factors and for selection into adopting the
organ and bone marrow donations. For organ donations, we focus legislation based on the level and growth rate of the outcome vari-
on the two most commonly donated organs, livers and kidneys, ables (Ashenfelter and Card, 1985; Heckman and Hotz, 1989). To
which account for over 80% of all organs donated and almost 100% probe the validity of our identification strategy, we assess whether
of donations from live donors. In addition, the gap between supply pre-existing trends in the demand for organs predict the adoption
and demand is much larger for kidneys and livers relative to hearts, of legislation.
lungs, pancreas, and intestines.5 Even though the tax and leave Our results indicate that the legislation had no overall effect
legislation apply primarily to living donors, we assess whether on the number of organ donations. This result is robust across a
cadaveric donations are affected as well. We do so for three main variety of specifications and sub-samples, and holds also when we
reasons: first, if donations from living and deceased donors follow allow the legislation to affect outcomes with one- or two-year lags.
a common time trend, the donations by deceased donors might be In contrast, and consistent with our prior, we do find a positive
used as a “control” group. Second, these types of donations might effect of leave legislation on bone marrow donation. Specifically,
actually be substitutes (Fernandez et al., 2013). Moreover, the intro- we find a positive effect of leave legislation for state employees on
duction of the legislation might be accompanied by a spread of bone marrow donations, provided that a sufficient share of a state’s
information on organ donation in general, which might induce labor force is state-employed (i.e., when the size of the population
more individuals to become registered organ donors. If so, legis- actually affected by the law is larger). Our estimated coefficients
lation targeting one type of donors might have effects on the other imply that leave legislation for state employees has a positive effect
type as well. We also study whether these laws affect organ qual- on bone marrow donations if state employees represent about 4%
ity, as measured by the state-level post-transplant graft survival of the labor force, which is the case for 30 states, of which 18 passed
rate. For bone marrow donations, we explore potential differential laws granting leave to state employees. To probe our interpretation
effects by donation method: aspiration or apheresis.6 that the size of the incentive matters, we check whether the effect
A priori, if the incentives implied by the legislation were to have of the leave legislation increases with its generosity, and we find
a positive impact on donation, we should expect these laws to that the positive effect of the leave legislation for bone marrow
donations is stronger when the leave is longer and when it is paid
rather than unpaid.
We have also considered the effects of the legislation on organ
3
The only country without an organ supply shortage is Iran, where the sale of kid- quality, as proxied by six-month and three-year post-transplant
neys is legal since 1988. The Iranian government pays $1200 and provides health graft survival rates, to test if the legislation is causing shifts in the
insurance for one year to cover surgery-related conditions. In addition, the ven- underlying quality distribution of organs used for transplantation
dor receives between about $2300 and $4500 either from the recipient or one of
several designated charitable organizations (Hippen, 2008). “Gray” market prices
even in the absence of changes in the overall number of trans-
for kidneys posted by websites offering to coordinate procurement and transplan- plants. This could happen if the legislation has opposed effects
tation internationally (a.k.a. “transplant tourism”) range from $14,000 to $85,000 on different types of living donors. For example, the laws might
(Shimazono, 2007). lead to increased donation among the less intrinsically motivated
4
New U.S. policies do provide priority for prior living donors on the kid-
donors and decreased donation among the more intrinsically moti-
ney transplant waitlist, effective 05/24/2012. http://optn.transplant.hrsa.gov/
PoliciesandBylaws2/policies/pdfs/policy 172.pdf. vated. If the latter types of donors are of higher “quality” (Titmuss,
5
The Organ Procurement and Transplantation Network.http://optn.transplant. 1971), the incentives implied by the legislation may lead to a shift
hrsa.gov/latestData/advancedData.asp. Accessed 07/04/2012. in the overall quality of organs donated. We find an overall posi-
6
In apheresis, a prospective donor undergoes five days of drug injections to stim- tive effect for three-year-survival from leave for state employees
ulate the production of specialized blood cells, which are then filtered out of the
donor’s blood over the course of several hours, much as in plasma donation. The
and tax credits for employers, although the estimated coefficients
alternative method of aspiration requires removal of actual bone marrow from the are only marginally statistically significant. We also found a nega-
hip of the donor, a more painful and risky procedure than apheresis. tive effect on three-year survival of recipients of organs from living
N. Lacetera et al. / Journal of Health Economics 33 (2014) 43–56 45
donors and a positive effect on the same group of tax credits to even less invasive procedure, blood donation.7 In all, our empirical
employers. These results suggest that the laws may affect the dis- identification strategy, as well as the focus on both organs and mar-
tribution of organs donated or the distribution of organs used even row and on both quantity and quality, allow us to make significant
in the absence of an overall effect on the quantity of organs, but progress in understanding how the existing legislation affects the
there is no systematic evidence of adverse selection induced by the supply of organs and marrow, and more generally whether finan-
legislation. cial incentives might be a viable and effective option to increase
This study contributes to a growing literature in economics supply.
on organ and bone marrow donation. Two papers have looked In Section 2 we offer background information on the history of
at the effects of a variety of traffic safety laws on cadaveric organ and bone marrow donation and the associated legislation. In
organ donation, such as motorcycle helmet laws, primary seat Section 3 we describe the data, and in Section 4 we present and
belt enforcement laws, and speed limits (Dickert-Conlin et al., discuss our empirical strategy. We report the results in Section 5,
2012; Fernandez et al., 2013). Kessler and Roth (2012, 2013a) and in Section 6 we discuss some implications of our results and
studied the effect of priority rules both theoretically and exper- conclude.
imentally, and Lavee et al. (2012) found positive effects of the
priority policy in Israel. Kessler and Roth (2013b) found that provid-
2. Organ and bone marrow donation and associated
ing individuals with multiple opportunities to declare their organ
legislation
donor registration status leads to more individuals joining reg-
istries. A number of studies (e.g., Roth et al., 2004, 2005a, 2005b)
2.1. Background
have analyzed the use of kidney exchanges, which cross-match
incompatible donor and recipient pairs to create compatible donor-
In 1954, the first successful living donor kidney transplant was
recipient pairs. Bagozzi et al. (2001) documented differences in
performed, followed by the first successful cadaveric donor kidney
bone marrow donation across different cultures, and Bergstrom
transplant in 1962. The first successful cadaveric liver transplant
et al. (2009, 2012) analyzed the optimal size and racial composition
occurred in 1967 between identical twins. Bone marrow was first
of bone marrow registries. Although the effects of non-cash legis-
successfully transplanted in 1973. A living donor was not success-
lated incentives on many types of pro-social behavior, including
fully used in liver transplantion until 1989.8
health-related activities (e.g., blood donation; see Lacetera et al.,
Long-term dialysis treatment became available in 1960, greatly
2013a), have now been studied extensively, the literature has just
extending the life expectancy of individuals with renal failure and
begun to study the effects of such legislation on the much more
with that, the demand for kidney transplants. The Food and Drug
“costly” pro-social behavior involved in organ and bone marrow
Administration’s approval of the immunosuppressant cyclosporine
donation.
in 1983 transformed organ transplantation from a high-risk experi-
The overall scarcity of organ and bone marrow donors and
mental procedure with almost certain organ rejection to a common
the difficulty in matching between donors and recipients make
treatment for organ failure. In the late 1990s, laparoscopic surgery
“natural experiments” such as the ones exploited in this paper
greatly reduced the pain and recovery time for live kidney donors.9
particularly important for determining how well such legislation
As science and technological progress developed ever more
performs in solving the severe organ and bone marrow shortage
effective transplantation methods, policymakers sought, in a vari-
problem existing in the U.S. A few closely related studies exist.
ety of ways, to facilitate the use of these new methods and
Venkataramani et al. (2012) examine the effect of tax deductions for
influence the exchange of organs. Medicare has paid for dialysis
individuals on living organ donation. Bilgel (2011) and Wellington
since 1972, kidney transplants since 1978, and liver transplants
and Sayre (2011) study tax deductions for individuals and leave
since 1990.10 As for bone marrow, Medicare began coverage in 1978
legislation for state employees, but consider only organs and not
and expanded it in 1985 and in 2010. Federal law increased the sup-
bone marrow (Bilgel) and only kidneys (Wellington and Sayre). In
ply of deceased donor organs by extending the definition of “death”
our paper, we present a comprehensive analysis including kidneys
to include “brain death” in 1981. In 1984, the National Organ Trans-
and livers, and all types of legislation (leave for state employ-
plant Act banned the sale of organs and bone marrow, and the Organ
ees, leave for private employees, tax deductions to individuals
Procurement and Transplantation Network (OPTN) was established
and tax credits to employers for donation-promoting activities).
to promote organ donation, facilitate the allocation of organs, and
Boulware et al. (2008) include all four types of legislation, but only
serve as a central repository of organ donation-related data. OPTN’s
consider kidney donations, and do not control for state-specific
bone marrow counterpart, the National Marrow Donation Program
time-invariant factors, factors that vary over time but are com-
mon to all states, or pre-existing trends in state-level donation
rates. Our paper employs a more stringent and complete empiri-
cal specification compared to these earlier papers. We also explore 7
Through the UNOS data request process rather than the OPTN’s online data tool,
whether these laws affected the quality of organs donated. Per- we were able to obtain transplant-level data, which we aggregated at the state level.
haps most importantly, our paper is the first to examine the effects With the exception of Boulware et al. (2008) the prior studies cited above use data
aggregated at the donation service area level, not the state level. Although donation
of these policies on bone marrow donations, for which theoret- service areas usually align with state boundaries, 32 of the 58 donation service areas
ical considerations lead us to anticipate stronger effects. In fact, in the U.S. include counties from more than one state.
like in our study, these other papers document no effect of the 8
We focus here on the two human organs and one type of tissue included in
laws examined on organ donation. In addition to being relevant this study: kidneys, livers, and bone marrow. Lungs, hearts, and intestines can be
donated by living donors, but this occurs extremely rarely, so we drop these from
for policy and health considerations, the positive effect we find on
our sample. (Deceased donors are required for heart transplants except for the case
bone marrow donation supports an “incentive size” explanation of domino transplants [i.e., a heart-lung recipient donates his or her heart to another
for the zero result on organs, namely that the incentives may be recipient]).
9
too low for more “costly” donations but may work for less invasive http://www.organtransplants.org/understanding/history/index.html.
10
procedures such as bone marrow donation. Our finding that the Prior to 1990, Medicare covered liver transplants on a case-by-case basis. In
1990, this coverage was expanded to all end-stage liver disease patients except
effect increases with the generosity of the leave legislation further those with Hepatitis B or liver cancer. In 1996, Hepatitis B became covered, and in
corroborates this interpretation. This is consistent with the posi- 2001 Medicare began covering hepatocellular carcinoma, but other forms of cancer
tive effect found by Lacetera and Macis (2013) of paid leave on an remain uncovered.
(NMDP), was established two years later in 1986. In 1994, a federal private employees, sixteen states give tax deductions to individ-
law was passed to provide leave of absence for bone marrow (5 uals and three states provide tax credits to employers. For bone
days) and organ (30 days) donation by federal employees. Federal marrow donations, thirty-three states offer leave for state employ-
law also started requiring hospitals to notify organ procurement ees, eleven states offer leave for private employees, fifteen states
organizations of all eligible deaths in 1998, so that all might have give tax incentives to individuals and four states give employers
the opportunity to donate. tax credits for donation-promoting activities (further details are in
Although organs from deceased donors are still the main source the online Appendix).
for transplants, this supply is inherently limited by the death rate,
the type of deaths, and the decomposition process. Only a small per-
centage of deaths yield viable organs, and although improvements 3. Data and descriptive statistics
in storage and transportation of organs have occurred, kidneys
are only viable for up to 24 h and livers for up to 12 h with- 3.1. Organs
out a living blood supply (Institute of Medicine, 1999). Despite
over two million deaths in the U.S. in 2009, eligible deaths doc- Patient-level data on kidney and liver transplants come from the
umented by UNOS totaled only 9827.11 These deaths yielded Organ Procurement and Transplantation Network (OPTN). From
organs used in 11,285 kidney transplants and 6098 liver trans- a total of 358,378 individual-level transplants, we obtained 1050
plants, supplemented with 6387 live kidney donations and 219 state-year level observations with 50 observations per year from
live liver donations. Meanwhile, 33,663 people joined the kid- 1988, when OPTN began collecting data, through 2008.13 Note that
ney waitlist and 10,706 joined the liver waitlist in 2009. Bone our data counts are transplants: we only count the organs actually
marrow donations must come from living donors exclusively, but used for transplantation, not organs recovered or donors consent-
individual donors can donate more than once, making the bone ing. However, because our main focus is on living donors, the
marrow supply less inherently limited than the supply of donor number of organs recovered and the number of organs transplanted
organs. is the same.
Donating an organ or bone marrow exacts financial costs and Fig. 1A and B describe how the volume and composition of organ
at least some risk of pain and immediate and future health risks. donations have changed over time. Kidneys are the most common
Even though payers of organ and bone marrow transplants also organ transplanted, followed by livers. Together these account for
pay the costs of recovery, both types of donors face costs in most of the organs transplanted in the U.S. and over 99% of all
terms of time away from work, travel, and lodging. Prohibited living donor organs. Kidney, liver and bone marrow donations gen-
by law from paying direct compensation to donors, states have erally increased until the late 2000s. This pattern, however, is much
attempted to address the organ shortfall by offsetting the inci- stronger for kidneys vis-à-vis livers and bone marrow. An overall
dental costs associated with donation and protecting employees upward trend in donations by both cadaveric and living donors
from employer retaliation for work absences related to dona- exists until the late 2000s. Cadaveric donations far exceed those
tion. The health risk remains: 3.1 deaths per 10,000 donors for from living donors and underscore how infrequently compatible
kidney donors and as high as 17 per 10,000 for liver donors living donors come forward, although every human is born with
(Muzaale et al., 2011; Segev et al., 2010). In addition, donors may two kidneys and a liver that can lose up to 70% of its size and still
experience non-fatal complications including pain, infection, and re-grow.
hemorrhaging.12 The highly detailed organ data include many variables associ-
ated with the medical procedure and demographics of both the
transplant recipient and the donor. Table 2 reports summary statis-
2.2. Leave and tax legislation
tics. On average, cadaveric donations dominate living donations
with kidneys being the more common organ donated relative to
States have attempted to address the organ and bone marrow
livers. Males donate more organs overall, but this is due to more
shortfalls through a variety of methods that diminish the financial
males donating cadaveric organs, while females dominate living
barriers to donation: leave for state employees, leave for private
donation. A variety of explanations could exist – men are more
employees, tax credits for employers, and tax deductions for indi-
likely to die in accidents and tend to die earlier. Women may
viduals. In general, laws granting leave offer up to 30 days for
have more flexible work lives and might simply be more altru-
organ donation and up to one week for bone marrow donation.
istic regarding live donation or more cautious about opting in to
Tax deductions for individuals cover non-medical donation-related
cadaveric organ donation. For livers, almost 94% of transplant recip-
expenses up to a maximum of $10,000. Tax credits for employ-
ients receive a cadaveric donor organ, while for kidneys about
ers cover donation-promoting activities, including the provision of
30% receive a living donor organ. These differences suggest that
paid leave to donors. Table 1A (organs) and Table 1B (bone mar-
we should break down our main analyses by both gender and
row) list the dates of passage for each type of legislation by state
organ. We also have information on the donor-recipient relation-
(details and legal references by state are listed in Table A1 in the
ship. Seventy-two percent of live donors are biologically related
online Appendix: Lacetera et al., 2013b). In 1989, Colorado passed
to the transplant recipient. This likely arises in part due to donor-
the first relevant legislation providing state employees leave for the
recipient capability issues, which mean family members are more
donation of an organ or bone marrow. Since then, most states have
likely to match the recipient’s blood type and other match factors
implemented legislation removing financial disincentives to dona-
than a random person from the general population. Approximately
tion through leave or tax legislation. For organ donation, thirty-one
18% of living donors are not spouses and are biologically unre-
states offer leave for state employees, seven states offer leave for
lated.
11
This may be a lower bound since UNOS does not track information on all deaths.
13
Gortmaker et al. (1996) estimated an eligible donor pool of 13,700 per year based For the regressions that analyzed organ quality changes, non-reporting of
on a study of 69 acute care hospitals in the U.S. covariates leads to a reduction in observations. In the quality regressions using six-
12
www.transplantliving.org/livingdonation/outcomes/risks.aspx., accessed month and three-year survival rates as the outcome variable, the panel is shortened
03/26/12. to give all recipients sufficient follow-up time.
Table 1A
State laws for organ donors. This table reports the year of introduction of the different laws analyzed in this study in the states where at least one of these laws was enacted,
with reference to organ donations. The following states have none of the laws: AL, AZ, FL, KY, MI, MT, NE, NV, NH, NJ, SD, TN, VT, WY.
State Leave for state employees Leave for private employees Tax deduction for individuals* Tax credit for employers
Alaska 2008
Arkansas 2003 2005 2005 2005
California 2002
Colorado 1989
Connecticut 2007 2004
Delaware 2001
Georgia 2002 2004
Hawaii 2005
Idaho 2006 2006
Illinois 2002 2005
Indiana 2002
Iowa 2003 2005
Kansas 2001
Louisiana 2005
Maine 2002 2002
Maryland 2000
Massachusetts 2005
Minnesota 2006 2005
Mississippi 2004 2004 2006
Missouri 2001
New Mexico 2007 2005
New York 2001 2006
North Carolina
North Dakota 2005 2005
Ohio 2001 2007
Oklahoma 2002 2008
Oregon 1991
Pennsylvania 2006 2006
Rhode Island 2009
South Carolina 2002 2006
Texas 2003
Utah 2002 2005
Virginia 2001 2007
Washington 2002
West Virginia 2005
Wisconsin 2000 2004
*Idaho has an individual tax credit rather than tax deduction.
3.2. Bone marrow 4. Empirical strategy
Bone marrow donation data were obtained from the National Our empirical strategy exploits the fact that different states
Marrow Donor Program (NMDP), the largest bone marrow reg- introduced legislation in different years; we use variation both
istry in the U.S. The 14,463 transplants in our data do not include across and within states over time to identify the effect of the leg-
approximately 30% of the total donations recorded by NMDP. These islation on a series of outcomes of interest. We estimate a linear
donations were made by members of the military and by donors for model as follows15 :
whom no state of residence was recorded. Table 3 contains descrip-
Ykt = LEAVEkt ıleave + TAXkt ıtax + Xkt ˇ + k + t + ãkt + εkt (1)
tive statistics for bone marrow donations. Both males and females
donate at equal rates. Apheresis is a less common type of donation, In Eq. (1), Ykt is the outcome variable in state k in year t, and
largely due to its later uptake. Our data document no apheresis we consider three main outcome variables: the number of organ
donations prior to 1999. In 2009, 0.7 apheresis donations occurred donors standardized by one million population, the number of
along with 0.3 aspiration donations, per one million individuals. bone marrow donations per one million population, and post-organ
transplant survival rates.16 LEAVEkt and TAXkt are indicators for
3.3. Legislation
The legislative data are compiled from donor program websites 15

In the online Appendix (Table A6 for organs and A9 for bone marrow) we also
(www.optn.org and www.ncls.org), state government websites, report the estimates from different specifications: a model where the outcome vari-
able is the number of donations (count variable); a log-linear model where the
and searches of state laws via Nexis®. We categorize the leave
outcome variables are entered in natural logarithm (plus 1); and fixed effect Pois-
incentives into leave for employees of the state government son model (considering the discrete, non-negative nature of our outcome variables).
(“state employees”) and leave for private sector employees (“pri- The results are very similar to those from the models reported here. Online Appendix
vate employees”). Taxes fall into two categories: individual tax Figures A1 and A2 offer comparisons of specifications.
16
deductions of up to $10,000 or employer tax credits for donation- We use the state of the donor and not of the transplant, i.e., our outcome is
the number of donations that resulted in transplants per million individuals in a
related expenses including promotional activities and paid leave state regardless of whether any of those transplants occurred in that state. We do
for donation.14 so because the legislation benefits residents of the state, but do not specify that an
individual must donate in the state where he/she lives in order to enjoy the bene-
fits. Using the state where the transplant took place would have been problematic
because a given transplant center might perform transplants of organs originat-
14
Virginia does not set a maximum and Idaho allows for a tax credit of up to $5,000. ing in different states potentially with different legislation, which would have
Table 1B
State laws for bone marrow donors. This table reports the year of introduction of the different laws analyzed in this study in the states where at least one of these laws was
enacted, with reference to bone marrow donations. The following states have none of the laws: AL, AZ, FL, KY, ME, MI, MT, NV, NH, NJ, NC, SD, TN, VT, WY.
State Leave for state employees Leave for private employees Tax deduction for individualsa Tax credit for employers
Alaska 2008
Arkansas 2003 2005 2005 2005
California 2002
Colorado 1989
Connecticut 2004 2004
Delaware 2001
Georgia 2002 2004
Hawaii 2005
Idaho 2006 2006
Illinois 2002 2005
Indiana 2002
Iowa 2003 2005
Kansas 2001
Louisiana 1992 1992 1992
Maryland 2000
Massachusetts 2005
Minnesota 1990 2004 2005
Mississippi 2004 2004 2006
Missouri 2001
Nebraska 1992 1992
New Mexico 2007 2005
New York 2001 2007 2006
North Dakota 2005 2005
Ohio 2001 2007
Oklahoma 2002 2008
Oregon 1991 2002 1991
Pennsylvania 2006 2006
Rhode Island 2009
South Carolina 2002 2002
Texas 2003
Utah 2002 2005
Virginia 2001 1997
Washington 2002
West Virginia 2005
Wisconsin 2000 2004
a
Idaho has an individual tax credit rather than tax deduction.
Table 2
Descriptive statistics – Organ transplants per one million population. Data are from OPTN and cover the period from 1/1/1988 through 12/31/2008.
Variable State-year observations Mean Standard deviation Minimum Maximum
Total 1050 66.5 19.2 13.0 138.4

Live 1050 16.7 8.4 0.0 59.2
Cadaveric 1050 49.8 14.9 12.0 116.2
Male 1050 37.3 11.6 0.0 88.1
Female 1050 29.1 10.0 0.0 76.9
Livers 1050 16.0 6.1 0.0 45.1
Kidneys 1050 46.5 12.4 13.0 99.4
Live – male 1050 7.1 3.8 0.0 32.7
Dead – male 1050 30.3 10.1 0.0 78.1
Live – female 1050 9.6 5.1 0.0 36.1
Dead – female 1050 19.5 7.1 0.0 55.1
Live – Livers 1050 0.6 0.9 0.0 6.1
Dead – Livers 1050 15.4 5.9 0.0 44.4
Live – Kidneys 1050 16.1 7.9 0.0 59.2
Dead – Kidneys 1050 30.4 8.3 10.9 70.1
Table 3
Descriptive statistics – Bone marrow. The data for this table come from the NMDP and cover the period from 1/1/1987 through 12/31/2009.
Variable Observations Mean Standard deviation Minimum Maximum
Bone marrow transplants per 1 m population

Total 1150 3.6 3.3 0.0 21.3
Female 1150 1.6 1.6 0.0 13.7
Male 1150 2.0 2.0 0.0 15.2
Apheresis versus aspiration per 1 m population

Apheresis 1029 2.3 3.3 0.0 22.7
Aspiration 1029 3.7 2.9 0.0 33.0
contaminated the results. Moreover, not all states have a transplant center; for
example, Alaska and Idaho have laws encouraging organ donation but do not have a
transplant center. We thank an anonymous referee for prompting us to clarify this
important point.
Fig. 1. Donations by year. A: Total donations by type of donation. B: Total donations by type of donor.
whether state k had leave legislation or tax legislation in place, capita might be more likely to introduce the legislation, perhaps
respectively, in year t. More specifically, we include indicators for due to greater familiarity with donation in the population or the
whether a state has leave provisions for state employees, leave transplant community’s outreach efforts. In that case, a positive
provisions for private employees, tax deductions for individuals, coefficient on the tax indicators might simply reflect this underly-
and/or tax credits for employers. The vector of controls Xkt includes ing heterogeneity rather than an effect of the law. The opposite is
state-level income per capita and the unemployment rate, which also possible; in other words, states with lower levels of organ or
could affect the availability and accessibility of transplant surgery. bone marrow donations per capita may be more likely to adopt the
In the organ quality regressions, we also include donor, match, and legislation in response to a shortage of organs. That case would bias
patient characteristics to control for the differences in the types our coefficient estimates downward. The inclusion of state fixed
of donors, matches, and patients across states. Year fixed effects effects mitigates the bias that would occur if states adopted legisla-
(t ) account for aggregate factors that might affect the outcome tion based on the level of the outcome variable. Further, we include
variables, including nation-wide policy changes as well as secu- state-specific time trends to account for the possibility that states
lar trends in attitudes toward organ and marrow donation. k are with systematically lower or higher growth rates of the outcome
state fixed effects, kt are state-specific time trends, and εkt is an variable might be more likely to adopt the legislation.
error term. The main coefficients of interest, ıleave and ıtax , measure To probe our identification strategy, we checked whether the
the within-state effect of passing a given type of legislation on the passage of the legislation in a given state-year correlates with the
number of donations per million population, controlling for factors lagged (one year) cumulative number of waitlist candidates per one
affecting all states in a given year, and state-specific fixed effects million population; the lagged number of waitlist candidates year
and time trends. In all regressions, standard errors are clustered at per one million population; the lagged cumulative number of indi-
the state level to account for serial or other forms of correlation in viduals who ever left the waitlist dead or too sick for a transplant
the state law indicators (Bertrand et al., 2004). and the lagged number of waitlist candidates who died or were
Not all states have introduced such laws and different states too sick for transplant, per one million population. We use these
have introduced the legislation at different times. This raises the variables to proxy for lagged values of cumulative demand, current
question of whether adoption of these laws and the timing of adop- demand, cumulative excess demand, and current excess demand.
tion are correlated with our outcomes of interest. For instance, We regress a dummy variable equal to one if a law was passed in
states with systematically higher levels of organ donations per the current year on lagged values (from the prior year) of these
Table 4
Probability of legislation passing. This table reports linear probability regression estimates of the probability of a given law passing. The data include the full unbalanced
panel. State-level controls include unemployment rate and income per capita. All regressions include state and year fixed effects and state time trends. State-level clustered
standard errors are in parentheses. ***p < .01, **p < .05, * p < .1.
Outcome = probability of law passing
(1) (2) (3) (4)

Leave for state Leave for private Tax deductions for Tax credits for
employees employees individuals employers
Cumulative waitlist per 1 m population (prior year) −0.014 0.006 −0.003 −0.004
(0.027) (0.001) (0.019) (0.009)
Observations 1000 1000 1000 1000

R-squared 0.12 0.12 0.15 0.14
Cumulative waitlist per 1 m population minus cumulative 0.068 −0.031 0.086* 0.009
removals from waitlist per 1 m population since 1988 (prior (0.092) (0.035) (0.051) (0.030)
year)
Observations 1000 1000 1000 1000

R-squared 0.12 0.12 0.15 0.14
Yearly number of candidates dying on the waitlist or deteriorating −0.189 0.086 0.282 0.019
until too sick for transplant per 1 m population (prior year) (0.335) (0.148) (0.199) (0.065)
Observations 1000 1000 1000 1000

R-squared 0.12 0.12 0.15 0.14
Cumulative candidates dying on waitlist −0.163 0.028 −0.0252 −0.0053

or deteriorating until too sick for transplant per 1 m population (0.127) (0.020) (0.079) (0.034)
(prior year)
Observations 1000 1000 1000 1000

R-squared 0.12 0.12 0.15 0.14
waitlist variables, and report the results in Table 4. The estimated Additionally, the employer tax credits could (in some states) apply
coefficients are generally small and not statistically significant, to expenses incurred in promoting cadaveric donation as well as
indicating that prior values of these variables did not generally have living donation. For these reasons, it is informative to study the
any discernible effect on whether a state passed a law. (specifica- effect of the legislation on both types of donations separately.
tions where we used two- or three-year lags yield similar results.) The results shown in column [1], which do not control for state
These results, together with the inclusion of year and state effects or year effects, indicate a positive correlation between the number
and state-level time trends in the regressions, support the validity of transplants and the existence of leave for state employees and
of our identification strategy. leave for private sector employees; however, the coefficient esti-
mates dramatically drop in magnitude and statistical significance
5. Results when year and state fixed effects are introduced in the specifica-
tion. In column [5] we also add state-specific linear time trends,
5.1. Organ donations and again all of the coefficients of interest are estimated to be
small and not statistically significant. These results underscore the
We first estimate model (1) using total organ transplants per importance of accounting for state-level heterogeneity and aggre-
one million individuals in a given state and year as the dependent gate time effects. Specifically, it appears that the positive estimated
variable in columns [1] through [5] of Table 5, and living and cadav- coefficients on the legislation indicators in column [1] are reflecting
eric donor transplants separately in columns [6] and [7]. Although a general trend of increasing donations over time. Breaking down
the tax and leave legislation generally target living donors, we the analysis by live and cadaveric donations in columns [6] and [7]
consider donations from both living and deceased donors for two does not change the main results.18
main reasons. First, one could postulate that donations from living
and deceased donors follow a common time trend, and precisely
because the legislation targets living donations, the donations
by deceased donors might be seen as a benchmark, or “control” highest quality organs to be selected. Donor factors correlated with lower survival
group. The second reason to study the effects on both living and outcomes after liver transplantation include donors over 40 years old, donation
after cardiac death rather than brain death, partial rather than whole liver grafts,
deceased donations is that these types of donations might actually
African-American race, shorter donors, and cerebrovascular causes of death (Feng
be substitutes, in which case donations from deceased donors could et al., 2006). For kidneys, donor characteristics associated with poorer transplant
not be used as a control because under this hypothesis they would outcomes include age, cerebrovascular causes of death, renal insufficiency (serum
be affected by the legislation. For example, the waitlist only applies creatinine over 1.5 mg/dL) and a history of hypertension (Port et al., 2002).
18
We performed a number of additional analyses to probe the (null) results on
to potential recipients of cadaveric organs. If more living organs are
organs, and report them in the online Appendix: (1) We considered the possibility
donated, these individuals will drop from the waitlist, which might that effects may differ for men and women due in part to men’s greater attachment
enable the donor-recipient matching process to be more selec- to the workforce, the target of the tax and, especially, the leave legislation (Table
tive as to which cadaveric organs are used in transplantation.17 A2); (2) We differentiated between kidneys and livers (Table A3); (3) We run the
regressions on the number of donors who are biologically related, and on the number
of donors who are not biologically related (both including and excluding spouses).
The vast majority of organ donations occurs between biologically related individuals
17
Organs vary in quality, so without an increase in the number of transplants, or between spouses, and one could imagine that leave and tax incentives might have
an increase in the number of living donors should lead to a smaller group of indi- a stronger impact on non-related potential donors (Tables A4 and A5). All of these
viduals accessing the same pool of cadaveric donor organs, thus allowing only the additional analyses confirm our initial finding of no effect from the passage of the
Table 5
Organs – Baseline results. This table reports the regression estimates on the effects of the laws on organ transplants (per one million individuals). The data include the full
unbalanced panel. State-level controls include unemployment rate and income per capita. State-level clustered standard errors in parentheses. ***p < .01, **p < .05, *p < .1.
(1) (2) (3) (4) (5) (6) (7)

Type of donations All All All All All Cadaveric Live
Leave for state employees 12.05*** −0.16 6.23*** 0.02 −1.45 −1.71 0.26
(3.64) (2.76) (2.15) (1.80) (2.25) (1.94) (1.02)
Leave for private employees 9.2 1.84 9.76* 5.97 6.85 8.17 -1.31
(5.75) (5.52) (5.64) (4.58) (6.66) (6.76) (1.46)
Tax credits for employers 6.19 −3.04 1.54 −0.66 1.11 2.03 −0.92
(5.29) (4.29) (3.78) (5.29) (5.92) (6.61) (1.03)
Tax deductions for individuals 10.72** 2.14 −1.41 0.47 −0.33 1.49 −1.82
(4.84) (4.45) (4.23) (2.58) (3.79) (3.49) (1.34)
Year fixed effects X X X X X

State fixed effects X X X X X
State-year fixed effects X X X
Observations 1050 1050 1050 1050 1050 1050 1050

R-squared 0.19 0.48 0.57 0.71 0.75 0.64 0.84
Table 6
Organs – Results controlling for and interacting with state employment rate. This table reports the regression estimates on the effects of the laws on organ transplants (per
one million individuals). The specifications here add variables with information on the share of state employees (overall and full time) over the whole labor force. The data
include the full unbalanced panel. State-level controls include unemployment rate and income per capita. All regressions include state and year fixed effects and state time
trends. State-level clustered standard errors in parentheses. ***p < .01, **p < .05, *p < .1.
(1) (2) (3) (4) (5) (6)

Type of donor All All Live Live Cadaveric Cadaveric
Leave for state employees −6.37 −7.64 1.85 1.62 −8.23 −9.26*
(5.61) (5.64) (3.51) (3.10) (4.75) (4.92)
Leave for private employees 7.09 7.29 −1.33 −1.23 8.42 8.52
(6.72) (6.73) (1.42) (1.43) (6.82) (6.86)
Tax credits for employers 1.22 1.26 −0.85 −0.82 2.07 2.08
(5.76) (5.52) (1.01) (1.02) (6.39) (6.15)
Tax deductions for individuals −0.53 −0.17 −1.78 −1.81 1.25 1.63
(3.72) (3.83) (1.30) (1.32) (3.40) (3.48)
State employees/labor force 1.52 0.21 1.31
(2.07) (0.92) (2.23)
Leave for state employees* (state employees/labor force) 1.13 −0.36 1.49
(1.48) (0.88) (1.18)
Full-time state employees/labor force 5.08 0.70 4.38
(3.06) (1.32) (3.46)
Leave for state employees*(full-time state employees/labor force) 1.97 −0.43 2.41
(2.04) (1.08) (1.70)
Observations 1050 1050 1050 1050 1050 1050
R-squared 0.75 0.75 0.84 0.84 0.64 0.64
In the case of leave for state employees, we also analyze the pos- 5.2. Bone marrow donations
sibility that the population of state employees must be of a certain
minimum size in order for the leave laws to have an effect on the As described in Section 2.2, many states have passed legislation
number of donors. To this aim, we run our main regressions con- for bone marrow donors that is separate from that for organ donors,
trolling for the number of state employees, both total and full-time, but the legislation is similar in spirit. The primary difference is that
normalized by the total labor force at the state-year level (to paral- leave allowances tend to be shorter for bone marrow donors. Addi-
lel the construction of the left-hand side variable), and interact this tionally, while the legislation regarding organs does not present
variable with the law indicator. We report the results in Table 6. much variation in the generosity of the provisions across states,
The estimated interaction effects are again small in magnitude and considerable variation exists in the laws providing leave for state
not statistically significant, confirming our previous finding that employees to donate bone marrow as shown in Table 7. Twelve
the legislation did not significantly affect organ donations. states provide leave of less than one week and nineteen states
provide leave of greater than or equal to one week. Three states
Table 7
Generosity of leave for state employees – number of states.
Leave for state employees to donate Organs Bone marrow

legislation. In analyses not reported, we performed our tests using a balanced panel
2 days paid 1 1
that includes only five years before passage of any tax or leave legislation and five
5 days paid 2 11
years after the passage of that first legislation (omitting the year of introduction
7 days paid 0 13
of the legislation), to reduce the influence of the long lags, in most cases, before
10 days paid 1 1
the passage of legislation. We also increased the frequency of the observations to
20 days paid 2 2
months and quarters and manipulated the length of the balanced panel. Lastly, we
30 days paid 23 3
also tested whether the generosity of leave provisions might be important. None of
Unpaid leave 3 3
these other approaches significantly altered our results. In all cases, we obtained a
Total 32 34
null result.
Table 8
Bone marrow – Baseline results. This table reports the regression estimates on the effects of the laws on bone marrow donations (per one million individuals). The data
include the full unbalanced panel. State-level clustered standard errors are in parentheses. *** p<.01, ** p<.05, * p<.1.
(1) (2) (3) (4) (5)
Leave – state 0.80* −0.72 −0.01 −0.82* 0.11

(0.44) (0.61) (0.46) (0.48) (0.41)
Leave – private −0.45 −0.96* −0.65 −0.82 0.24
(0.37) (0.57) (0.47) (0.53) (0.49)
Tax – employer −0.15 −0.03 0.13 0.12 0.79
(0.56) (0.81) (0.68) (0.55) (0.79)
Tax – individual 0.01 −0.99* −1.19* −0.88* −0.13
(0.35) (0.51) (0.48) (0.44) (0.42)
Year fixed effects X X X
State fixed effects X X X
State-year fixed effects X
Observations 1150 1150 1150 1150 1150
R-squared 0.19 0.43 0.53 0.66 0.79
provide unpaid leave. This cross-state variation allows us to probe observations in the data. The point estimates imply positive
whether the effect of the laws, if any, is stronger when the size of effects statistically significant at the 10% confidence level for state
the benefit is bigger. employment percentages one standard deviation above the
The results are presented in Tables 8 through 10. Once again, in mean,19 and significant at the 5% level for state employment
the most conservative specification reported in Table 8 (including percentages two standard deviations higher than the mean, and
year and state fixed effects and state time trends) we find no evi- negative effects that are never statistically significant (not even
dence that leave or tax legislation had any impact on the number marginally) for state employment rates within the sample. In
of bone marrow donations (normalized by one million population). column [2], we use the rate of full-time employees (which account
In Table 9, we check whether the burden to the donor in terms of for 72% of state employees), and the regressions yields very similar
pain and suffering is important for these laws to have an effect. As results: the effect of leave legislation on bone marrow donations
described earlier, apheresis is less burdensome to the donor than is positive if full-time state employees are at least 2.9% of the
aspiration in terms of pain and suffering and potentially time away state labor force, which again holds for about half of state-year
from work. Although the generosity of the leave provisions does observations in the sample.
affect donation, the results from Table 9 indicate that the burden of Columns [3] through [6] explore the importance of the generos-
the procedure itself does not appear to impact the decision of the ity of leave. We split the leave for state employees variable into
marginal donor. three categories: leave of less than one week, leave of greater than
Once we control for the size of the state employees population one week, and no leave for state employees (the omitted category).
(normalized by the state labor force to parallel the way we con- The National Marrow Donor Program reports that marrow donors
structed the outcome variable), however, we find that the laws should “be able to return to work, school and any other activities
providing leave for state employees have a positive effect on the within one to seven days,” (NMDP, 2013). In other words, although
number of bone marrow donors per one million population, pro- a donor might not need a full seven days to recover, the donor
vided that the share of state employees – i.e., the beneficiaries of the might be more hesitant to donate in a state in which the maximum
legislation – is sufficiently large. In column [1] of Table 10, the inter- allowed leave is less than the higher end of the expected recov-
action term between the law providing leave for state employees ery period. As shown in columns [3] and [4], only leave of greater
and the rate of state employment has a positive, statistically sig- than or equal to one week has a positive impact on the number of
nificant coefficient. The point estimates of about −1.5 on leave for bone marrow donors per one million population. Again, the effect
state employees (marginally significant) and of 0.39 on the inter- is stronger for full-time employees.
action term (significant at the 5% level) imply that leave for state In columns [5] and [6] we test the importance of paid leave
employees has a positive effect if state employees represent about and further corroborate our results that incentive size matters for
4% of the labor force, which is true for roughly half of the state-year promoting donation. The coefficients on the interaction terms are
larger and more statistically significant than when we included
Table 9
unpaid leave in our leave for state employees variables, but again,
Bone marrow – Donations by method. This table reports the regression estimates the positive effect only exists for leave of greater than or equal to
on the effects of the laws on bone marrow donations (per one million individ- one week.
uals), by method of donation. The data include the unbalanced panel limited to
the state-years in which apheresis donations were recorded (i.e., post 1998). All
regressions include state and year fixed effects and state time trends. State-level
clustered standard errors are in parentheses. ***p < .01, **p < .05, *p < .1. 5.3. Effect on the quality of organs
(1) (2)
Apheresis donations Aspiration donations Although we found no effect of these types of legislation on
the quantity of organ donation, we explore the possibility that
Leave for state employees −0.20 0.29
(0.42) (0.34)
these laws could have shifted the quality composition of organs
Leave for private employees 0.42 −0.63**
(0.36) (0.29)
Tax credits for employers 1.27** 0.43
(0.59) (0.39) 19
On average across all state-year observations, state employees represent 4.3%
Tax deductions for individuals −0.68 0.29
of the labor force (standard deviation = 1.5). The lowest rate observed in the sample
(0.67) (0.53)
is 2.3% and the maximum is 11.8%. Full-time state employees are 3.2% of the labor
Observations 547 547
force on average (standard deviation = 1.2), the lowest rate is 1.6% and the highest
R-squared 0.86 0.81
8.6%.
Table 10
Bone marrow – Results controlling for and interacting with state employment rate. This table reports the regression estimates on the effects of the laws on bone marrow
donations (per one million individuals). The specifications here add variables with information on the share of state employees (overall and full time) over the whole labor
force. The data include the full unbalanced panel. State-level controls include unemployment rate and income per capita. All regressions include state and year fixed effects
and state time trends. State-level clustered standard errors in parentheses. ***p < .01, **p < .05, *p < .1.
(1) (2) (3) (4) (5) (6)
Type of leave for state employees Any Any Any Any Paid Paid
Leave for state employees −1.54* −1.44*
(0.83) (0.75)
Leave for state employees < one week 0.96 1.41 0.96 1.44
(2.27) (2.53) (2.28) (2.55)
Leave for state employees > = one week −1.58** −1.51** −2.17*** −2.00***
(0.79) (0.71) (0.72) (0.65)
Leave for private employees 0.18 0.1 0.18 0.17 0.19 0.18
(0.53) (0.52) (0.49) (0.48) (0.53) (0.53)
Tax credits for employers 0.75 0.77 0.78 0.74 0.78 0.73
(0.87) (0.85) (0.95) (0.93) (0.97) (0.95)
Tax deductions for individuals −0.23 −0.18 −0.19 −0.18 −0.23 −0.2
(0.42) (0.40) (0.39) (0.39) (0.39) (0.39)
State employees/labor force −0.22 −0.23 −0.22
(0.34) (0.34) (0.34)
Leave for state employees*(state employees/labor force) 0.39**
(0.18)
Full-time state employees/labor force −0.54 −0.56 −0.56
(0.45) (0.44) (0.44)
Leave for state employees*(full-time state employees/labor force) 0.50**
(0.23)
Leave for state employees < one week*(state employees/labor force) −0.36 −0.36
(0.59) (0.59)
Leave for state employees > = one week*(state employees/labor force) 0.41** 0.53***
(0.18) (0.15)
Leave for state employees < one week*(full-time state employees/labor force) −0.71 −0.72
(0.96) (0.97)
Leave for state employees > = one week*(full-time state employees/labor force) 0.54** 0.67***
(0.23) (0.19)
Observations 1150 1150 1150 1150 1150 1150
R-squared 0.79 0.79 0.79 0.79 0.79 0.79
used for transplant.20 One way this could happen is if the legis- We consider two measures of the quality of the organ trans-
lation has opposed effects on different types of living donors. For planted, the total number of grafts functioning for at least six
example, the laws might have led to increased donations among months and for at least three years as a share of the total num-
less intrinsically motivated donors and decreased them among the ber of transplants. Tables 11 and 12 show descriptive statistics for
more intrinsically motivated and the latter are donors of higher the quality outcome variables. In all of our regressions, we include
“quality,” on average (Titmuss, 1971).21 Another possibility is that a range of match, donor, and recipient characteristics that could
the laws are affecting living and deceased donations differently. affect survival.23 Obviously, longer time periods are associated with
Fernandez et al. (2013) measure a substitution effect between live higher death rates. Table 12 also shows that the means survival
and cadaveric donors. A shift in the distribution of donors between rates for recipients of living donations are higher than for recipients
living and cadaveric could also lead to a shift in the quality distri- of cadaveric donor organs.
bution if those two donor sources lead to systematically different The regression results in Table 13 (from a linear probability
survival outcomes. Both medical and social factors could lead to model) show, overall, no effect of these laws on the quality of the
differences in the outcomes yielded by these two donor sources. organs donated as measured by the six-month state-level survival
Although living donors do tend to be older and, therefore, less likely rate. For the three-year survival period, however, we find an
to yield high-quality organs, the timing of donation can be opti- overall (i.e., in column 4, where we consider all types of donations)
mized with living donors. The timing is important for two reasons: positive effect from leave for state employees and tax credits for
one, because organs rapidly deteriorate without a blood supply, and employers, although the estimated coefficients are only marginally
two, because live donation ensures that the recipient and donor statistically significant. We also find a negative effect on three-year
are in the best health possible at the time of transplantation. With
cadaveric donation the timing of the transplant is entirely deter-
mined by the time of death of the donor. Regarding social factors,
transplantation by 6.5% and 2.6% of kidney and liver transplant centers, respec-
living donation may proxy for a better social support network, tively. The relative contraindication percentages are 67.4% and 33.5% for kidneys
which could improve longer term survival.22 and livers, respectively (Levenson and Olbrisch, 1993). Although we are unaware
of any studies directly testing the effect of social networks on post-transplant sur-
vival among liver and kidney recipients, authors have documented its importance
in heart transplantation (Bohachick et al., 2002) and long-term dialysis outcomes
20
Because survival data are not available for bone marrow transplants, we must (Thong et al., 2007). A review of 122 studies across medical fields suggests that
limit the quality analysis to just organ donation. social support is important for patient adherence to medical treatment (DiMatteo,
21
Note that the opposite could also happen, with the more intrinsically motivated 2004). Conversations with personnel at the University of Michigan’s transplant cen-
donors being lower-quality donors (Healy, 2006). If these donors have lower moti- ter also strongly confirm a strong social support network as crucial for long-term
vation to give in the presence of incentives, the quality of the resulting pool of organs post-transplant survival.
23
will actually increase. The controls used mirror those used by the Scientific Registry of Transplant
22
n a study of 289 transplant centers, the lack of a support person avail- Recipients to calculate transplant-center-level expected survival rates and include
able to the transplant recipient was viewed as an absolute contraindication to age, gender, race, live donor, pediatric recipient, and underlying diagnosis.
Table 11
Descriptive statistics – donor, match, and patient characteristics. The percentages and averages in this table are calculated for the whole sample. Subsample rates are used
in subsample analyses.
Incompatible recipient-donor blood types (%) 1050 15% 12% 0% 68%

Pediatric patients (<18 years) (%) 1050 8% 3% 0% 30%
Multi-organ recipients (%) 1050 2% 2% 0% 15%
With diabetes diagnosis (%) 1050 16% 5% 0% 38%
With non-cholestatic liver disease (%) 1050 55% 13% 0% 100%
Cold ischemia time 1050 13 3 6 26
Recipient age 1050 44 4 32 55
Days on waitlist 1050 393 130 55 880
Table 12
Descriptive statistics – quality outcome variables.
Survive six months – all donor types 1050 86% 7% 54% 100%
Survive three years – all donor types (%) 1050 62% 24% 0% 100%
Survive six months – live donors (%) 1047 93% 7% 0% 100%
Survive six months – cadaveric donors (%) 1050 84% 7% 44% 100%
Survive six months – female donors 1050 86% 8% 33% 100%
Survive six months – male donors (%) 1050 86% 7% 17% 100%
Survive six months – live female donors (%) 1042 92% 9% 0% 100%
% survive six months – cadaveric female donors 1048 83% 10% 0% 100%
Survive six months – live male donors (%) 1035 93% 10% 0% 100%
Survive six months – cadaveric male donors (%) 1050 85% 8% 0% 100%
Survive six months – kidney donors (%) 1050 89% 6% 55% 100%
Survive three years – kidney donors (%) 1050 65% 25% 0% 100%
Survive six months – live kidney donors (%) 1047 93% 7% 0% 100%
Survive six months – cadaveric kidney donors (%) 1050 87% 7% 43% 100%
Survive six months – liver donors (%) 1046 78% 11% 0% 100%
Survive three years – liver donors (%) 1046 56% 24% 0% 100%
Survive six months – live liver donors (%) 573 80% 30% 0% 100%
Survive six months – cadaveric liver donors (%) 1046 78% 11% 0% 100%
Table 13
Tax and leave effects on quality – all organs. The data in this table include the full unbalanced panel. All survival rates are multiplied by 100 so the coefficients can be
read as percentage changes. All regressions include state and year fixed effects and state time trends. Case mix controls: recipient age, recipient gender, recipient race,
diagnosis = noncholestatic v. other (livers), diagnosis = diabetes v. other (kidneys), live donor, and pediatric recipient. State-level clustered standard errors in parentheses.
*** p<.01, ** p<.05, * p<.1.
(1) (2) (3) (4) (5) (6)

Survival time Six months Six months Six months Three years Three years Three years
Type of donor All Live Cadaveric All Live Cadaveric
Leave for state employees −0.03 0.37 −0.14 1.32* 2.29 0.88
(0.46) (0.77) (0.59) (0.73) (1.52) (0.85)
Leave for private employees −0.22 1.28 −0.12 −0.84 −4.34** 0.63
(0.91) (1.21) (1.47) (1.14) (1.77) (1.35)
Tax credits for employers 1.07 0.33 1.19 1.91* 8.22** 0.47
(1.23) (1.36) (1.75) (1.06) (3.71) (1.12)
Tax deductions for individuals −0.31 1.04 −0.33 2.57 1.09 2.31
(0.72) (0.93) (0.89) (1.92) (2.47) (1.99)
Observations 1050 573 1046 950 946 950

R-squared 0.74 0.58 0.69 0.89 0.74 0.84
survival of recipients of organs from living donors and a positive 6. Discussion and Conclusions
effect on the same group from tax credits to employers (column 5).
In the online Appendix, the quality analysis is reported sepa- Policymakers and scholars have long debated how to overcome
rately for kidneys and livers (Tables A10a and A10b). We find some the shortage of organs and bone marrow in the U.S. In systems
marginally significant positive effects on six-month survival and a based exclusively on altruistic donors, the supply is insufficient to
significant effect on three-year survival from tax credits for liver cover the need, and legal rules and social norms prevent direct com-
transplant recipients. pensation. We analyzed tax and leave laws, which allow donors
Thus, the legislation might be inducing quality improvements in of organs or bone marrow to be, at least financially, not signifi-
survival rates by changing the composition of the donor pool even cantly worse off than before donating. Donating an organ or bone
though it does not increase overall organ donations. Even though marrow is a costly decision for the living donor, and one that may
no systematic pattern emerges, these results do not support the hinge on financial (and work-related) considerations. Because of
hypothesis that incentives lead to a crowding out of high quality these costs, both financial and in terms of pain and suffering for the
donors, at least for the relatively small incentives provided by the donor, the tax- and leave-related incentives may be insufficient to
laws studied herein. significantly lessen the discrepancy between supply and demand,
i.e., the efficacy of these laws is certainly not guaranteed. How- a bone marrow donation, they may be too low for the more “costly”
ever, if an effect is to be found, we anticipated that less costly types organ donations. Similarly, there may be enough individuals at the
of donation, such as bone marrow, or more generous incentives, margin between being willing to donate bone marrow or not, but
such as longer leave, would be associated with higher statistical this may not be the case for organs. In other words, and following
significance and larger effects in our regression analysis. the terminology of Gneezy and Rustichini (2000), the incentives
Our results are consistent with this interpretation; we docu- described here may be “large enough” for bone marrow donations,
ment no impact of the legislation on the number of organ donations, but not for organ donations. The findings from Lacetera and Macis
and a positive impact of leave for state employees on bone marrow (2013) and Lacetera et al. (2011, 2012) of a positive effect of leave
donations when the population affected is sufficiently large. The legislation and $5–15 gift cards on an even less invasive procedure,
positive effect on bone marrow donations is larger for more gen- blood donation, are in line with our interpretation. If this interpre-
erous provisions, such as longer and paid leave. We also find some tation is correct, then we would expect larger incentives to have
evidence suggesting that the legislation might affect the quality positive effects on bone marrow donations and potentially also on
of the organs transplanted, even though no systematic patterns organ donations. More systematic analyses from contexts where
emerged. More research is needed, but this suggests that only such stronger incentives are provided would be needed to reach
focusing on changes in quantity may overlook shifts in the under- firmer conclusions, however. The recent decision on December 1,
lying quality distribution of organs used for transplantation. 2011 by the 9th U.S. Circuit Court of Appeals that bone marrow
A few explanations exist for the lack of an effect of the legisla- apheresis can be compensated will provide researchers with an
tion on the quantity of organs. First, it is possible that not enough opportunity to further our understanding of which policies are
people are aware of the existence of the legislation. UNOS, for effective in reducing the organ and bone marrow demand-supply
example, does not mention these types of legislation in its summary imbalance.27
of information for prospective living organ donors (UNOS, 2013).
The NMDP does, however, mention the existence of laws provid-
ing leave to donors, which also could help explain the stronger References
effect of these types of legislation on bone marrow donations
Abadie, Alberto, Gay, Sebastien, 2006. The Impact of presumed consent legislation
(NMDP, 2013). Second, the results could be confounded by the on cadaveric organ donation: a cross-country study. Journal of Health Economics
existence of grant programs, which already may be providing the 25, 599–620.
same cost reimbursement as the tax laws.24 Employer-specific Ashenfelter, Orley, Card, David, 1985. Using the longitudinal structure of earnings
to estimate the effect of training programs. Review of Economics and Statistics
paid leave programs could further be obscuring the effects of
67, 648–660.
legislation mandating leave for private employees.25 Third, a com- Bagozzi, Richard P., Kam-Hon Lee, M., Frances Van Loo, 2001. Decisions to donate
positional effect might be occurring, whereby some subsets of the bone marrow: the role of attitudes and subjective norms across cultures. Psy-
population are positively motivated by these additional incentives chology and Health 16, 29–56.
Becker, Gary S., Elias, Julio Jorge, 2007. Introducing incentives in the market for live
to donate (on top of their intrinsic motives) whereas others are and Cadaveric organ donations. The Journal of Economics Perspectives 21, 3–24.
“crowded out” (because their self or social image may be tainted Benabou, Roland, Tirole, Jean, 2006. Incentives and prosocial behavior. American
Benabou and Tirole, 2006 or because they consider the presence Economic Review 96, 1652–1678.
Bergstrom, Ted, Garratt, Rod, Sheehan-Connor, Damien, 2009. One chance in a mil-
of material incentives repugnant: Benabou and Tirole, 2006; Roth, lion: altruism and the bone marrow registry. American Economic Review 99,
2007). Fourth, the incentives put in place by these types of legis- 1309–1334.
lation might not be strong enough to induce an individual, who is Bergstrom, Ted, Garratt, Rod, Sheehan-Connor, Damien, 2012. Stem cell match-
ing for patients of mixed race. The B.E. Journal of Economic Analysis & Policy,
not otherwise sufficiently altruistically motivated, to endure the Forthcoming.
pain, suffering, scarring, time away from work and leisure, and Bertrand, Marianne, Duflo, Esther, Mullainathan, Sendhil, 2004. How much should
undocumented long-term donor health effects implied by an organ we trust differences-in-differences estimates? Quarterly Journal of Economics
119, 249–275.
donation. Some evidence also exists that donors occasionally have Bilgel, Firat, 2011. The law and economics of organ procurement. Erasmus School of
difficulty obtaining life and health insurance post-donation (Rudow Law, Rotterdam, Netherlands, doctoral thesis.
et al., 2006; Spital and Jacobs, 2002).26 Untangling these explana- Biyar, Raphael, 2011. Guidelines of the Organ Transplant Steering Committee. Offi-
cial Gazette, Miscellaneous Notices, February 20.
tions is of importance for policymakers interested in increasing
Bohachick, Patricia, Taylor, Melissa V., Sereika, Susan, Reeder, Sara, Anton, Bonnie
and enhancing the supply of organs for transplantation and for B., 2002. Social support, personal control and psychosocial recovery following
the many individuals on the waitlist, who will die without a donor heart transplantation. Clinical Nursing Research 11, 34–51.
organ. Boulware, L.E., Troll, M.U., Plantinga, L.C., Powe, N.R., 2008. The Association of state
and national legislation with living kidney donation rates in the united states:
The positive effect of the legislation on bone marrow donations, a national study. American Journal of Transplantation 8, 1451–1470.
and the fact that the effect was increasing in the generosity of the Byrne, M.M., Thompson, P., 2001. A positive analysis of financial incentives for
legislation, lead us to favor the fourth explanation: although tax Cadaveric organ donation. Journal of Health Economics 20, 69–83.
Confer, D.L., Leitman, S.F., Papadopoulos, E.B., Price, T.H., Stroncek, D.F., Robinett,
breaks and leave provisions may be sufficient to induce, at the mar- P., Braem, B., Gandham, S., 2003. Serious complications following unrelated
gin, individuals to undergo a moderately invasive procedure such as donor marrow collection: experiences of the national marrow donor program® .
Biology of Blood and Marrow Transplantation 10, 13–14.
Dickert-Conlin, Stacy, Todd Elder, Brian Moore, 2012. Donorcycles: motorcycle hel-
met laws and the supply of organ donors working paper. Journal of Law and
Economics, forthcoming.
24
For the particularly financially constrained, organizations such as The National DiMatteo, Robin M., 2004. Social support and patient adherence to medical treat-
Living Donor Assistance Center (www.livingdonorassistance.org) provide grants to ment: a meta-analysis. Health Psychology 23, 207–218.
cover the costs of donation, which may leave the legislation with little room to have Duenwald, Mary, Shipley, David, 2011. Better Incentives for Organ Donors Can
Thwart Black Market, View. www.businessweek.com, November 2.
an impact.
25 Feng, S., Goodrich, N.P., Bragg-Gresham, J.L., Dykstra, D.M., Punch, J.D., DebRoy, M.A.,
The American Society of Transplantation publishes an incomplete list
Greenstein, S.M., Merion, R.M., 2006. Characteristics associated with liver graft
of the names of private companies offering their employees paid leave to
donate (http://www.myast.org/sites/default/files/legacy pdfs/public policy/ELOD
Participating Institutions.pdf).
26
Perhaps suggestive of the issues with insuring organ donors, The Living
27
Organ Donor Protection Act, which would have ensured donors could not be The apheresis technique did not exist at the time of the National Organ Trans-
denied coverage or charged surcharges by health insurers, died in Committee. plant Act in 1984 (Korbling and Freireich, 2011). This highlights the importance of
http://www.govtrack.us/congress/bills/111/hr1558. legislative evolution to match scientific innovation.
failure: the concept of a donor risk index. American Journal of Transplantation Muzaale, Abimereki D., Dagher, Nabil N., Montgomery, Robert A., Taranto, Sarah E.,
6, 783–790. Mcbride, Maureen A., Segev, Dorry L., 2011. Estimates of early death, acute liver
Fernandez, Jose M., David Howard, Lisa Kroese, 2013. The effect of cadaveric kid- failure, and long-term mortality among live liver donors. Gastroenterology 142,
ney donations on living kidney donations: an instrumental variables approach. 273–280.
Economic Inquiry 51 (3), 1696–1714. Port, Friedrich K., Jennifer, L., Bragg-Gresham, Robert A., Metzger, Dawn M., Dykstra,
Frey, Bruno S., 1993. Motivation as a limit to pricing. Journal of Economic Psychology Brenda W., Gillespie, Eric, Young, W., Delmonico, Francis L., Wynn, James J.,
14, 635–664. Merion, Robert M., Wolfe, Robert A., Held, Philip J., 2002. Donor Characteristics
Gneezy, Uri, Rustichini, Aldo, 2000. Pay enough or don’t pay at all. The Quarterly associated with reduced graft survival: an approach to expanding the pool of
Journal of Economics, 791–810. kidney donors. Transplantation 74, 1281–1286.
Gortmaker, S.L., Beasley, C.L., Brigham, L.E., Franz, H.G., Garrison, R.N., Lucas, B.A., Rodrigue, J.R., Crist, K., Roberts, J.P., Freeman Jr., R.B., Merion, R.M., Reed, A.I., 2009.
Patterson, R.H., Sobol, A.M., Grenvik, N.A., Evanisko, M.J., 1996. Organ donor Stimulus for organ donation: a survey of the american society of transplant
potential and performance: size and nature of the organ donor shortfall. Critical surgeons membership. American Journal of Transplantation 9, 2172–2176.
Care Medicine 24, 432–439. Roth, Alvin E., Tayfun Sonmez, M., Utku Unver, 2004. Kidney exchange. The Quarterly
Healy, K., 2006. Last Best Gifts. Altruism and the Market for Human Blood and Organs. Journal of Economics 19, 457–488.
University Of Chicago Press. Roth, Alvin E., Tayfun Sonmez, M., Utku Unver, 2005a. A Kidney exchange clearing-
Heckman, James J., Hotz, V.J., 1989. Choosing among alternative nonexperimental house in new England. American Economic Review, Papers and Proceedings 95,
methods for estimating the impact of social programs: the case of manpower 376–380.
training. Journal of the American Statistical Association 84, 862–874. Roth, Alvin E., Tayfun Sonmez, M., Utku Unver, 2005b. Pairwise kidney exchange.
Hippen, Benjamin E., 2008. Organ sales and moral travails: lessons from the living Journal of Economic Theory 125, 151–188.
kidney vendor program in Iran. Policy Analysis 614, 1–17. Roth, Alvin E., 2007. Repugnance as a constraint on markets. The Journal of Economic
Institute of Medicine. Committee on Organ Procurement and Transplantation Policy, Perspectives 21, 37–58.
Division of Health Sciences Policy., 1999. Assessing Current Policies and the Rudow, Dianne LaPointe, Marian O’Rourke, Claudia Musat, Garvey, Cathy A., Deidre
Potential Impact of the DHHS Final Rule. National Academy Press, Washington, Gish-Panjada, Charles Alexander, 2006. Living Donor Insurability and its Impact
DC. on Donor Healthcare. World Transplant Congress. Abstract #1192, Concurrent
Karanes, Chatchada, Confer, Dennis, Walker, Tim, Askren, Andrea, Keller, Claire, Session 99: Ethics and Quality of Life.
2003. Unrelated donor stem cell transplantation: the role of the national marrow Segev, Dorry L., Muzaale, Abimereki D., Caffo, Brian S., Mehta, Shruti H.,
donor program. Oncology 17, 1–12. Singer, Andrew L., Taranto, Sarah E., McBride, Maureen A., Montgomery,
Kessler, Judd B., Roth, Alvin E., 2012. Organ allocation policy and the decision to Robert A., 2010. Perioperative mortality and long-term survival following
donate. American Economic Review 102 (5), 2018–2047. live kidney donation. The Journal of the American Medical Association 307,
Kessler, Judd B., Roth, Alvin E., 2013a. Organ Donation Loopholes and the Deterio- 1337–1447.
ration of Warm Glow Giving. Working Paper. Shimazono, Yosuke, 2007. The state of the international organ trade: a provisional
Kessler, Judd B., Roth, Alvin E., 2013b. Don’t Take ‘No’ For An Answer: An Experiment picture based on integration of available information. Bulletin of the World
with Actual Organ Donor Registration. Working Paper. Health Organization 85, 955–962.
Korbling, Martin, Freireich, Emil J., 2011. Twenty-five years of peripheral blood stem Spital, Aaron, Jacobs, Cheryl, 2002. Life insurance for kidney donors: another update.
cell transplantation. Blood Journal 117, 6411–6416. Transplantation 74, 972–973.
Lacetera, Nicola, Macis, Mario, 2010. Do all material incentives for pro-social activi- Thong, Melissa S., Kaptein, Adrian A., Krediet, Raymond T., Boeschoten, Elisabeth
ties backfire? the response to cash and non-cash incentives for blood donations. W., Dekker, Friedo W., 2007. Social support predicts survival in dialysis patients.
Journal of Economic Psychology 31, 738–748. Nephrology Dialysis Transplantation 22, 845–850.
Lacetera, Nicola, Macis, Mario, 2013. Time for blood: the effect of paid leave leg- Titmuss, Richard, 1971. The gift relationship: from human blood to social policy.
islation on Altruistic behavior. Journal of Law, Economics and Organization, Pantheon Books, New York.
forthcoming. Venkataramani, A.S., Martin, E.G., Vijayan, A., Wellen, J.R., 2012. The impact of tax
Lacetera, Nicola, Mario Macis, Robert Slonim, 2011. Rewarding altruism? a natural policies on living organ donations in the United States. American Journal of
field experiment. NBER Working Paper, 17636. Transplantation 12, 2133–2140.
Lacetera, Nicola, Macis, Mario, Slonim, Robert, 2012. Will there be blood? incentives Wellington, Alison J., Sayre, Edward A., 2011. An evaluation of financial incentive
and displacement effects in pro-social behavior. American Economic Journal: policies for organ donations in the United States. Contemporary Economic Policy
Economic Policy 4, 186–223. 29, 1–13.
Lacetera, Nicola, Macis, Mario, Slonim, Robert, 2013a. Economic rewards to motivate
blood donations. Science 340 (6135), 927–928.
Lacetera, Nicola, Macis, Mario, Stith, Sarah, 2013b. Removing Financial Barriers Further reading
to Organ and Bone Marrow Donation: The Effect of Leave and Tax Legis-
lation in the U.S., Online Appendix (October 8, 2013). Available at SSRN: Fusionspark Media, Inc. (www.organtransplants.org/understanding/history/
http://ssrn.com/abstract=2337645. index.html).
Lavee, J., Ashkenazi, T., Stoler, A., Cohen, J., Beyar, R., 2012. Preliminary marked Levush, Ruth, 2010. Israel: encouragement of organ donation. Global
increase in the national organ donation rate in israel following implementa- Legal Monitor, accessed 12/14/2011, www.loc.gov/lawweb/servlet/lloc
tion of a new organ transplantation law. American Journal of Transplantation news?disp3 I205401781.
13, 780–785. Living Donor Protection Act (http://www.govtrack.us/congress/bills/111/hr1558).
Leider, Stephen, Roth, Alvin, 2010. Kidneys for sale: who disapproves, and why? National Marrow Donor Program (www.nmdp.org).
American Journal of Transplantation 10, 1221–1227. National Marrow Donor Program (NMDP), 2013. You’re a Match: A Donor’s
Levenson, J.L., Olbrisch, M.E., 1993. Psychosocial evaluation of organ transplant can- Guide to Donation., http://marrow.org/Registry Members/Donation/
didates: a comparative survey of process, criteria, and outcomes in heart, liver Now that you are a match %28PDF%29.aspx.
and kidney transplantation. Psychosomatics 34, 314–323. Organ Procurement and Transplantation Network (optn.transplant.hrsa.gov/data/).
Maddrey, W.C., Van Thiel, D.H., 1988. Liver transplantation: an overview. Hepatol- Satel, Sally, 2010. Israel’s Remarkable new steps to solve its organ short-
ogy. 8 (4), 948–959. age. Slate Magazine, http://www.slate.com/articles/health and science/
Matas, Arthur J., Schnitzler, Mark, 2003. Payment for living donor (vendor) kid- medical examiner/2010/01/kidney mitzvah.html.
neys: a cost-effectiveness analysis. American Journal of Transplantation 4, United Network for Organ Sharing (UNOS), 2013. Living Donation. Information You
216–221. Need to Know, http://www.unos.org/docs/Living Donation.pdf.

The impact of patient cost-sharing on low-income populations:

Evidence from Massachusetts夽
Amitabh Chandra a,d , Jonathan Gruber b,d , Robin McKnight c,d,∗
a
Harvard Kennedy School, Harvard University, United States
b
Department of Economics, MIT, United States
c
Department of Economics, Wellesley College, United States
d
NBER, United States
Article history: Greater patient cost-sharing could help reduce the fiscal pressures associated with insurance expansion
Received 26 April 2012 by reducing the scope for moral hazard. But it is possible that low-income recipients are unable to cut
Received in revised form 15 October 2013 back on utilization wisely and that, as a result, higher cost-sharing will lead to worse health and higher
downstream costs through increased use of inpatient and outpatient care. We use exogenous variation
Available online 1 November 2013
in the copayments faced by low-income enrollees in the Massachusetts Commonwealth Care program
to study these effects. We estimate separate price elasticities of demand by type of service. Overall, we
Keywords:
find price elasticities of about −0.16 for this low-income population — similar to elasticities calculated
Heath insurance
Cost sharing
for higher-income populations in other settings. These elasticities are somewhat smaller for the chron-
ically sick, especially for those with asthma, diabetes, and high cholesterol. These lower elasticities are
attributable to lower responsiveness to prices across all categories of service, and to some statistically
insignificant increases in inpatient care.
The recently enacted Patient Protection and Affordable Care Act The motivation for the subsidies is twofold: to make the
(PPACA) includes the largest expansion of health insurance cover- transfers in PPACA more progressive, and to protect low-income
age to low-income populations in our nation’s history. The Federal populations from sacrificing necessary medical care because of
government will spend over $1 trillion over the next decade to sub- cost. The optimal level of such subsidies, therefore, depends
sidize insurance for those below 400% of the Federal Poverty Line critically on the way in which the medical care utilization of low-
(FPL) (Congressional Budget Office, 2013). Roughly, half that total income groups responds to cost sharing, and how any change
will be through expansions of the Medicaid program, which will in utilization impacts their health. On one hand, greater patient
provide publicly financed health care for those below 133% of the cost-sharing could help reduce the fiscal pressures associated with
poverty line at essentially zero patient cost. The other half will be insurance expansion by reducing the scope for moral hazard.
in the form of subsidies to private insurance for those between 133 But on the other hand, there has been speculation that low-
and 400 percent of the poverty line. These subsidies are of two income patients may be more price sensitive than other patients
types: the first type is premium subsidies, which offset the pre- or that low-income patients may be more likely to experience
mium cost of insurance by limiting the percentage of income that adverse health consequences as a result of cost-sharing (Baicker
low-income individuals must pay. The second type is cost-sharing and Goldman, 2011).
subsidies, which offset to some extent the copayments, coinsurance Differential effects on low income patients could arise for a
and deductibles that these low-income populations face. number of reasons. First, low-income patients may simply be more
responsive because they face a tighter budget constraint; in this
case, we would expect low-income patients to cut back on care
with the lowest marginal benefit. Second, it is possible that lower
夽 We are grateful to Kaitlyn Kenney and the staff at the Massachusetts Health income individuals are less able to evaluate the marginal benefit
Connector for making these data available. We thank, without implicating, Dan of their care than higher income individuals and, as a result, may
Fetter, Ilyana Kuziemko, Joseph Newhouse, and seminar participants at the have a higher propensity to cut back on high marginal benefit care.
Columbia/CUNY/NYU Health Seminar for very helpful suggestions on earlier ver- In their study of drug copayments in Medicaid, for example, Reeder
sions of this paper.
∗ Corresponding author at: Department of Economics, Wellesley College, 106 and Nelson (1985) argued that, because education is positively cor-
Central Street, Wellesley, Massachusetts, 02481, United States. related with income, low-income individuals may be less able to
E-mail address: robin.mcknight@wellesley.edu (R. McKnight). communicate with their physicians and, consequently, make less
58 A. Chandra et al. / Journal of Health Economics 33 (2014) 57–66
well-informed decisions. Goldman and Smith (2002) provide evi- some statistically insignificant impacts among the chronically ill
dence that patients of lower socioeconomic status are less likely population.
to adhere to treatment regimens for chronic illnesses and, as a Our paper proceeds as follows. Section 1 reviews the prior
result, experience worse health outcomes. Third, higher rates of literature on this topic. Section 2 provides background on the
chronic illness among low-income populations could imply differ- institutional setting and data. Section 3 presents our estimation
ential effects of cost-sharing in low-income populations. In prior strategy. Section 4 shows our results, while Section 5 presents a set
studies that have identified adverse consequences, these conse- of robustness checks of those findings. Section 6 concludes with a
quences have generally been concentrated among the chronically discussion of the implications of our findings.
ill (Goldman et al., 2007).
These concerns highlight the importance of considering the pos-
sibility of cross-price effects in response to copayment increases: 1. Prior literature
cost-sharing for one service could reduce utilization of complemen-
tary services or it could increase utilization of substitute services. Most analyses of price sensitivity of medical care still rely on
The positive cross-price effects on substitute services are known the evidence from the RAND Health Insurance Experiment (HIE) of
in the empirical literature as “offset effects,” because decreases in the mid-1970s (Newhouse, 1993). This seminal study found that,
one service are offset by increases in substitute services (McGuire, for the population as a whole, there was a significant but mod-
2012). In particular, prior work has suggested that reduced primary est response of medical care utilization to the point-of-service cost
care due to copayment increases can lead to higher downstream of medical care. Notably, the reductions in care appeared to come
costs in inpatient or outpatient settings. Commentators such as across the board, both in categories of “effective” and “ineffective”
Evans et al. (1993) have argued that user fees are a “policy zom- medical care. There were, however, no “offset” effects in terms of
bie”, while Donaldson (2008) argued that it is “wrong, unfair, and reduced primary care leading to a demand for more hospital care;
ineffective to try to limit consumer and patient access through indeed, reduced primary care appeared to lower spending on hos-
user fees, and also to dress up this process as actually enhancing pital care. Most importantly, there was no evidence of a detrimental
access.” effect on the typical person in terms of worsened health. That is, for
In this paper, we use exogenous variation in the copayments the average person in the HIE, there did not appear to be productive
faced by low-income enrollees in Massachusetts’ Commonwealth returns to marginal health care utilization in terms of improving
Care program. This state program was the model for PPACA, pro- health status.
viding highly subsidized insurance for families below 300% of the Although the sample size was more limited, the HIE did consider
FPL. Importantly, there was a substantial increase in the copay- heterogeneity by income and health. For the subset of low-income
ments paid by enrollees in this program in July 2008 that we population, the findings paralleled those for the larger popula-
exploit for identification. Using unique claims data on the Com- tion: a modest impact on health care spending, with no offset
monwealth Care population provided to us by the state, we are effects and no impact on health status. However, there was an
able to estimate the effect of greater cost-sharing on overall utiliza- important potential exception to this finding: for the chronically
tion. Because copayments increased nearly proportionately across ill low-income population, there was a suggestion of a sizeable rise
most services, we are unable to separately estimate own- and in blood pressure for those in the higher cost-sharing insurance
cross-price elasticities for most services. However, we are able to plan (although the overall health results for this population were
estimate the overall price elasticity associated with across-the- mixed, and the blood pressure result itself was not significant). But
board copayment increases. In addition, because copayments for the HIE evidence is now over thirty years old, and changes in the
inpatient admissions did not increase for most of our sample, we practice of medicine—including greater reliance on managed care
are able to provide evidence on hospitalization offsets. contracts, the availability of many new pharmaceutical and surgi-
In previous work (Chandra et al., 2010b), we had provided cal interventions, the growth of imaging and diagnostic technology,
preliminary evidence on the price elasticity of demand in this pop- and the development of the medical device industry—may imply
ulation, using only the first six months of post-policy change data. a structural change in the elasticity of medical demand and the
This paper extends that analysis by incorporating additional post- health impacts of any utilization reductions.
policy change data, by using a superior estimation framework, Besides the HIE, there has been relatively little research on
and most importantly by considering a variety of questions not the impacts of cost-sharing on low-income populations. Indeed,
addressed in that earlier paper, such as the issue of population Baicker and Goldman (2011) write that “while there is a lot of
heterogeneity in demand elasticities and in offset effects. We are speculation that the poor have more elastic demand, there is little
also able to perform a number of specification checks that provide evidence.” Much of the existing work focuses on the introduction of
assurance that our results are not be driven by stockpiling (where copayments for prescription drugs in Medicaid programs. Nelson
patients avail of medical care immediately before the policy change et al. (1984) examined the introduction of a drug copayment in
and do not use care in the immediate months after), or other con- South Carolina’s Medicaid program and found that it was associ-
founders that may be timed with the policy change. As a result of ated with a statistically significant decline in drug purchases, in a
a variety of improvements to our data and estimation, our current pre-post framework and in comparison with Tennessee’s Medicaid
estimate for the overall price elasticity of demand in this popula- program. Follow-up work in Reeder and Nelson (1985) examines
tion is on the low end of the range of elasticities that we reported in the change in drug utilization within South Carolina and notes a
our earlier work. Overall, we find price elasticities of about −0.16 reduction in classes of drugs which, if not taken, “could result in
for this low-income population, which is similar to, but somewhat a deterioration of health that could ultimately lead to the use of
lower than, elasticities calculated for higher-income populations more expensive medical services.” Stuart and Zacker (1999) use
in other settings. cross-state variation in the use of drug copayments in Medicaid
We also find lower price elasticities among individuals with to examine the impact of copayments on Medicare dual eligibles.
chronic illness and with higher levels of prior spending, suggest- They find that individuals in states with copayments, ranging
ing that copayments are less important in these subsamples. In from $0.50 to $3.00, use 15 percent fewer prescriptions than indi-
addition, we find no evidence of offsetting increases in hospital- viduals in states without copayments, with the impact resulting
izations in response to the higher copayments, although there are primarily from a reduction in drug use on the extensive margin.
A. Chandra et al. / Journal of Health Economics 33 (2014) 57–66 59
Cunningham (2002) analyzes cross-state variation in Medicaid’s dependents on another person’s family plan, and do not qualify
use of prescription drug cost-control methods, including copay- for other public health insurance programs such as Medicaid.
ments, and finds that cost-control efforts are associated with Members are eligible for different plan types based on where
reductions in reported access to prescription drugs. Our research their income is relative to the FPL, and each plan type offers a
builds on this existing evidence, by exploiting plausibly exogenous different level of patient cost-sharing. At the beginning of our sam-
variation affecting the low-income population that allows us to ple period in July 2007, there were four different plans— each
calculate a price elasticity of demand for this population. with the same benefits but with different levels of patient cost-
The theoretical impact of cost-sharing on overall spending is sharing. None of the plans had deductibles, so all the cost-sharing
complicated by cross-price effects (Baicker and Goldman, 2011). took the form of copayments. Plan 1 was for all members with
On one hand, cross-price effects could imply that a copayment for incomes below 100 percent of the FPL and it had the lowest level
one service would reduce utilization of complementary services; an of cost-sharing ($0 for all services, except emergency room visits
office visit copayment, for example, might reduce spending on labs and prescription drugs). Plan 2 was for all members with incomes
and diagnostic testing. On the other hand, cross-price effects could between 100 percent and 200 percent of the FPL and it had higher
imply that a copayment for one service would increase utilization of copayments for all services than did Plan 1. This plan structure gen-
substitute services; reduced use of primary care can lead to wors- erated a discontinuity in cost-sharing at 100 percent of the FPL.
ened health, which might indirectly lead to increased utilization Members with incomes between 200 and 300 percent of the FPL
of emergency or inpatient services. Offset effects are a particular had a choice of two plans, with different levels of cost-sharing and
concern if psychological biases cause patients to overweight the premiums. If they chose Plan 3, they had higher cost-sharing than
immediate cost of a copayment relative to expected future health Plan 2; if they chose Plan 4, they had the same cost-sharing as mem-
benefits (Baicker et al., 2012). In the potential presence of cross- ber of Plan 2, but they had to pay a higher premium than for Plan 3.
price effects, it is important to evaluate empirically the impact of Thus, there was a cost-sharing discontinuity at 200 percent of the
cost-sharing on overall spending and on indicators of worsening FPL, but only for members who opted for Plan 3.
health, such as hospitalizations. In July 2008, copayments were increased for Plans 2 and 3, gen-
The possibility of offset effects is real: while the RAND Health erating changes in the discontinuities at 100 percent of the FPL
Insurance Experiment did not find any evidence for offsets, more and 200 percent of the FPL. In addition, Plan 4 was discontinued in
recent research has suggested that increased cost-sharing may be July 2008 and its members were placed into Plan 3.2 Consequently,
associated with offsetting increases in other forms of care. For Plan 4 enrollees experienced a substantial increase in patient cost-
example, Gaynor et al. (2007) find that increased spending on out- sharing as a result of the elimination of Plan 4. These changes in
patient care offset 35% of the savings from reduced prescription the discontinuities in copayments at 100 and 200 percent of the
drug use after drug copayment increases in employer-provided FPL that occurred in July 2008 are the key source of variation that
health insurance. Such offsets are often found to be concentrated is exploited in our empirical work below.
among chronically ill populations. For example, in our own earlier CommCare charges no monthly premium for members with
work (Chandra et al., 2010a), we found evidence that increased incomes below 150% of the FPL. For those with incomes above
prescription drug and office visit copayments caused offsetting 150% of the FPL, there is a monthly premium, which increases with
increases in hospitalizations in elderly who had been diagnosed income. Currently, the monthly premium for the lowest cost plan
with chronic diseases such as diabetes, hypertension, and hyper- is $40 for enrollees with income between 150 and 200 percent of
lipidemia. Similarly, Trivedi et al. (2010) find that increased the FPL, $78 for those between 200 and 250 percent of the FPL, and
copayments for ambulatory care for elderly patients were asso- $118 for those between 250 and 300 percent of the FPL.
ciated with increased rates of hospitalizations; these effects were
concentrated among enrollees of lower socioeconomic status and 2.1. Data
enrollees who had chronic illness or a history of myocardial infarc-
tion. In a paper that is especially relevant to our setting, Tamblyn For the purposes of this analysis, Massachusetts’ Common-
et al. (2001) examined the effect of cost-sharing on prescription wealth Health Insurance Connector Authority provided us with
drugs in the poor and elderly and found an increase in the proba- the universe of (de-identified) enrollment and claims data from
bility of hospitalization, nursing home admission, or mortality. The July 2007 through June 2009. This sample period covers a full year
key role of cost-sharing in the PPACA suggests the importance of before and a full year after the copayment change that we are
revisiting this topic in a low-income population. studying. From the enrollment file, we can observe the Comm-
Care plan in which each individual was enrolled in each month of
2. Institutional setting and data the sample period. We also observe very basic demographic infor-
mation, such as age and gender. For most individuals, we observe
Our study examines the impact of increased patient copayments household income as a share of the FPL, although this informa-
that were imposed on low-income adults in the Commonwealth of tion is missing for some individuals. As a result of this missing
Massachusetts. As part of the insurance expansion that was signed information, we must exclude 25% of member-months from our
into law on April 12, 2006, the state of Massachusetts provides sample. Because income information is disproportionately miss-
completely subsidized coverage to legal residents earning up to ing for individuals who were enrolled for shorter periods of time,
150% of the FPL and highly subsidized coverage to those earning this restriction affected only 16% of the member-months in the
between 150% and 300% of the FPL.1 These subsidies are provided continuously enrolled subsample that is the focus of our analy-
through a program known as Commonwealth Care (henceforth, sis. We also exclude 10,073 member-months that report income
CommCare) to persons between the ages of 19 and 64 who levels above 300% of FPL and 19,290 member-months that report
are not offered employer-provided health insurance, cannot be ages below 19 or above 65, since these individuals should not
1 2
In 2013, 150% of the FPL is $17,235 for an individual; $35,325 for a family of Plan 4 was discontinued because the Connector Authority Board concluded that
four. 300% of FPL is $34,470 for an individual; $70,650 for a family of four. These are it was unnecessary and that the choice of plans for this income group was inducing
national cutoffs (Federal Register, 2013). selection.
Table 1
Member characteristics.
All members Continuously enrolled (no plan switching)
Age 40.3 40.9

Percent Male 47 47
Percent enrolled in Plan 1 52 52
Percent with Charlson score of 1 or more 23 24
Percent with Chronic disease 29 31
Average monthly expenditure

Total $384.6 $358.8
Hospitalizations $86.2 $77.7
ER $43.2 $39.7
Drugs $44.5 $43.9
Office visits $46.8 $44.5
Outpatient $79.7 $75.4
Lab $58.2 $54.5
Average FPL 95.4% 96.1%

Rate of avoidable hospitalizations 0.06% 0.05%
Number of members 247,565 122,456
Number of member-months 2,842,493 1,865,777
be eligible for CommCare. After making these adjustments, our service, or that indicates that the service provided was “Labora-
full sample includes 2,842,493 member-month observations and tory” or “Radiology.” Labs account for approximately 15% of total
our continuously enrolled subsample includes 1,865,777 member- costs. Office visits include professional claims that do not indicate
month observations. that the place of service is outpatient hospital, independent labo-
Our claims data include the universe of medical claims for ratory, or emergency room; that do not occur on the same day as
CommCare enrollees during our sample period. For each member any inpatient hospital claims; and that do not have a service code
in each month of our data, we use the claims data to calculate the of “Laboratory” or “Radiology.” They are, therefore, largely ambu-
sum of all spending for each type of medical care, including zeroes latory visits for checkups and consultations and they account for
for members who have no claims in a particular month.3 For our roughly 15% of total costs.
regression analysis, we then collapse these member-month obser- In Table 1, we report means for two samples: the full sample
vations to averages for each percent of income as a share of the FPL with 247,565 members and a sample of 122,456 members who
in each of the 24 months of data. We allow two observations per stayed continuously enrolled at the time of the copayment increase.
month for income cells between 200 and 300 percent of FPL: one Individuals in this second “continuously enrolled” sample were not
for individuals who were ever enrolled in Plan 4 and one for indi- necessarily enrolled for the full 24-month sample period, but did
viduals who were never enrolled in Plan 4. As a result, the sample not change their enrollment status at the time of the copayment
size for our main regressions is 9,644. increase. Individuals who initially enrolled in Plan 4 but switched
to Plan 3 in July 2008 are not excluded from the “continuously
2.2. Descriptive statistics enrolled” sample, since it was not possible for them to remain in
Plan 4 at that time.
In Table 1, we describe the characteristics of our study popula- We created this “continuously enrolled” sample because
tion and, in Table 2, we provide details of the copayment change. examining utilization changes in the full sample is subject to
As Table 1 illustrates, the sample is poor; on average, members considerable bias if member income or enrollment is a function of
were at 95% of the FPL. They are also quite sick, with about 30% health status. This is a real concern in the full sample: in our work
of the sample having some form of chronic disease (as measured on the individual mandate in Massachusetts, we demonstrated
by a diagnosis of diabetes, hypertension, hyperlipidemia, asthma, that the insurance market for Commonwealth Care was laced
arthritis, affective orders, or gastritis— conditions that are derived with adverse selection prior to the individual mandate binding
from Goldman et al. (2004) and also used in Chandra et al. (2010a)). on January 1, 2008 (Chandra et al., 2011). Consistent with adverse
Annual spending is approximately $4,600, much of which comes selection, we found that those who enrolled first in CommCare
from hospitalizations and outpatient care. were sicker, with relatively healthy individuals joining the plan
In our data, we identify outpatient care as care that indicates closer to the date of the mandate. This time-varying change in the
“outpatient hospital” as the place of service on the claim; it includes composition of enrollees will affect full-sample estimates of the
both facility and professional claims for the outpatient hospital price elasticity of demand because utilization and health-status
visit. Such care accounts for about 20% of total costs. Labs include are correlated in a time varying manner. Moreover, there was a
any claims that indicate “Independent Laboratory” as the place of change in the premiums charged to enrollees in CommCare at
the same time that copayments changed, which could impact the
composition of enrollees. To mitigate these concerns, we examine
3
In our earlier work, we computed average monthly utilization for each indi- the continuously enrolled sample, who were continuously enrolled
vidual over the months during which they were enrolled and then computed an just before and after the policy change, to ensure that our findings
average across all individual-years. In this version, we simply compute the average are not the result of differential entry and exit rates for different
of monthly utilization across all member-month observations. In effect, we previ-
types of individuals around the time of the copayment change.
ously placed disproportionate weight on individuals who were enrolled for shorter
periods of time; now the weight that we place on individuals is proportional to the Comparing the means in Table 1 reassures us—the two samples are
amount of time that they are enrolled. similar in age, gender, and income; furthermore, utilization levels
Table 2
Copayment changes by percent of federal poverty line of member.
Plan 1: 0–100 percent of FPL Plan 2: 100–200 percent of FPL Plan 3: 200–300 percent of FPL Plan 4: 200–300 percent of FPL
Copayments for hospital inpatient admission

October 2006–June 2008 $0 $50 $250 $50
July 2008–present $0 $50 $250 $250
Copayments for emergency room visits

October 2006–June 2008 $3 $50 $75 $50
July 2008–present $3 $50 $100 $100
Copayments for outpatient surgery

October 2006–June 2008 $0 $50 $100 $50
July 2008–present $0 $50 $125 $125
Copayments for office visits (primary care/specialist)

October 2006–June 2008 $0/$0 $5/$10 $10/$20 $5/$10
July 2008–present $0/$0 $10/$18 $15/$22 $15/$22
Copayment for prescription drugs (generics/formulary/non-formulary retail)

October 2006–June 2008 $1/$3 $5/$10/$30 $10/$20/$40 $5/$10/$30
July 2008–present $1/$3 $10/$20/$40 $12.50/$25/$50 $12.50/$25/$50
Average copayment change (across all services)

October 2006–June 2008 $1.11 $ 8.39 $16.83 $ 8.39
July 2008–present $1.11 $13.99 $21.30 $21.30
Increase 0 percent 51 percent 24 percent 93 percent
Note: Prior to July 2008, there were two plan types for members whose income was between 200 and 300 percent of the FPL (Plans 3 and 4). Plan 4 was discontinued in
July 2008 and its members were enrolled in Plan 3. Throughout this period, the copayment for emergency room use is waived if the patient is admitted to the hospital. To
calculate average copayments, we weighted copayments for each service (hospitalizations, ER use, prescription drugs) by its pre-reform utilization (pre July 2008 utilization).
Source: The Massachusetts Health Insurance Connector Authority (2008).
are similar as are enrollments across the four plans that we study.4
Nonetheless, we acknowledge the possibility that our findings Generic Drug Copayments
might not generalize to the broader, low-income population. 10
In Table 2 we summarize the variation that we use to understand
the effect of patient cost-sharing on patient demand: we exploit 8
Dollars Per Prescription
a series of changes in the discontinuities in the cost-sharing pro-

visions of CommCare that occurred in July 2008. These changes, 6
which varied by type of plan, are reported in Table 2. Of these
changes, the most salient are the increases in copayments for 4
prescription drugs and office visits for members whose incomes
exceeded 100% of the FPL. Copayments for ER visits increased only
2
for those with incomes over 200% of FPL. Copayments for hospital
admissions were unchanged, except for the Plan 4 members who
0
had to switch to Plan 3. Note that there were no copayment changes
0 100 200 300
for members whose income was between 0 and 100 percent of the Income as a Percent of Federal Poverty Line
FPL.
Pre Post Pre, Plan IV Post, Plan IV
In order to estimate the price elasticity of demand, we devise a
simple measure of the overall change in copayments. Specifically,
we calculate the weighted average of the copayments for generic Fig. 1. Copayments for generic drugs.
drugs, formulary drugs, non-formulary drugs, office visits, ER visits,
and hospitalizations. The weights are average monthly utilization
of each type of care during the pre-period (July 2007 to June 2008)
frequency that those services are used i.e., more weight is assigned
for the entire sample. At the bottom of Table 2, we summarize the
to office visit and generic prescription drug copayments, because
(unadjusted) change in copayments that we will exploit: that is,
those are the most frequently used in the population overall.5
there is no change for Plan 1, a 51% increase for Plan 2, a 24% increase
In Fig. 1, we plot the average copayment for one category of
for Plan 3, and a 93% increase for Plan 4. This weighted average
utilization—generic drugs—by income category and by period (‘pre’
measure produces meaningful results in two situations: One such
denotes the period before July 2008 and ‘post’ denotes the period
situation is when copayment changes are proportional across all
after July 2008). This figure graphically illustrates the basis for our
services. Another is if individuals respond to copayment increases
formal regression analysis below.
across services in a manner that is proportional to the overall
4 5
We investigated the possibility that a disproportionate number of people It is also possible that individuals respond more to the copayments that, pro-
switched plans because of the copayment increase. The percent of members who portionately, have affected them more in the past. To test sensitivity to this issue,
were new members of a plan in May, June, July and August of 2008 was 5.2, 5.2, 4.7 we calculated an alternative copayment measure, with weights that are specific to
and 4.4 percent respectively (the policy change occurred in July 2008). Similarly, the an individual’s spending tercile in the base year. When we calculate the weighted
percent of members who left the plan in May, June, July and August of 2008 was 5.2, average copayment in this way (and control for tercile in our regressions), our over-
4.8, 4.7 and 5.7 percent respectively. In other words, we do not see excess churning all estimates were very similar to those reported in the paper, suggesting that this
at the time that the policy change occurred. possibility, while plausible, was not enough to justify the alternative approach.
Table 3
GLM estimates of the effect of copayment increases on healthcare spending (all patients).
(1) (2) (3) (4) (5) (6) (7)

Total spending Hospital spending ER spending Outpatient spending Office visit spending Rx spending Lab spending
Panel A: all patients

ln(Copay) −0.158 (0.064) −0.115 (0.247) −0.207 (0.152) −0.203 (0.122) −0.147 (0.051) −0.131 (0.074) −0.282 (0.089)
N 9644 9644 9644 9644 9644 9644 9644
Note: Each number in the table is an elasticity that comes from a separate GLM regression. Dependent variable is Total Monthly Healthcare Expenditure per member per
month. Standard errors are clustered on member, and explanatory variables include percent of FPL fixed-effects, and year*month indicators.
Because a number of cost-sharing increases occurred at the avoids the problem with two-part models (which introduce selec-
same time, it is important not to attribute changes in a specific tion on the dependent variable) and yields estimates that are easily
category of spending, such as the utilization of prescription drugs, interpretable as elasticities.6 In principal, Eq. (2) could also be esti-
solely to changes in the copayments for these drugs. Declines in mated at the level of individuals and we could include individual
the use of prescription drugs may, for example, partially reflect a fixed effects. Because the policy change occurs at the plan level, and
reduction in the use of office visits. For this reason, we are most individuals are assigned to plans based on their income, there is no
interested in the reduction in total expenses across all categories additional benefit from this approach.
of care.
4. Results
3. Estimation
Table 3 reports our key results. In column 1, we report the over-
Because the policy variation occurs at the level of plans, and all elasticity, which is −0.158; a 10 percent increase in prices faced
plans are defined according to the percent of FPL of the mem- by patients would reduce utilization by 1 to 2 percent. This is com-
ber’s income, we estimate our regression models at the level of parable to the much older estimates of RAND HIE, albeit somewhat
FPL × month, and cluster our standard errors at the level of FPL. The lower.
simplest framework to think about estimating the price elasticity With our data, we are able to estimate separate effects by type of
of demand is with the equation below: service: hospital spending, ER spending, outpatient spending, office
visit spending, prescription drug spending, and spending on labo-
ln Spendingjt = ˇ0 + ˇ1 ln Copaymentjt + FPLj + Timet + εjt (1) ratory services. We are also able to examine whether there were
Here, Spendingjt is the average monthly spending for income cell j ‘offset’ effects; that is, did inpatient hospital visits increase, poten-
in time t, and Copaymentjt is the weighted average copayment for tially because of excessive reductions in primary care utilization?
this cell at a point in time. Income is defined as percent of the FPL The results in Table 3 suggest that elasticities for all services fall
so all our specifications can (and do) include fixed effects for each in a relatively narrow range, between −0.1 and −0.3. Perhaps most
percent of FPL. Note that plan fixed effects are perfectly correlated interestingly, we find no evidence of offset effects. Hospital uti-
with the percent of FPL fixed effects, and are therefore not included. lization falls, rather than rising, as copayments are increased (but
We also include year × month fixed effects to remove any effects this effect is statistically insignificant). For Plans 1, 2 and 3, hos-
of time trends or seasonality. pital copayments did not increase, so that any impact on hospital
Eq. (1) identifies the price (copayment) elasticity of healthcare utilization represents a pure offset effect. But for patients in Plan
as ˇ1. In Eq. (1) variation in copayments is coming only from within- 4, this could reflect the demand side impact of copayments rising
plan increases. Our approach is identical to identifying the effect of from $50 to $250 for hospital care. Our results are similar, how-
copayment changes using a regression-discontinuity design, where ever, if we exclude Plan 4 (which is unsurprising since Plan 4 is 1
we identify off changes in copayments at 100 and 200 percent of percent of the total sample), clarifying that we are estimating the
the poverty line (see Fig. 1); later in this paper (Table 7), we restrict offset effect only.
the analysis to utilization in the period immediately around the We next examine whether the overall elasticity varies by type
policy change and demonstrate that the results are unchanged. of patient. We first examine whether price sensitivity varies with
We could estimate Eq. (1) with OLS, but this would make sense health status, since prior research has suggested that any offset
only in a world where all cells had non-zero spending. It is tempt- effects are likely to be concentrated among those in poor health
ing to add a small amount of (arbitrary) spending, such as $1, to (Manning et al., 1987; Chandra et al., 2010a). We use two meas-
overcome the problem that ln(0) is not defined; indeed, this is ures of chronic illness. One measure is an indicator for the presence
the approach that we took in Chandra et al. (2010b). But this can of a diagnosis of hypertension, high cholesterol, diabetes, asthma,
be a problematic strategy, because ln(Spending) is quite sensitive arthritis, affective disorders or gastritis, following definitions in
to spending at the bottom of the spending distribution; convert- Goldman et al. (2004). Thirty-one percent of our continuously
ing ln($1) to ln($2) has the same effect as converting ln($1000) to enrolled sample meets this definition of chronic illness. The sec-
ln($2000). To circumvent this problem, we follow the literature and ond measure is the Charlson Comorbidity Index, a weighted sum
estimate a general linear model with a log-link function (Buntin of indicators for certain health conditions, where the weights indi-
and Zaslavsky, 2004). In GLM models with a log link-function, the cate the relative increase in one-year mortality risk associated with
conditional mean is modeled as: the condition, as described in Charlson et al. (1987). We report sep-
arate price elasticities for individuals with a Charlson index value
ln E(Spending|X) = ˇ0 + ˇ1 ln Copaymentjt + FPLj + Timet (2)
This setup models both the mean and variance on the original scale
6
of the dependent variable (here, dollars). We allow for a natu- In contrast to this model, in OLS one is assuming that E(ln Spending|X) = ˇ0 + ˇ1
ln Copaymentjt + FPLj + Timet , which is difficult to translate into easily interpretable
ral form of heteroscedasticity, where var(Spending|X) is allowed to statements about E(Spending). The solution is to use a ‘smearing’ estimator, where
depend on the mean level of conditional spending, E(Spending|X), exponentiated predictions are multiplied by a smearing factor to calculate expected
which is a function of the covariates. The use of the GLM model values on the raw (unlogged) scale (Manning et al., 1987).
Note: Each number in the table is an elasticity that comes from a separate GLM regression. Dependent variable is Total Monthly Healthcare Expenditure per member per month. Standard errors are clustered on member, and
of 0 (i.e., those who have none of the conditions) and for those
−0.161 (0.076)
with Charlson index values of 1 or greater. Twenty-four percent of
Excluding top
our continuously enrolled sample meets this definition of chronic
quartile
illness. These chronically ill individuals have much higher spend-
9627
ing levels than the full sample of continuously enrolled individuals;
(8)
those with a chronic illness spend, on average, $563 per month and
those with a Charlson score of 1 or greater spend, on average, $787
per month, as compared to $359 per month for the entire sample
of continuously enrolled individuals.
0.131 (0.083)
We also examine whether price sensitivity is different in the
Top spending
“tail” of the spending distribution. To do this, we divide our sam-
quartile
ple into individuals with high and low levels of spending during a
9590
baseline period, the 2007–08 plan year. The “high spenders,” those
(7)
who were in the top quartile of utilization during the base period,
spend, on average, $968 per month, whereas those who were in the
bottom three quartiles spend, on average, $158 per month.
0.366 (0.215)
The results in Table 4 show total spending elasticities for differ-
ent subsamples of our data. There is a somewhat higher elasticity
Asthma
for those with a Charlson index score of 1 or more, as seen in col-
7893
umn 3, although it is not statistically significantly different from
(6)
the elasticity for those with a Charlson index score of 0. There is
a substantially smaller elasticity for those with a chronic illness
diagnosis, seen in column 5; this smaller elasticity appears to be
−0.057 (0.094)
driven by positive, but insignificant, relationships between copay-
ments and overall spending among those with diabetes and high
Any chronic
cholesterol, and a positive relationship that is statistically signifi-
disease
cant at the 10% level for asthma patients.7 The positive, statistically
9608
(5)
insignificant result for asthma patients is shown in column 6. Sim-
ilarly, the price elasticity is positive and insignificant for those who
were in the top quartile of the spending distribution, as seen in
−0.274 (0.086)
column 7. Taken together, these results suggest that the impact of
copayments is not uniform; there appears to be less price sensitiv-
No chronic
ity among less healthy enrollees.
disease
There are two classes of explanations for this set of findings. One
9606
is that less healthy patients are less responsive to higher copay-
(4)
ments, because the marginal benefit of care to them exceeds the

higher out-of-pocket cost of the copayment. The other is the pres-
ence of offset effects amongst those who are chronically ill. Indeed,
−0.216 (0.100)
GLM estimates of the effect of copayment increases on medical spending (by type of patient).
this is the group for whom offsets have most frequently been found
in prior work (Chandra et al., 2010a; Goldman et al., 2007; Trivedi
index = 1 + )
Charlson
et al., 2010). Of course, both explanations could apply.

explanatory variables include percent of FPL fixed-effects, year*month indicators.
9590
In Table 5, we explore these explanations by restricting the

(3)
analysis to members with any chronic illness. We observe smaller

elasticities for presciption drug and office visits spending among
the chronically ill, consistent with the first explanation. In this
−0.184 (0.068)
sample, we do not see statistically significant evidence of hospital

spending increases as a result of copayments increasing, although
the point estimate is positive. Findings (not reported) are similar
Charlson
index = 0
for the members in the top quartile of the baseline spending distri-
9619
bution, who have a statistically insignficant increase in hospital and

(2)
outpatient spending. These findings are suggestive of offset effects

but, given the imprecision of the estimates, are not conclusive.
Regardless of the mechanism, the lower price elasticities among
−0.158 (0.064)
these subsamples suggest that, while copayments are effective on

average at reducing total spending among the low-income popu-
All patients
lation, there are chronically ill subsamples for whom copayments

9644
are less important.

(1)
The overall reduction in utilization reflects the fact that some

members cut back on services completely (extensive margin),
while others reduce their utilization while continuing to use some
ln(Copay)
7
Table 4
High cholesterol and asthma diagnoses receive a score of 0 in the Charlson index,
so the impacts of cost-sharing for these diagnoses explains the difference in our
N
findings when we use the chronic disease indicator instead of the Charlson index.
Table 5
GLM estimates of the effect of copayment increases on spending (chronically ill members).
(1) (2) (3) (4) (5) (6) (7)

Total spending Hospital ER spending Outpatient Office visit Rx spending Lab spending
spending spending spending
ln(Copay) −0.057 (0.094) 0.356 (0.407) Insufficient −0.194 (0.174) −0.100 (0.060) −0.098 (0.073) −0.318 (0.101)
data
N 9608 9608 9608 9608 9608 9608 9608
Note: Each number in the table is an elasticity that comes from a separate GLM regression. Dependent variable is Total Monthly Healthcare Expenditure per member per
Table 6
OLS estimates of the effect of copayment increases on any spending (extensive margin results), (all patients).
(1) (2) (3) (4) (5) (6) (7)

Any spending Any hospital Any ER Any outpatient Any office visit Any Rx Any lab
spending spending spending spending spending spending
ln(Copay) −0.052 (0.009) −0.0003 (0.002) −0.001 (0.002) −0.001 (0.005) −0.019 (0.007) −0.069 (0.010) −0.024 (0.005)
N 9644 9644 9644 9644 9644 9644 9944
Average $746 $5605 $1055 $697 $206 $147 $362
spending|spending > 0
Implied elasticity from −0.11 – – – −0.088 −0.231 −0.159
extentive margin
Note: Each number in the table is the percentage point change in the probability of having any medical spending. Each estimate comes from a separate OLS regression
that is estimated at the percent of FPL × month level. Dependent variable is the fraction of the cell that had any utilization. Standard errors are clustered on member, and
explanatory variables include percent of FPL fixed-effects, and year*month indicators. The implied elasticity from the extensive margin is the price elasticity of demand that
would arise from behavior at the extensive margin. Empty cells are those where the effect at the extensive margin is not statistically different than zero.
care (intensive margin). These two effects are combined in the behavior at the extensive margin explains a substantial portion of
tables that we have reported so far. In Table 6, we report the impor- the overall price elasticity. This finding is consistent with Stuart
tance of the extensive margin for all patients. Here, the dependent and Zacker (1999), who concluded, in their cross-sectional analy-
variable is the fraction of members who had any utilization of sis of drug copayments in Medicaid, that the primary impact of drug
care, by type of care; this can modeled by OLS. We see statistically copayments in Medicaid was a reduction in Medicaid recipients fill-
insignificant, negative effects on the extensive margin of hospital ing any prescriptions. It is a significant finding for it suggests that
care, emergency room care, and outpatient care. But a 10% increase rather than all patients cutting back on some care, some patients
in copayments results in a statistically significant 0.19 percent- cut back completely.
age point decline in the probability of making any physician office
visits, a 0.69 percentage point decline in the use of any prescription 5. Robustness checks
drugs, and a 0.24 percentage point decline in the use of any labo-
ratory services. These results are very consistent across samples. In Table 7 we examine the robustness of our results to three
We can ask how much of the overall decline in utilization is alternative specifications. In the first specification, we ignore infor-
plausibly the consequence of patients reducing utilization at the mation from the window just before and just after the policy change
extensive margin. To perform this calculation, we multiply the (that is, we exclude June and July, 2008). We did this to examine
change in the margin of participation at the extensive margin with the concern that patients may be stockpiling drugs, or seeking care,
the spending among those who adjusted on the extensive mar- in anticipation of the policy change. Such behavior will cause us
gin, and then divide by the total spending. We proxy utilization for to overstate the price elasticity of demand for it accentuates the
those who adjusted on the extensive margin by using the average reduction in utilization after the copayment increase (the increase
utilization of patients who had positive utilization.8 For example, in pre-policy use is accompanied by a reduction in post-policy
for total spending, we calculate the implied elasticity from behav- use). The first panel of Table 7 shows that this fear is unfounded:
ior at the extensive margin as −0.052 × $746/$358.8 = −0.11 (here, the elasticities reported here are quite similar to those reported
$746 is the average spending of enrollees with non zero spending, in Table 3, where we use all the data available to us. The use of
and $358.8 is average spending from Table 1). This is two-thirds of prescription drugs is particularly susceptible to the concern about
the total elasticity reported in Table 3, and suggests that about 70 stockpiling, but we do not find any support for this actually hap-
percent of the decline in spending might be attributed to members pening.
dropping to zero utilization; the rest of the decline is a consequence Next, we restrict the analysis to the period just before and just
of members using fewer services after copayments increase than after the policy change (that is, we keep data from April through
before. Table 6 performs the same calculation for different cate- September of 2008). The motivation for this specification is that
gories of care (we ignore categories where behavior at the extensive the US financial markets collapsed in early October of 2008 and
margin was not statistically different than zero): in general, there was a rapid rise in recession-induced unemployment in late
2008 and 2009. We wanted to be sure that the price elasticity of
demand that we are estimating was not partially confounded by
these trends in the business cycle. While it is true that we have
8
Note that if all members had positive spending pre-reform, and if (pre-reform)
included income fixed-effects and very flexible year × month con-
spending for the average member were the same as spending for a person who
dropped to zero utilization, then this exercise would generate the same elasticity
trols, there may still be income-specific trends in the data. To allay
as what we have estimated above. Of course, we have no way of validating this this concern we constructed Panel B of Table 7. The results are sim-
assumption, so this calculation is purely illustrative of the potential importance of ilar to those in Table 3, so we are reassured that the business-cycle
these extensive margin responses. is not an alternative explanation for our results. One downside of
Table 7
Specification and falsification checks: GLM estimates of the effect of copayment increases on spending (all patients).
(1) (2) (3) (4) (5) (6) (7)

Total spending Hospital spending ER spending Outpatient spending Office visit spending Rx spending Lab spending
Panel A: omit policy change window (exlude June/July 2008; n = 8840)

ln(Copay) −0.196 (0.072) −0.194 (0.279) −0.331 (0.161) −0.231 (0.133) −0.117 (0.056) −0.138 (0.082) −0.300 (0.098)
Panel B: restrict to policy change window (retain April to September 2008; n = 2412))
ln(Copay) −0.207 (0.088) Insufficient data −0.075 (0.229) −0.225 (0.140) −0.251 (0.064) −0.135 (0.062) −0.297 (0.091)
Panel C: falsify policy change by changing date (set policy change to January 2008 and keep observations through June 2008; n = 4820)
ln(Copay) 0.095 (0.096) 0.380 (0.502) 0.072 (0.214) −0.024 (0.166) 0.129 (0.082) −0.059 (0.095) 0.047 (0.117)
Note: Each number in the table is an elasticity that comes from a separate GLM regression. Dependent variable is total monthly healthcare expenditure per member per
this specification is that the loss of data also meant that we were for larger copayment changes or changes in coinsurance rates,
unable to estimate a separate price elasticity of demand for hospi- which would induce larger patient cost-sharing than the copay-
tal care (there was not enough variation in 6 months of the data to ment changes examined here. Moreover, we have not estimated
jointly identify all the fixed-effects that we have included and the pure own-price effects here. The decline in prescription drug use,
effect of the copayment change, when it comes to a relatively rare for example, could come from the increase in prescription drug
event like hospital care). copayments or from the increase in office copayments, which
Finally, in Panel C of Table 7, we coded the policy change as could cause physicians to prescribe fewer drugs. Our aggregate
having occurred on January 1, 2008 (6 months before it actually elasticity provides a measure of patient response to an overall rise
happened) to verify that we obtained null effects from this placebo in copayments, but it does not necessarily speak to the optimal
regression. Note that the estimated coefficients are never signifi- design of insurance for this population. Future work could usefully
cant and are much smaller in magnitude, thereby reassuring us that address these limitations.
we have really picked up the effect of the policy change and not of An important policy question is the extent to which these results
a pre-existing differential trend. apply to the cost sharing imposed by the Affordable Care Act. For
those below 150% of poverty in the ACA, the cost sharing is com-
parable to that imposed on the Commonwealth Care population
6. Discussion and implications
studied here. Once income rises above 150% of poverty, however,
the cost sharing imposed by the ACA is much larger than that stud-
In the past, public insurance for our lowest income citizens has
ied here, with enrollees likely facing deductibles of $500 or more
featured little patient copayment. This reflects concerns both about
as income rises. Future work on the actual impacts of ACA cost
affordability and the ability of the poor to effectively manage their
sharing changes could usefully corroborate our findings and assess
care in the face of marginal costs; of particular concerns is the
whether they apply to the much higher cost sharing imposed by
notion that the poor, facing unaffordable copayments, may cut back
the ACA on those above 150% of the poverty line.
on necessary as well as unnecessary care. As public insurance sub-
sidies have expanded to citizens above the poverty line, however,
fiscal expediency suggests the use of some patient copayments. Yet References
concerns remain that copayments in this population may do more
harm than good. Baicker, K., Goldman, D., 2011. Patient cost-sharing and healthcare spending growth.
Journal of Economic Perspectives 25 (2), 47–68.
We have investigated this issue in the Commonwealth Care Baicker, K., Mullainathan S., Schwartzstein, J., 2012. Behavioral Hazard in Health
program in Massachusetts, the nation’s largest expansion of pub- Insurance, NBER working paper number 18468.
licly subsidized private insurance coverage and the model for the Buntin, Melinda Beeuwkes, Zaslavsky, Alan M., 2004. Too much ado about two-
part models and transformation? Comparing methods of modeling medicare
national Affordable Care Act. During our sample period there was
expenditures. Journal of Health Economics 23 (3), 525–542.
a sizeable (in percentage terms) copayment change facing individ- Chandra, A., Gruber, J., McKnight, R., 2010a. Patient cost-sharing and hospitalization
uals with incomes between one and three times the poverty line. offsets in the elderly. American Economic Review 100 (1), 193–213.
Using a unique set of claims data provided by the state, we are able Chandra, A., Gruber, J., McKnight, R., 2010b. Patient cost sharing in low income
populations. American Economic Review 100 (2), 303–308.
to study the impacts of this copayment change on overall utilization Chandra, A., Gruber, J., McKnight, R., 2011. The importance of the individual man-
and on utilization of specific medical services. date: evidence from Massachusetts. New England Journal of Medicine 364 (4),
Our results largely confirm the conclusions of the RAND Health 293–295.
Charlson, M.E., Pompei, P., Ales, K.L., McKenzie, C.R., 1987. A new method of classify-
Insurance Experiment. We find that health care demand is some- ing prognostic comorbidity in longitudinal studies: development and validation.
what sensitive to copayments, but that the elasticity is small (−0.16 Journal of Chronic Disease 40 (5), 373–383.
on average). We find that those who are chronically ill, and espe- Congressional Budget Office, 2013. Effects on Health Insurance and the Federal Bud-
get for the Insurance Coverage Provisions in the Affordable Care Act–May 2013
cially those with diabetes, high cholesterol and asthma, have a Baseline, Available from www.cbo.gov/publications/44190 on 03.06.13.
lower price elasticity of demand. We also find no statistically signif- Cunningham, P.J., 2002. Prescription drug access: not just a medicare problem. Cen-
icant evidence for “offset effects” that would indicate that reduced ter for Studying Health System Change 2002 (51), 1–4 (Issue Brief).
Donaldson, C., 2008. Top-ups for cancer drugs: can we kill the zombie for good?
use of outpatient services led to increased demand for hospital ser-
British Medical Journal 337, a578, doi:10.1136/bmj.a578.
vices. Among the sickest patients in our sample, there are positive Evans, R.G., Barer, M.L., Stoddart, G.L., Bhatia, V., 1993. Who are the zombie mas-
effects on inpatient care, but they are statistically insignficant, so ters, and what do they want? Vancouver: Centre for Health Services and Policy
Research. University of British Columbia.
we cannot draw strong conclusions from them. Our results suggest
2013, January 24. Federal Register. U.S. Government Printing Office, Washington,
that ‘offset effects’ in poorer populations are possible, but require DC.
more evidence than we are able to provide. Gaynor, M., Li, J., Vogt, W.B., 2007. Substitution, spending offsets, and prescription
These results are subject to a number of caveats. In particular, drug benefit design. Forum for Health Economics and Policy 10 (2).
Goldman, D., Joyce, G., Escarce, J., Pace, J., Solomom, M., Laouri, M., Landsman, P.,
these copayment changes are large in percentage terms but small Teutsch, S., 2004. Pharmacy benefits and the use of drugs by the chronically ill.
in absolute dollar terms. It is possible that responses could differ Journal of the American Medical Association 291, 2344–2350.
Goldman, D., Joyce, G., Zheng, Y., 2007. Prescription drug cost sharing: associations Nelson Jr., A.A., Reeder, E.C., Dickson, M.W., 1984. The effect of a medicaid drug
with medication and medical utilization and spending and health. Journal of the copayment program on the utilization and cost of prescription services. Medical
American Medical Association 298 (1), 61–69. Care 22 (8), 724–736.
Goldman, D., Smith, J.P., 2002. Can patient self-management help explain the SES Newhouse, J., 1993. Free for All: Lessons from the RAND Health Insurance Experi-
health gradient? Proceedings of the National Academy of Science 99 (16), ment. Harvard University Press, Cambridge, MA.
10929–10934. Reeder, C.E., Nelson, A.A., 1985. The differential impact of copayment on drug use
Massachusetts Health Insurance Connector Authority, 2008. Report to the Mas- in a medicaid population. Inquiry 22, 396–403.
sachusetts Legislature: Implementation of the Health Care Reform Law Stuart, B., Zacker, C., 1999. Who bears the burden of medicaid drug copayment
2006–2008, Chapter 58, available from http://www.mahealthconnector.org policies? Health Affairs 18 (2), 201–212.
Manning, W.G., Newhouse, J.P., Naihua, D., Keeler, E.B., Leibowitz, A., 1987. Health Tamblyn, R., et al., 2001. Adverse events associated with prescription drug cost-
insurance and the demand for medical care: evidence from a randomized exper- sharing among poor and elderly persons. Journal of the American Medical
iment. The American Economic Review 77 (3), 251–277. Association 285 (January (4)), 421–429.
McGuire, T., 2012. Demand for health insurance. In: Mark, V., Pauly, Thomas, G., Trivedi, A.N., Moloo, H., Mor, V., 2010. Increased ambulatory care copayments and
McGuire, Pedro, P., Barros (Eds.), The Handbook of Health Economics. Elsevier, hospitalizations among the elderly. New England Journal of Medicine 362 (4),
pp. 317–396. 320–328.

Publication selection and the income elasticity of the value

of a statistical life
Hristos Doucouliagos a , T.D. Stanley b , W. Kip Viscusi c,∗
a
School of Accounting, Economics and Finance, Deakin University, Melbourne, Australia
b
Department of Economics and Business, Hendrix College, Conway, AR, USA
c
Vanderbilt University Law School, Nashville, TN, USA
Article history: Estimates of the value of a statistical life (VSL) establish the price government agencies use to value fatality
Received 4 December 2012 risks. Transferring these valuations to other populations often utilizes the income elasticity of the VSL,
Received in revised form 16 October 2013 which typically draw on estimates from meta-analyses. Using a data set consisting of 101 estimates of
the income elasticity of VSL from 14 previously reported meta-analyses, we find that after accounting for
potential publication bias the income elasticity of value of a statistical life is clearly and robustly inelastic,
with a value of approximately 0.25–0.63. There is also clear evidence of the importance of controlling for
levels of risk, differential publication selection bias, and the greater income sensitivity of VSL from stated
J17
J18
preference surveys.
J31 © 2013 Elsevier B.V. All rights reserved.
C12
I10
D61
Keywords:
Income elasticity
Value of statistical life
Meta-regression analysis
Selectivity bias
1. Introduction million U.S. dollars. This range in the literature has been reflected in
similar clustering of the VSL estimates used by government agen-
The value of a statistical life (VSL) is the most important cies since the mid-1990s.1 Meta-analyses of the VSL literature have
parameter used in monetizing the value of health, safety, and envi- played a prominent role in agencies’ selection of the VSL.
ronmental risks. For example, the largest benefit component of Policy analysts generally face a benefits transfer problem in
regulations under the Clean Air Act consists of the reduction of that the population in any VSL study may not be representa-
mortality risks valued by VSL (EPA, 1997). Regulatory policies to tive of the population affected by the policy. While adjusting for
reduce mortality risks from medical devices and transportation population characteristics remains a controversial issue (U.S. EPA,
safety improvements can have large benefits when evaluated at 2010; Viscusi, 2011), agencies have incorporated income elasticity
large VSLs. Thus, it is essential to get the value of a statistical life adjustments into their analyses.2 The U.S. EPA (2010) adjusts its
right in order to allocate public resources efficiently (Viscusi and VSL estimates over time to account for the effect of rising income
Aldy, 2003). Ideally, these values should reflect the willingness to levels on the pertinent VSL. The U.S. Department of Transportation
pay of the population that will benefit from a mortality risk reduc- (2011) policy guidance document adopted an income elasticity of
tion.
Although estimates of the VSL vary greatly, most research sur-
veys report the VSL to be somewhere between $6 million and $10
1
Viscusi (2009) provides the VSL amounts used to assess 40 major government
regulations, all of which include estimates of the VSL in the $6–10 million range
after 1996.
∗ Corresponding author. Tel.: +1 615 343 7715; fax: +1 615 322 5953. 2
Adjustments for age are less common and generated controversy in the U.S.
E-mail addresses: douc@deakin.edu.au (H. Doucouliagos), Stanley@hendrix.edu (Viscusi, 2009), but Canada and the World Bank have adopted age adjustments to
(T.D. Stanley), kip.viscusi@vanderbilt.edu (W.K. Viscusi). the VSL (Hammitt and Robinson, 2011).
68 H. Doucouliagos et al. / Journal of Health Economics 33 (2014) 67–75
VSL of 0.55 based on the meta-analysis results in Viscusi and Aldy data but a different specification. In other cases, different original
(2003). More recently, the U.S. Department of Transport guidelines studies are used to generate estimates of the income elasticity.
(2013) now use an elasticity of 1.0, splitting the difference between While these meta-analyses all incorporate a broad set of VSL
the meta-analysis results in Viscusi and Aldy (2003) and the quan- studies, there is still great variation among these meta-regression
tile estimates in Kniesner et al. (2010). Because most VSL estimates estimates of the income elasticity of VSL, which range from −0.26 to
are for the U.S. and other developed countries, assessing the perti- 4. For theoretical and practical reasons, researchers should expect
nent income elasticity is essential to estimating the pertinent VSL there to be some heterogeneity in the income elasticity. But, as we
in other countries based on the existing literature (Hammitt and will demonstrate, much of the variation across studies arises from
Robinson, 2011). Such modifications are especially sensitive to the differences in methodology and publication selection. It is impor-
magnitude of the income elasticity when applied to low-income tant for policymakers to have a clearer, more narrow, estimate of
countries or populations. the average income elasticity of VSL and how it might vary with
This paper integrates, explains, and corrects 101 meta-analytic the choices that researchers make.
estimates of the income elasticity of the VSL. Consistent with the Generally, VSL estimates can be calculated by a two-step pro-
VSL meta-analysis in Doucouliagos et al. (2012), we find that once cess (Hammitt and Robinson, 2011). In the first step, an overall VSL
the estimated VSL’s income elasticities are corrected for observed is estimated from a hedonic wage model or a contingent valuation
publication selection bias, the average reported estimate is greatly survey of what people are willing to pay (WTP) for a hypotheti-
reduced. We find that the VSL is a normal good, but not a lux- cal risk reduction. After an overall VSL value is estimated, it can
ury good. Correcting for potential publication bias yields an overall be adjusted for the particular circumstances to which the policy
income elasticity of approximately 0.6 or as low as 0.25 when fur- will apply. This adjustment can primarily be made for differences
ther adjusted for common misspecification biases. in income from the sample used to estimate overall VSL value
and the incomes of those likely affected by the new policy. These
2. Calculating the value of a statistical life and its income adjustments are therefore very sensitive to the income elasticity
elasticity of VSL, and this sensitivity is especially compounded when applied
to low-income countries (Hammitt and Robinson, 2011).4 Because
The VSL can be estimated in a variety of ways from the choices most VSL estimates come from developed nations with relatively
that workers and citizens make regarding an actual or hypothetical high incomes, it is essential to have an accurate income elasticity
increased risk of death. Most estimates are produced by wage-risk estimate to extrapolate to low incomes.5 Among the 14 existing
studies based on the tradeoffs implied by the estimated regression meta-analyses of VSL that we include on our study, 94% of the VSL
coefficient of the risk of fatal injury from a hedonic wage model. In estimates come from developed nations.
this primary research literature, the dependent variable is usually There are basic economic reasons to believe that the income
the worker’s log wage, and one of many independent variables is elasticity will be greater than 1.0 and higher for low income work-
the fatality risk. Because wages comprise much of worker income, ers (Kaplow, 2005; Viscusi, 2010; Hammitt and Robinson, 2011).
these hedonic regressions usually cannot be further employed to When VSL is income inelastic and extrapolated from a high-income
estimate how VSL varies with income. sample to a low-income country (or population sub-group), the
Evans and Smith (2010) note that the income elasticity of analyst gets values of VSL that seem to be too high given the
VSL can be measured in one of four ways: meta-analyses of expected lifetime income and consumption choices of very low-
hedonic wage studies, stated preference studies, single country income workers. In these cases, it is possible for the ratio of the VSL
comparisons of VSL at different points in time, or cross-country to a worker’s discounted expected stream of future income to be
comparisons of VSL estimates. An additional approach involves the much greater than in developed countries. For this reason, some
use of quantile regressions (Evans and Schaur, 2010; Kniesner et al., analysts arbitrarily use a value of one for VSL’s income elasticity,
2010). Nevertheless, the most common way to estimate the sen- which gives more modest estimates of VSL for low-income groups.
sitivity of VSL to income is to ascertain how the VSL varies across The primary way to estimate the income elasticity of VSL, ,
studies. Because researchers use different samples of workers, their has been to use meta-regression analysis. As already noted, we
average income will naturally vary from study to study. Thus, a have found 14 such meta-analyses containing 101 estimates of the
meta-analysis, which collects all comparable estimates of VSL, can income elasticity of the value of a statistical life. Because primary
provide that broader perspective needed to estimate the income studies that report estimates of VSL also typically report the aver-
elasticity of the VSL (Viscusi, 2012). age incomes of the sample of workers surveyed, it is rather easy
We conducted a comprehensive search for prior meta-analyses to estimate from a meta-regression of VSL estimates on average
that report the income elasticity of VSL or which report regression income. Generally, these meta-regression estimates are inelastic.
results that could be converted into an income elasticity. The search For example, Viscusi and Aldy (2003) report income elasticities
was conducted using various search engines, including Econlit, ranging between 0.50 and 0.60, while Doucouliagos et al. (2012)
JSTOR, Proquest, ScienceDirect, Scopus, and Google Scholar. We estimate VSL’s income elasticity to be only 0.2. Likewise, contin-
also pursued references in prior meta-analyses. Our search strategy gent valuation surveys also tend to give inelastic estimates of the
and subsequent meta-analysis follows the MAER-NET guidelines income elasticity of VSL (Hammitt and Robinson, 2011).
for meta-analysis of observational data (see Stanley et al., 2013). An exception to these low estimates of the inelastic income elas-
This search strategy revealed 14 meta-studies from which it was ticity for VSL is the value based on quantile regressions (Viscusi,
possible to derive or calculate the income elasticity of VSL.3 These 2010; Kniesner et al., 2010). Quantile regression analyses of large
14 meta-studies jointly report a total of 101 estimates of the income
elasticity, and its standard error, of the value of a statistical life
(see Meta-Analysis References and Appendix 1). In some cases, the
4
estimates are derived from the same meta-study using the same An argument can be made that income elasticity might vary between groups or
countries, e.g., some groups or countries can have greater taste for safety, regardless
of income.
5
How one should undertake such an extrapolation to less developed countries
3
Several VSL meta-studies could not be included as they did not include income will also depend on the extent to which the income elasticity varies for much lower
in their meta-regressions. income populations, which is beyond the scope of our study.
H. Doucouliagos et al. / Journal of Health Economics 33 (2014) 67–75 69
national surveys such as the Panel Survey of Income Dynamics and having the ‘correct’ sign. “(S)tudies that find insignificant and
yield separate estimates of the VSL obtained for different income wrong-signed values of compensating wage differentials have a
groups along with the associated income elasticites. This procedure more difficult time getting published” (Hwang et al., 1992, p. 855).
yields an average income elasticity of 1.44, but individual quan- Typically, for this literature, a log wage equation is estimated where
tile estimates become larger (smaller) for lower (higher) incomes the coefficient on the fatality risk variable is used to estimate VSL.
(Kniesner et al., 2010). If a researcher gets a negative coefficient, she is unlikely to report
To maximize comparability of the income elasticity values it believing that a negative VSL must be a signal of some important
included in our analysis, we include only the income elasticity esti- estimation or misspecification error. Even insignificant, positive
mates from meta-regressions. These studies provide the largest VSLs are likely to be under-reported on the belief that review-
number of empirical estimates of income elasticity of VSL. Under- ers and editors might use statistical significance as one criterion
standing what causes these estimates to vary from study to study for publication (Card and Krueger, 1995). Publication bias is some-
is important in itself, and combining all of these estimates provides what of a misnomer. Most researchers have already incorporated
a better assessment of the overall average income elasticity and its the perceived professional importance of statistical significance
variation.6 and will be less likely to report negative and/or insignificant
The one limitation of these meta-analysis estimates is that they VSL estimates even in their unpublished reports and working
are drawn from the average VSL and income estimates of the sam- papers.
ples used. Hence, they may not be representative of the entire Nor are reviewers and meta-analysts immune to further selec-
range of income (Viscusi, 2011). In contrast, estimates from quan- tion bias. Doucouliagos et al. (2012, p. 199) give two examples:
tile regressions use a broader range of incomes and can thereby
For example, Kluve and Schaffner (2008) note that some stud-
estimate the distribution of the income elasticity. However, a quan-
ies in their literature search found negative or statistically
tile regression estimate is based on a single sample, which might
insignificant estimates and these were excluded from their
contain some idiosyncratic feature or errors that might, as a result,
meta-analysis. In their dataset, Bellavance et al. (2009) include
produce an outlier (Viscusi, 2012).7
studies that report both negative and positive VSL estimates,
3. Meta-regression of the value of a statistical life but only positive estimates are included in their meta-analysis.
In this paper, we document that publication selection (or repor-
Meta-regression analysis is the regression analysis of all pre-
ting) bias is also found among the meta-regression estimates of
viously published or reported comparable regression estimates of
VSL’s income elasticity. Correcting this selection bias transforms
a given parameter. It is a type of meta-analysis specifically devel-
an average income elasticity that is not significantly less than one
oped for empirical economics (Stanley, 2001). Because the VSL is
to an elasticity that is significantly inelastic with a value of approx-
such a key policy parameter, there have been more meta-regression
imately 0.6 or less.8
analyses of VSL, 14, than for any other economic research topic
(see Meta-Analysis References). The advantage of meta-regression
3.2. A graphical illustration of publication selection bias
analysis (MRA) is that it minimizes random sampling (or estima-
tion) error by averaging across many VSL estimates. Furthermore,
The best way to illustrate, literally, publication selection or
MRA can objectively and statistically control for any idiosyncratic
reporting bias is through a funnel graph. The funnel graph is a
research approach, technique, or other dimensions that might be
scatter diagram of an estimate’s precision (the inverse of the esti-
suspected of inducing bias. For example, some datasets do not con-
mate’s standard error or 1/SE) and the magnitude of this estimate
tain sufficient information to control for all of the many worker
(horizontal axis). Funnel plots have been widely used to detect
characteristics that have been shown to be important in explain-
publication bias among clinical trials on medical treatments. As
ing workers’ wage. Omitting a relevant variable is well known to
the name suggests, a funnel graph should resemble, roughly, an
potentially create a bias in the regression coefficient of interest,
inverted funnel when there are no selection or reporting biases.
in this case the coefficient on fatality risk. Experience has shown
The symmetry of a funnel graph is quintessential. Alternatively,
that controlling for omitted-variable bias is essential, and acknowl-
when a significant number of values are missing from one side or
edging these potential biases does much to explain the wide
the other (that is, when the funnel is skewed or asymmetric), this
variation among reported empirical economic estimates (Stanley
may indicate selection bias.
and Doucouliagos, 2012).
Fig. 1 presents the funnel plot for all 101 meta-analytic esti-
Generally, meta-analysts find that VSL is between $6 and $10
mates of VSL’s income elasticity, . Clearly, this graph is skewed to
million (US dollars) and that it is income inelastic. However,
the right, indicative of the suppression of negative and, perhaps,
29% of the income elasticities are greater than one. Aside from
insignificant income elasticities. Also, there might be selection for
Doucouliagos et al. (2012), none of these meta-studies corrected
elastic values (e.g., > 1). For theoretical reasons, some researchers
for publication (or reporting) bias, and this correction makes a huge
may wish to find income elasticities greater than or equal to
difference, reducing VSL to less than $2 million in that study.
one, or at least not significantly less than one. In spite of how
revealing you might find this graph, the inherently subjective
3.1. Publication bias
nature of visual interpretation means that the graph alone cannot
establish publication bias or determine the corrected income elas-
Publication bias is a form of selection bias where by researchers
ticity. Rather, objective statistical testing is required. Fortunately,
choose which estimate to report based on statistical significance
a series of tests and estimates have been developed for exactly this
purpose.
6
This type of meta-analysis of prior meta-analyses is known as meta-meta-
analysis. See Stanley and Doucouliagos (2012) for details.
7 8
The problem lies not in meta-analysis as a technique but in the estimates avail- Selection for income elasticities can occur either as a result of selection for posi-
able to apply to it. Thus, over time when more quantile regression studies become tive and statistically significant VSL estimates or as a deliberate process of selecting
available, it will be possible to apply meta-analysis to the distribution of income both VSL and income elasticity estimates. As in the case of VSL estimates, however,
elasticities from these quantile regressions. the existence of bias does not mean that all authors have engaged in this process.
could be other reasons, for the funnel asymmetry. Perhaps it is the

result of heterogeneity in model specifications that is coinciden-
tally correlated with SE or maybe it is some type of small-sample
bias. To allow for heterogeneity in how the models were speci-
fied, we embed these selection terms into a multiple MRA that also
controls for important model specification choices as well as differ-
ence across countries, measures, and data—see Section 5, below. In
spite of these controls, evidence of selection or small-sample bias
remains. Regardless of the source of the bias, bias is bias, and we
seek to minimize its influence on this important policy parame-
ter. Fortunately, accommodating and correcting either publication
bias or small-sample bias is precisely what these MRA models of
publication selection do.
The estimated intercept, ˇ ˆ 0 , from the FAT–PET–MRA (1) is an
estimate of the overall income elasticity corrected for publication
bias. Note how it is only half as large as the simple reported average
of these 101 elasticity estimates (0.90). Although the average value
of this literature is not significantly less than one (t = −1.46; p > .05),
ˆ 0 is much less than one (t = −8.55; p < .001).
ˇ
The income inelasticity of VSL is also borne out by a very sim-
ple correction for publication bias, focusing on the most precisely
estimated income elasticity values. The measure, which we des-
Fig. 1. Funnel plot of the income elasticities of VSL. ignate as the Top 10, is calculated by discarding 90% of research
with the lowest precision and averaging the remaining 10% most
accurate estimates, those at the top of the funnel graph. The Top 10
3.3. Meta-regression models of selective reporting
value is 0.61. Although Top 10 was developed merely to highlight
how publication selection creates a statistical paradox, simulations
A simple meta-regression model of estimated effects and their
show that it provides a rather good corrected estimate, one that is
standard errors can be used to model and test for publication (or ˆ 0 (Stanley et al., 2010). Regardless, correct-
typically better than ˇ
reporting) bias:
ing for publication selection (or reporting or small-sample) bias
i = ˇ0 + ˇ1 SEi + εi , (1) makes a difference in whether the VSL may be considered a luxury
or a necessity, and this has important policy implications when, for
where i is an individual estimate of the income elasticity of VSL
example, extrapolating to low incomes.
and SEi is its standard error, ˇ1 SEi models publication selection ˆ 0 is known to be biased downward when there
Unfortunately, ˇ
bias, and estimates of ˇ0 serve as corrections for publication bias.
is a true nonzero effect (Stanley, 2008; Stanley and Doucouliagos,
The essential idea is that studies with larger SEs (and generally
2013). This bias is due to using a linear approximation to represent
smaller samples) will need to engage in more intensive selection
a complex non-linear relation between i and SEi (for details see
to find a statistically positive i . See for example Stanley (2005,
Stanley and Doucouliagos, 2013). In the place of MRA model (1),
2008) and Stanley and Doucouliagos (2012). Note that as SEi → 0,
simulations show that the simple MRA that substitutes SEi2 for SEi
E(effecti ) → ˇ0 . The funnel-asymmetry test (FAT), H0 : ˇ1 = 0, is a low
offers an improved, less biased, corrected estimate (Stanley and
power test for the presence of selection (Egger et al., 1997), while
Doucouliagos, 2012, 2013). That is, ˆ 0 in Eq. (3) below,
the precision-effect test (PET), H0 : ˇ0 = 0, indicates whether there
is a genuine effect beyond publication selection (Stanley, 2008).
i = 0 + 2 SEi2 + i (3)
This FAT–PET–MRA, Eq. (1), is never estimated by OLS because
empirical economics estimates always contain great heteroskedas-
ticity, as directly evidenced by the wide variation found among the typically provides a better corrected estimate. This estimator
reported standard errors, SEi . Weighted least squares, WLS, esti- is sometimes called ‘PEESE’ from precision-effect estimate with
mates of MRA model (1) can be obtained either dividing MRA (1) standard error (Stanley and Doucouliagos, 2012, 2013). Table 1
by SEi : columns 3 and 4 estimates MRA (3), above, for this meta-regression
1 income elasticity data and finds ˆ 0 = 0.62, with a 95% confidence
ti = ˇ1 + ˇ0 + vi (2) interval of 0.44–0.80. The value of PEESE is larger than ˇ ˆ 0 , as
SEi
expected, but still much less than one (t = −11.1; p < .001). Note
or by using a WLS statistical package on MRA (1) with 1/SEi2 (the how the PEESE is virtually identical with the simple Top 10, sug-
inverse variance) as the weights. gesting that 0.6 is, approximately, a robust overall estimate of the
Column 1 of Table 1 reports our meta-meta-estimates for the income elasticity of the value of a statistical life for this research
weighted least squares version of MRA model (1). Note first that literature. This figure lies within the upper range of the income
there is clear evidence of both reporting selection (reject H0 : ˇ1 = 0; elasticity derived by Viscusi and Aldy (2003).
t = 3.61; p < .001) and a genuinely positive overall income elastic- Some have pointed out that theory requires VSL’s income elas-
ity of VSL (reject H0 : ˇ0 = 0; t = 6.99; p < .001).9 Of course, there ticity to be at least one. For example, there is a relationship between
the income elasticity of VSL and the coefficient of relative risk aver-
sion (CRRA, Kaplow, 2005). Estimates of CRRA in conjunction with
9
When MRA model (1) is reported with cluster-robust standard errors, column 2
Table 1, evidence of publication bias is less clear. The funnel-asymmetry test (FAT) is
widely known to have low power, and Egger et al. (1997) recommend using ˛ = .10.
Using this larger significance level and cluster-robust standard errors confirms pub- Stanley and Doucouliagos (2012, pp. 78–79) recommend correcting for potential
lication selection for statistically significant positive income elasticities. In any case, publication selection bias regardless of the outcome of FAT.
Table 1
WLS-MRA of publication selection. Dependent variable = i .
Variables 1: WLS-MRA (1) 2: Cluster-robust (1) 3: WLS-MRA (3) 4: Cluster-robust (3) 5: WLS-MRA (4) 6: Cluster-robust (4)
ˆ 0 or ˆ 0 (PET or PEESE)
Intercept: ˇ 0.45 0.45 0.62 0.62 0.63 0.63
(6.99) (2.39) (18.1) (6.83) (13.3) (5.27)
ˆ 1 (FAT)
SEi : ˇ 1.28 1.28 – – −1.40 −1.40
(3.61) (1.68) (−3.97) (−1.80)
SEi2 : ˆ 2 – – 0.97 0.97 – –
(2.76) (3.09)
ˆ3
Elastici ·SEi : ˇ – – – – 2.83 2.83
(10.6) (3.93)
Average 0.90 0.90 0.90 0.90 0.90 0.90
Top 10 0.61 0.61 0.61 0.61 0.61 0.61
Adjusted R2 .11 .11 .06 .06 .58 .58
n 101 101 101 101 101 101
Notes: Cells report coefficient estimates for Eq. (1) in columns 1 and 2, Eq. (3) in columns 3 and 4, and Eq. (4) in columns 5 and 6. All models estimated using WLS. FAT is a
test for publication selection bias. PET and PEESE provide estimates the income elasticity of VSL corrected for selection bias. n is the number of observations. t-values are
reported in parenthesis. Lastly, columns 2, 4 and 6 report t-values in parenthesis from cluster-robust standard errors, clustered at the study level.
that model imply that VSL should be greater than 1.10,11 Others see negative direction for those estimates where Elastici = 0, because
elastic values to be necessary to extrapolate the VSL to low-income many significantly positive estimates will be selected out of this
applications. Given such professional priors, it is possible that some group. However, note that the sum of these two publication selec-
of the reported elasticities are selected to be greater than one, or at tion terms (ˇ ˆ1 + ˇ
ˆ 3 ), is quite close to what is reported in columns
least not statistically less than one. Therefore, it would be prudent 1 and 2 of Table 1 for the overall publication selection effect.
to allow for differential publication selection to accommodate the When there is no publication selection bias (SEi = 0), this model
selection of elastic values should this research contain such selec- of differential publication (or reporting) selection gives virtually an
tion. The simplest way to allow for two types of reporting selection identical corrected estimate of VSL’s income elasticity as Top 10 and
is to expand the publication bias term to: PEESE, 0.63.
Columns 2, 4, and 6 of Table 1 provide the first robustness check
i = ˇ0 + ˇ1 SEi + ˇ3 Elastici · SEi + i , (4) for our MRA results. Others will be discussed below. Because the
typical meta-analysis of VSL reports multiple income elasticites
where Elastici is a dummy variable indicating whether an income (over 7, on average), there might be some dependence among
elasticity is statistically significantly less than one (Elastici = 0) or estimates in the same study. This is especially true when there
not (Elastici = 1).12 The term ˇ3 Elastici · SEi in MRA model (4) allows is a pattern of selective reporting as indicated by all of our tests.
for differential publication selection for elastic income elasticity, One acceptable way to accommodate potential dependence within
while ˇ1 SEi accounts for potential selection for statistically posi- studies is to calculate cluster-robust standard errors—see columns
tive or negative income elasticity found among inelastic reported 2, 4 and 6 of Table 1. Although the t-values change in all cases, the
estimates. overall implications about publication bias or the size of the cor-
Table 1, columns 5 and 6, give the MRA results for this differen- rected income elasticity do not. Another way to deal with potential
tial publication bias model (4). These findings confirm the need to dependence of having multiple estimates per study is to use fixed
make this further accommodation of differential selective repor- and random-effects unbalanced panel models, which we use in the
ting. First, we easily reject H0 : ˇ3 = 0 (t = {10.6; 3.93}; p < .001). multiple meta-regression models reported in Section 5, below.
Second, the explanatory power increases greatly when differen- Although these simple meta-regression results are very revea-
tial publication selection is allowed, increasing the adjusted R2 ling, we must also be sure that they remain robust when other
from 11% to 58%. Allowing more or less publication (or reporting) potential explanatory variables are considered. Perhaps what
selection in terms of whether a reported VSL elasticity is signifi- appears as publication selection bias is merely chance correlation
cantly less than one or not explains nearly one half of the observed with how VSL is measured or how the MRA model is specified?
variation among these 101 income elasticities. But then, a rise in It is conventional practice in meta-regression analysis to embed
explanatory power is not unexpected as ˇ3 Elastici · SEi is defined these models of publication selection into larger explanatory MRAs
in a way that is related to large income elasticities. The negative (Stanley and Doucouliagos, 2012). Section 5 considers multivariate
estimate coefficient for ˇ ˆ 1 is likely an artifact of how we have
MRAs and further investigates the robustness of the above findings.
reclassified the research as those with an income elasticity signif- First, however, we illustrate how meta-regression estimates of the
icantly less than one (Elastici = 0). Thus, the set of estimates that income elasticity of VSL can be so different and how they might
have Elastici = 0 are much less likely to be significantly positive. be selected for either statistical significance or, alternatively, to be
Our reclassification therefore creates its own selection bias in a elastic.
4. Estimating the income elasticity of the value of a

10
Evans and Smith (2010) demonstrate that the theoretical links between these statistical life
two variables are not as clear cut and that the level of the income elasticity of VSL
could be below 1.0 under other reasonable assumptions.
11
If the reporting of estimates of relative risk aversion is affected by publication To illustrate the wide variation that one can obtain for VSL’s
selection bias, then they will be artificially inflated. Hence, it is possible that some income elasticity, we use data from one of the 14 meta-analyses of
of the observed discrepancy between the income elasticity and the relative risk VSL (Bellavance et al., 2009). Their meta-analysis contains 39 esti-
aversion coefficient could be explained by differences in selection bias. We see this mates of VSL from hedonic wage equations, which Doucouliagos
as a fertile area of future research.
12
Elastic may also be defined to be equal to 1 when the reported estimate is greater
et al. (2012) used to correct the VSL estimate for publication selec-
than or equal to 1. This alternative definition of Elastic gives essentially the same tion. It is important to note that obtaining a VSL income elasticity
MRA results. was not the objective of Doucouliagos et al. (2012). It was reported
Table 2
Alternative MRA estimates of VSL’s income elasticity. Dependent variable = ln VSL.
Variables Column 1 Column 2 Column 3 Column 4 Column 5
ln Income −0.16 −0.88 1.31 0.62 0.31

(−0.44) (−2.88) (5.32) (3.01) (1.81)
ln Year 100.1 – 152.4 – –
(3.14) (3.95)
ln SE-VSL 0.82 0.99 – – –
(4.88) (5.63)
Compensation −0.82 −0.42 −1.25 −0.73 –
(−3.42) (−1.84) (−4.39) (−2.42)
Intercept −754.9 10.6 −1156.1 8.35 11.2
(−3.10) (7.16) (−3.92) (4.32) (6.85)
RESET† 6.45 4.50 2.21 3.26 1.24
[p = .002] [p = .01] [p = .11] [p = .03] [p = .31]
Notes: t-values are reported in parenthesis.

†
p-values are reported in brackets.
as merely an ancillary byproduct. The point to this qualification is variables. Therefore, how do we know which estimates to believe
that there was no intentional selection for either the size or sig- or which estimates come from correctly specified models of the
nificance of the income elasticity. No doubt, other meta-analysts value of a statistical life? Largely, we do not. Researchers should
also did not explicitly or knowingly select for positive or significant routinely use specification tests, but editors and reviewers do not
income elasticities. However, perhaps a few MRAs produced neg- require their usage; thus, they are rarely reported.14 Furthermore,
ative or insignificant income elasticities, and those meta-analysts with many tests to choose from and their well-known low power,
used this as evidence that their MRA was misspecified. In any case, these tests are also likely to give a wide variety of results, easily
if a researcher chooses to do so, it would be easy to get virtually confirming any a priori notion the researcher has about VSL.
any value he wanted. Nonetheless, we report the results from Ramsey’s RESET in
Table 2 explicitly models VSL to estimate VSL’s income elastic- Table 2. Recall that Ramsey’s RESET test is sensitive to omitted-
ity. Because log-log models are the most straight-forward and the variable, misspecification, and simultaneity biases, among others.
most often used, we begin with the log-log version of the MRA Because of the low power of such tests, rejections of proper spec-
model reported in Doucouliagos et al. (2012, Table 3 column 1). ification are to be believed more than a failure to find improper
Indeed, 65% of the meta-analysis estimates of VSL’s income elas- specification. Log-log models that use all of these variables or that
ticity use a log-log MRA. Oddly, this MRA gives a negative, though do not contain a time trend are rejected as misspecified (p < .05).
insignificant, income elasticity when applied to Bellavance’s et al. The exception is when all moderator variables are omitted from
(2009) 39 estimates of VSL (see column 1 Table 2). If taken seri- explaining VSL other than income (Column 5 Table 2), which is a
ously, this would imply that VSL does not depend on income and common approach in this literature. However, we can be confi-
need not be adjusted when applied to low-income countries. dent that this simple MRA model is misspecified because it omits
The next columns of Table 2 merely change the set of modera- several relevant explanatory variables—Compensation, time trend,
tors variables included in the MRA to see how robust (or sensitive) and publication or reporting selection. In contrast, the model used
the income elasticity estimate is to changes in exact model spec- by Doucouliagos et al. (2012, Table 3 column 1) that includes all
ification. Column 2 of Table 2 omits the time trend and yields a of these variables when only income is a logarithm easily passes
statistically significant negative income elasticity. If instead, the RESET (F(3,31) = 0.63; p .05).
meta-analyst fails to allow for publication selection bias, which
97% of this literature fails to do, VSL’s income elasticity becomes 5. Modeling heterogeneity and bias among VSL’s income
significantly positive and larger than one (column 3 Table 2). If, on elasticity estimates
the other hand, the researcher omits both a time trend13 and pub-
lication selection with or without accounting for the presence of As clearly demonstrated in the above section, VSL’s income elas-
compensation insurance (Compensation), one obtains significantly ticity is highly sensitive to unavoidable modeling choices. Thus,
positive but inelastic values (see columns 4 and 5 of Table 2). With- like all other meta-analyses in economics, it is important to ensure
out employing entirely unreasonable specifications, the researcher that our findings about the overall magnitude of are robust to
may easily get a significantly negative income elasticity, an elas- potential heterogeneity and misspecification bias. Doucouliagos
ticity that is not significantly different than zero, a significantly and Stanley (2009) and Stanley and Doucouliagos (2012) offer a
positive one that is not statistically less than one, or a significantly very general MRA model that can accommodate any complexity
positive income elasticity that is clearly and statistically less than of genuine heterogeneity, misspecification biases, selection biases,
one. All of these widely dispersed estimates of VSL’s income sen- and study-effects, Ss :
sitivity are compatible with the existing VSL research, any one of
which might be obtained by an honest researcher. is = ˇ0 + ˇk Zkis + ˇ1 SEis + ıj SEi Kjis + Ss + is . (5)
Nor is this merely an academic exercise. The majority of
reported meta-regression VSL income elasticity estimates through- Where the Z-variables account for heterogeneity and filter out
out this literature omit all or most of these important explanatory potential misspecification biases, and the K-variables allow for
potential differential selection. In this application, we find that
the K-variable, Elastici · SEi , from Eq. (4) is particularly important.
13
There is no theoretical reason for a time trend to exist in income elasticities.
Theoretically, it is possible that over time VSL becomes more elastic, less elastic, or
14
time invariant. Viscusi and Aldy (2003) is an exception to this.
Table 3
Moderator variables and inclusive MRA results.
Moderator variable Definition MRA coefficient (t-value)
Publication selection
SE is the standard error of the reported estimated elasticity 0.50 (1.14)
SE·Elastic allows differential publication selection 1.54 (5.83)
Model specification variables

Log Log =1, if the MRA model is in a log–log form 0.10 (0.678)
Log Lin =1, if the MRA model is in a log-linear form −0.44 (−1.53)
Linear =1, if the MRA model is linear −0.44 (−0.48)
Risk =1, if the MRA model includes a measure of risk −0.39 (−3.95)
Education =1, if the MRA model includes an education variable 0.22 (0.24)
Cluster robust =1, if the MRA model reports cluster-robust standard errors 0.08 (0.73)
Robust =1, if the MRA model is estimated by a robust regression −0.001 (−0.01)
Random =1, if the MRA model is estimated by random-effects −0.03 (−0.42)
OLS =1, if the MRA model is estimated by OLS 0.02 (0.20)
Data
Stated preference =1, if all estimates come from stated-preferences studies 1.25 (2.44)
Wage Risk =1, if all estimates come from wage-risk studies 0.15 (1.15)
Developed =1, if all estimates come from developed countries −0.03 (−0.92)
US is the proportion of estimates that come from the US 0.09 (0.23)
Europe is the proportion of estimates that come from Western Europe −6.58 (−2.69)
Canada is the proportion of estimates that come from Canada −1.45 (−1.89)
Australia/NZ is the proportion of estimates that come from Australia or NZ −6.52 (−1.63)
Japan is the proportion of estimates that come from Japan 7.08 (2.19)
Developing is the proportion of estimates from developing countries 1.03 (0.57)
Ave. year is the average year of the data used −0.008 (−0.20)
Intercept 16.64 (0.21)
However, we do not stop at a multivariate explanation of het- This multiple MRA model and its individual effects are
erogeneity and selection. To allow for potential dependence of robust—see Table 4. Because researchers typically report multi-
reported estimates in the same study we also estimate MRA model ple estimates per study, it is necessary to control for the potential
(5) with fixed and random study-specific effects in an unbalanced dependence of estimates within studies. Column 1 of Table 4 does
panel design. These effects are represented by Ss in Eq. (5). Lastly, so by computing cluster-robust standard errors, while columns 2
like our above MRAs, we accommodate heteroskedasticity by using and 3 treat this research data as an unbalanced panel using either
weighted least squares. random study effects (RE-panel) or fixed study effects (FE-panel),
Table 3 lists 21 moderator (or explanatory) variables and their respectively. Note how all effects are robust to different estimation
estimated MRA coefficients when all are included in a single meta- methods, except that one effect becomes statistically insignificant
regression model. The variables listed in Table 3 can be classified in when estimated by a fixed-effects panel. However, for our data, the
three broad categories: publication selection, model specification Hausman test accepts the RE-panel MRA, column 2 (2 (8) = 6.28;
or potential misspecification bias, and data. p .05). In case a few ‘outlier’ estimates of VSL’s income elastic-
Because there is substantial multicollinearity,15 we also report ity might be driving our results, we also estimate this MRA model
the general-to-specific (G-to-S) MRA results in Table 4. G-to-S using robust regression and obtain virtually the same findings in
begins by including all 21 moderator variables coded. The vari- column 4 of Table 4.
able that had the largest p-value is removed, one at a time, until Of special note is the fact that differential publication selec-
all remaining explanatory variables are statistically significant. tion (Elastic·SE), controlling for risk (Risk), and Stated Preference
Because of limited degrees of freedom and high multicollinearity, have robust and statistically significant effects even when all mod-
some simplification of MRA models is necessary. The general-to- erator variables are included (Table 3). Furthermore, when these
specific approach is the least objectionable way to do so (Charemza three effects are accommodated, all potential systematic varia-
and Deadman, 1997). tion is explained (2 (97) = 88.9; p > .05), and this is also true for
As we saw in Section 4, the income elasticity can be highly the MRA model reported in Table 4 (2 (93) = 71.1; p > .05).17 Lastly,
affected by the functional form of the MRA model used. For this our multiple MRA easily passes RESET (F(9,84) = 1.35; p > .05). Thus,
literature, log-log models give significantly larger (more elastic) we can have confidence that our meta-regression model provides
estimates—MRA coefficient = 0.20; t = 4.33; p < .001; see column 1 an acceptable explanation of the systematic pattern found among
Table 4. Consistent with the simple MRA models of publication reported income elasticites of the value of a statistical life.
selection reported in Table 1, we again find differential publication
selection bias (MRA coefficient Elastic·SE = 1.79; t = 6.84; p < .001).
Other important effects are: whether the estimating MRA controls
for risk levels (t = −5.52; p < .001), whether estimates are exclusive cannot be regarded as reliable because they are based on extrapolations well beyond
from stated-preference studies (t = 3.85; p < .05) or wage-risk stud- our MRA sample. In particular, the largest proportion of estimates that come from
western Europe is 0.286 for any of our MRA VSL income elasticities with mean 0.053.
ies (t = 2.54; p < .05), as well as regional effects (Japan and Europe).16
For Japan, its largest proportion is 0.385 with mean 0.074.
17
This test is applied to the t-value WLS-MRA, Eq. (2), but one with the added
moderator variables. When we have only sampling error and no excess heterogene-
ity, we would expect the t-values to have a variance of one (Higgins and Thompson,
15
Six variables have variance-inflation factors (VIF) larger than 100. 2002; Stanley and Doucouliagos, 2012). This gives a chi-square test for whether or
16
Although these regional coefficients would imply that VSL is income elastic not the variance is one after the systematic variation from the moderator variables
in Japan and has a negative income elasticity in western Europe, such inferences is factored out.
Table 4
Multiple WLS-MRA.
Moderator variables 1: Cluster-robust 2: RE-panel 3: FE-panel 4: Robust regression 5: Cluster-robust with SE2
a a a a
Elastic·SE 1.79 (6.84) 1.69 (6.40) 1.57 (5.54) 1.78 (7.02) –
SE2 – – – – 1.21 (4.94)a
Log Log 0.20 (4.33) 0.19 (2.24) 0.11 (0.84) 0.17 (3.16) 0.15 (1.26)
Risk −0.37 (−5.52) −0.43 (−6.11) −0.47 (−6.16) −0.40 (−5.72) −0.46 (−3.88)
Stated Preference 0.39 (3.85) 0.44 (2.45) 0.59 (2.00) 0.44 (3.69) 0.64 (3.77)
Wage Risk 0.24 (2.54) 0.30 (2.38) 0.46 (2.05) 0.26 (2.96) 0.12 (0.73)
Japan 0.84 (3.37) 0.74 (2.71) 0.80 (2.44) 0.79 (2.90) 1.00 (2.64)
Europe −1.34 (−2.87) −1.43 (−2.78) −1.58 (−2.60) −1.45 (−2.89) −2.72 (−4.10)
Intercept 0.40 (3.84) 0.49 (3.31) 0.52 (2.01) 0.48 (4.85) 0.66 (3.89)
n 101 101 101 101 101
Adj. R2 0.78 – – – 0.56
a
t-values are reported in parenthesis.
The dependent variable is the estimated income elasticity of VSL. n is the number of observations.
As an additional robustness check, we replace the potentially corrected estimates and the values preferred by Viscusi and Aldy
endogenous Elastic·SE variable that we constructed from elasticity (2003). To justify an income elasticity that is one or larger, the
estimates not significantly less than one, with the exogenous, SE2 , meta-analyst would need to purposefully choose a likely biased
used by the PEESE-MRA, Eq. (3), in column 5 Table 4. Overall, the log-log estimate, consider only stated preferences studies that
findings remain largely the same; however, meta-analyses that use contain mostly Japanese surveys and none from Europe. Defensible
only wage risk estimates, WageRisk, or a double log specification are applications of our multiple MRA findings imply that the overall
no longer significantly different from meta-analyses that use a mix income elasticity is much less than one, and even less than what
of estimates or other specifications. the simple MRA models of publication selection (or small-sample
But what does this MRA model say about the magnitude of VSL’s bias) suggest (recall Table 1). Thus, we have robust evidence that
income elasticity? What values should be substituted into the MRA income elasticity of the value of a statistical life is less than one.
reported in Table 4? With a multiple MRA, one must make some
allowance for the appropriate values of the moderator variables.
Although reasonable researchers might have some differences 6. Conclusions
of judgment regarding the appropriate values of this research’s
explanatory variables, we think that most of these choices are Meta-analysis has become a common method from which to
uncontroversial, and we explore plausible variation in professional derive VSL estimates. Recent evidence suggests that it is impor-
judgment below. tant to correct meta-analysis for publication selection bias. This
First, the differential publication term, Elastic·SE, must be driven results in significantly lower estimates of VSL. Another impor-
to zero, because it represents a bias. Next, Risk should be set to tant parameter estimate is the income elasticity of VSL. Here too,
one, because researchers should account for the risk level. Our meta-analysis have been widely used to derive estimates of this
meta-analysis clearly shows that risk level affects income elasticity important policy parameter. Again, however, it is important to
estimates, and the failure to control for risk level, therefore, would correct the evidence base for possible publication selection, small-
seem to cause omitted-variable bias. From our exploration of sample, or other potential sources of bias. We show that correcting
the Bellavance et al. (2009) wage-risk data, RESET provides clear for potential, perhaps inadvertent, selection bias yields an income
evidence that the log-log MRAs are misspecified (recall Section 4), elasticity of VSL that is significantly less than 1, practically and
implying that LogLog should be set to zero. This leaves only the statistically. Although all of our preferred estimates of the income
values of Stated Preference, WageRisk, and the regional variables elasticity fall between 0.25 and 0.63, most are in the upper range
in question. For those, one might be purposefully ambivalent and (0.5; 0.63). The Top 10 estimate of 0.61 and the PEESE estimate of
substitute in their sample average values. Substituting these values 0.62 were remarkably similar. In any case, our findings are consis-
into the MRA model reported in column 1 Table 4 gives an overall tent with the estimates from Viscusi and Aldy (2003) that have been
VSL income elasticity of 0.25 (CI: 0.17; 0.34) and is close to the used by at least some U.S. agencies. Viscusi and Aldy (2003) derived
value (0.20) reported by Doucouliagos et al. (2012, p. 202). If you their estimate by constructing a range of income elasticities using
believe that there is nothing wrong with using the log-log specifi- specification from prior meta-analysis as well as their own speci-
cation, we could instead substitute the sample mean in for LogLog, fications. We adopt a different methodology and yet find a similar
and VSL’s income elasticity increases to 0.38 (CI: 0.30; 0.47). On magnitude, though wider range, of VSL’s income elasticity. A series
the other hand, if we use the multiple MRA that substitutes SE2 of studies has questioned the prior findings of income inelastic
for Elastic·SE, column 5 Table 4, the implied income elasticity is VSL. However, a meta-meta-analysis of 101 estimates from 14 prior
nearly the same, 0.37, (CI: 0.25; 0.49), or 0.47 if LogLog is set to meta-studies of VSL’s income elasticity in developed countries pro-
its sample mean. To attempt to make these predictions as large as vides robust results that VSL is indeed income inelastic.
possible, one might consider only U.S. stated preferences studies,
0.42 (CI: 0.22; 0.62).18 Even here, we obtain an inelastic estimate,
Appendix 1. Studies included in the meta-meta-analysis,
but one that is consistent with both our simple publication-bias
chronological order
18
Relying on stated preference studies exclusively, however, is problematic. VSL
estimates from stated preference studies tend to be lower than those from revealed
preference studies (such as those from hedonic wage regressions). Viscusi (2012, hypothetical risks that people may not view as a real threat” whereas hedonic wage
p. 3) notes that this arises in part because stated preference studies “address the regressions report estimates based on actual observed behavior.
Authors Location in the article Income elasticity Type of study Number of studies
Liu et al. (1997) p. 357 0.53 Wage 17

Day (1999) p. 20 0.36–3.56 Wage 22
Miller (2000) Tables 3 and 4 0.85–1.00 Various 13–68
Bowland and Beghin (2001) p. 392 1.52–2.27 Wage 33
Leggett et al. (2001) p. 108 0.86–1.32 Various 60
Dionne and Michaud (2002) Table 2 1.08–1.72 Wage 38
Mrozek and Taylor (2002) Table 3 0.31–0.50 Wage 33
de Blaeij et al. (2003) Table 4 −0.19–3.98 Various 30
Viscusi and Aldy (2003) Tables 7 and 8 0.46–0.61 Wage 41–49
Kluve and Schaffner (2008) Table 4 −0.26 Various 37
Bellavance et al. (2009) Tables 4, 5 and appendix 3 and 4 0.40–0.75 Wage 29–32
Lindhjem et al. (2010) p. 33 and Table 4 0.10–1.26 SP 35
Lindhjem et al. (2011) Tables 4–8 0.25–1.31 SP 7–80
Doucouliagos et al. (2012) p. 202 0.20–0.38 Wage 32
All studies −0.26–3.98
Notes: Wage = wage risk study; SP = stated preference study.
References Trottenberg, Under Secretary for Policy, and Robert S. Rivkin, General Counsel,
Guidance on Treatment of the Economic Value of a Statistical Life in U.S.
Bellavance, F., Dionne, G., Lebeau, M., 2009. The value of a statistical life: a meta- Department of Transportation Analyses, http://www.dot.gov/office-policy/
analysis with a mixed effects regression model. Journal of Health Economics 28, transportation-policy/guidance-treatment-economic-value-statistical-life
444–464. (accessed 10.09.13).
Card, D.E., Krueger, A.B., 1995. Time-series minimum-wage studies: a meta-analysis. United States Environmental Protection Agency (EPA), 1997. The Benefits and Costs
American Economic Review 85, 238–243. of the Clean Air Act, 1970 to 1990, EPA 410-R-97-002.
Charemza, W.W., Deadman, D.F., 1997. New Directions in Econometric Practice, 2nd United States Environmental Protection Agency, 2010. Valuing Mortality Risk
ed. Russell Edward Elgar, Cheltenham. Reductions for Environmental Policy: A White Paper. Science Advisory Board
Doucouliagos, H., Stanley, T.D., 2009. Publication selection bias in minimum-wage – Environmental Economics Review Draft.
research? A meta-regression analysis. British Journal of Industrial Relations 47, Viscusi, W.K., 2009. The devaluation of life. Regulation and Governance 3, 103–127.
406–428. Viscusi, W.K., 2010. The heterogeneity of the value of statistical life: introduction
Doucouliagos, H., Stanley, T.D., Giles, M., 2012. Are estimates of the value of a sta- and overview. Journal of Risk and Uncertainty 40, 1–13.
tistical life exaggerated? Journal of Health Economics 31, 197–206. Viscusi, W.K., 2011. Policy challenges of the heterogeneity of the value of statistical
Egger, M., Smith, G.D., Schneider, M., Minder, C., 1997. Bias in meta-analysis detected life. Foundations and Trends in Microeconomics 6 (2), 99–172.
by a simple, graphical test. British Medical Journal 315, 629–634. Viscusi, W.K., 2012. What’s to know? Puzzles in the literature on the value of statis-
Evans, M.F., Smith, V.K., 2010. Measuring how risk tradeoffs adjust with income. tical life. Journal of Economic Surveys 26, 763–768.
Journal of Risk and Uncertainty 40, 33–55. Viscusi, W.K., Aldy, J.E., 2003. The value of a statistical life: a critical review of market
Evans, M.F., Schaur, G., 2010. A quantile estimation approach to identify income and estimates throughout the world. Journal of Risk and Uncertainty 27, 5–76.
age variation in the value of a statistical life. Journal of Environmental Economics
and Management 59, 260–270.
Hammitt, J.K., Robinson, L.A., 2011. The income elasticity of the value per statistical Meta-Analysis References
life: transferring estimates between high and low income populations. Journal of
Benefit-Cost Analysis 2 (1), Article 1, http://www.bepress.com/jbca/vol2/iss1/1 Bellavance, F., Dionne, G., Lebeau, M., 2009. The value of a statistical life: a meta-
(accessed 18.09.13). analysis with a mixed effects regression model. Journal of Health Economics 28,
Higgins, J.P.T., Thompson, S.G., 2002. Quantifying heterogeneity in a meta-analysis. 444–464.
Statistics in Medicine 21, 1539–1558. Bowland, B.J., Beghin, J.C., 2001. Robust estimates of value of a statistical life for
Hwang, H.-S., Reed, W.R., Hubbard, C., 1992. Compensating wage differentials and developing economies. Journal of Policy Modeling 23, 385–396.
unobserved productivity. Journal of Political Economy 100, 835–858. Day, B., 1999. Meta-analysis of wage-risk estimates of the value of statistical life.
Kaplow, L., 2005. The value of a statistical life and the coefficient of relative risk Unpublished Final Report to the DG-XII, European Commission, Contract ENV4-
aversion. Journal of Risk and Uncertainty 31, 23–34. CT96-0227. University College London.
Kluve, J., Schaffner, S., 2008. The value of life in Europe: a meta-analysis. Sozialer de Blaeij, A., Florax, R.J.G.M., Rietveld, P., Verhoef, E., 2003. The value of statisti-
Fortschritt 10, 279–287. cal life in road safety: a meta-analysis. Accident Analysis and Prevention 35,
Kniesner, T.J., Viscusi, W.K., Ziliak, J.P., 2010. Policy relevant heterogeneity in the 973–986.
value of statistical life: new evidence from panel data quantile regressions. Dionne, G., Michaud, P.C., 2002. Statistical analysis of value-of-life estimates using
Journal of Risk and Uncertainty 40, 15–31. hedonic wage method. Working Paper 02-01. École des Hautes Études Commer-
Stanley, T.D., 2001. Wheat from chaff: meta-analysis as quantitative literature ciales, HEC, Montréal.
review. Journal of Economic Perspectives 15, 131–150. Doucouliagos, H., Stanley, T.D., Giles, M., 2012. Are estimates of the value of a sta-
Stanley, T.D., 2005. Beyond publication bias. Journal of Economic Surveys 19, tistical life exaggerated? Journal of Health Economics 31, 197–206.
309–345. Kluve, J., Schaffner, S., 2008. The value of life in Europe: a meta-analysis. Sozialer
Stanley, T.D., 2008. Meta-regression methods for detecting and estimating empirical Fortschritt 10-11, 279–287.
effects in the presence of publication bias. Oxford Bulletin of Economics and Leggett, C.G., Neumann, J.E., Penumalli, P.R., 2001. Willingness to pay for reductions
Statistics 70, 103–127. in fatal risk: a meta-analysis of the value of statistical life literature. In: Eco-
Stanley, T.D., Doucouliagos, C.(H)., 2012. Meta-Regression Analysis in Economics nomic Valuation of Mortality Risk Reduction: Assessing the State of the Art for
and Business. Routledge, Oxford. Policy Applications. US Environmental Protection Agency, National Center for
Stanley, T.D., Doucouliagos, C.(H)., 2013. Meta-regression approximations to reduce Environmental Economics and National Center for Environmental Research.
publication selection bias. Research Synthesis Methods, http://onlinelibrary. Lindhjem, H., Navrud, S., Braathen, N.A., 2010. Valuing Lives saved from environmen-
wiley.com/journal/10.1002/(ISSN)1759-2887/earlyview (accessed 18.09.13). tal, transport and health policies: a meta-analysis of stated preference studies.
Stanley, T.D., Jarrell, S.B., Doucouliagos, C.(H)., 2010. Could it be better to discard Working Party on National Environmental Policies, OECD.
90% of the data? A statistical paradox. American Statistician 64, 70–77. Lindhjem, H., Navrud, S., Braathen, N.A., Biausque, V., 2011. Valuing mortality risk
Stanley, T.D., Doucouliagos, H., Giles, M., Heckemeyer, J.H., Johnston, R.J., Laroche, P., reductions from environmental, transport, and health policies: a global meta-
et al., 2013. Meta-analysis of economics research reporting guidelines. Journal analysis of stated preference studies. Risk Analysis 31 (9), 1381–1407.
of Economic Surveys 27 (2), 390–394. Liu, J.T., Hammitt, J.K., Liu, J.L., 1997. Estimated hedonic wage function and value of
U.S. Dept. of Transportation, 2011. Office of the Assistant Secretary for Transporta- life in a developing country. Economics Letters 57 (3), 353–358.
tion Policy, Memorandum: Treatment of the Economic Value of Statistical Life Miller, T.R., 2000. Variations between countries in values of statistical life. Journal
in Departmental Analyses – 2011 Interim Adjustment, http://www.dot.gov/ of Transport Economics and Policy 34 (2), 169–188.
sites/dot.dev/files/docs/Value of Life Guidance 2011 Update 07-29-2011.pdf Mrozek, J.R., Taylor, L.O., 2002. What determines the value of life? A meta-analysis.
(accessed 18.09.13). Journal of Policy Analysis and Management 21 (2), 253–270.
U.S. Dept. of Transportation, 2013. Office of the Secretary of Transportation, Viscusi, W.K., Aldy, J.E., 2003. The value of a statistical life: a critical review of market
Memorandum to Secretarial Officers Modal Administrators from Polly estimates throughout the world. Journal of Risk and Uncertainty 27 (1), 5–76.

In utero exposure to the Korean War and its long-term effects on

socioeconomic and health outcomes夽
Chulhee Lee ∗,1
Department of Economics, Seoul National University, Seoul, South Korea
Article history: Prenatal exposure to the disruptions caused by the Korean War (1950–1953) negatively affected the
Received 8 October 2012 individual socioeconomic and health outcomes at older ages. The educational attainment, labor market
Received in revised form 24 October 2013 performance, and other socioeconomic outcomes of the subjects of the 1951 birth cohort, who were
Accepted 8 November 2013
in utero during the worst time of the war, were significantly lower in 1990 and in 2000. The results of
difference-in-difference estimations suggest that the magnitude of the negative cohort effect is signifi-
cantly larger for individuals who were more seriously traumatized by the war. Whereas the 1950 male
birth cohort exhibited significantly higher disability and mortality rates at older age, the health outcomes
I10
J24
of females are unaffected by the war. Different aspects of human capital (e.g., health and cognitive skills)
N35 were impaired by in utero exposure to the war, depending on the stage of pregnancy when the negative
shocks were experienced.
Keywords: © 2013 Elsevier B.V. All rights reserved.
Fetal origins hypothesis
Maternal stress
Childhood health
Korean War
Socioeconomic outcome
1. Introduction increasing number of studies offer semi-experimental evidence

on the long-run consequences of exogenously generated shocks
Research across various disciplines suggest that in utero expo- to fetal health. Such traumatic events include the 1918 Pandemic
sure to negative health shocks has strong and persistent effects Influenza (Almond, 2006; Almond and Mazumder, 2005), the Dutch
on health and socioeconomic outcomes at older ages. This argu- Famine (Neuzgebauer et al., 1999; Roseboom et al., 2001; Bleker
ment is widely known as the fetal origins hypothesis, which et al., 2005), the Chinese Famine (St. Clair et al., 2005; Luo et al.,
was developed and popularized by David J. Barker and his col- 2006; Meng and Qian, 2009; Chen and Zhou, 2007; Almond et al.,
leagues in the 1990s (Barker, 1992, 1994). Since then, a voluminous 2010), and the Chernobyl disaster (Almond et al., 2009).
body of literature has been accumulated, providing a variety The Korean War (1950–1953) offers a unique opportunity to
of evidence in favor of this thesis (Currie and Hyson, 1999; examine the long-term effects of war-related disruptions such as
Chay and Greenstone, 2003; Behrman and Rosenzweig, 2004; arduous refugee experiences, suffering under the North Korean
Black et al., 2007; Currie and Moretti, 2007). In particular, an occupation, hunger, and direct exposure to combat. Although the
war lasted for more than three years, the major war damage
sustained by civilians was concentrated in the first nine months
following the sudden invasion of North Korea (late June 1950 to
夽 This study benefited from the criticisms and suggestions of Douglas Almond,
late March 1951). At that time, the frontline rapidly and unexpect-
Dora Costa, Young Jun Cho, Joe Ferrie, Daeil Kim, Tae Woo Kim, Eun Kyung Lee,
edly moved back and forth across South Korea (Halberstam, 2007;
Seung Kyu Shim, and the participants of the Korean Population Studies Association
Annual Conference, Korean Economic Association Conference, ASSA Meetings, NBER Chung, 2010; Yang, 2010). Furthermore, the severity of wartime
Cohort Studies Meeting, and seminars at University of Chicago, Institut National experience considerably differs depending on the place of resi-
Etudes Demographiques, Seoul National University, Seogang University, University dence. For example, Central Region residents were hit particularly
of Tokyo, and UCLA. The author would also like to thank Adriana Lleras-Muney and hard because they lived closer to North Korea and their area was
two anonymous referees for their very helpful and detailed comments, and Eunmi
Ko and Young Long Kim for their excellent research assistance. The author takes the
invaded twice by enemy forces. These specific circumstances help
responsibility for any remaining errors. identify the effects of the war on maternal and fetal health through
∗ Tel.: +82 2 880 6396. a comparison of the adult outcomes across birth cohorts and places
E-mail address: chullee@snu.ac.kr of birth.
C. Lee / Journal of Health Economics 33 (2014) 76–93 77
Drawing from these features of the Korean War, this arti- 2. Long-term consequences of in utero influences
cle explores how in utero exposure to war-related disruptions
affects adult health and economic outcomes in South Korea. For Research across a range of disciplines establishes that early-life
this purpose, we examine how various measures of socioeco- health and circumstances play an important role in determining
nomic performance and health at older ages differ depending health and economic conditions at older ages.1 A series of studies
on the timing and the place of birth. Specifically, we inves- by physicians and epidemiologists links many of the degenerative
tigate whether the individuals born in 1950 and 1951 (those conditions of old age to exposure to infectious disease, malnu-
who spent time in utero during the first nine months of the trition, and other types of biomedical and socioeconomic stress
war) show discontinuous cohort effects, and whether the cohort in utero and during the early years of life. Recent research by
effects are more distinct for those born in areas hit harder by economists suggests that early-life circumstances have strong and
the war. Several micro datasets, including micro samples of the persistent influences on human capital accumulation and labor
1990, 2000, and 2005 censuses, and Vital Statistics for death market performance. These effects are also possibly mediated by
from 1991 to 2009, are used to construct variables on adult out- the deterioration in health and cognitive ability.
comes. A group of researchers paid particular attention to the long-
This study is one of the few attempts to understand the long- term consequences of in utero influences. In the 1990s, David J.
term socioeconomic consequences of disruptions directly caused Barker and his colleagues developed and popularized the argu-
by combat activities. Previous studies of this kind largely focused ment widely known as the fetal origins hypothesis. The hypothesis
on the effects of famine, disease, and pollution. The present paper argues that disruptions to the prenatal environment are related to
draws from South Korea evidence that were not investigated various chronic health outcomes at older ages, including coronary
in the study of the long-term consequences of in utero circum- heart disease and diabetes (Barker, 1992, 1994). The hypothesis fur-
stances: the relationship between early-life conditions and later ther emphasizes initial health endowment (formed in utero), rather
socioeconomic outcomes can differ across periods and countries, than postnatal conditions, as a health determinant in early ages.
which depends on the extent of economic development, institu- A wide variety of evidence pertaining to the fetal origins hypoth-
tional features, and cultural characteristics. Thus, evidence from esis has been presented over the last two decades. Experimental
the Korean War in the current study can widen the scope of the studies using animals provide evidence that maternal malnutrition
literature. has a causal effect on the subsequent health of the offspring. A volu-
The evidence in this paper strongly suggests that prenatal expo- minous body of literature evaluates the health and socioeconomic
sure to the disruptions caused by the Korean War negatively consequences of low birth weight (LBW), a proxy of exposure to
affected individual socioeconomic and health outcomes at older malnutrition, infection, or toxic substances while in utero. Most
ages. Measures of the educational attainment and labor market per- of these studies reveal significant negative effects of LBW on
formance of individuals born in 1951, who were in utero during the human capital accumulation, socioeconomic status, and health out-
worst time of the war, were significantly lower in 1990 and 2000. comes (Currie and Hyson, 1999; Behrman and Rosenzweig, 2004;
The results of difference-in-difference estimations suggest that the Black et al., 2007; Currie and Moretti, 2007).2 A growing number
magnitude of the negative 1951 cohort effect is significantly larger of biomedical studies suggest that maternal stress during preg-
for individuals whose places of birth were more seriously devas- nancy may increase the likelihood of preterm birth, developmental
tated by the war. As for health outcomes, the 1950 male birth cohort delays, and behavioral abnormalities (Weinstock, 2001; Aizer et al.,
is more likely to have functional limitations and to die at older age 2009).
than predicted by long-term trends in health variables. If poten- A possible concern that confronts studies on size at birth and
tial selections in pregnancy, birth, and survival are considered, the subsequent outcomes is that the positive correlation between
actual negative effects of the war may be even greater than what measures of early life and adult health can be biased by omit-
this study suggests. ted variables such as genetic factors or post-birth investments.3
The effects of prenatal exposure to the Korean War differ by gen- A new line of research addresses this potential problem by exploit-
der and birth cohort. The health outcomes of females are unaffected ing unique opportunities offered by natural experiments, in which
by the war. However, the health of males who were in utero dur- individuals of a particular background or cohort are randomly
ing the early stages of the war is significantly worse than predicted exposed to a type of disruption in utero. If the probability of
by the smooth cohort trends. This gender difference can be partly experiencing negative health shocks in utero is uncorrelated with
attributed to the stronger population selection among females born unobservable determinants of health, then the estimated effects of
in 1951. It appears that the Korean War influenced the health of the shock are not subject to omitted variable bias.
wives through adversely affecting the economic and health status Influenza pandemics were used as a natural experiment to test
of the husbands. The women married to the men born in 1950 were the fetal origins hypothesis.4 Historical famines were studied as
more likely to have a disability in 2005 than predicted by smooth
cohort trends, which is explained in part by the higher disability
rates of their husbands. 1
See Almond and Currie (2010, 2011) for comprehensive surveys of the literature.
The aspects of human capital that were significantly impaired 2
For example, Black et al. (2007) analyzed Norwegian twins and found that LBW
by wartime in utero influences differ between the 1950 and 1951 has long-run adverse effects on adult height, IQ, earnings, and education. Currie
birth cohorts. The 1950 cohort exhibits worse health outcomes, and Moretti (2007) stated that LBW among individuals born in California has mod-
while the major consequences of the war for the 1951 cohort are est but statistically significant negative effects on educational attainment and the
probability of living in a wealthy neighborhood.
lower educational attainment and labor market performance. This 3
Almond et al. (2005) compared the hospital costs, health at birth, and infant
difference by year of birth is attributed to the exposure of the two mortality rates between heavier and lighter infants of twins born in the United States
cohorts to negative health shocks at different stages of pregnancy. to reduce potential bias arising from omitted variables such as genetic factors. They
The subjects of the 1951 cohort were in utero during the first half observed a minimal effect of LBW on infant mortality and other health measures
of pregnancy, a critical period for human brain development. In from a sample of twins.
4
Almond (2006) found that the cohorts in utero during the pandemic dis-
contrast, the majority of the subjects of the 1950 cohort became played reduced educational attainment, increased rates of physical disability, lower
exposed to war-caused shocks after prenatal brain development income, lower socioeconomic status, and higher transfer payments compared with
was completed. other birth cohorts. Almond and Mazumder (2005) discovered that cohorts in utero
78 C. Lee / Journal of Health Economics 33 (2014) 76–93
natural experiments as well.5 The effects of prenatal exposure to Second, how the effects of early-life conditions differ across
pollution were also examined via natural experiments.6 The influ- periods and nations remain unclear. The strength of the effect
ences of maternal stress during pregnancy on offspring outcomes of an adverse shock to health in utero depends on the patterns
were investigated, exploiting the events where pregnant women of postnatal investments in health and human capital accumu-
were exposed to an exogenously driven shock.7 lation. If postnatal interventions on the nutrition, medical care,
Unresolved issues continue to exist in spite of the growing and education of a child are effectively implemented, the pre-
body of evidence on the link between early-life circumstances natal damages can be rectified to a particular extent. Thus, the
and later economic and health outcomes. These issues motivate relationship between early-life circumstances and later socioeco-
the present research. First, how prenatal exposure to military nomic outcomes is likely to differ across periods and countries. The
conflict affects adult outcomes is considerably less known. Well- differences depend on income, quality of medical care, and educa-
documented studies report that measures of wartime stress such as tional system, as well as political and social structures. In addition,
exposure to combat and imprisonment in POW camp have persis- cultural and social norms matter in determining the patterns of
tent negative effects on the health and socioeconomic outcomes of investments in health and human capital during childhood.9 Except
veterans (Nefzger, 1970; Beebe, 1975; Sutker et al., 1991; Lee, 2005; for studies on China, evidence for this issue is drawn largely from
Costa and Kahn, 2008; Robson et al., 2009). A long-term impact of the US and from Europe. Thus, by focusing on Korea, this study can
war was found for civilians, too.8 However, these studies do not widen the scope of the literature.
provide clear evidence pertaining to the effects of prenatal expo-
sure to war. Although a number of biomedical studies examined
the impact of maternal stress from exposure to combat activi- 3. The Korean War (1950–1953)
ties and from physical or emotional hardship can adversely affect
fetal health (Geber, 1970; Jones and Tauscher, 1978; Huttunen and The Korean War began with the surprise invasion of South
Niskanen, 1978; Knipschild et al., 1981; Os and Selten, 1998; Selten Korea by North Korea on 25 June, 1950. The war lasted three
et al., 2003; Meijer, 2007; Rofe and Goldberg, 2011), they were years and ended when an armistice agreement was signed on 27
largely concerned with the short-term influences on birth or child- July, 1953. Vastly unprepared in terms of equipment and training,
hood outcomes, with a special focus on mental disorders such as the South Korean troops were repeatedly defeated and forced to
schizophrenia. Thus, this study can add to our understanding of retreat southward. The South Korean capital, Seoul, was captured
how the long-term effects of in utero circumstances differ based on by North Korean forces in only three days. By late August 1950 (only
the type of negative health shock. two months after the war broke out), North Korea had occupied
about 90% of the entire Korean Peninsula (Fig. 1). The remaining
uninvaded region was a small area around Busan City, located in
Southeast Korea.
during the pandemic exhibited impaired health outcomes relative to those born a
In the resulting battle at the Busan perimeter from August to
few months earlier or later 65–80 years after the event. Kelly (2009) found that peo- mid-September 1950, the South Korean and US forces managed to
ple who experienced prenatal exposure to the 1957 Asian flu in Britain exhibited stave off fierce North Korean attacks meant to capture the entire
diminished test scores. peninsula. On 15 September, the UN troops countered with a suc-
5
At middle age, a cohort in utero during the 1944–1945 Dutch Famine exhibited
cessful landing at Inchon and forced the North Korean Army in the
a broad spectrum of health problems such as poor self-reported health (Roseboom
et al., 2001), coronary heart disease morbidity (Roseboom et al., 2001; Bleker et al., south to retreat northward. The UN troops liberated Seoul on 27
2005), and adult antisocial personality (Neuzgebauer et al., 1999). Research on the September and continued northward to reach the Amnok (Yalu)
Chinese Famine (1959–1961) reported negative health and socioeconomic conse- River, which divides North Korea and China.
quences, including heightened risk of schizophrenia (St. Clair et al., 2005), obesity
In late November 1950, Chinese communist forces ambushed
among women (Luo et al., 2006), height reductions (Chen and Zhou, 2007), and
reduction in working hours of employees (Meng and Qian, 2009). Almond et al.
the UN troops, forcing the troops to fall back. Seoul was again lost
(2010) reported that higher famine intensity is associated with a greater risk of to the enemy on 4 January 1951. Refugees from both the south and
illiteracy, dropping out of the labor force, marrying late (for men), and marrying the north left their hometowns, following the retreating soldiers.
partners with less education (for women). Neelsen and Stratman (2011) claimed Regrouped and reinforced, the UN forces stabilized the battlefront
that exposure to the famine in Greece (1941–1942) in infancy lowered educational
and fought back. On 14 March 1951, the UN troops drove the North
attainment.
6
Chay and Greenstone (2003) used variations in air pollution across counties that Korean and Chinese armies out of Seoul. After exchanging offensive
were exogenously triggered by the implementation of the Clean Air Act of 1970 and actions around the 38th parallel between March and June, the war
the recession of the early 1980s. The decline in particulates significantly reduced reached a stalemate that lasted until the armistice of July 1953.
infant death rates. Almond et al. (2009) analyzed the effect of pollution stemming
The Korean War is the most brutal event that Korea has ever
from the Chernobyl disaster on the subjects of a Swedish cohort who were in utero
at the time of the disaster. The subjects exposed to radiation showed significantly
experienced in the modern era and one of the major international
reduced academic performance. wars fought after World War II. Statistical data on human losses
7
Eccleston (2011) found that cohorts exposed to the stress of the 11 September, caused by the war are unreliable and substantially vary based on
2001 terrorist attacks in utero have lesser weight, shorter gestation length, and sources. According to the official statistics of South Korea pub-
lower initial education attainments. Lauderdale (2006) reported that Arabic-named
lished in 1955, the number of civilian casualties during the three
women in California, who experienced a period of increased harassment, violence,
and workplace discrimination immediately following 11 September, 2001, have years of hostilities totaled 990,968, including 373,599 persons
increased risks of preterm birth and LBW. Currie and Rossin-Slater (2012) found killed, 229,625 injured, 84,532 abducted, and 303,212 missing in
that exposure to a hurricane during pregnancy increases the probability of labor South Korea alone (Chung, 2010). These estimates are likely under-
and delivery complications as well as abnormal conditions of the newborn. Camacho stated. Military casualties were also heavy. An estimated 36,940 US
(2008) reported that the intensity of random landmine explosion during a woman’s
pregnancy had a significant negative impact on child birth weight in Columbia.
8
Individuals who resided in war-torn countries in childhood during the World
War II experience greater educational and earning losses at older age compared
9
to those from countries not involved in the war (Ichino and Winter-Ebmer, 2004). In some societies, for example, parents tend to invest more in children with
Kesternich et al. (2011) reported that exposure to the World War II, as measured by poorer health to achieve more equal outcomes among children (a case of com-
the number of years an individual was directly affected by the war, and to individual- pensatory investment). By comparison, parents in other societies invest in their
level shocks caused by the war, such as hunger periods, significantly predict old-age healthiest children to increase the expected return of investment (a case of rein-
health and socioeconomic outcomes. forcing investment).
Fig. 1. Movements of the Frontline during the Korean War: 1950–1953.
servicemen and 245,000–415,000 South Korean soldiers were government prevented mass starvation, but these measure were
killed while in service during the war. still insufficient to solve the severe food shortage.10
During the first several months of the war, numerous South
Koreans remained in the areas that were occupied by North Korean
forces. The exact number of civilians who failed to escape from 4. Data and methods
North Korean control remains unknown. An estimated 1 million
(or 50%) residents in Seoul are believed to have remained in the 4.1. Identifying the effect of war: timing of birth
city during the first three months of captivity (Yang, 2010, p. 576).
Many of the people under communist control suffered from atroc- We hypothesize that the various disruptions caused by the
ities such as massacres, tortures, imprisonment, and abduction. Korean War adversely affected maternal and fetal health. Mea-
Civilians who remained under North Korean control were also vic- suring the magnitude of war-related negative health shocks is
timized by heavy bombings by the UN air forces that attempted central to evaluating this hypothesis. However, data that can be
to destroy North Korean military facilities and supply lines (Kim, used to link adult health and socioeconomic outcomes to detailed
2008). wartime experiences are unavailable. The key strategy that we
Refugee experience was no less arduous. Many refugees trav- employed in measuring the effect of the war is to investigate
eled on foot for days, carrying as many of their belongings as how individual socioeconomic outcomes at old age (available
possible. People who were fortunate to get a train ride endured from various micro data) differ using the timing and the place of
overly crowded cabins. The small areas in the southeast that man- birth.11
aged to avoid occupation, especially the two large cities of Busan
and Daegu, were the main refugee destinations. Busan City, with
a population of about 400,000, became home to more than 1 mil- 10
See memoirs by Korean writers including Kim (2006) and Chun (2006).
lion people in only several months. Away from home, the refugees 11
This study closely follows the methods employed in major studies that exam-
endured chronic hunger and slept without proper shelter. Emer- ined the effect of the 1918 Pandemic Influenza by looking into the differences in
gency aid provided by the UN forces and ration distributions by the adult outcomes by birth group (Almond et al., 2005; Almond, 2006).
A comparison of two birth cohorts while keeping all other fac- where yi denotes the census outcome for individual i; It denotes
tors constant is infeasible.12 However, one feature of the Korean the dummy variable with a value of 1 for individuals born in year t,
War helps disentangle the cohort effect, that is, civilians were and 0 otherwise; YOB denotes the last two digits of the birth year;
directly exposed to major war disruptions for only a relatively brief and ˇt measures the departure of outcomes for the birth cohorts in
period. Although the war lasted more than three years, the major utero during the Korean War from the cubic cohort trend.
damage sustained by civilians happened in the first nine months
following the sudden invasion of North Korea (late June 1950 to late 4.2. Identifying the effect of war: place of birth
March 1951). During these months, the frontline troops rapidly and
unexpectedly moved back and forth across the region, and civil- The severity of wartime experiences differs by region of resi-
ians suffered through occupation or refugee experience (Fig. 1). The dence, which offers another opportunity to identify the effect of the
war reached a stalemate around the 38th parallel after the spring war on maternal and fetal health. Because of the manners the front-
of 1951, and the civilian life in the south of the battlefield more or line moved during the war, as explained in the preceding section,
less stabilized. Thus, a more reasonable assumption is that the fetal the extent of war-related disruptions experienced by a particular
health of individuals born in 1951 or late 1950 is more seriously locality was strongly influenced by the distance to the border (the
damaged by war-related disruptions than other birth cohorts. 38th parallel). That is, the closer to the 38th parallel a person was
Two more assumptions are required to add credence to the located, the more arduous his or her wartime experiences would
hypothesis of this study. First, following the fetal origins hypoth- be. For example, the residents of the Central Region, the northern
esis, a particular wartime event is assumed to cause a stronger provinces of South Korea, had had less time to respond to the sur-
negative health shock while an individual is in utero than in the prise attack of the North Korean forces compared with the residents
postnatal period. Nearly all Koreans born either prior to or shortly of the southern regions because of their proximity to North Korea.
after the Korean War should be negatively influenced by the war As a consequence, they likely suffered from the occupation of North
in one way or another. With all other factors being equal, if in utero Korean troops and the bombings by UN forces. Even if the residents
circumstances are more critical than, say, early childhood circum- of the Central Region managed to escape from the communist con-
stances, then the adult outcomes of the cohorts who were in utero trol, they would have had to travel a longer distance to reach the
from the summer of 1950 to the spring of 1951 should be worse Busan perimeter. Moreover, since a large part of the Central Region
than those of neighboring cohorts. was occupied by North Korea twice, many of the residents in the
Second, all other experiences, except the war effect, is assumed region were displaced twice.
to continuously or gradually change across the birth cohorts con- The establishment of the 38th parallel as the border between
sidered in the analysis (those born in 1945 through 1959). In other North and South Korea suggests that the geographic variations in
words, no other major events (epidemics, short-term economic war-caused damage sustained by civilians are perhaps exogenously
shocks, institutional changes, and so on) differentially affect the determined. The parallel was abruptly established as the boundary
1950–1951 birth cohorts and those adjacent to it. This premise that divides Korea into two spheres, which would be split by the US
appears to be a reasonable assumption. For example, the number and Soviet Union troops after the Second World War. The United
of students entering a college each year in Korea, a key institu- States proposed to the Soviet Union that it would take control of
tional factor that determines the educational attainment of each the northern half of Korea, a Japanese colony at the time, in an
birth cohort, gradually increased from 1960 to 2000, with no dis- attempt to prevent the further advance of the Red Army when the
criminating effect on a particular birth cohort (Lee, 2013, Appendix Japanese surrender in August 1945 became apparent. The 38th par-
Fig. A1). Similarly, although several disease epidemics occurred allel was chosen because it roughly divides the Korean peninsula
between 1945 and 1955, no evidence suggests that a particular in the middle. This arbitrary decision made the temporary bound-
birth cohort underwent a considerably worse disease environment ary to separate the same province, county, and even village into
(Chun, 2011). two. Initially, no one expected that it would become a permanent
Two types of evidence for the cohort effect are presented below. barrier separating Korea. The 38th parallel was not a formal border
First, the raw data on adult outcomes for each birth cohort are and people could freely cross it until the summer of 1948. The prox-
graphically displayed. Whether the 1950 or 1951 cohort reveals imity to North Korea and the characteristics of a particular locality
discontinuous cohort effects is determined. For measures of socioe- or residents are unlikely correlated, because the 38th parallel was
conomic outcomes drawn from the 2000 census (reporting the drawn suddenly and arbitrarily, and the Korean War broke out only
place of birth), the results for individuals born in the Central Region two years after it became a formal border.
are highlighted because they experienced particularly severe hard- For this study, the residence of the mother of a person who lived
ships during the war, as explained later in the paper. at the beginning of the war (summer of 1950) should be consid-
Second, the deviations of census outcomes from smooth cohort ered in constructing the location-specific measures of war-related
trends are systematically estimated for individuals born between disruptions. Unfortunately, information regarding maternal resi-
1945 and 1959. The estimation is carried out using the following dence is unavailable. However, the place of birth of Korean War
regression equation: cohorts is a reasonably effective proxy for the following reasons.
First, descriptive records suggest that most refugees returned home

1951 as soon as their hometowns were liberated by the UN forces. Sec-
yi = ˛ + ˇt Iit + 1 YOBi + 2 YOBi 2 + 3 YOBi 3 + εi (1) ond, even individuals who were born while in refuge regarded
t=1950 their permanent homes as their birthplaces. These claims are sup-
ported by the regional composition of birth-by-birth cohort (Lee,
2013, Appendix Fig. A2). If many people were born in the unoccu-
pied Busan perimeter and recorded it as their place of birth, the
percentage of people born in this area should have increased dur-
12
Given the linear dependence among period, age, and cohort, implementing ing the early stages of the war. However, the numbers remained
changes in cohort necessarily imposes alterations on either age or period. Given
that most socioeconomic outcomes only tend to change gradually across cohorts,
remarkably stable from 1949 to 1952.
identifying a cohort effect from other smooth effects is difficult. An example is the To study the magnitude of wartime stress experienced while in
effect of changes in secular income over time. utero, a difference-in-difference method is employed in examining
the variations across places of birth. The basic idea is that if the timing and duration of in utero exposure to wartime disruptions
cohort effect captures the direct influences of war-related disrup- difficult.
tions, then it should be stronger for those persons who underwent The analyses in this study regarding the effect of the war on
more traumatic wartime experiences. To verify this assumption, socioeconomic outcomes at older ages largely rely on the 1990 and
we use regression to estimate the following equation 2000 censuses because the individuals born during the Korean War
reached their prime ages (37–50 years old) from 1990 to 2000 in
yi = ˛ + ˇ1 Ii + ˇ2 (Ii × Si ) + ˇ3 Si + 1 YOBi + 2 YOBi2 terms of socioeconomic status. Furthermore, the 1990 and 2000
censuses provide detailed geographic codes for the places of birth,
+ 3 YOBi 3 + ıXi + εi (2)
which enables this paper to study how the effect of war differs by
where I denotes the dummy variable that identifies an individ- places of birth. The following measures of socioeconomic outcomes
ual as in utero between the summer of 1950 and the spring of are considered: (1) years of schooling; (2) probability of enter-
1951, S is a measure of wartime stress, and X represents other ing college; (3) probability of having primary school education or
location-specific factors of the given adult outcome. In addition lower; (4) probability of having professional employment in 2000;
to simply including a cubic cohort trend, more non-parametric (5) probability of having an unskilled job in 2000; (6) probability of
approaches to specifying cohort effects will be considered remaining never married by 2000; (7) years of the spouse’s school-
later. ing in 2000; and (8) the number of children in 2000.16 The quality
The analyses below employ the following location-specific mea- of occupation is considered only for 2000 because of data problem
sures of wartime stress. First, a dummy variable indicating whether of the 1990 census.17 Marital status, marriage selection, and fertil-
a person was born in the Central Region is used because of the ity are analyzed based on the 2000 census because the subjects of
higher degree of suffering that these citizens endured. Second, the youngest birth cohort considered in the study (the 1959 birth
dummy variables show if the place of birth belongs to either of the cohort) were too young to complete marriage or fertility by 1990.
following two categories: the localities that were occupied twice by Analyzing the number of children was limited to the females who
the communist forces or the remaining areas in the Central Region have ever married according to how it is reported in the censuses;
captured only once. Third, the distance between the place of birth for males, the number of children of the current wife is used in the
and Busan City is employed.13 A longer distance from the major analysis.
destination for refugees should be related to a higher probability Measures of adult health are obtained from the 2005 census,
of being occupied by the North Korean forces, a greater exposure which indicates whether a person has disability and functional
to UN bombings, and more arduous refugee experiences. Finally, limitation for all respondents.18 Specifically, the following health
the duration of North Korean control is considered.14 The longer outcomes are considered: (1) probability of having disability and
the duration under occupation, the more devastating the suffering (2) probability of suffering from limited daily activities. The micro
of the residents should be regardless of whether they managed to sample of the 2005 census does not indicate the place of birth.
escape. Thus, comparing the health outcomes across birth cohorts can-
not be carried out separately for each region of birth. Mortality at
middle and old age is also considered as a measure of health. The
4.3. Data and variables on adult outcomes micro data of the Vital Statistics for death provide the universe of
digitalized death registration records in Korea. These data are cur-
The following micro datasets are used: the 2% micro samples rently available for the years 1991–2010. Utilizing the sources, the
of the population censuses (referred to as the Census, hereafter) yearly death rate for each birth cohort is computed for males and
for 1990, 2000, and 2005, and the Vital Statistics for death from females.
1991 to 2009. Each of these sources has its own advantages and Table 1 presents the sample means of the socioeconomic and
drawbacks in terms of sample size and variables offered. The major health outcomes of all male and female individuals and those
advantages of the Census are its large sample size and its provision who were born in the Central Region from 1945 to 1959. The
of the places of birth, which allow for an accurate estimation of mean of schooling years in the pooled censuses of 1990 and
adult outcomes by birth cohort and place of birth.15 Conversely, 2000 for the males and females are, respectively, 11.8 and 9.9.
the Census has two major limitations. First, it offers only a restric- 22.3% of males entered college, whereas only 8.5% of females
tive set of variables on personal characteristics, especially health did. Individuals born in the Central Region were more edu-
outcomes. Second, it does not provide a report on the month or cated than the entire sample. In terms of labor-market outcomes,
quarter of birth, thereby making the identification of the exact 9.6% of males and 67.8% of females of the pooled sample were
not working. Of those who worked in 2000, 25.8% of males
and 9% of females held professional jobs, while 24.2% of males
13
The distance between the municipal office of each county (or district) and the and 17.5% of females comprised those in unskilled occupations.
city hall of Busan is measured. Natives of the Central Region were more likely to work and
14
The dates of occupation of a particular place by the North Korean forces and to be employed in professional occupations. In 2005, disability
its liberation by the UN troops are either drawn or estimated based on descriptive
histories of the Korean War and maps displaying changes in the battlefield over
rate is slightly higher for females than for males, whereas males
time. If the exact timing of an event cannot be determined based on the available were much more likely to die than females between 1991 and
sources, the frontline connecting two places is assumed, for which the date of the 2009.
event is known, moving with the same pace. The six-volume histories of the Korean
War compiled by a government agency (The War Memorial Society 1992) and the
Military Academy of the Korean Army (1998) were used as major sources.
15 16
The micro sample of the 1980 census also provides information on the place of The occupations include professionals, managers, officials, and technical experts
birth as well as several socioeconomic outcomes. However, some individuals who (2000 census occupation codes 0–293). The unskilled jobs include manual labor and
belonged to the post-war birth cohorts and who were compared with the Korean operatives (2000 census occupation codes 811–942).
17
War cohorts in this study were too young in 1980 to provide accurate measures Measures of the quality of occupation cannot be obtained from the 1990 census
of adult outcomes. For example, individuals in the 1959 birth cohort, who were because of serious errors in the occupation codes given in the public-use micro
included in the sample used in the analyses conducted in the balance of the study sample of the 1990 census.
18
in 1980, were only 21 years old, too young to complete school, and to start a job The 2000 census offers information on functional limitations only for individuals
career. aged 65 and older.
Table 1
Sample means of socioeconomic and health outcomes of the 1945–1959 birth cohorts.
All Born in the Central Region
Male Female Male Female
1990–2000 Censuses (pooled)

Years of Schooling 11.8 9.9 12.3 10.5
% College or higher 22.3 8.5 26.3 10.4
% Primary or lower 11.1 24.3 8.0 18.9
% Not working 9.6 67.8 8.9 67.8
2000 Census
% Professional job 25.8 9.0 31.7 12.0
% Unskilled job 24.2 17.5 24.8 18.3
% Never married 2.2 1.7 2.5 2.0
Spouse’s schooling (married) 9.6 10.3 10.1 10.9
Fertility (married) 2.2 2.4 2.1 2.2
2005 Census
% Disabled 5.3 5.8 N/A N/A
% Activity limited 3.5 3.7 N/A N/A
1991–2010 Vital statistics (death)
1991–2010 Deaths per 100,000 627.3 218.7 N/A N/A
1991–2000 Deaths per 100,000 520.5 181.3 N/A N/A
2001–2010 Deaths per 100,000 734.1 256.1 N/A N/A
Source: The 2% samples of the 1990, 2000 and 2005 censuses; the micro data on the Vital Statistics for death from 1991 to 2010.
Note: See text for the definition of the variables. Fertility for males refers to the total number of children born to the current wife of married men.
Years of Schooling by Birth Cohort: % Professionals and Unskilled Workers among Employed Males in
Born in the Central Region 2000: Born in the Central Region
14 36
13.5 35
34
13
33
12.5 32
12 31
30
11.5
29
11 28 Professional
10.5 27
26 Unskilled
10
25
9.5 Male 24
9 Female 23
22
8.5
21
8 20
1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 195 9 1960 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
Fig. 2. Years of Schooling by Birth Cohort: born in the Central Region. Fig. 3. % Professionals and unskilled workers among employed males in, 2000: born
Source: Pooled data of the 2% samples of the 1990 and 2000 censuses. in the Central Region.
Source: The 2% sample of the 2000 census.
5. Long-term effect of the Korean War on adult

Fig. 3 presents two measures of success in the labor market
socioeconomic outcomes
for male subjects born in the Central Region from 1945 to 1959.
The percentages of those professionally employed and those who
5.1. Deviations of socioeconomic outcomes from the smooth
were manual laborers were calculated from the 2000 census.20 The
cohort trend
analysis is limited to males because of the low labor force participa-
tion of females belonging to the cohorts. As with schooling trends,
The educational attainment of individuals born in the Central
the lower occupational achievement of the 1951 male birth cohort
Region from 1945 to 1959 was measured in terms of years of school-
is evident. Male workers born in 1951 are much less likely to be
ing (Fig. 2), college admission (Lee, 2013, Appendix Fig. A3), and
employed in a professional job and more likely to hold unskilled
completion of primary schooling or lower (Lee, 2013, Appendix Fig.
manual occupation than predicted by long-term trends.21
A4), as obtained from the pooled sample of the 1990 and 2000 Cen-
Adult outcomes on marriage and reproduction were also com-
suses. A strong upward trend in educational attainment is shown by
pared across birth cohorts, as shown in Figs. 4–6. Males born in 1951
these cohorts, with the 1950 and 1951 birth cohorts veering from
are more likely to remain unmarried compared with the neigh-
this steady trend. A deviation of the 1951 cohort from the tendency
boring male birth cohorts. In contrast, the percentage of females
for long-term increase in schooling is clearly revealed; the1951
who never married is higher for the 1952 and 1953 birth cohorts
cohort subjects received approximately two months less school-
(Fig. 4). The spouses of men and women born in 1951 are more
ing than those born in 1949. They were less likely to enter college
and more likely to achieve the lowest educational attainment.19
and primary school or lower for females) or improved at a rate slower than the
predicted long-term trend (years of schooling for males and females, and primary
19
Although not presented here, digression of the 1951 birth cohort from the trend school or lower for males) from the 1950 to the 1951 cohorts.
20
is also observed in the entire sample, albeit at a smaller magnitude. Educational These measures of labor market outcomes are computed for the entire sam-
attainment either stagnated (admission to college or higher for males and females, ple, including non-participants in the labor market and the unemployed. The
% Never Married Persons in 2000: negatively selected in terms of educational attainment compared
Born in the Central Region
4.5 with the adjacent birth cohorts (Fig. 5). The lifetime fertility of
females born in 1951 is higher than the predicted long-term cohort
4
trend (Fig. 6).
3.5 Columns 1 and 2 of Table 2 provide the estimates of ˇt in Eq. (1),
3 indicating the magnitude of the outcome deviation of male sub-
jects born in 1950 and 1951 from the cubic cohort trend. Standard
2.5
errors were adjusted in testing the statistical significance for the
2 clustering of birth year and region levels.22 The regression results
1.5 confirm that educational attainment, employment rate, quality of
1 Male occupation, and education of the spouse of the 1951 birth cohort
Female are significantly low, especially among subjects born in the Cen-
0.5
tral Region.23 If born in the Central Region, the male subjects of
0 the 1951 birth cohort are more likely to remain bachelors, and the
1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 percentage of those who never married is low for all males born in
1951. The fertility of the female spouse is higher for the 1951 male
Fig. 4. % Never married persons in 2000: born in the Central Region.
birth cohort than that predicted by the cubic cohort trend. For all
measures, the 1951 cohort effect is statistically significant in the
sample limited to people born in the Central Region (Table 2, Col-
Spouse's Years of Schooling in 2000 by Birth Cohort: umn 2). This effect remains statistically significant when the entire
Born in the Central Region sample is used (Column 1) but at a small magnitude. However, the
12
sign of the coefficient changes for the probability of being never
11.5 married. By contrast, the cohort effect is insignificant for individ-
11 uals born in the Southern Region, including the areas occupied by
North Korea (not reported in the table).
10.5
Columns 3 and 4 of Table 3 present the regression results for
10 females, which correspond well with the patterns of the cohort
9.5 effect. Female subjects of the 1951 birth cohort, who were born in
Female (husband's) the Central Region, exhibit clear disadvantages in all three mea-
9
Male (wife's) sures of educational attainment, the percentage of the unskilled
8.5 among the employed, and spouse’s schooling (Panels A to C, F, and
8 H of Column 4). Females born in 1951 had more children than the
19441945194619471948194919501951195219531954195519561957195819591960 neighboring birth cohorts as in the case of males. However, unlike
males, this female cohort does not significantly differ from other
Fig. 5. Spouse’s years of schooling in 2000 by birth cohort: born in the Central
cohorts in terms of employment status and the percentage of work-
Region.
ers employed in professional occupation. This unremarkable effect
Source: Currently married males and ever married females in the 2% sample of the
2000 census. on female-labor market outcomes is unsurprising, given the low
labor-force participation of women from this cohort. No negative
Number of Children in 2000:
effect on female socioeconomic outcomes (not reported here) was
Born in the Central Region observed among those born in the Southern Region in 1951.
3.1
A possible concern of the baseline regressions conducted above
2.9 is that a cubic function of the year of birth may not accurately cap-
ture the actual smooth cohort trend. To address this concern, a more
2.7
flexible cohort trend is applied to the cohort analysis, in which a
Female
2.5 quadratic cohort trend is allowed to change across the pre-war
Male (1945–1949), wartime (1950–1954), and post-war (1955–1959)
2.3 cohorts. The results are similar to those based on the cubic spec-
2.1
ification, especially for males born in 1951 (Lee, 2013, Appendix
Table A1). For females, the negative effects of birth in 1950 or 1951
1.9 on education and occupation are more strongly revealed if the new
specification is applied.
1.7
1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
5.2. Cohort effect, wartime stress, and socioeconomic outcomes:
Fig. 6. Number of children in 2000: born in the Central Region. Note: The total
difference-in-difference method
number of children of females who have ever married; and the number of children
for the current wife for males.
Source: The 2% sample of the 2000 census. If the deviation of Census outcomes of the 1951 birth
cohort reflects the influences of war-related disruptions expe-
rienced while in utero, then its magnitude should be larger for
22
unconditional percentages provide similar implications that the results are not All the coefficients of interest are jointly significant at the 1% level for all the
primarily driven by decisions on labor force participation. regressions reported in this paper.
21 23
The same measures constructed for the entire sample (also not presented in the The negative effect of being born in 1951 on the quality of occupation is not fully
paper) show similar handicaps for the 1951 birth cohort, although the magnitude explained by the disadvantage in educational attainment of the cohort. Although the
of deviation from the trend is smaller compared to that in the Central Region. education factor is controlled, the 1951 cohort effect remains statistically significant.
Table 2
Deviation of the 1950 and 1951 birth cohorts’ adult socioeconomic outcomes from 1945 to 1959 trend.
Outcome Males Females
(1) (2) (3) (4)

South Korea Central Region South Korea Central Region
∂y/∂x S.E. ∂y/∂x S.E. ∂y/∂x S.E. ∂y/∂x S.E.
A. Years of schooling
Born in 1950 −0.0911 0.0357 −0.1703 0.0194 −0.1023 0.0194 0.0321 0.0423
Born in 1951 −0.1660 0.0354 −0.4323 0.0198 −0.1681 0.0198 −0.2883 0.0406
B. College or higher
Born in 1950 −0.0099 0.0029 −0.0113 0.0034 −0.0001 0.0015 0.0103 0.0046
Born in 1951 −0.0122 0.0031 −0.0281 0.0033 −0.0074 0.0017 −0.0173 0.0043
C. Primary or less
Born in 1950 0.0052 0.0021 0.0187 0.0026 0.0109 0.0025 −0.0026 0.0028
Born in 1951 0.0070 0.0020 0.0178 0.0027 0.0153 0.0022 0.0234 0.0025
D. Not working
Born in 1950 0.0079 0.0011 0.0056 0.0022 −0.0079 0.0013 −0.0001 0.0029
Born in 1951 0.0004 0.0010 0.0077 0.0019 −0.0032 0.0012 −0.0022 0.0024
E. Professional job
Born in 1950 −0.0034 0.0033 −0.0028 0.0054 −0.0055 0.0025 −0.0039 0.0060
Born in 1951 −0.0176 0.0027 −0.0485 0.0049 −0.0001 0.0023 −0.0068 0.0050
F. Unskilled job
Born in 1950 0.0122 0.0019 0.0084 0.0035 0.0123 0.0023 0.0349 0.0051
Born in 1951 0.0122 0.0018 0.0471 0.0034 0.0026 0.0021 0.0333 0.0048
G. Never married
Born in 1950 0.0007 0.0006 0.0007 0.0015 −0.0017 0.0011 −0.0087 0.0036
Born in 1951 −0.0012 0.0007 −0.0048 0.0015 −0.0027 0.0009 −0.0067 0.0034
H. Spouse’s schooling
Born in 1950 −0.0287 0.0203 −0.0596 0.0233 −0.0867 0.0329 −0.0961 0.0402
Born in 1951 −0.1684 0.0192 −0.4065 0.0188 −0.1764 0.0319 −0.2632 0.0364
I. Number of children
Born in 1950 0.0039 0.0059 −0.0024 0.0109 0.0525 0.0156 −0.0123 0.0303
Born in 1951 0.0123 0.0052 0.0564 0.0093 0.1096 0.0144 0.1696 0.0264
Source: The pooled sample of the 2% micro sample of the 1990 and 2000 population censuses.
Notes: (a) “South Korea” includes persons who were born in North Korea and those for whom the place of birth is not reported; (b) For panels E through I, the 2% sample of
the 2000 census is utilized, and for panels E and F, the sample is limited to persons who were employed at the time of census enumeration; (c) Polynomials of the year of
birth (YOB, YOB2 and YOB3 ) are included in the regressions; (d) The number of children for males (Columns 1 and 2 of Panel I) is the total number of children born to the
wife of currently married men; (e) Standard errors are adjusted for clustering at the levels of year and region of birth; and (f) The estimated coefficient in bold number is
statistically significant at 10% level.
Table 3
Summary of regressions for difference-in-difference model: interaction between measure of wartime stress and deviation of the 1951 birth cohort’s adult outcomes from
1945 to 1959 trend.
Outcome Measure of wartime stress
(1) (2) (3) (4)

Central Region Occupied twice Distance from Duration of
Busan occupation
1. Males
A. Years of schooling −0.3212 0.1193 −0.5232 0.1828 −0.1852 0.0572 −0.1426 0.0533
B. College or higher −0.0236 0.0141 −0.0552 0.0205 −0.0147 0.0068 −0.0107 0.0066
C. Primary or less 0.0058 0.0095 0.0085 0.0130 0.0100 0.0040 0.0099 0.0041
D. Not working 0.0145 0.0081 0.0269 0.0122 0.0084 0.0039 0.0059 0.0034
E. Professional job −0.0540 0.0207 −0.0584 0.0317 −0.0155 0.0092 −0.0064 0.0105
F. Unskilled job 0.0539 0.0195 0.0783 0.0275 0.0282 0.0088 0.0289 0.0104
G. Never married 0.0044 0.0049 0.0083 0.0088 0.0016 0.0026 −0.0003 0.0027
H. Spouse’s schooling −0.3116 0.1580 −0.3475 0.2607 −0.1620 0.0750 −0.1730 0.0832
I. Number of children 0.0664 0.0327 0.0554 0.0477 0.0327 0.0148 0.0265 0.0156
2. Females
A. Years of schooling −0.2581 0.1330 −0.2402 0.1989 −0.1378 0.0753 −0.0991 0.0669
B. College or higher −0.0283 0.0096 −0.0303 0.0162 −0.0133 0.0062 −0.0092 0.0053
C. Primary or less 0.0063 0.0153 −0.0036 0.0218 0.0075 0.0074 0.0087 0.0070
D. Not working 0.0004 0.0152 0.0024 0.0192 −0.0051 0.0070 −0.0002 0.0066
E. Professional job −0.0229 0.0165 −0.0300 0.0244 −0.0072 0.0086 −0.0142 0.0097
F. Unskilled job 0.0309 0.0245 0.0116 0.0335 0.0260 0.0096 0.0294 0.0108
G. Never married −0.0050 0.0037 −0.0076 0.0052 −0.0040 0.0020 −0.0026 0.0023
H. Spouse’s schooling −0.1694 0.1776 −0.3092 0.2494 −0.0812 0.0866 −0.0795 0.0959
I. Number of children 0.0987 0.0620 0.1324 0.0990 0.0349 0.0296 0.0375 0.0316
Source: The pooled data of the 2% micro sample of the 1990 and 2000 population censuses.
Notes: (a) Summary of 72 regressions; (b) The sample is limited to those for whom the place of birth is reported; (c) For panels E through I, the 2% sample of the 2000 census
is utilized, and for panels E and F, the sample is limited to persons who were employed at the time of census enumeration; (d) A cubic cohort trend is included in each
regression but omitted from the table; (e) The number of children for males (Panel 1–I) is the total number of children born to the current wife of married men; (f) Standard
errors are adjusted for clustering at the levels of year and county of birth; and (g) The estimated coefficient in bold number is statistically significant at 10% level.
Years of Male Schooling by Distance from Busan the number of months under occupation by North Korean forces
12.4
(Duration of occupation).
12.2
Table 3 reports the summary results of 72 difference-
12 in-difference regressions conducted on the 9 measures of
11.8 socioeconomic outcomes cited in this study. The results sug-
11.6 1951 gest that for a wide range of socioeconomic outcomes, the 1951
11.4 1950/1952 cohort effect is stronger for individuals who suffered more seri-
11.2
ous wartime stress while in utero. For males, the coefficient of the
interaction between birth in 1951 and wartime stress is statistically
11
significant in 26 out of 36 regressions. In all adult outcomes, except
10.8
the probability of being never married (Panel 1-G), the coefficient
10.6 for interaction is statistically significant in at least 2 specifications;
0-100 100-200 200-300 300-400 Over 400
in 7 out of 9 measures, it is significant in at least 3 specifications.25
Fig. 7. Years of male schooling by distance from Busan. The results among females are similar to those among males
Source: The 2% sample of the 2000 census. in terms of educational attainment. However, prenatal exposure
to war-related disruptions had considerably weaker effects on the
labor-market, marriage, and fertility outcomes of women than
individuals exposed to worse in-utero circumstances during the men. Only the probability of employment in unskilled occupations
war. This hypothesis is consistent with the fact that the disadvan- in 2000 was significantly affected by the interaction between birth
tages of the 1951 birth cohort are particularly pronounced among in 1951 and measure of wartime stress (Panel 2-F, Columns 3 and 4).
individuals born in the Central Region. Further analyses employ- The significantly lower and, presumably, more selective labor force
ing more direct and detailed measures of wartime stress (provided participation of women born prior to 1960 is a possible explanation
in the remainder of this subsection) confirm that the 1951 cohort for the gender difference in labor-market outcomes.26
effect is indeed stronger for individuals with more arduous in-utero The measures of the intensity of the war likely capture the
wartime experiences. exogenously determined distance between the place of birth and
Fig. 7 compares the schooling years of males from the 1951 the 38th parallel, but they might be correlated with selections of
birth cohort against those of neighboring cohorts (either 1950 the treated group (i.e. the 1951 birth cohort born in the Central
or 1952) using the distance between their place of birth and Region) as well.27 Accordingly, a number of control variables per-
Busan, a measure of wartime stress. Individuals born near Busan taining to the characteristics of the place of birth were added to
(within 200 kilometers) and who were from the 1951 birth cohort the difference-in-difference regressions, including (a) the extent
acquired slightly more schooling than those from adjacent cohorts. of urbanization, (b) educational attainment, (c) Province dummy,
The direction of this schooling gap is reversed for areas that are and (d) urbanization and education combined. To construct the
200–300 km away from Busan. The disadvantages in schooling for variables for extent of urbanization, places of birth were classified
the 1951 cohort versus other cohorts are particularly pronounced as follows: (a) Seoul; (b) five metro cities, namely, Busan, Inchon,
for individuals born far from Busan. The results obtained from Daegu, Deajun, and Kwangju; (c) all other smaller cities; and (d)
female samples are similar to those from males, wherein the 1951 rural areas (omitted category). For variables on educational attain-
cohort effect increases with the severity of wartime stress mea- ment of each province or birth, we computed the percentage of the
sured by the distance from Busan. Analyses in which either an adult population with each of the following education category:
alternative measure of wartime stress (such as the duration of (a) primary school or less (omitted category), (b) middle school, (c)
occupation) or a different adult outcome (such as the quality of high school, and (d) college or higher.
occupation) present remarkably similar implications.24 Table 4 presents the regression results for selected adult
Regressions are conducted based on a difference-in-difference measures on which the interaction term of the 1951 birth and
model, as displayed in Eq. (2). If the 1951 cohort effect increases measure of wartime stress has a relatively strong effect in
with the magnitude of wartime stress, then the coefficient of inter- the original regressions (Table 3). Only the coefficient of the
action between the dummy variable for the 1951 birth (BORN1951) interaction term is reported for 112 regressions (seven adult
and the measure of wartime stress should be negative if the depen- outcomes, four measures of wartime stress, and four models).
dent variable is a desirable outcome. The cubic cohort trend (i.e., Except female years of schooling and male college education
YOB, YOB2 , and YOB3 ), dummy variable for the 1951 birth, and
measure of negative shocks caused by the war are controlled.
Four measures of wartime stress are employed: (a) dummy vari-
able for birth in the Central Region (Central Region); (b) dummy 25
The entire Southern Region is the omitted category in the regressions in which
variables for birth in the localities occupied twice (Occupied twice) CENTRAL and CENOCCUP2 are used as measures of wartime stress (Columns 1 and
and birth in other places in the Central Region that were cap- 2 of Table 3). If Busan is chosen as the omitted category, the absolute magnitudes of
the regression coefficients for the interaction terms generally increase. For example,
tured only once; (c) the distance between the place of birth and
the estimated coefficient of CENTRAL interacting with BORN1951 for male schooling
Busan, measured in 100 kilometers (Distance from Busan); and (d) changes from −0.3212 (1-A of Column 1) to −0.3793 (p-value = 0.0271). Female
schooling changes from −0.2581 to −0.4266 (0.0466).
26
The possibility of the effect of in utero exposure to war-related disruptions
on change across ages was also investigated. The pooled sample of the 1990 and
24
Lower schooling of the 1951 birth cohort is apparent among individuals born 2000 censuses is used by allowing the change in regression coefficients over the
in areas under communist control for a prolonged period (Lee, 2013, Appendix Fig- decade. The results for the years of schooling reported in Appendix Table A2 of Lee
ure A5). The disadvantages experienced by the 1951 male cohort in terms of labor (2013) suggest that the differences between the two census years are statistically
market performance (measured by the percentage of unskilled workers) are also insignificant.
27
larger among those exposed to more serious war-related disruptions, which are Variables on parental characteristics should be added to the regressions to
represented by the greater distance from Busan or the longer occupation by enemy address this concern. However, census reports the information on parents only if
forces (Lee, 2013, Appendix Figure A6). Similar results are obtained using percent- they reside in the same household. As the survival of and co-residence with parents
age of males with professional employment as measure of occupational success (not are selective, conducting analyses based on a sample of people living with their
reported in the paper). parents would be problematic.
Table 4
Robustness tests for difference-in-difference model: interaction between measure of wartime stress and deviation of the 1951 birth cohort’s adult outcomes from 1945 to
1959 trend.
Model 1 + urbanization: Model 2 + education Model 3 + province dummy Model 4 + urbanization and
education
1. Male schooling
A. Central Region −0.2054 0.1049 −0.2536 0.1723 −0.2376 0.1701 −0.2491 0.1704
B. Occupied twice −0.3987 0.1366 −0.3758 0.2165 −0.3686 0.2097 −0.3610 0.2054
C. Distance from Busan −0.1144 0.0438 −0.1232 0.0626 −0.1221 0.0613 −0.1197 0.0610
D. Duration of occupation −0.0831 0.0435 −0.1424 0.0740 −0.1565 0.0722 −0.1300 0.0758
2. Female schooling
A. Central Region −0.1096 0.1026 −0.1671 0.1519 −0.1454 0.1477 −0.1598 0.1497
B. Occupied twice −0.0673 0.1371 −0.1496 0.2147 −0.1335 0.2001 −0.1506 0.2067
3. Male college
A. Central Region −0.0212 0.0132 0.0260 0.0217 −0.0236 0.0214 −0.0249 0.0213
B. Occupied twice −0.0488 0.0168 −0.0412 0.0279 −0.0393 0.0275 −0.0387 0.0267
D. Duration of occupation −0.0027 0.0060 0.0016 0.0105 −0.0007 0.0103 0.0032 0.0106
4. Female college
A. Central Region −0.0222 0.0078 −0.0353 0.0114 −0.0319 0.0111 −0.0334 0.0110
B. Occupied twice −0.0233 0.0120 −0.0409 0.0172 −0.0378 0.0163 −0.0389 0.0165
5. Male not working
A. Central Region 0.0156 0.0080 0.0144 0.0081 0.0154 0.0081 0.0148 0.0080
B. Occupied twice 0.0277 0.0122 0.0267 0.0122 0.0273 0.0122 0.0270 0.0122
C. Distance from Busan 0.0089 0.0039 0.0086 0.0040 0.0087 0.0039 0.0087 0.0039
D. Duration of occupation 0.0061 0.0033 0.0056 0.0034 0.0059 0.0034 0.0055 0.0033
6. Male unskilled
A. Central Region 0.0468 0.0192 0.0493 0.0190 0.0488 0.0188 0.0480 0.0189
B. Occupied twice 0.0714 0.0255 0.0737 0.0258 0.0728 0.0258 0.0716 0.0255
C. Distance from Busan 0.0246 0.0084 0.0250 0.0084 0.0262 0.0083 0.0247 0.0083
D. Duration of occupation 0.0252 0.0098 0.0266 0.0098 0.0278 0.0094 0.0259 0.0099
7. Male spouse schooling
A. Central Region −0.1906 0.1348 −0.3269 0.1704 −0.3267 0.1702 −0.2050 0.1521
B. Occupied twice −0.2385 0.1974 −0.3341 0.2654 −0.3637 0.2647 −0.2249 0.2229
Source: The 2% micro samples of the 1990 and 2000 population censuses.
Notes: (a) Summary of 112 regressions; (b) The sample is limited to those for whom the place of birth is reported; (c) For panels 6 and 7, the 2% sample of the 2000 census
is utilized, and the sample is limited to persons who were employed (panel 6) and the currently married (panel 7) at the time of census enumeration; (d) A cubic cohort
trend is included in each regression but omitted from the table; (e) Standard errors are adjusted for clustering at the levels of year and county of birth; and (f) The estimated
coefficient in bold number is statistically significant at 10% level.
(Panels 2 and 3), the results of difference-in-difference estima- only statistically significant coefficient at the 5% level (Lee, 2013,
tions do not change much by applying different specifications. Appendix Table A3). A discontinuous drop in the magnitude of the
At far as the majority of male adult outcomes and a particu- coefficient between 1950 and 1951 is clearly observed in females,
lar type female educational outcome are concerned, it is likely too. However, unlike males, the 1951 female birth cohort is less
that the 1951 cohort effect captures the long-term consequence
of in-utero exposure to disruptions directly related with combat Estimated Coefficients for Birth Year Dummies Interacted with
activities. Measure of Wartime Stress: Years of Schooling for Males
0.1
This study also examined whether the effect of interaction
between the 1951 birth and the measure of war-caused damage is 0
sensitive to the functional form of cohort trend by including birth
year dummies (1945 is the omitted category) interacting with each -0.1
measure of wartime stress in the regressions, as represented by the
following equation: -0.2

1959

1959 -0.3
yi = ˛ + ˇ1,t Ii,t + ˇ2,t (Ii,t × Si ) + ˇ3 Si + ıXi + εi (3)
-0.4
t=1946 t=1946 Central Region
Figs. 8 and 9 plot the coefficients for the interaction terms esti- -0.5 Distance from Busan
mated from the regressions in which education is used as the

dependent variable. The Central Region dummy and the distance -0.6
1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
from Busan are applied as measures of wartime stress (Lee, 2013,
Appendix Table A3 for regression results). The coefficient for the Fig. 8. Estimated coefficients for birth year dummies interacted with, measure of
1951 dummy for males, which interacts with wartime stress mea- wartime stress: years of schooling for males.
sure, stands out as a clear downward outlier (Fig. 8). It is the Source: Regression results from the pooled sample of the 1990 and 2000 censuses.
Estimated Coefficients for Birth Year Dummies Interacted with % Males with Limited Activity in 2005 by Birth Cohort
Measure of Wartime Stress: Years of Schooling for Females 10
0.1
9
0 Central Region
8
Distance from Busan
-0.1
7 Male
-0.2 Female
6
-0.3 5
-0.4 4
3
-0.5
2
-0.6 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
Fig. 11. % Males with limited activity in 2005 by birth cohort.
Fig. 9. Estimated coefficients for birth year dummies interacted with, measure of Source: The 2% sample of the 2005 census.
wartime stress: years of schooling for females.
Source: Regression results from the pooled sample of the 1990 and 2000 censuses.
1950 and 1953 male subjects exhibit upward deviations from the
% Persons with Disability in 2005 by Birth Cohort long-term trend in terms of disability and limitations in activity.
16 Among females, conversely, no clear departure of the 1950 birth
cohort from long-term trends in health outcomes appears whereas
14
the 1949 birth cohort shows particularly high rates of disability
12
and limited activity. It also appears that females born in 1951 were
somewhat more likely to be disabled than predicted by smooth
Male cohort trend.
10
Female The regressions are conducted where cubic cohort trend (model
8 1) and quadratic trend allowed to change across the pre-war
(1945–1949), wartime (1950–1954), and post-war (1955–1959)
6
cohorts (model 2) are applied. The results in Table 5 confirm that
the 1950 male birth cohort is more likely to possess disability or
4
limitations in daily activities than other birth cohorts. No simi-
2 lar statistically significant 1950 cohort effect is observed among
19441945194619471948194919501951195219531954195519561957195819591960 females.
A crucial drawback of the analyzing the 2005 census data is that
Fig. 10. % Persons with disability in 2005 by birth cohort. the cohort effect cannot be estimated separately based on the sub-
jects’ particular locality, thereby causing difficulty in assessing the
actual effect of the war on fetal health. As in the case of socioeco-
clearly distinguished from other wartime birth cohorts born in nomic outcomes, the 1950 cohort effect is considered stronger for
1952 or 1953.28 those born in the Central Region.29
Adult mortality is another index of health. Fig. 12 shows the
5.3. Health outcomes: disability and mortality average number of deaths by year of birth per 100,000 persons
between 1991 and 2010. The death rate for both males and females
Figs. 10 and 11 present two health outcomes drawn from the is visibly lower for the 1951 birth cohort. The males born in 1950
2005 census: the percentages of individuals with disability and experience a higher risk of death than those in the neighbor-
individuals with limited daily activity. The charts suggest that the ing cohorts. However, this 1950 cohort effect is not observed in
females.
Regression analyses were conducted for the two sub-periods
28
Quantile regressions are also conducted to determine how the effect of expo- (1991–2000 and 2001–2010) and the entire two decades. The
sure to the war on the years of schooling differs across various quantiles of the cubic cohort trend was controlled. Table 6 confirms the mortal-
schooling distribution. The distance from Busan is employed as the measure of
ity patterns presented in Fig. 12. Two additional features can be
wartime stress because the coefficient of the interaction of this variable and the
1951 birth dummy is significantly negative for both males and females (Table 3). drawn from the results. First, the adverse 1950 cohort effect among
Moreover, this measure is the most continuous among all variables on wartime males becomes slightly weaker over time in terms of statistical
stress used in the analysis. Variations in the years of schooling in Korea largely
come from those among graduates of primary school (six years), middle school (nine
years), high school (12 years), and college (16 years). Accordingly, the coefficient of
29
the interaction term is estimated for the three quantiles corresponding to average Their 2005 residence region is at best a highly inaccurate proxy of their region of
middle school, high school, and college-dropouts as well as the three quantiles of birth given that geographic mobility since the Korean War was high. A comparison
schooling. The results reported in Appendix Table A4 of Lee (2013) suggest that the of the regions of birth and residence in 2000 based on the 2000 Census suggests that
effect of in utero exposure to the war on male schooling is particularly strong at the about 90% of individuals born in the Central Region continued to reside there in 2000.
quantiles corresponding to middle school and college dropouts. Female schooling However, these original residents of the Central Region accounted for only 58% of
is most strongly influenced by wartime stress along the borderline between high the entire regional population because of the inflow of migrants from the Southern
school and college graduates. These results are consistent with the outcomes of Region after the Korean War. Nevertheless, the graphs in Appendix Figures A7 and
the difference-in-difference regressions (Table 3): for males, both the probabilities A8 given in Lee (2013) suggest that the negative 1950 cohort effect is likely to be
of primary school education and college entrance are significantly affected by the greater for individuals born in the Central Region. For males, deviations of the 1950
interaction term; for females, only the effect on the probability of college entrance cohort are more pronounced when the sample is limited to individuals living in the
is significant. Central Region.
Table 5
Deviation of the 1950–51 birth cohorts’ health outcomes in 2005 from 1945 to 1959 trend.
Male Female
(1) (2) (3) (4)

Disabled Activity limited Disabled Activity limited
A. Model 1
Born in 1950 0.0086 0.0024 0.0082 0.0025 0.0036 0.0024 0.0007 0.0016
Born in 1951 0.0037 0.0021 0.0020 0.0022 0.0064 0.0020 0.0011 0.0014
B. Model 2
Born in 1950 0.0116 0.0035 0.0182 0.0022 0.0004 0.0046 −0.0053 0.0033
Born in 1951 0.0038 0.0021 0.0065 0.0012 0.0057 0.0021 −0.0015 0.0013
Source: The 2% micro sample of the 2005 population census.

Notes: The sample includes 83,649 males and 85,994 females who were aged 46–60 in 2005. In model 1, polynomials of the year of birth (YOB, YOB2 and YOB3 ) are included in
the regressions, whereas quadratic cohort trend is allowed to change across the pre-war (1945–1949), war-time (1950–1954), and post-war (1955–1959) cohorts in model
2; (d) Standard errors are adjusted for clustering at the level of year of birth; and (e) The estimated coefficient in bold number is statistically significant at 10% level.
Table 6
Deviation of the 1950–1951 birth cohorts’ death rate from 1945 to 1959 trend.
Male Female
(1) (2) (1) (2)

Deaths per Log (Deaths per Deaths per Log (Deaths per
100,000 100,000) 100,000 100,000)
A. 1991–2010
Born in 1950 26.6213 14.5082 0.0354 0.0133 2.8231 5.8860 0.0136 0.0159
Born in 1951 −34.2432 14.2522 −0.0542 0.0131 −18.2210 5.7822 −0.0838 0.0156
B. 1991–2000
Born in 1950 23.8547 13.4718 0.0381 0.0191 8.8337 5.7133 0.0371 0.0233
Born in 1951 −25.8857 13.2341 −0.0501 0.0187 −9.8559 5.6125 −0.0623 0.0229
C. 2001–2010
Born in 1950 29.3878 19.5679 0.0327 0.0180 −3.1875 7.8104 −0.0099 0.0215
Born in 1951 −42.6007 19.2226 −0.0584 0.0177 −26.5862 7.6726 −0.1053 0.0211
Source: The year-by-cohort number of deaths is computed from the micro data on the Vital Statistics for death from 1991 to 2010. The year-by-cohort mid-year population
is drawn from the Projected Population from 1991 to 2010.
Notes: (a) The polynomials of the year of birth (YOB, YOB2 , and YOB3 ) and the year dummy were included in the regressions, but omitted from the table; and (b) The estimated
coefficient in bold number is statistically significant at 10% level.
significance (Panels B and C of Columns 1 and 2). Second, the advan- in middle or old age. The effect of early exposure to the war on
tage of the 1951 birth cohort becomes stronger in the recent decade health seems weaker in females than in males. The particularly
than in the 1990s in terms of magnitude and statistical significance. low mortality of the cohort born in 1951 could result from positive
Finally, the 1951 cohort effect is more strongly revealed among selections of the cohort who survived until age 40. This conjecture
females than among males, especially in old age (2001–2010), if the is consistent with the observation that the advantage of the 1951
percentage change in the mortality rate is considered (Columns 2 cohort in mortality increases over time.
and 4 of Table 6).
The results suggest that the 1950 male birth cohort is likely
the major victim of the Korean War in terms of health outcomes 6. Discussion
6.1. Potential bias arising from population selections

Average Number of Deaths per 100,000 by Year of Birth:
900
1991-2010 Several types of population selections possibly influenced the
composition of war-time birth cohorts, including (a) marriage and
800 fertility behaviors; (b) deaths of pregnant women; (c) prenatal
700
deaths; and (d) postnatal deaths. A natural concern in interpret-
Male ing the results of this study is the pattern of selection of the Korean
600 Female War birth cohorts that survived five decades after the conflict. If
individuals who were in utero during the worst time of the war
500
were positively selected in terms of characteristics correlated with
400 later life health and economic outcomes, then maternal health can
no longer be considered as explanation for the results given in the
300
preceding section.
200 Marriage and fertility behaviors before June 1950 were not
affected by the war. Thus, cohort subjects born prior to the spring of
100
1951 should not be selected in terms of conception. The probabil-
1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
ity of conception for cohort subjects born after March 1951 and
Fig. 12. Average number of deaths per 100,000 by year of birth: 1991–2010. the probability of prenatal death during conception by subjects
Source: The Vital Statistics for death from 1991 to 2010. born after June 1950 were deemed influenced by the war. No direct
evidence is available as to which subjects were more likely to be Parents' Years of Schooling by Children's Year of Birth
pregnant or have live birth. A possible source of selection in fertil- 6
ity behaviors is selective enlistment in the military (Brown, 2011). 5

Documentations of military conscription during the Korean War do
not show how men were selected into the armed forces. However, 4
the process was obviously extremely chaotic, especially in the early Father
3
stage of the war.30 Under these circumstances, men from particu- Mother
lar socioeconomic backgrounds were unlikely to be recruited more 2
than the others.
How fertility behaviors differed among socioeconomic classes 1
during the war is likewise unclear. Brown (2011) suggested 0
that higher-income US families concerned with providing higher 19451946194719481949195019511952195319541955195619571958
quality education to their children during the First World War post-
poned reproduction until the war was over. However, given the fact Fig. 13. Parents’ years of schooling by children’s year of birth.
that Korea circa 1950 was a poor agrarian society in which the fer- Source: The 2006 Korea Longitudinal Study of Aging.
tility transition had not begun, women who were under a more
favorable socioeconomic circumstance can be assumed to have Population by Sex and Year of Birth: 1960 and 2000
better chances of conceiving than those in poorer conditions.31 450000
The majority of civilians who died during the Korean War were
victims of either combat activities or politically motivated murders 400000
(Chung, 2010). Considering the unexpected outbreak and highly
chaotic progress of the war, visualizing that pregnant women from 350000
a particular background were more likely to be hit by bullets or
bombs is difficult. Descriptive documents suggest that migrants
300000
from North Korea, as well as family members of South Korean
policemen and soldiers, were more likely to escape to the southern
250000
region for fear of punishment by the North Korean forces (Chung,
2010). However, whether the people who failed to escape consid-
erably differ in terms of socioeconomic status from refugees in 200000
the Busan perimeter is difficult to determine. Refugees had their 1948 1949 1950 1951 1952 1953 1954 1955
own share of hardships. Although the possibility of selection can- 1960 Male 2000 Male 1960 Female 2000 Female
not be precluded, survival during the Korean War should be more
Fig. 14. Population by sex and year of birth: 1960 and 2000.
randomly determined compared with events such as famines or
Source: Korean Statistics Information System (KOSIS).
epidemics.
Direct evidence regarding the selections in conceptions and live
birth is difficult to obtain, yet a comparison of socioeconomic char- More 1951 birth cohort subjects died between 1960 and 2000 than
acteristics of the parents across different birth cohorts would be other birth cohorts. The size of the 1951 cohort was reduced by 15%,
useful in determining the magnitude and direction of potential sec- while the 1950 and 1952 cohorts decreased by 9% and 11%, respec-
tion bias. Fig. 13 provides the average years of schooling of the tively, during these four decades. The cause of the rapid decline in
parents by the children’s year of birth, computed from the 2006 population of the 1951 birth cohort compared with other cohorts is
Korea Longitudinal Study of Aging.32 Beginning with the 1949 birth unclear. While this may be attributed to the in utero circumstances
cohort, parental education increased across cohorts. The years of faced by this cohort, further research is required to verify this claim.
schooling for both the fathers and the mothers of children born in Regardless of the cause, cohort subjects with frailer health tended
1951 are longer than those predicted by long-term cohort trends. to die by 2000. Consequently, the 1951 birth cohort subjects who
This suggests that the subjects of the 1951 birth cohort are posi- were still alive in 2000 are positively selected in terms of health
tively selected in terms of family background. or robustness compared with other cohorts with lower mortality
Circumstantial evidence is available for positive selections in rates since 1960. This conjecture is consistent with the finding that
postnatal survival. Fig. 14 shows the size of each cohort born from the old-age mortality of the 1951 birth cohort is significantly lower
1948 to 1955, as enumerated by the censuses of 1960 and 2000. than that of the adjacent birth cohorts (Fig. 12 and Table 6).33
In summary, available evidence suggests that the cohort off-
spring, whose fetal health was most seriously damaged by the
Korean War, are positively selected in terms of conception and live
30
Immediately following the outbreak of the war, the South Korean military
birth. Therefore, the cohort effect estimated in Section 5 could be
attempted to recruit more than 200,000 men on compulsory basis. However, these
mobilization efforts were seriously hampered by rapid advancing of the North
understated. At the least, the results of this study are unlikely driven
Korean force and by a paralyzed administrative infrastructure. A number of men by selection bias.
were, in fact, conscripted by force on the street, bypassing formal enlistment pro-
cedures including medical examination, and being sent directly to training camps
(Korean Military Manpower Administration 1985, pp. 264–286; Tucker, 2002, pp.
33
354–359). Being an orphan is another possible mechanism for the effects of in-utero expo-
31
Total fertility rate in South Korea was 5.99 in 1960, and the country is believed sure to the Korean War observed in this study. To see if this was the case, it is
to have undergone its first fertility transition during the 1960s (Kim and Eun, 2002, investigated whether the individuals born in either 1950 or 1951 were more likely to
pp. 84–87). be without their parents in 1960 than the subjects of the neighboring birth cohorts,
32
The sample size of each birth cohort ranges from 236 (1951) to 368 (1957). In the based on the one-percent micro sample of the 1960 census. More specifically, Eq. (1)
Census, parental education can be observed only when the person co-resides with in Section 3 is estimated where the probability of parental absence in the household
his or her parents. Any evidence from the source cannot determine the direction is included as the dependent variable. The results suggest that the subjects of the
of potential selections because parental survival and co-residence decisions are not 1950 and 1951 birth cohorts were no more likely to lose their parents by 1960 than
random. that predicted by the cubic cohort trend.
6.2. Differences by gender by different economic statuses. Using a sample from the 2005 Cen-
sus, this hypothesis is tested by selecting married women whose
The above results do not explain why the health outcomes of families were headed by the husband. The health outcomes (the
females were unaffected by the Korean War, whereas those of probability of having any disability) of these women are compared
male cohort subjects born in 1950 were significantly worse than in terms of their husbands’ birth years and their own birth years
predicted by smooth cohort trends. It is unclear, either, why the (Lee, 2013, Appendix Fig. A9).
advantage of the 1951 birth cohort in mortality is more strongly The regressions on the health outcomes of married women
revealed for females. Four possible explanations are examined, employed smooth cohort trends and dummy variables of birth
without implying that these are exhaustive or mutually exclusive. by the couples during wartime (Lee, 2013, Appendix Table A5).
First, the war-related shocks could have less severely impaired Women married to subjects from the 1950 male birth cohorts were
the fetal health of females than of males because of the biological more likely to have disability or limitations in daily activities in
differences between genders. In the prenatal period and early child- 2005 than predicted by smooth cohort trends, although its statis-
hood, females are generally known to be more robust than males, tical significance is sensitive to change in specification. In contrast,
as suggested by the lower rates of still births and infant mortality the married women’s own birth years had no significant effect on
among females. However, previous studies examining other disas- their health at old age. This finding implies that the previously cited
trous events such as famines and epidemics provide mixed results gender differences in the effects of the Korean War on health are
regarding gender differences in the relationship between in utero at least partly explained by the strong influence on female health
circumstances and adult health outcomes.34 A possible explanation by the husband’s economic and/or health status.35
for the distinct results of this study is that varied types of shocks Finally, a stronger population selection among females could
to fetal health can affect the adult health outcomes of males and have produced the gender differences in the effects of the Korean
females in different ways. War on health outcomes. This hypothesis may partly explain the
Second, the gender differences in the health effects of the Korean more pronounced lower mortality in females than in males from
War could have been generated by the gender bias in parental the 1951 birth cohort compared with that in the neighboring
investments in children. This hypothesis is based on the follow- cohorts (Table 6). As noted above, the size of the 1951 birth cohort
ing assumptions. First, the effects of the Korean War on health decreased more rapidly than that of neighboring cohorts from 1960
were mediated by different parental investments in children. Sec- to 2000 (Fig. 14). Normally, the male-to-female ratio of a cohort
ond, more parental investments were made in healthier children is about 1.05 at the time of birth, which continuously declines
to increase investment return. Finally, parents made minor invest- with age because the rate of male deaths exceeds that of females.
ments in the wellbeing of their daughters during the mid-20th Fig. 14 shows that for the majority of birth cohorts, the male-
century because of the strong preference for sons in Korea. Under to-female ratio substantially dropped between 1960 and 2000.
these conditions, the disparity in in utero circumstances among However, the gender ratio for the 1951 cohort remained remark-
girls would not produce any differences in their adult health ably stable during the four decades. The estimated mortality rate
because their parents did not spend much to improve their daugh- between 1960 and 2000 for the 1951 cohort is 14.7% for females and
ter’s health in the first place. However, testing this hypothesis is 15.1% for males. These rates suggest that compared with the pop-
difficult because neither the mechanism by which the negative ulation decline in other birth cohorts, population selection in the
influences of the Korean War affected adult health nor the patterns 1951 cohort is much more pronounced among females than among
of parental response to health shocks on children are known. males. As noted by previous studies (Stanner et al., 1997; Bozzoli
Given the evidence that parents prefer sons over daughters even et al., 2009), this stronger population selection should leave more
in developed countries today (Dahl and Meretti, 2008; Lhila and robust females alive within the 1951 cohort. On the contrary, the
Simon, 2008), such a bias is assumed to significantly influence 1950 birth cohort shows that females were more likely to survive
parental investment in children in Korea circa 1950, although its than males by 2000 if they were alive in 1960.36
magnitude is difficult to determine. Evidence pertaining to parental
response to health shocks is even harder to obtain. A study by 6.3. Differences by birth cohorts
Conti et al. (2011), based on a sample of twins in China, suggests
that parental investments in children’s health are compensatory, Another puzzling finding is that different aspects of human
whereas their investments in children’s cognitive development are capital for the 1950 and 1951 cohorts were significantly dam-
reinforcing. Although Korea and China share several common cul- aged by their in utero exposure to the Korean War. Considering
tural features, similarities in the patterns of parental investments the Census outcomes, the 1950 cohort mostly exhibits health-
in children in Korea six decades ago and those of China today are related negative consequences from the war, whereas the 1951
unclear, considering these countries’ institutional and socioeco-
nomic differences.
Third, only a small fraction of females born during the Korean
35
War participated in the labor market and virtually all of them The last two regressions reported in Appendix Table A5 (Columns 3 and 4) given
in Lee (2013) include a dummy variable indicating whether the husband has any
eventually married. Thus, the husband’s health and socioeconomic disability. The results suggest that the effect of the husband’s birth year on the wife’s
status possibly played a more important role in determining the disability can be mediated by the husband’s poor health. The husband’s disability
health of older women. An underlying assumption is that the effect strongly increases the likelihood of disability in the wife. Once the husband’s disabil-
of war-driven negative health shocks on adult health was mediated ity is controlled, the effect of being married to a man born in 1950 on the probability
of having disability diminishes in magnitude by a third and becomes statistically
insignificant. The wives of disabled men are likely to become handicapped them-
selves in the course of providing care. Alternatively, economic hardship caused by
the breadwinner’s incapacity can adversely affect the health of the wife.
34 36
For example, the adverse effect of prenatal exposure to the 1918 pandemic The question of gender differences in mortality during childhood (prior to 1960)
influenza on adult disability in the United States is more strongly revealed for males is also considered. Male-to-female ratios of the 1950 and 1951 birth cohorts in 1960
than for females (Almond and Mazumder, 2005), whereas female disability is more (1.065 and 1.064, respectively) are close to the normal gender ratio at birth (1.05).
heavily affected by early-life exposure to the 1959–1961 Chinese famine (Mu and Therefore, it is unlikely that considerably more girls than boys were killed among the
Zhang, 2008). No clear consensus exists on gender difference in the association 1950 and 1951 cohorts, resulting in a stronger population selection among females
between birth weight and systolic blood pressure in later life (Lawlor et al., 2002). by 1960.
cohort only shows lower socioeconomic outcomes than neighbor- including the distance from Busan and the duration of occupa-
ing cohorts. tion by North Korean forces. Regression results suggest that the
This result can be attributed to the difference in pregnancy magnitude of the negative cohort effect is significantly larger for
stages during which the two birth cohorts were exposed to nega- individuals more seriously traumatized by the war.
tive health shocks. Individuals born at the end of 1950 had already Likewise, cohorts who were in utero at the early stages of
spent four months as a fetus or in utero by the time the war broke the war show poorer health outcomes at older ages. The 1950
out in late June; thus, majority of this cohort survived the first half male birth cohorts were more likely to be disabled and suffer
of pregnancy before the war. In particular, the 1950 cohort in utero from limitations in daily activities in 2005. Males born in 1950
during the first three months of the Korean War spent only the experienced a significantly higher mortality rate from 1991 to
final trimester of pregnancy during what is considered as the most 2009. These negative effects of in utero exposure to the Korean
chaotic period of the war. Conversely, the majority of the 1951 War on health are likely stronger for individuals born in the
cohort subjects were exposed to the war during the early stages Central Region, as in the case of socioeconomic outcomes. The esti-
of pregnancy. mated negative effects of the war are also underestimated when
The effects of prenatal exposure to negative health shocks dif- potential selections in pregnancy, birth, and survival are consid-
fer according to the stage of pregnancy at which the shocks were ered.
received (Barker, 1994). The human brain develops between the 8th The effects of prenatal exposure to the Korean War differ by gen-
and 25th weeks of pregnancy. Thus, the 1951 birth cohort was more der. The health outcomes of females were unaffected by the war,
likely to be exposed to war-related disruptions as a fetus during whereas those of male cohorts in utero during the early stages of
these weeks than the 1950 cohort. This study considers socioe- the war were significantly worse than predicted by smooth cohort
conomic outcomes, which are closely related to cognitive skills trends. This gender difference can be explained in part by the influ-
including educational attainment and quality of occupation. There- ence of the husband’s economic and health status on female health
fore, with all factors considered, the education and labor market in Korea. Women married to the 1950 male subjects were more
outcomes of the 1951 cohort were significantly worse than pre- likely to have disability or limitation in daily activities in 2005
dicted by long-term trends because in utero development of the than predicted by smooth cohort trends. In contrast, these married
subjects’ brains was adversely affected by the war. In contrast, the women’s own birth years had no significant effect on their health
subjects of the 1950 birth cohort were exposed to the war in utero at old age. Stronger population selection among females born in
after their brains were fully formed. This conjecture is also consis- 1951 can be a factor; the mortality rate from 1960 to 2000 was
tent with the result that the probability of entering college is more particularly high for the 1951 female cohort, making the female
strongly affected by in utero exposure to the war than the proba- survivors among the 1951 cohort more robust on average than
bility of attending at least primary school (see Panels 1-B, 1-C, 2-B, those in neighboring cohorts. Other possible explanations are (a)
and 2-C of Table 3) because college demands cognitive ability more biological differences in health by gender and (b) gender bias in
than middle or high school. parental investment on children’s health in Korea, a factor that
cannot be directly tested.
The type of human capital significantly impaired by war-time in
7. Summary and Implications utero influences differs between the 1950 and 1951 birth cohorts.
The 1950 cohort exhibits worse health outcomes while the 1951
The Korean War (June 1950–July 1953) likely damaged individ- cohort shows lower educational attainment and labor market per-
ual maternal and fetal health, especially in the first nine months formance. A possible explanation for the difference by year of birth
of the war following the surprise invasion of North Korea. The is that the two cohorts were exposed to negative health shocks dur-
frontline dramatically moved back and forth across the region, ing different stages of pregnancy. For example, the majority of the
forcing civilians to suffer from severe war-related disruptions. 1951 cohort subjects experienced the worst time of the war during
Motivated by growing literature on the fetal origins hypothesis, the first half of pregnancy, a critical period for the development of
this article investigates how in utero exposure to such arduous the human brain. On the other hand, most of the 1950 birth cohort
wartime experiences (refugee experience, occupation, hunger, and subjects had already passed the 25th week in pregnancy with com-
combat) affected socioeconomic and health outcomes at older plete prenatal brain development when the war broke out. The
ages. weaker effect of the war on the health of the 1951 cohort, as noted
The evidence offered in this paper supports the fetal origins above, may be attributed to the considerably stronger population
hypothesis. A number of socioeconomic outcomes from the 1990 selection experienced by this cohort.
and 2000 census were significantly lower for the 1951 birth cohort. The results of this paper provide new evidence pertaining to
For instance, males and females of this cohort received significantly the fetal origins hypothesis. A rare account of the long-run con-
less schooling than predicted by the smooth cohort trends. Control- sequences of direct exposure to a war in utero is offered. Thus
ling for cubic cohort trends in these measures shows that males far, previous studies have focused on how famines, infections, and
born in 1951 are less likely to have professional employment, as exposure to toxic substances in utero affect adult outcomes. Sev-
well as more likely to be a manual laborer around the age of 50 (in eral observed impacts of the Korean War may partly be explained
2000) than other male birth cohorts. by the effects of malnutrition and infection while in utero. How-
The disadvantages in socioeconomic outcomes experienced by ever, the extent of malnutrition during the Korean War should
the 1951 birth cohort are particularly pronounced for those born be considerably milder than that experienced during the Dutch
in the Central Region of South Korea. This result can be attributed or Chinese famine. Despite the vivid memories of hunger among
to the extensive collateral damage experienced by the residents many survivors, finding a case of civilian death caused by starva-
of this region when the war broke out: a larger percentage of tion proved difficult. In addition, there were no records found on
residents endured North Korean occupation; refugees from this major epidemics of infectious diseases during the war. Although
region traveled longer distances; and the region fell under com- further research is required to determine the main mechanisms
munist control twice. By exploiting these geographic variations by which the war damaged fetal health, the following wartime
in wartime disruptions, a difference-in-difference model was esti- experiences may have played important roles: suffering from atroc-
mated using several location-specific measures of wartime stress, ities under the North Korean occupation, physical and mental
hardships faced while taking refuge, and direct exposure to bloody health and labor-market outcomes are not considered in this exer-
combats.37 cise. Thus, the long-term costs of prenatal exposure to war-caused
This study suggests that military conflict can lead to long-term disruptions may be substantial.
negative health and economic consequences by adversely affecting
maternal and fetal health.38 This finding is more evident in civilians References
directly exposed to combat, as in the case of the Korean War, which
implies that the conventionally estimated costs of historical wars Aizer, A., Stroud, L., Buka, S., 2009. Maternal stress and child well-being: evidence
may be understated. In general, only physical damages and direct from siblings. In: Brown University Working Paper.
Almond, D., 2006. Is the 1918 influenza pandemic over? Long-term effects of in utero
human losses such as wartime deaths and wounds are counted in
influenza exposure in the post-1940 population. Journal of Political Economy
assessing the economic costs of a war (Goldin and Lewis, 1975). Sev- 114 (4), 672–712.
eral studies consider the longer-term effects on the impaired health Almond, D., Chay, K.Y., Lee, D.S., 2005. The cost of low birth weight. Quarterly Journal
of Economics 120 (3), 1031–1084.
of individuals, especially those who fought in the war (Lee, 2005,
Almond, D., Currie, J., 2010. Human capital development before age five. In: NBER
2008; Edwards, 2010). If the long-term adverse effect on the health Working Paper No. 15827.
and socioeconomic outcomes of the next generation is also consid- Almond, D., Currie, J., 2011. Killing me softly: the fetal origins hypothesis. Journal of
ered, the actual total costs of modern military conflict, which tend Economic Perspectives 25 (3), 153–172.
Almond, D., Edlund, L., Li, H., Zhang, J., 2010. Long-term effects of early-life develop-
to expose civilians to various disruptions, should be substantially ment: evidence from the 1959 to 1961 Chinese famine. In: Ito, T., Rose, A. (Eds.),
larger than suggested by conventional estimates. The Economic Consequences of Demographic Change in East Asia. University of
The accurate estimation of the long-term economic costs of in Chicago Press, Chicago, pp. 321–350.
Almond, D., Edlund, L., Palme, M., 2009. Chernobyl’s subclinical legacy: prenatal
utero exposure to the Korean War is beyond the scope of this study. exposure to radio active fallout and school outcomes in Sweden. Quarterly
However, the following exercise enables us to assess its rough mag- Journal of Economics 124 (4), 1729–1772.
nitude. The rate of return to schooling in Korea in 1980 is estimated Almond, D., Mazumder, B., 2005. The 1918 influenza pandemic and subsequent
health outcomes: an analysis of SIPP data. American Economic Review Papers
at 11.1% (Psacharopoulos, 1989). The result of this study (Panel 2-A and Proceedings 95 (2), 258–262.
of Table 2) suggests that prenatal exposure to the war diminished Barker, D.J.P., 1992. Fetal and Infant Origins of Adult Disease. British Medical Journal
the schooling of the male Central Region natives born in 1951 by of Publishing Group, London.
Barker, D.J.P., 1994. Mothers, Babies, and Disease in Later Life. British Medical Journal
0.43 years. These two estimates suggest that the 1951 male birth
of Publishing Group, London.
cohorts born in the Central Region lost 4.8% of their lifetime earn- Beebe, G.W., 1975. Follow-up studies of World War II and Korean War Prisoners, II:
ings (11.1% times 0.43) because of the negative effect of the war morbidity, disability, and maladjustments. American Journal of Epidemiology
101, 400–422.
on their education. Similarly, the subjects of the 1950 male birth
Behrman, J.R., Rosenzweig, M.R., 2004. Returns to birth weight. Review of Economics
cohort and of the 1951 female birth cohort born in the Central and Statistics 86 (2), 586–601.
Region lost 1.9% and 3.2% of their lifetime earnings, respectively, Black, S.E., Devereux, P.J., Salvanes, K.G., 2007. From the cradle to the labor market?
for the same reason. If these figures are applied to roughly estimate The effect of birth weight on adult outcomes. Quarterly Journal of Economics
122 (1), 409–439.
the lifetime earnings of these birth cohorts, the aggregate earning Brakman, S., Garretsen, H., Schramm, M., 2004. The strategic bombing of cities in
losses resulting from lowered education is estimated at about 2.0 Germany in World War II and its impact on city growth. Journal of Economic
trillion won in 2010 (about 1.8 billion US dollars in 2010 value).39 Geography 4 (1), 1–18.
Bleker, O.P., Roseboom, T.J., Ravelli, A.C.J., van Montfans, G.A., Osmond, C., Barker,
By contrast, the entire physical cost of the Korean War is estimated D.J.P., 2005. Cardiovascular disease in survivors of the Dutch famine. In: Horn-
at 11.8 trillion Won in 2010 (Lee, 2001). The influences of the war on stra, G., Uauy, R., Yang, X. (Eds.), The Impact of Maternal Malnutrition on the
Offspring: Nestle Nutrition Workshop Series Pediatric Program, vol. 55. Karger,
Basel, Switzerland, pp. 183–195.
Bozzoli, C., Deaton, A., Quintana-Domeque, C., 2009. Adult height and childhood
37
disease. Demography 46 (4), 647–669.
This paper proposes implications for future changes in health outcomes at old Brown, R., 2011. The 1918 U.S. influenza pandemic as a natural experiment. In: Duke
age in Korea. Maternal and fetal health in South Korea likely improved significantly University Working Paper (revisited).
after recovering from destruction caused by the war. Thus, assuming all other factors Camacho, A., 2008. Stress and birth weight: evidence from terrorist attack. American
are equal, the post-war generations are likely to be healthier at old age than previous Economic Review Papers and Proceedings 98 (2), 511–515.
cohorts who experienced the Japanese colonial occupation, the Second World War, Chay, K.Y., Greenstone, M., 2003. The impact of air pollution on infant mortality: evi-
or the Korean War in utero. dence from the geographic variation in pollution shocks induced by recession.
38
This result is not necessarily inconsistent with the finding that intense bombing Quarterly Journal of Economics 118 (3), 1121–1167.
during a war has no long-term effect on the local population or economic growth Chen, Y., Zhou, L.A., 2007. The long-term health and economic consequences of
(Davis and Weinstein, 2002; Brakman et al., 2004; Miguel and Roland, 2008). First, 1959–1961 famine in China. Journal of Health Economics 26 (4), 659–681.
the evidence drawn from variations across localities within a country does not pro- Chung, B.J., 2010. The types and characteristics of the civilian casualties in South
Korea during the Korean War. In: The Group for Korean Historical Studies (Ed.),
vide a counterfactual outcome that would have been achieved by the nation as a
The Korean War in Historical Perspectives. Humanist, Seoul, pp. 471–504 (in
whole had it not been for the war. Second, even if the human capital development
Korean).
of a particular birth cohort born in a particular region is negatively affected by a war,
Chun, S.K., 2006. The Korean War that I experienced. In: Park, D. (Ed.), The One
it may not significantly affect the overall economic growth if individuals from neigh- Hundred Scenes of the Korean War. Nunbit, Seoul, pp. 153–160 (in Korean).
boring birth cohorts are close substitutes in the labor market and if labor mobility Chun, W.Y., 2011. The Birth of Modern Men. Isoon, Seoul (in Korean).
across regions is high, which is likely to be the case in post-war South Korea. Conti, G., Heckman, J., Yi, J., Zhang, J., 2011. Early health shocks, prenatal responses,
39
The average monthly wages of males and females are 588,320 won and 323,691 and child outcomes. In: Paper Presented at the 2011 NBER Cohort Studies Meet-
won, respectively, in 1990 (drawn from the website of the Korea Statistical Office: ing, April, 2011.
www.kosis.kr). Assuming that these amounts are the average lifetime monthly Costa, D.L., Kahn, M.E., 2008. Heroes and Cowards. Princeton University Press,
wages of the 1950 and 1951 birth cohorts and that the average length of work Princeton, NJ.
life of these cohorts is 40 years, the lifetime earnings of these cohorts are estimated Currie, J., Hyson, R., 1999. Is the impact of shock cushioned by socioeconomic status?
at 282,393,000 won for males and 155,371.680 won for females. The labor force The case of low birth weight. American Economic Review 89 (2), 245–250.
participation rates of males and females in 1990 (74% and 47% respectively; drawn Currie, J., Moretti, E., 2007. Biology as destiny? Short- and long-run determinants of
intergenerational transmission of birth weight. Journal of Labor Economics 25
from www.kosis.kr) are assumed as the average participation rates of these cohorts
(2), 231–264.
during the study period. The numbers of the 1950 and 1951 male birth cohorts and
Currie, J., Rossin-Slater, M., 2012. Weathering the storm: hurricanes and birth out-
the 1951 female birth cohort who were born in the Central Region and who were
comes. In: NBER Working Paper No. 18070.
alive in 1990 are 96,900, 91,300, and 87,500, respectively (computed from the micro Dahl, G., Meretti, E., 2008. The demand for sons. Review of Economic Studies 75,
sample of the 1990 census, with the data drawn from www.kosis.kr). The aggregate 1085–1120.
lifetime earnings made by each of the three groups is computed by multiplying the Davis, D., Weinstein, D., 2002. Bones, bombs, and break points: the geography of
individual lifetime earnings, the average labor force participation rate, and the total economic activity. American Economic Review 92 (5), 1269–1289.
number of individuals. The amount of the Korean currency was converted into the Edwards, R.D., 2010. U.S. war costs: two parts temporary, one part permanent. In:
2010 value based on the GDP deflator reported in Kim (2012). NBER Working Paper No. 16108.
Eccleston, M., 2011. In utero exposure to maternal stress: effects of 9/11 on birth Meijer, A., 2007. Child psychiatric sequelae of maternal war stress. Acta Psychiatrica
and early schooling outcomes in New York City. In: Harvard University Working Scandinavica 72 (6), 505–511.
Paper. Meng, X., Qian, N., 2009. The long-term consequences of famine on survivors: evi-
Geber, W.F., 1970. Cardiovascular and teratogenic effects of chronic intermittent dence from a unique natural experiment using China’s great famine. In: NBER
noise stress. In: Welch, B.L., Welch, A.S. (Eds.), Physiological Effects of Noise. Working Paper No. 14917.
Plenum Press, New York, pp. 85–90. Miguel, E., Roland, G., 2008. The long-run impact of bombing Vietnam. Journal of
Goldin, C.D., Lewis, F.D., 1975. The economic cost of the American Civil War: esti- Development Economics 96 (1), 1–15.
mates and implications. Journal of Economic History 35 (2), 299–326. Mu, R., Zhang, X., 2008. Gender difference in the long-term impact of famine. In:
Halberstam, D., 2007. The Coldest Winter: American and the Korean War. Hyperion, International Food Policy Research Institute Discussion Paper 00760.
New York. Neelsen, S., Stratman, T., 2011. Effects of prenatal and early life malnutri-
Huttunen, M.O., Niskanen, P., 1978. Prenatal loss of father and psychiatric disorders. tion: evidence from the Greek famine. Journal of Health Economics 30,
Archives of General Psychiatry 35 (4), 429–431. 479–488.
Ichino, A., Winter-Ebmer, R., 2004. The long-run educational cost of World War II. Nefzger, D.M., 1970. Follow-up studies of World War II and Korean War prison-
Journal of Labor Economics 22 (1), 57–87. ers, I. Study plan and mortality findings. American Journal of Epidemiology 91,
Jones, F.N., Tauscher, J.T., 1978. Residence under an airport landing pattern as a factor 123–138.
in teratism. Archives of Environmental Health 33 (1), 10–12. Neuzgebauer, R., Wijbrand Hoek, H., Susser, E., 1999. Prenatal exposure to wartime
Kelly, E., 2009. The scourge of Asian flu: in utero exposure to pandemic influenza and famine and development of antisocial personality disorder in adulthood. Journal
the development of a cohort of British children. In: Institute for Fiscal Studies of the American Medical Association 281 (5), 455–462.
Working Paper 09/17. University College London. Os, J.V., Selten, J.-P., 1998. Prenatal exposure to maternal stress and subsequent
Kesternich, I., Siflinger, B., Smith, J.P., Winter, J.K., 2011. The effects of World War II schizophrenia. The May 1940 invasion of The Netherlands. British Journal of
on economic and health outcomes across Europe. In: Working Paper. Psychology 172, 324–326.
Kim, D.-S.S., Eun, P.K., 2002. Population of Korea, vol. 1. Korea National Statistical Psacharopoulos, G., 1989. Time trends of the returns to education: cross-country
Office, Seoul. Evidence. Economics of Education Review 8 (3), 225–231.
Kim, N.N. (Ed.), 2012. National Accounts of Korea 1911–2010. Seoul National Uni- Robson, D., Welch, E., Beeching, N.J., Gill, G.V., 2009. Consequences of captivity:
versity Press, Seoul. health effects of far east imprisonment in World War II. Quarterly Journal of
Kim, T.W., 2008. A study of the aerial bombing by the United States air force dur- Medicine 102 (2), 87–96.
ing the Korean War. Department of History, Seoul National University (Ph. D. Rofe, Y., Goldberg, J., 2011. Prolonged exposure to a war environment and its effects
Dissertation; in Korean). on the blood pressure of pregnant women. British Journal of Medical Psychology
Kim, W.I., 2006. Three months in Seoul under the North Korean occupation. In: Park, 56 (4), 305–311.
D. (Ed.), The One Hundred Scenes of the Korean War. Nunbit, Seoul, pp. 33–43 Roseboom, T.J., Meulen, J.H.P., Osmond, C., Barker, D.J.P., Ravelli, A.C.J., Bleker, O.P.,
(in Korean). 2001. Effects of prenatal exposure to the Dutch famine on adult disease in later
Knipschild, P., Meijer, H., Sallé, H., 1981. Aircraft noise and birth weight. International life: an overview. Twins Research 4 (5), 293–298.
Archives of Occupational and Environmental Health 48, 131–136. Selten, J., Cantor-Graae, E., Nahon, D., Levav, I., Aleman, A., Kahn, R., 2003. No
Korean Office Military Manpower, 1985. History of Military Manpower Administra- relationship between risk of schizophrenia and prenatal exposure to stress dur-
tion, vol. 1. Office of Military Manpower, Seoul. ing the six-day war or Yom Kippur War in Israel. Schizophrenia Research 63,
Lauderdale, D.S., 2006. Birth outcomes for Arabic-named women in California before 131–135.
and after September 11. Demography 43 (1), 185–201. Sutker, P., Winstead, D., Galina, Z., Allan, A., 1991. Cognitive deficits and psy-
Lawlor, D.A., Shah, E., Davey Smith, G., 2002. Is there a sex difference in the asso- chopathology among former prisoners of war and combat veterans of the Korean
ciation between birth weight and systolic blood pressure in later life? Findings conflicts. American Journal of Psychiatry 148, 67–72.
from a mega-regression analysis. American Journal of Epidemiology 156 (12), St. Clair, D., Xu, M., Wang, P., Yu, Y., Fang, Y., Zhang, F., Zheng, X., Gu, N., Feng, G., Sham,
1100–1104. P., He, L., 2005. Rates of adult schizophrenia following prenatal exposure to the
Lee, C., 2005. Wealth accumulation and the health of Union Army Veterans, Chinese famines of 1959–1961. Journal of the American Medical Association 294
1860–1870. Journal of Economic History 65 (2), 352–385. (5), 557–562.
Lee, C., 2008. Health, information, and migration: post-service geographic mobility Stanner, S.A., Bulmer, K., Andrès, C., Lantseva, O.E., Borodina, V., Poteen, V.,
of Union Army Veterans, 1860–1880. Journal of Economic History 65, 352–385. Yudkin, J.S., 1997. Does malnutrition in utero determine diabetes and
Lee, C., 2013. In-utero exposure to the Korean War and its long-term effects on coronary heart disease in adulthood. British Medical Journal 315 (7119),
socioeconomic and health outcomes. In: Working Paper. Seoul National Univer- 1342–1348.
sity. Tucker, S.C. (Ed.), 2002. Encyclopedia of the Korean War: a political, social, and
Lee, J.W., 2001. The impact of the Korean War on the Korean economy. International military history. Checkmark Book, New York.
Journal of Korean Studies 5 (1), 97–118. Weinstock, M., 2001. Alterations induced by gestational stress in brain morphology
Lhila, A., Simon, K.I., 2008. Prenatal health investment decisions: does the child sex and behaviour of the offspring. Progress in Neurobiology 65, 427–451.
matter? Demography 45 (4), 885–905. Yang, Y.C., 2010. The conditions of refugees and aid activities in Deagu area during
Luo, Z., Mu, R., Zhang, X., 2006. Famine and overweight in China. Review of Agricul- the Korean War. In: The Group for Korean Historical Studies (Ed.), The Korean
tural Economics 28 (3), 296–304. War in Historical Perspectives. Humanist, Seoul, pp. 569–594 (in Korean).

Nurses’ labour supply elasticities: The importance of accounting for

extensive margins夽
Barbara Hanel a,∗ , Guyonne Kalb a,b , Anthony Scott a
a
Melbourne Institute of Applied Economic and Social Research, The University of Melbourne, Australia
b
Institute for the Study of Labor (IZA), Germany
Article history: We estimate a multi-sector model of nursing qualification holders’ labour supply in different occupations.
Received 17 May 2012 A structural approach allows us to model the labour force participation decision, the occupational and
Received in revised form 21 October 2013 shift-type choice, and the decision about hours worked as a joint outcome following from maximising a
utility function. Disutility from work is allowed to vary by occupation and also by shift type in the utility
function. Our results suggest that average wage elasticities might be higher than previous research has
found. This is mainly due to the effect of wages on the decision to enter or exit the profession, which was
not included in the previous literature, rather than from its effect on increased working hours for those
J22
J24
who already work in the profession.
I10 © 2013 Elsevier B.V. All rights reserved.
I11
Keywords:
Nursing
Labour supply
Shift work
Wage elasticities
1. Introduction outcomes (Lankshear et al., 2005; Kane et al., 2007; Cook et al.,
2010). It is clear that shortages are common across many low and
Maintaining an adequate supply of nurses is a key ongoing high income countries, which means that research on the deter-
policy issue in many countries. The World Health Organisation minants of nursing labour supply remains a priority. Even where
(2006) estimated global shortages of health workers of 4.3 million. overall shortages are less of an issue, shortages in specific sectors,
Shortages are important in the light of evidence, largely from the disease areas, and geographic areas highlight the role of maldistri-
US, of the impact of the level of nurse staffing on patients’ health bution. Although the budgetary pressures from the global financial
crisis may have reduced demand for nurses employed in the public
sector in some countries, the ‘need’ for health care, and therefore for
nurses, continues to grow as the population ages and as developing
夽 Funding from ARC linkage grant LP0669209 and the Victorian Department countries continue to struggle to provide basic health services.
of Health is gratefully acknowledged. The paper uses the unconfidentialised unit Shortages are a long-term issue since labour markets, especially
record file from the Department of Families, Housing, Community Services and
for nurses who work in the public sector, are often slow to adjust to
Indigenous Affairs’ (FaHCSIA) Household, Income and Labour Dynamics in Australia
(HILDA) survey managed by the Melbourne Institute of Applied Economic and Social changes in demand and supply conditions partly due to centralised
Research. We would like to thank Richard Blundell, Deborah Cobb-Clark and Denise wage bargaining. At the same time, the nursing workforce is also
Doiron for their comments and suggestions on an earlier draft. We are also grateful ageing and a high number of ‘baby boomer’ nurses are leaving the
for feedback and suggestions by participants at the 3rd Australasian Workshop on profession as they retire. This trend is aggravated by relatively low
Econometrics & Health Economics, 2011 HILDA Conference, the Microeconometrics
Workshop at the University of Melbourne, the School of Economics Seminar at the
numbers of young graduates who enter the profession.
University of Sydney, and the IAB Colloquium in Nuremberg. The views expressed The stock of trained nurses who could in principle work in the
in this paper and any remaining errors are those of the authors solely. profession but are currently not doing so is relatively high. For
∗ Corresponding author at: Melbourne Institute of Applied Economic and Social
example in Australia, around 17% of registered or enrolled nurses
Research, Faculty of Business and Economics Building, Level 5, 111 Barry Street,
were not working as nurses in 2005 (Australian Institute of Health
University of Melbourne 3010, Australia. Tel.: +61 3 9035 4565;
fax: +61 3 8344 2111. and Welfare (AIHW), 2008). Raising wages in order to attract them
E-mail address: bhanel@unimelb.edu.au (B. Hanel). back to the profession may be one possible option to tackle the
B. Hanel et al. / Journal of Health Economics 33 (2014) 94–112 95
problem, but increasing wage offers will be an optimal strategy smaller. Estimating labour supply elasticities for different groups
for the demand side of the market (i.e. public and private health of nurses with different family circumstances, it is found that the
care providers) only if the response of the supply side is strong labour supply elasticity with respect to other non-labour income
enough. Otherwise, the cost of inducing any substantial change in is considerably higher for nursing qualification holders with chil-
labour supply will be too high and have an even greater dampen- dren and for those with a higher qualification. For the labour
ing impact on demand. Most previous research suggests that the supply elasticity with respect to nursing wages, it is found that
labour supply elasticity with respect to wage is fairly low among lower qualified, childless, and older nursing qualification holders
nurses at around 0.3 (see Antonazzo et al., 2003; Shields, 2004 for respond more strongly to a given wage increase than their coun-
a review). However, a major drawback of these studies is that they terparts.
do not model the decision to exit or enter the nursing profession The paper continues as follows. Section 2 briefly discusses pre-
versus other occupations. The elasticity of labour supply in nursing vious literature, and Section 3 the proposed estimation strategy.
at the extensive margin might differ considerably from that at the This is followed in Section 4 by a description of the data, including
intensive margin. This is of particular importance as the nursing some summary statistics. The results are presented in Section 5.
workforce primarily consists of women. Not only is their general Section 6 concludes.
labour force attachment lower than that of men, but their career
paths, moving out of and back into the labour force around child- 2. Literature review
birth, also lower the opportunity costs of changes in occupation and
thus make career changes more likely. In order to obtain reliable Antonazzo et al. (2003) and Shields (2004) provide extensive
estimates of the total labour supply elasticity in nursing, it is thus reviews of the literature on the labour supply of nurses. The major-
important to model not only individuals’ labour force participation ity of studies find that nurses’ labour supply is fairly unresponsive
decisions – which has been done in previous research using a broad to wages, at an elasticity of around 0.3 (Shields, 2004). A major
range of methods – but also their occupational choices. methodological challenge in modelling labour supply in the highly
This paper seeks to investigate the viability of wage policies to female-dominated profession of nursing is to model the labour
increase nurses’ labour supply through a multi-sector labour sup- force participation decision together with the decision about the
ply model. We extend the literature by considering a sample of number of hours worked. Previous studies have applied a broad
nursing qualification holders, rather than those who work as regis- range of methods to deal with the endogeneity of the participation
tered nurses only. This means that our sample contains potentially decision. Early studies estimated systems of equations using a 2SLS-
available nurses regardless of their labour force participation and or 3SLS-framework based on regional data, where a continuous
regardless of the occupation they are currently working in. Thus hours equation is estimated simultaneously with an equation rep-
our sample enables us to model the extensive margin not only in resenting the labour force participation rate in the region (Benham,
terms of workforce participation (Creedy and Duncan, 2002), but 1971; Link and Settle, 1981b). Some studies based on individual
also in terms of choice of non-nursing occupations. This allows data apply Tobit models for the estimation of working hours (Sloan
us to address the following questions: (1) how responsive is the and Richupan, 1975; Link and Settle, 1979, 1981a). In later stud-
decision to enter the nursing profession to wages in nursing?, (2) ies, Heckman selection models are used to estimate the number
how responsive is the decision to exit the profession to wages in of supplied working hours while accounting for the participation
non-nursing occupations?, and (3) is the estimated total elastic- decision (Link, 1992; Ault and Rutmann, 1994; Chiha and Link,
ity of labour supply in nursing higher than suggested by previous 2003). The use of panel data is rare, given the scarcity of panel data
research, once transitions between nursing and other occupations on nurses. Among the first studies using panel data are Askildsen
are allowed for in the model? The answers to these questions will et al. (2003) and Rice (2005). While Rice (2005) used data from
indicate whether increasing wages in nursing could be a more the British Household Panel Study, Askildsen et al. (2003) drew
promising policy to tackle the problem of nurse staffing shortages upon a large panel data set of registered nurses in Norway. Ask-
than previously thought. ildsen et al. estimated the number of supplied hours together with
Additionally, we examine the choice for shift type (distinguish- a participation equation, accounting for unobserved heterogeneity
ing regular day shifts, regular night shifts and irregular shifts) as through correlated error terms. Regardless of the applied estima-
part of nurses’ labour supply. In our structural approach employ- tion methods, with few exceptions (e.g. Sloan and Richupan, 1975;
ment in different shift types enters the utility function as separate Link and Settle, 1985), previous studies found the elasticity of work-
arguments. This leads to our fourth research question: (4) does the ing hours as well as the elasticity of labour force participation to be
responsiveness of labour supply to wage changes differ by shift well below one, i.e. nurses’ labour supply appears to be relatively
type? A further novelty of our research is that we investigate the inelastic with respect to wages.
importance of family circumstances for nurses’ labour supply, as Recent research has therefore paid more attention to other, non-
these are expected to affect the utility derived from income and the monetary, factors that might drive nurses’ labour supply, such as
disutility derived from working hours in nursing jobs with differ- relations with colleagues, degree of autonomy, or intrinsic rewards
ent shift types or a non-nursing occupation. This leads to our final (e.g. Clark, 2001). Shields and Ward (2001) investigate the role of
research question: (5) to what extent does the responsiveness to job satisfaction for nurses’ intentions to quit the profession, and
wage changes depend on a nurse’s family circumstances? find that satisfaction with promotion and training opportunities are
To give a preview of our findings, the results suggest that nurses’ of higher importance than, for example, satisfaction with wages.
labour supply might be more responsive to changes in wages This has also been investigated using discrete choice experiments
than earlier research concluded, as we find an average elasticity (Lagarde and Blaauw, 2009; Lagarde, 2010). The range of job char-
of hours worked in nursing with respect to nursing wages of 1.3. acteristics that might be relevant for nurses’ labour supply is broad.
This higher elasticity is due to changes in the entry into and exit One job characteristic that seems to distinguish nursing jobs from
from the nursing profession in response to wage changes. Ear- most other professions is the high incidence of shift work. Only
lier research has ignored the possibility of entry from or exit to two studies have examined shift work in a labour supply context.
other occupations. Work in night shifts turns out to be associ- Askildsen et al. (2003) use the shift type in a nurses’ work contract
ated with a higher disutility from work than work in other shift as an explanatory variable for the number of supplied hours and
patterns, and thus the responsiveness to wages is considerably find it to be an important determinant that influences the size of
96 B. Hanel et al. / Journal of Health Economics 33 (2014) 94–112
the wage elasticity. Di Tommaso et al. (2009) are the first to model problem, because if utility U is increasing in income at the observed
shift type as a choice together with supplied hours in a discrete income and leisure time, and the matrix of second order derivatives
choice setting. In their structural model the labour supply deci- of income with respect to leisure along the indifference surface is
sion is the solution to an individual’s utility maximisation which positive at the observed income and leisure time, then U is quasi-
has leisure (or its complement, hours of work) and consumption concave at that point (Varian, 1992, pp. 96–97). In the discrete
(usually assumed to be equivalent to net income) as its arguments. approach taken here, these two conditions can be tested at all data
Hours worked in different types of jobs (day shifts versus rotating points after estimation of the parameters.2
shifts, and hospitals versus primary care facilities) enter the utility In each period, each nursing qualification holder i can choose
function as separate arguments. While they find the overall elastic- between alternatives j from a set of combinations M of income and
ity of labour supply to be rather low, they find strong inter-job-type working hours in different employment types: {(yji , hdji , hnji , hiji ,
responses to wages. Moreover, work in irregular shifts responds hoji ); j = 1,2,. . .,m} where hdji , hnji and hiji denote the individuals’
more strongly to a wage increase than work in regular day shifts. working hours in nursing jobs, either in regular day shifts (hd), reg-
We build on the approach used by Di Tommaso et al. (2009). A ular night shifts (hn) or in irregular shifts (hi), and ho are working
structural multi-sector model of labour supply is ideally suited to hours in non-nursing jobs. y is the household net income associ-
answer the research questions raised in the introduction: how is ated with the choice of working hours and employment type. The
the decision to supply labour in nursing versus another occupation quadratic utility function to be estimated is:
affected by wages in nursing and wages in competing professions?
Is the labour supply elasticity different for work in different shift Uji = ˇ0 yji + ˇ1 yji2 + (ˇ2x hxji + ˇ3x hxji2 + ˇ4x hxji yji ) (1)
patterns? Similar to Di Tommaso et al. we estimate a discrete choice x=d,n,i,o
labour supply model with work in different jobs entering the utility We estimate various specifications where we allow the linear
function as separate arguments. We extend their approach, how- preference parameters ˇ in the utility function to differ by family
ever, by using a sample of nursing qualification holders, rather than status.
those who work as registered nurses only.1 Hours of work in dif- Labour supply is treated as a discrete choice problem rather
ferent shift patterns and in different occupations enter the utility than a continuous choice, similar to the approach by Van Soest
function as separate arguments. Maximisation of this utility func- (1995). This approach is chosen for a number of reasons. First, since
tion then determines the labour force participation decision, the we want to distinguish the choice between four types of occu-
hours decision and the occupational choice. However, while many pation (three nursing occupations with different shift types and
of the studies discussed above model the labour force participation other occupations combined), which is a discrete choice. Second,
decision, the individual’s occupational choice has not been incor- in the labour supply literature it has been shown that a discrete
porated in the estimations. Given the number of nurses leaving the representation of continuous labour supply is adequate, and per-
occupation, understanding how these nurses can be attracted back haps even preferred since in reality workers often have a limited
to the profession and how current nursing staff can be prevented number of hours of work they can choose from. We estimate the
from leaving will inform policy makers in the design of effective model for a number of different labour supply points to explore
policies, keeping nurses’ labour supply at a higher level. What is the sensitivity to a different number of points, and to find the
needed to keep nurses from leaving specific shift types may well minimum number of points at which the results become stable
differ. A better understanding of this latter aspect is also crucial for (i.e. the point at which they do not change much when increas-
effective policy making. ing the number of points). Third, we can estimate the preference
parameters in the utility function directly, allowing the model to
3. Estimation strategy be used to predict the effects of any policy changes that affect net
income. Fourth, in a model with continuous hours of labour sup-
We estimate a structural model of individual labour supply, ply, quasi-concavity conditions would need to be imposed a priori
directly based on an underlying utility function, to obtain estimates to guarantee coherency, substantially complicating estimation.
of labour supply elasticities with respect to income and wages. In 2002, Creedy and Duncan already noted “. . .in recent years
The utility function has household net income (assumed to repre- there has been a move towards discrete modes of estimation.” (p.
sent household consumption) and four different types of working 13), and this move has continued up to now. The discrete choice
hours as its arguments: working hours in non-nursing jobs, and labour supply model is particularly popular in the context of tax
working hours in nursing jobs in the three different shift types and transfer behavioural microsimulation modelling.
‘regular day shifts’, ‘regular night shifts’ and ‘irregular shifts’. It is We observe between 0 and 80 working hours per week, mea-
assumed that nursing qualification holders choose hours so that sured in 1-h units. Individuals can hold more than one job, and thus
utility is maximised. We use a quadratic specification for the utility different combinations of working hours in different hours-types
function (following Keane and Moffitt, 1998). The quadratic speci- are possible. To maintain computational tractability, we apply two
fication is simple but quite flexible in that it allows for the leisure simplifying assumptions: (i) for each alternative, only one of the
of each person and consumption to be substitutes or complements. variables hd, hn, hi, ho is allowed to be positive,3 and (ii) we allow
This means the model can represent complex interactions. Further- three discrete labour supply choices (which means that we approx-
more, the quadratic utility function can be expressed as a function imate actual labour supply by one of the choices).4 The discrete
of labour supply rather than leisure without the need to choose a
value for total endowment of time as would be required in alter-
native specifications. These advantages make the quadratic utility 2
Allowing for expressing utility as a function of hours worked instead of leisure,
function a good choice, even though it is not automatically quasi- we have checked for these two conditions. In our case, they were found to be fulfilled
concave. However, in a discrete labour supply model this is not a in 99.24% of the observations.
3
89.7% of the observations in the sample do not hold more than one job and are
thus not affected by this restriction. For those who hold more than one job, the shift
pattern in the main job is considered to determine the chosen alternative. The shift
1
This allows us to analyse the decision to work in nursing versus working in pattern in other jobs is not observed. Hours worked in all jobs are added together.
4
another occupation or being out of the labour force, and to examine the response to The intervals of working hours per week are [4, 25], [26, 37] and [38, 80]. The
wages in competing professions together with the response to wages in nursing. discrete hours points hdj , hnj , hij or hoj are set to the average number of hours
Table 1
Person-year-observations by shift type/occupation and hours worked.
Occupation/shift type
Nursing: day Nursing: night Nursing: irregular Outside nursing Not employed Total
Hours worked:
4–25 h 209 73 328 300 – 910
26–37 h 171 111 373 250 – 905
38+ h 262 58 456 525 – 1301
Not employed – – – – 856 856
Total 642 242 1157 1075 856 3972
# of individuals 234 98 343 293 260 696
labour supply points are chosen in such a way that the actual Restricting the number of possible working hours to a limited
labour supply is represented as well as possible.5 This leaves us set of discrete values has the advantage that for this limited set
with twelve different choices of employment, and we add a thir- of hours, one can calculate the level of utility that each possible
teenth alternative “non-employment”, where hours in each of the combination of hours and occupation would generate, according
four different work arrangements and labour income is set to zero. to the specified utility function. If we can calculate utility levels for
The utility function is approximated by a deterministic compo- each of the discrete values of hours of work and income, and the
nent represented by the quadratic function specified in (1), plus error terms are specified, then for each possible combination we
a random disturbance that is assumed to follow a type I Extreme can calculate the probability of that combination being preferred
Value distribution. Choosing an extreme value specification for the according to the estimated model. Under the assumption of a type I
error term results in a multinomial logit model (see Maddala, 1983). Extreme Value error term, the probability that individual i chooses
Due to the tractability of the multinomial logit model, this choice alternative j (from the m alternatives) is:
has been popular in discrete choice labour supply modelling.
exp(Uji )
The basic assumption underlying the model estimation is that Pr(Uji > Uki , k =
/ j) = m (2)
every individual chooses the alternative that leads to the highest k=1
exp(Uki )
utility. There is reasonable support for this assumption of utility To estimate these probabilities, we need to know the budget
optimisation since actual hours worked and preferred hours of constraint in order to determine the household net income (and
work correspond quite closely. Preferred hours are measured in thus the utility level) associated with each choice j. Hourly gross
HILDA by asking employed respondents “whether and how many wages for the different shift types in nursing and for working in
hours per week they would prefer to work more or less than they non-nursing jobs are predicted for all nursing qualification holders
currently do, taking into account how that would change their using wage regressions. We regress wages in non-nursing occu-
income”. For 72% of all observations on employed individuals, the pations on tenure (as a proxy for job-specific human capital), age
reported preferred working hours fall in the observed hours inter- (as a proxy for labour market experience and thus general human
val. For those working in nursing this is even higher at 74%. This is capital), and state and year dummies. Age and tenure enter the
some indication that nursing qualification holders may be slightly model in linear, quadratic and cubic terms. Age and tenure are
better able to optimise their hours than individuals in other profes- fully interacted with a dummy variable indicating tertiary educa-
sions, and certainly no worse. This is possibly due to the shortage tion. A separate regression equation for nurses’ wages has the same
of nurses’ supply. Thus we can expect models of labour supply, functional form, but additionally includes a dummy variable that
which are based on the assumption that people can choose their indicates the shift type, which is again interacted with tertiary edu-
hours freely, to work at least as well for nursing qualification hold- cation. Using tenure in the wage models allows us to capture nurses
ers as for other groups. In addition, the distribution of hours worked who have been in nursing for a longer time and so are less likely to
within each shift type suggests that both part-time and full-time switch occupation. Tenure in the nursing wage models will be high,
hours ranges are well covered (see Table 1). This indicates that a and tenure in another occupation would be zero, thus making the
broad range of combinations of shift type and hours worked is on relative wage in nursing higher than for nurses with lower levels
offer, providing nursing qualification holders with varied choices of tenure. Naturally the reverse is true as well, with longer tenure
of shift type and hours worked, facilitating the supply of preferred in other occupations making the wage in other occupations more
hours without facing demand side constraints.6 attractive relative to nursing.
To account for selection into employment and into nursing, we
apply a two-step Heckman selection model. The participation deci-
worked observed in each of these intervals, and the average number of hours worked
sions for employment in nursing and for employment in other
is used to determine the corresponding labour income at that labour supply point. occupations are estimated as a function of age, education, gender,
Individuals working less than 4 hours per week are considered to be non-employed. State and year, family circumstances (presence of a partner and
5
We also examine the sensitivity of results to choosing a larger number of labour partner’s employment status, as well as the presence of children
supply points: four, six and eight instead of three. This does not change the estimated
and interactions between children and partner’s characteristics).7
elasticities much. Selected results are included in Appendix C, and the full results
are available from the authors upon request. Using the estimated wage models, the expected gross labour
6
Individuals who are most likely to be facing demand side factors that lead to income at different choices of hours worked per week can be cal-
sub-optimal working hours are those for whom observed hours are not equal to culated. Non-work household income and partner’s gross income
preferred hours. This may potentially lead to bias in the estimation of the model’s
parameters due to measurement error. Therefore, in the empirical section of the
paper, we follow Ribeiro (2001) who uses information from the sample (whether
7
workers were looking for another job) to exclude individuals from the analysis, and As a robustness check, we have repeated the estimation with predicted wages
estimate an alternative version of the model, excluding all observations who are from a simple OLS regression without controlling for selection into employment.
not working at their preferred hours. This provides an indication of the bias of the The results are very similar, and results from both wage models are available from
estimated elasticities due to sub-optimal labour supply reported in the data. the authors upon request.
are treated as exogenous. The sum of resulting labour income, other weekly net household income higher than $10,000 or lower than
non-labour income and partner’s income is used to compute taxes $200. The remaining sample contains information on 696 nursing
paid and family payments received by both partners based on the qualification holders (3972 person-year observations), of whom
relevant tax and transfer system (which were slightly different for 474 are observed working in a nursing occupation at least once dur-
each of the years observed in the data).8 The resulting net house- ing the observation period (2041 person-year observations). The
hold income is adjusted for inflation using the consumer price index data also show that a substantial proportion of nursing qualification
depending on the year of observation. holders work outside nursing (27.1% of all person-year observa-
The most important advantages of this type of model for our tions), indicating that nursing qualification holders can quite easily
analysis are (i) that we fully incorporate the participation deci- move to other occupations. Six hundred and three nursing qual-
sion and the hours decision in the same modelling framework and ification holders are observed at least twice. In this group, 111
thus our results are not biased by endogenous sample selection change from a nursing occupation to another occupation at least
into working as a nurse, (ii) that the choice for different shift types once (18.4%), and 103 from a non-nursing occupation to a nursing
and occupations can be incorporated in the model easily, and (iii) occupation (17.1%) with 67 individuals observed in both groups.
that the direct estimation of the utility function allows us to sim- Within the 9-year observation period we observe a total of 184
ulate responses to different wage policies. A minor limitation of transitions from another occupation or out of the labour force into
the approach used in this paper is that the partner’s labour supply nursing, and 216 exits from nursing into other occupations or out
is treated as exogenous. Although the model in principle allows of the labour force.
estimating partner’s labour supply jointly by extending the set of HILDA provides information on hours worked in nursing
choices for couple families, this is not a feasible strategy given the occupations in different shift types and hours worked in other occu-
size of the data set (see Section 3), since it would require separate pations. Potentially important personal characteristics include,
models for single-adult and two-adult families.9 besides socio-demographics controls such as age, gender and high-
est educational degree; the family circumstances (presence of a
4. Data partner and of young children, partner’s employment status, part-
ner’s income and partner’s education); and information on the level
4.1. Description and summary statistics of the respondent’s satisfaction with their partner and a variable
that indicates if they feel a lack of support. We also explored the
In this analysis, we use pooled data from nine waves relevance of the nursing qualification holders’ personality which
(2001–2009) of the Household, Income and Labour Dynamics in is measured in five dimensions: extroversion, emotional stability,
Australia Survey (HILDA). HILDA is an annual household survey openness to new experiences, agreeableness and conscientious-
that is representative of all Australian households; individual inter- ness (the “Big Five”). However, these personality traits are not
views are conducted with all members of selected households who found to be very important, so we have only included them in the
are aged 15 or over. In their first interview, respondents are asked descriptive statistics and models, and we have left out these results
whether they hold a nursing qualification. In subsequent waves, from the labour supply model.10
respondents are asked whether they gained any further qualifica- Table 2a shows descriptive statistics for the nursing qualifica-
tions since the last interview. It is not known, however, whether tion holders. The vast majority is female, less than 10% are male.
any of these further qualifications were obtained in nursing. We The individuals in our sample are 45 years old on average, one out
assume that a qualification gained after the first interview is a of five has a tertiary educational degree in nursing, and they work
nursing qualification, if the individual works in nursing within two 25 h per week on average. Those who have a job in nursing are about
years of gaining the additional qualification. Whether an individual three years younger, work 6 h per week more, and are more likely
works in a nursing occupation is determined based on the 4-digit to hold a bachelor degree. Of those with a job in nursing, 74% work
ANZSCO-Code. Nursing occupations include Midwifery and Nurs- as a registered nurse or midwife, and 12% as an enrolled nurse or
ing Professionals, Enrolled Nurses, Nursing Support Workers and mothercraft nurse. The partner characteristics of those who work
Personal Care Workers. 4933 observations of 788 individuals with in nursing and the average nursing qualification holder appear to
a nursing qualification are available in HILDA across waves 1–9. be similar: about two thirds live with a partner, the vast majority
We exclude all observations after they reach age 65, observa- of partners is employed (although more so for those still working
tions of individuals with missing information on hours worked, on in nursing), about 40% of partners hold bachelor degrees, and part-
their shift type or on their partners, and individuals with a reported ners’ income is comparable for both groups. However, those who
work in nursing are more likely to have young children below age 5.
This is in line with Dockery and Barns (2005), who find that flexible
8
Included in the calculation of payable taxes and government payments are working arrangements in nursing jobs are an important motivation
income tax, low income rebates, dependent spouse rebates, Medicare levy and Medi- to study nursing for young women who intend to have children.
care levy surcharge, as well as family tax benefit part A and family tax benefit part Table 2b reports average predicted hourly wages in non-
B. See http://www.centrelink.gov.au/internet/internet.nsf/publications/co029.htm nursing occupations and in nursing by shift type.11 Predicted
for an overview of the current (and recent) Australian social security system and
http://www.ato.gov.au for the Australian tax system.
wages for nursing qualification holders who are currently not
9
The approach of treating partner’s labour supply as exogenous is frequently working are substantially lower than for those in employment,
used in the literature on nurses’ labour supply, as is evident from the review of regardless of the occupation or shift type they would commence
labour supply studies on nurses by Antonazzo et al. (2003). This approach is also work in. This is in line with self-sorting into employment. Nurses
adopted by a recent study by Di Tommaso et al. (2009) on which this paper builds.
working in regular night shifts have a lower predicted wage level
The approach of treating partner’s labour supply as exogenous has also been used
in recent studies on mother’s labour supply and childcare decisions (Kornstad and as well, reflecting their somewhat lower educational attainment.
Thoresen, 2007; Breunig et al., 2012). In these cases including the partner’s labour
supply would make the model more computationally demanding and lead to few
additional insights, since their labour supply behaviour was unlikely to be very
10
responsive to, for example, childcare cost. In addition, given that the majority of Readers who are interested in these results can request them from the authors.
11
partners (82%) in our data are working and are working full time (81% of working The wage rates were adjusted for inflation using the consumer price index calcu-
partners work 38 h or more), treating the partner’s labour supply as exogenous is a lated for all groups of goods and services in September 2009, as a weighted average
reasonable simplification. of the eight capital cities in Australia (Australian Bureau of Statistics, 2012).
Table 2a
Descriptive statistics.
Nursing qualification holders Working in nursing

occupations
Mean (Std. Dev.)
Female 90.66% 91.52%

Age (in years) 44.9 (10.8) 42.4 (10.6)
Holds a bachelor degree 47.68% 55.07%
Holds a bachelor degree in nursing 19.21% 27.00%
Working as registered nurse or midwife 38.09% 74.13%
Working as enrolled nurse or mothercraft nurse 6.07% 11.81%
Working as a nursing support or personal care worker 7.22% 14.06%
Number of hours worked per week 25.74 (17.37) 31.93 (11.12)
Family situation
Has child ≤ 4 years 10.42% 12.40%
Has partner 68.00% 67.91%
Has child ≤ 4 years and no partner 1.86% 2.01%
If partner
Partner is employed 81.75% 88.31%
Partner’s annual net income (in 10,000 AUD) 4.75 (3.09) 4.80 (2.97)
Partner holds a bachelor degree 39.50% 41.13%
Personality variables (scale 0–7)

Extroversion 4.56 (1.08) 4.56 (1.03)
Emotional stability 5.23 (1.06) 5.16 (1.03)
Agreeableness 5.68 (0.77) 5.71 (0.75)
Openness to new experiences 4.28 (0.98) 4.14 (0.94)
Conscientiousness 5.29 (1.01) 5.25 (1.03)
Social relationships
Satisfaction with relationship to partner (0: completely dissatisfied; 10: completely satisfied) 8.02 (1.71) 7.97 (1.70)
Often needs help but cannot get it (0: strongly disagree; 7: strongly agree) 2.32 (1.49) 2.32 (1.45)
Table 2b
Descriptive statistics – predicted wages.
Out of the Working in Working in Working in Working in

labour force non-nursing nursing, day nursing, night nursing,
shifts shifts irregular shifts
Mean (std. dev.)
Predicted wage
In nursing, day shifts 32.22 (8.63) 33.53 (8.41) 39.06 (8.53) 36.05 (9.46) 37.78 (8.56)
In nursing, night shifts 35.17 (10.52) 36.77 (10.25) 42.87 (10.18) 39.03 (11.35) 41.59 (10.05)
In nursing, irregular shifts 34.68 (9.14) 36.08 (8.90) 41.76 (8.97) 38.53 (9.95) 40.48 (8.95)
In non-nursing 43.06 (8.90) 48.30 (9.37) 47.58 (10.28) 43.53 (8.04) 47.70 (9.31)
Notes: Predicted wages are in AUD$/hour, adjusted for inflation to 2009 prices.
Nursing qualification holders who work in an occupation other pooled HILDA data for: (i) the type of employment and (ii) the
than nursing would have to accept a drop in wages if they were shift type conditional on being employed as a nurse. This allows
to change to nursing. In contrast to that, nursing qualification us to examine correlation patterns between nursing qualifica-
holders who do work in nursing can expect higher wage rates if tion holders’ characteristics (including interaction terms between
they were to change occupation. Due to the return to experience the partner’s characteristics and the presence of a young child)
in an occupation, nursing qualification holders who are working and their observed work arrangements. Table A1 (in Appendix A)
in nursing are predicted to have higher wages in nursing than reports marginal effects of individual characteristics on the proba-
those who are working in another occupation. Vice versa, nursing bility of full-time employment in nursing, part-time-employment
qualification holders who are working in a non-nursing occupation in nursing, employment in other occupations, or no employment.12
are predicted to have higher wages in a non-nursing occupation While the comparison of individual characteristics in Table 2a
than those who are working in nursing. Wages across shift types did not yield any strong differences in the family situation between
are relatively similar, with irregular shifts and regular night shifts those who work in nursing and those who do not, the results are
paying somewhat more than regular day shifts for all nursing different when we distinguish between not working at all and
qualification holders, consistent with collective wage agreements. working in other occupations, and between working as a nurse in
full-time jobs or in part-time jobs. The complex patterns implied
by the model presented in Table A1 are illustrated in Fig. 1. It
4.2. Correlation patterns between employment type, shift type shows predicted probabilities of different employment types,
and family situation
In a purely descriptive analysis to show the importance of 12

Observations on the same individual are clustered to allow for the panel nature
family situation for the choice of employment type and shift of the data in a limited way by correcting the bias in standard errors arising from
type, we estimate two separate multinomial Logit models on the correlation between observations on the same person.
Fig. 1. Predicted probability of employment type by presence of children, partner and partner’s annual net income (in AUD).
evaluated at different individual characteristics: with and without (and in fact that, in that case, nursing might be a slightly preferred
young children, without partners, and with employed partners at option compared to other occupations).
different levels of the partner’s income. The latter finding immediately leads to the question whether
For nursing qualification holders without children, we see nurses in different family circumstances work different shift pat-
almost no variation in the probability of being in a nursing job or not terns. Table A2 shows the marginal effects of a multinomial Logit
when the partner’s income increases and only a slightly decreasing model that we estimated for the sub-sample of nursing qualifi-
probability of being full-time employed as a nurse instead of being cation holders who work as a nurse. The model includes three
part-time employed. For nursing qualification holders with chil- different outcomes: regular day shifts, regular night shifts and
dren, results are very different. Without a partner, working as a irregular shifts.
nurse is unlikely for those with children compared to those with- Fig. 2 shows predicted probabilities for different shift types
out children. If there is a partner, but with a low income, working depending on the family situation. The effect of having children
in nursing jobs becomes more likely. With increasing partner’s depends strongly on the presence of a non-employed or employed
income, nursing appears to become less attractive, and there is a partner with high or low income. The graph suggests that it is
large shift from full-time nursing towards part-time nursing. Fig. 1 the presence of a partner that allows for coordination of work-
suggests that (a) income effects are important in pushing nursing ing times among parents and thus for irregular shift schedules,
qualification holders out of the profession, and particularly out of which are in turn chosen more often. On the other hand, income
full-time nursing, (b) that these income effects are more impor- effects might decrease the relative attractiveness of the compara-
tant when children are present, and (c) that working as a nurse and tively high-paid irregular shifts, and thus increase the probability
raising a child might be easier to organise when there is a partner to work according to a standard schedule when the partner’s
Fig. 2. Predicted probability of shift type by presence of children, partner and partner’s annual net income (in AUD).
income is high enough. Both effects seem to be important for level.14 Table 3 shows the estimated preference parameters of the
nurses with young children, but play a minor role for their childless quadratic utility function as presented in Eq. (1), as well as the
counterparts. marginal effects of income and hours worked in the different shift
types or a non-nursing occupation on the utility averaged over all
individuals. Standard errors of marginal effects are bootstrapped.
5. Results of a structural model of labour supply – As expected from theory, utility increases significantly in house-
estimating labour supply elasticities hold net income, and decreases in hours worked. The strongest
disutility from work occurs for regular night shifts, although the
We estimate a structural model of nurses’ labour supply as out- 95%-confidence intervals for disutility from work overlap for all
lined in Section 2 to obtain estimates of the elasticity of nurses’ types of working hours, except for regular night shifts and irregular
labour supply in different shift types with respect to their wages shifts (which has the lowest disutility).
and incomes.13 All observations are clustered at the individual The results from Section 4.2 indicate heterogeneity in nurses’
labour market behaviour depending on characteristics. Differences
in labour supply for different population groups may arise from dif-
ferences in the utility an individual gains from income and leisure.
13
For the calculation of net income, we estimated hourly wages in different shift
types (cf. Section 2). We excluded observations with an hourly wage rate below $10
or above $85 from the auxiliary wage regression sample. Furthermore, observations
14
were not used for the wage regressions when a wage growth of more than 100% was To test the robustness of the results, we have repeated the estimation using only
observed from one year to the next, followed by an immediate decrease to less than the first observation year per individual. There were no substantial changes in the
50% (cf. Munasinghe et al., 2008). The wage regression results are available from the magnitude nor in the significance level of estimated coefficients or elasticities. The
authors upon request. results are available from the authors upon request.
Table 3
Structural model of labour supply, estimation of the preference parameters of the utility function.
Model (1): U = F(w, hd, hn, hi, ho)
Variable Coeff. Std. Err. 95% confidence interval
Lower bound Upper bound
Income (weekly net income in 1000 AUD) 7.653 0.833 ** 6.020 9.285
Income2 −0.942 0.337 ** −1.602 −0.282
Hours (non-nursing) −0.236 0.020 ** −0.274 −0.198
Hours (non-nursing)2 0.002 0.000 ** 0.001 0.003
Income × hours (non-nursing) 0.022 0.014 −0.004 0.049
Hours (day) −0.219 0.016 ** −0.252 −0.187
Hours (day)2 0.002 0.000 ** 0.001 0.002
◦
Income × hours (day) 0.020 0.010 −0.001 0.040
Hours (night) −0.265 0.024 ** −0.311 −0.218
Hours (night)2 0.002 0.000 ** 0.001 0.003
Income × hours (night) 0.017 0.017 −0.017 0.051
Hours (irregular) −0.171 0.017 ** −0.204 −0.139
Hours (irregular)2 0.001 0.000 ** 0.001 0.002
Income × hours (irregular) 0.009 0.012 −0.014 0.033
# observations 3972
# person observations 696
log-Likelihood −8991.1885
Wald-test of model significance (2 (dF)) 507.43 (14) **
Marg. Eff. Std. Err. 95% confidence interval
Variable Lower bound Upper bound
Income 5.347 0.477 ** 4.412 6.282

Hours (non-nursing) −0.175 0.013 ** −0.200 −0.150
Hours (day) −0.163 0.011 ** −0.185 −0.141
Hours (night) −0.207 0.014 ** −0.234 −0.180
Hours (irregular) −0.138 0.011 ** −0.160 −0.116
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level.
To incorporate such differences in individuals’ utility functions, the predicted and observed choices for individuals are compared.
we estimated additional specifications, that include income and Second, making use of the panel aspect of the data, the consis-
hours worked as in the base model, but also includes interactions tency of the estimated model with the observed transitions from
of the linear terms of income and hours with personal characteris- one labour supply choice to another over the nine available years of
tics. Because of the sample size, it was not possible to include the data is investigated. The results from both checks are re-assuring.
full range of socioeconomic characteristics and their interactions
as described in Section 4.2. Instead, we estimated a more parsimo- 5.1. Income elasticity
nious specification to explore heterogeneity in the utility functions
of nursing qualification holders. This alternative includes interac- After estimating the utility function, we run several simulations
tions of the linear terms of income and hours with age, presence of to explore the expected labour supply response of nursing qualifi-
young children below age 5, presence of a partner and educational cation holders to different scenarios with respect to income. First,
degree. Table 4 shows the coefficients of the additional interaction while holding their net labour income constant, we increase the net
terms. Almost none of them are individually significant at the 5% non-labour income and net partner’s income by 1% to predict the
level. However, when tested jointly, the presence of children, age resulting changes in their labour market behaviour. The results,
and education are highly significant. This implies that the optimal based on specification (1), can be seen in column 1 of Table 6a.
choice of labour supply, as well as the response of labour supply to If we examine the changes in the probability to work in specific
changes in wages and income, varies with personal characteristics. shift types or outside nursing jobs, we see that nursing qualification
To assess the performance of the estimation, we compared the holders are more likely to stop working or to switch into regular
actually observed labour supply decisions to the choices that are day shifts when their non-work income increases.
predicted by each of the model specifications in Table 5. The aver- The second panel of the table gives the expected working
age predicted probabilities for non-employment, occupation and hours in day shifts, night shifts, irregular shifts or non-nursing
shift types almost equal the observed frequencies, indicating that jobs, conditional on being in one of these working arrangements.
the model reflects observed labour market behaviour well. Also As one would expect, working hours decrease as a result of the
the number of working hours in each of the shift types and in both increase in other net income. However, the effects are very small in
occupation types is predicted well by the model, with the excep- magnitude, and the resulting elasticities are, although significant,
tion of working hours in night shifts. The estimated probability of close to zero. We get a similar result for expected hours in nursing
working a high or a low number of hours in night shifts is some- jobs conditional on working in a nursing job – the resulting elas-
what higher than the observed frequencies, while the probability of ticity is only −0.04. However, when we look at the unconditional
working between 26 and 38 h is somewhat lower. However, work- elasticity of working hours in nursing, this is more than twice as
ing in night shifts is an infrequent choice, with less than 3% of the high as the conditional elasticity (although still relatively small).
observations being observed as working in each of the hours bands This result implies that most of the changes in expected working
in night shifts, and the absolute difference between observed fre- hours in nursing are not because working nurses supply less hours
quencies and average predicted frequencies is small. Further checks when other net income increases, but mainly because nursing
on the performance of the model are presented in Appendix B. First qualification holders stop working in nursing. Finally, the overall
Table 4
Structural model of labour supply, estimation of the preference parameters of the utility function by groups.
Model (2) = U = F(w, hd, hn, hi, ho, Age, Children, Partner, Bachelor Degree)
Model (2) = Model (1), as in Table 3, plus interactions: Coeff. Std. Err.
Income times. . .
◦
Age (1 = upper 50th percentile) 2.146 1.100
Children ≤ 4 years (1 = yes) −2.357 1.063 *
Has partner (1 = yes) 0.421 1.205
Holds bachelor degree (1 = yes) 4.294 1.154 **
Hours(non-nursing) times. . .
Age (1 = upper 50th percentile) −0.057 0.021 **
Children ≤ 4 years (1 = yes) 0.020 0.021
Has partner (1 = yes) −0.015 0.022
Holds bachelor degree (1 = yes) −0.092 0.024 **
Hours(day) times. . .
Has partner (1 = yes) −0.016 0.017
Hours(night) times. . .
Has partner (1 = yes) −0.014 0.023
Hours(irregular) times. . .
Has partner (1 = yes) −0.013 0.019
Wald-Tests of Joint Significance 2 (dF) p-value
Interactions: Income, Hours(non-nursing), Hours(day),Hours(night), Hours (irregular) times:. . .

Age 41.18 (5) 0.000 **
Children ≤ 4 years 31.56 (5) 0.000 **
Has partner 2.64 (5) 0.756
Holds bachelor degree 50.65 (5) 0.000 **
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level. The standard errors were bootstrapped with 100 draws from the original sample. Model 2 is
estimated on the full sample (3974 observations).
Table 5
Model performance: observed frequencies and average predicted probabilities of shift types and working hours.
Alternative: Observed probability Specification (1) Specification (2)
Predicted probability Ratio Predicted probability Ratio
No employment 21.55% 21.17% 0.98 21.18% 0.98
Day shift, 4–26 h 5.26% 5.33% 1.01 5.34% 1.01

Day shift, 26–37 h 4.31% 4.22% 0.98 4.21% 0.98
Day shift, ≥38 h 6.60% 6.63% 1.01 6.64% 1.01
Day shift 16.16% 16.18% 1.00 16.19% 1.00
Night shift, 4–26 h 1.84% 2.85% 1.55 2.86% 1.56

Night shift, 26–37 h 2.79% 1.56% 0.56 1.55% 0.56
Night shift, ≥38 h 1.46% 1.97% 1.35 1.97% 1.35
Night shift 6.09% 6.38% 1.05 6.38% 1.05
Irregular shift, 4–26 h 8.26% 9.10% 1.10 9.10% 1.10

Irregular shift, 26–37 h 9.39% 8.36% 0.89 8.36% 0.89
Irregular shift, ≥38 h 11.48% 11.90% 1.04 11.90% 1.04
Irregular shift 29.13% 29.37% 1.01 29.37% 1.01
Other occupation, 4–25 h 7.55% 6.97% 0.92 6.95% 0.92

Other occupation, 26–37 h 6.29% 7.01% 1.11 7.03% 1.12
Other occupation, ≥38 h 13.22% 12.92% 0.98 12.92% 0.98
Other occupation 27.06% 26.90% 0.99 26.90% 0.99
Table 6a
104
Predicted changes in probability of shift type and expected working hours for changes in income and wages.
Increase in other net Increase in gross wage for all Increase in gross wage for nursing Increase in gross wage for
income by 1% occupations by 1% occupations by 1% non-nursing occupations by 1%
Pred. Std. Err. Pred. Std. Err. Pred. Std. Err. Pred. Std. Err.
Prob (no employment) (in %-points) 0.061 0.012 ** −0.377 0.047 ** −0.223 0.028 ** −0.157 0.023 **
Prob (day shift) (in %-points) 0.008 0.008 0.049 0.013 ** 0.186 0.024 ** −0.137 0.018 **
Prob (night shift) (in %-points) −0.002 0.007 0.011 0.009 0.067 0.011 ** −0.056 0.009 **
Prob (irregular shift) (in %-points) −0.045 0.009 ** 0.060 0.022 ** 0.330 0.042 ** −0.271 0.035 **
Prob (other occupation) (in %-points) −0.022 0.010 * 0.257 0.034 ** −0.360 0.040 ** 0.621 0.069 **
E(hours in day shift|day shift) −0.007 0.003 * 0.081 0.009 ** 0.081 0.009 ** 0 0
◦
E(hours in night shift|night shift) −0.011 0.006 0.080 0.012 ** 0.080 0.012 ** 0 0
E(hours in irregular shift|irregular shift) −0.016 0.003 ** 0.069 0.009 ** 0.069 0.009 ** 0 0
E(hours in other occupation|other occupation) −0.010 0.003 ** 0.085 0.011 ** 0 0 0.085 0.011 **
Elasticity (hours in day shift|day shift) −0.021 0.011 * 0.255 0.030 ** 0.255 0.030 ** 0 0
Elasticity (hours in night shift|night shift) −0.039 0.025 0.276 0.044 ** 0.276 0.044 ** 0 0
Elasticity (hours in irregular shift|irregular shift) −0.052 0.011 ** 0.214 0.029 ** 0.214 0.029 ** 0 0
Elasticity (hours in other occupation|other occupation) −0.030 0.010 ** 0.255 0.035 ** 0 0 0.255 0.035 **
B. Hanel et al. / Journal of Health Economics 33 (2014) 94–112

E(hours in nursing occupation|nursing occupation) −0.012 0.003 ** 0.074 0.009 ** 0.074 0.009 ** 0 0
Elasticity (hours in nursing occupation|nursing occupation) −0.038 0.010 ** 0.236 0.029 ** 0.236 0.029 ** 0 0
E(hours in nursing occupation) −0.018 0.005 ** 0.077 0.012 ** 0.227 0.024 ** −0.151 0.018 **
Elasticity (hours in nursing occupation) −0.112 0.038 ** 0.444 0.078 ** 1.374 0.168 ** −0.929 0.126 **
E(total hours) −0.029 0.005 ** 0.187 0.022 ** 0.103 0.013 ** 0.086 0.012 **
Elasticity of total hours −0.121 0.028 ** 0.752 0.105 ** 0.412 0.062 ** 0.347 0.057 **
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level. The standard errors were bootstrapped with 100 draws from the original sample. All predictions are based on the Model presented in Table 3.
Table 6b
Predicted changes in probability of shift type and expected working hours for changes in wages by shift.
Increase in gross wage (day shifts) Increase in gross wage (night Increase in gross wage (irregular
by 1% shifts) by 1% shifts) by 1%
Pred. Std. Err. Pred. Std. Err. Pred. Std. Err.
Prob (no employment) (in %-points) −0.072 0.011 ** −0.027 0.005 ** −0.126 0.018 **
Prob (day shift) (in %-points) 0.354 0.046 ** −0.029 0.004 ** −0.137 0.022 **
Prob (night shift) (in %-points) −0.030 0.004 ** 0.155 0.022 ** −0.057 0.010 **
Prob (irregular shift) (in %-points) −0.142 0.023 ** −0.057 0.009 ** 0.530 0.064 **
Prob (other occupation) (in %-points) −0.110 0.014 ** −0.043 0.006 ** −0.210 0.028 **
E(hours in shift with wage increase|shift with wage increase) 0.081 0.009 ** 0.080 0.012 ** 0.069 0.009 **
Elasticity (hours in shift with wage increase|shift with wage increase) 0.255 0.030 ** 0.276 0.044 ** 0.214 0.029 **
E(hours in nursing occupation|nursing occupation) 0.028 0.006 ** 0.002 0.003 0.046 0.008 **
Elasticity (hours in nursing occupation|nursing occupation) 0.087 0.019 ** 0.006 0.010 0.146 0.026 **
E(hours in shift with wage increase) 0.130 0.018 ** 0.052 0.008 ** 0.196 0.025 **
Elasticity (hours in shift with wage increase) 2.433 0.274 ** 2.589 0.329 ** 1.937 0.231 **
E(hours in nursing occupation) 0.073 0.010 ** 0.023 0.004 ** 0.133 0.017 **

Elasticity (hours in nursing occupation) 0.443 0.065 ** 0.140 0.026 ** 0.802 0.115 **
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level. The standard errors were bootstrapped with 100 draws from the original sample. All predictions are based on the Model presented in Table 3.
income elasticity of expected working hours (i.e. in nursing jobs result with some caution. First, the relatively small sample size
and in non-nursing jobs) amounts to −0.12, which is comparable only allowed us to define a limited number of labour supply points
to labour supply elasticities with respect to income that were per employment type, which is an approximation of the full set of
found in the literature for nurses and for females in general before choices a nursing qualification holder faces. However, comparing
(see for example Skatun et al., 2005; Phillips, 1995 for nurses; our basic model estimated using three labour supply points with
Killingsworth, 1983, pp. 193–199 or Blundell and MaCurdy, 1999, one based on four, six, and eight labour supply points, we find that
pp. 1649–1651 for the general female population; and Birch, 2005 the results are very similar, indicating that the results are not sen-
for women in Australia). sitive to increasing the number of points (see Appendix C). Second,
there is measurement error in wages, as the exact feasible wage if
a nursing qualification holder was to exit or enter the profession is
5.2. Labour supply elasticity with respect to wages
unknown. Both issues lead to imprecise estimates. However, while
the magnitude of the point estimate may not be precise, there is an
The second column of Table 6a shows the adjustment of labour
important conclusion that can be drawn. It appears that the num-
market behaviour to an increase in all hourly gross wages (regard-
ber of supplied working hours is fairly unresponsive to wages, yet
less of the occupation and shift type) by 1%.15 As one would
the decision to enter the occupation, be it from non-employment or
expect, the probability of being non-employed decreases, while
from other occupations, is far more responsive. Therefore, if policy
the probability of employment increases in all working arrange-
makers aim to increase nursing labour supply, nursing qualifica-
ments. Moreover, hours supplied given a certain work arrangement
tion holders who currently do not work in nursing appear to be an
increase, with elasticities ranging between 0.21 (irregular shifts
important target group.
in nursing) and 0.28 (regular night shifts in nursing). Labour sup-
Fig. 3 shows the distribution of labour supply elasticities for
ply conditional on working in nursing increases by only 0.07 h per
the different wage and income changes described before, rather
week which translates into a wage elasticity of about 0.24, which
than just the average elasticities as shown in Table 6a. The figure
is at the lower end of the range of previously found wage elastici-
clearly shows the heterogeneity in estimated elasticities across
ties for nurses (Shields, 2004), for women in general in Australia
the sample. That is, not all nursing qualification holders respond
and New Zealand (Birch, 2005; Kalb, 2010), for women in gen-
to wage and income changes to the same extent. We observe
eral (Blundell and MaCurdy, 1999) and in an overview by Hotz and
that each individual in the data set has a negative labour supply
Scholz (2003) focussing on lower and middle income earners. How-
elasticity with respect to net non-labour income, and a positive
ever, the unconditional elasticity of working hours in nursing with
elasticity with respect to wages, be it nursing wages, non-nursing
regard to the gross wage is about 0.44. This is because, in addition to
wages or both. The picture gets somewhat more complicated if we
longer working hours, former non-employed nursing qualification
look at the elasticity of labour supply in nursing jobs only: while
holders now also enter the occupation.
the vast majority of nursing qualification holders in the dataset
The same phenomenon is observed for a third scenario, which
decrease their expected hours in nursing when other net income
is the most interesting one from a policy perspective. Column 3
increases, a small proportion will increase their labour supply in
shows estimated labour supply changes if only the wage rates
nursing. These are individuals who currently work in non-nursing
in nursing occupations increase, while other wage rates are held
jobs and decide to change the occupation due to an increase in
constant. Not only are former non-employed nursing qualification
other net income, in most cases into regular day shifts. Likewise,
holders now predicted to enter the occupation, but also nursing
an increase in wages regardless of the occupation, i.e. an overall
qualification holders who worked in different occupations. While
wage growth in real terms, leads to an increase in labour supply
the conditional elasticity that ignores this possibility by construc-
in nursing for most qualification holders as expected, although
tion remains at 0.24, the unconditional elasticity now increases
a small group will exit nursing and enter other occupations in
from 0.44 to 1.37. This elasticity is not readily comparable to those
response to such an economic development. If wages are increased
in other studies since it allows for entry of nursing qualification
in nursing occupations only, an incentive is provided to increase
holders currently working in other occupations into nursing.
working hours in nursing jobs unambiguously. Of course, the
Therefore, even if overall hours worked do not change much, a
reverse pattern is found if wages grow in competing jobs.
shift from other occupations into nursing may take place, resulting
in a high elasticity.
5.3. Labour supply elasticity with respect to wages in certain shift
We have checked the sensitivity of this key elasticity to exclud-
types
ing nursing qualification holders who are not working at their
preferred hours from the analysis. Although the estimated elas-
To examine whether and to what extent a shortage in nurses’
ticity for this restricted sample is somewhat smaller at 1.21, it is
labour supply in a particular shift type could be addressed by a
still substantially larger than previous estimates leaving the main
wage increase in that particular shift type, we carried out three
conclusion from this analysis unchanged.16
additional simulations: hourly gross wages in a given shift type
Naturally, the reverse pattern is found when only wages in com-
were increased by 1%, holding wages constant for all other work
peting jobs are raised, which leads to a decline in labour supply
patterns, non-nursing jobs and other household income constant.
in nursing. The labour supply elasticity in nursing with respect to
The results are presented in the three columns of Table 6b. In
non-nursing wages is estimated around −0.93.
each case, as one would expect, the probability of working in the
While the labour supply elasticity with respect to wage in the
now higher-paid shift type increases, while the probabilities of
profession is relatively high, there are two reasons to look at this
working in any other work arrangement and of non-employment
decrease. Only about half of the increase in the probability of
working in the shift type for which the wage was increased is
15
For full-time employed individuals, an increase in gross wages by 1% translates caused by previously non-employed nursing qualification holders
into an average increase in nominal weekly household net income by $14.42 in 2009. and those who worked in other occupations. The other half of the
16
Ikenwilo and Scott (2007) have also found that excluding individuals at sub-
optimal hours of work does not change their results much. They therefore used
effect stems from changes in the labour supply decision within the
the results from their full sample to avoid potential sample selection problems; an current nursing workforce. This result is in line with Di Tommaso
approach that we follow as well. et al. (2009), who find that increasing wages lead to substantial
Fig. 3. Distribution of elasticities.
shifts of labour supply between job types. Conditional on shift smallest effect is observed for a change of the wage in night shifts,
type, expected working hours increase by about 0.07–0.08 h per which is the smallest shift type (see Table 2a).
week, which translates into a labour supply elasticity of about
0.21–0.28. The unconditional labour supply elasticity is much 5.4. Heterogeneity in nurses’ labour supply elasticities
higher around 2, but standard errors are large. The 90% confidence
intervals around the point estimates range between 1.5 and 3.2 As described in the introduction of Section 4.2, the utility
across the three shift types. Again, this demonstrates that labour function estimated in specification (2) varies with personal char-
supply in nursing is more responsive at the extensive margin than acteristics, which allows labour supply elasticities to vary by
it is at the intensive margin. The overall responsiveness of labour characteristics. The elasticities of labour supply with respect to
supply in nursing to a wage change in a specific shift type varies income and with respect to nurses’ wages respectively are shown
by the proportion of nurses working in each shift type. That is, the in Tables 7 and 8 for six subgroups. Although the standard errors
Table 7
Predicted changes in probability of shift type and expected working hours for increase in other net income by 1%.
Children Bachelor Age

No No Lower 50th percentile
Prob (no employment) (in %-points) 0.049 0.011 ** 0.042 0.016 ** 0.053 0.013 **
Prob (day shift) (in %-points) 0.010 0.009 0.003 0.007 0.015 0.011
Prob (night shift) (in %-points) 0.001 0.008 0.000 0.011 0.003 0.010
Prob (irregular shift) (in %-points) −0.052 0.010 ** −0.041 0.009 ** −0.065 0.013 **
Prob (other occupation) (in %-points) −0.008 0.010 −0.004 0.010 −0.007 0.012
E(hours in nursing occupation) −0.018 0.005 ** −0.015 0.006 ** −0.022 0.006 **

Elasticity (hours in nursing occupation) −0.110 0.042 ** −0.107 0.053 * −0.120 0.047 **
E(hours) −0.023 0.005 ** −0.017 0.007 ** −0.027 0.006 **

Elasticity of working hours −0.096 0.027 ** −0.081 0.038 * −0.105 0.029 **

Yes Yes Upper 50th percentile
Prob (no employment) (in %-points) 0.087 0.023 ** 0.064 0.012 ** 0.051 0.012 **
Prob (day shift) (in %-points) 0.005 0.013 0.017 0.012 0.001 0.007
Prob (night shift) (in %-points) −0.001 0.013 0.003 0.007 −0.003 0.008
Prob (irregular shift) (in %-points) −0.080 0.018 ** −0.070 0.014 ** −0.038 0.009 **
Prob (other occupation) (in %-points) −0.011 0.011 −0.013 0.013 −0.011 0.009
E(hours in nursing occupation) −0.032 0.009 ** −0.025 0.006 ** −0.016 0.004 **

Elasticity (hours in nursing occupation) −0.204 0.073 ** −0.134 0.040 ** −0.119 0.045 **
E(hours) −0.038 0.009 ** −0.033 0.006 ** −0.021 0.005 **

Elasticity of working hours −0.179 0.055 ** −0.129 0.027 ** −0.103 0.031 **
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level. The standard errors were bootstrapped with 100 draws from the original sample. Elasticities
by children, bachelor degree and age are estimated based on Model (2) and the full sample (3974 observations).
are, again, large and thus none of the elasticities are significantly dren. While nursing qualification holders without children will
different from each other across groups, some interesting patterns increase the probability of supplying labour in day shifts along
are found. First, considerable difference in responses to increased with the probability of quitting work, nursing qualification hold-
non-labour income can be seen for nurses with and without chil- ers with children are expected to increase their probability of
Table 8
Predicted changes in probability of shift type and expected working hours for 1% increase in gross wages in nursing.

No No Lower 50th percentile
Coeff. Std. Err. Coeff. Std. Err. Coeff. Std. Err.
Prob (day shift) (in %-points) 0.231 0.033 ** 0.320 0.054 ** 0.195 0.031 **
Prob (night shift) (in %-points) 0.078 0.014 ** 0.063 0.017 ** 0.054 0.012 **
Prob (irregular shift) (in %-points) 0.402 0.047 ** 0.589 0.081 ** 0.377 0.059 **

E(hours) 0.112 0.014 ** 0.148 0.024 ** 0.100 0.016 **

Elasticity of working hours 0.442 0.062 ** 0.546 0.103 ** 0.364 0.064 **

Yes Yes Upper 50th percentile
Coeff. Std. Err. Coeff. Std. Err. Coeff. Std. Err.
Prob (day shift) (in %-points) 0.158 0.049 ** 0.135 0.028 ** 0.268 0.062 **
◦
Prob (night shift) (in %-points) 0.046 0.026 0.086 0.017 ** 0.108 0.026 **
Prob (irregular shift) (in %-points) 0.200 0.075 ** 0.191 0.038 ** 0.387 0.055 **

E(hours) 0.100 0.032 ** 0.076 0.014 ** 0.127 0.019 **

Elasticity of working hours 0.425 0.159 ** 0.344 0.073 ** 0.562 0.100 **
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level. The standard errors were bootstrapped with 100 draws from the original sample. Elasticities
by children, bachelor degree and age are estimated based on Model (2) and the full sample (3974 observations).
non-employment by a larger amount. Consequently, the elastic- worked on offer. We have investigated the sensitivity of results to
ity of working hours in nursing and overall working hours is this by exploring how results would change if only nurses at their
nearly twice as high for individuals with children compared to optimal (or preferred) labour supply are included in the analysis.
their childless counterparts. The reverse is true for the response to It turns out that this would reduce the elasticity from 1.37 to 1.21,
increased wages. Childless nursing qualification holders increase which remains a substantial elasticity, still indicating a potential
their expected working hours in nursing by nearly twice as much role for wage policy. The majority of nursing qualification holders
as nursing qualification holders with children, and the correspond- are at their preferred hours of work (72%), and perhaps those who
ing labour supply elasticity is 1.70 for those without children, and are not quite at their preferred hours are not far away from them,
only 0.89 for those with children. As the total response to a wage so that the impact of including those at sub-optimal hours in the
increase consists of the sum of the negative income effect and the analysis is not large. From the observed hours of work and shift type
positive substitution effect, the stronger, negative income elasticity combinations (Table 1), it appears that a broad range of choices are
of nursing qualification holders with children contributes to their available to employees in this sector, making it more likely that
lower wage responsiveness. they work at their preferred hours.
The second columns of Tables 7 and 8 show the income elas- A further finding is that changes in wages for certain shift
ticity and wage elasticity by educational level. Highly qualified types are predicted to result in considerable shifts in labour sup-
nurses respond less to wage incentives than lower-qualified nurses, ply between shift patterns, which is in line with previous research.
although their responsiveness to other income is comparable. This If policymakers increase wages for a specific shift type only, they
might be due to higher intrinsic rewards from work for nurses need to be aware that such a policy will lead to a decrease in labour
who work in higher positions, but also be due to the higher cost of supply in other shift types to a substantial extent, as about half
changing occupation or leaving the labour force for those who have of the labour supply response is caused by the current nursing
invested more in their nursing-related education. When we exam- workforce changing shift type, rather than by attracting nursing
ine differences in the wage elasticity, a similar pattern is found for qualification holders who are currently not working in nursing, or
older vs. younger nurses: for older nurses, the probability of being by increasing of supplied hours of nurses who work in the shift type
non-employed or being employed in another occupation is much already. Increased labour in one shift type thus comes at the cost
more responsive to a given wage increase than for younger nurses, of decreased labour supply in another, if wages are increased only
which is (considering that their responsiveness to other income is for certain shift patterns. Such a policy may therefore be useful
comparable) in line with a weaker labour market attachment of the to adjust the distribution across shift types when there is a mal-
older population. distribution across shift types, but it is unlikely to provide a solution
when there is a shortage in one shift type without oversupply in
other shift types.
6. Conclusions Comparing wage and income elasticities for different groups of
nurses, shows that wage elasticities are higher for low qualified,
Can wage policy help to maintain an adequate level of nurse older and childless nursing qualification holders. At the same time,
staffing? How responsive are nurses to relative wages in nursing available non-labour income plays a particularly important role for
versus other occupations in their decision to exit and enter the nursing qualification holders with young children. This indicates
profession? Our evidence shows nurses’ labour supply elasticity that wage policies are not going to be equally effective amongst all
to be higher than previous research suggested. Wage increases individuals who hold a nursing qualification. Some groups will be
thus appear to be a more promising policy to increase labour sup- far more responsive than other groups. This needs to be considered
ply than was implied by previous research. We have estimated a in designing wage policies.
structural model of labour supply for nursing qualification hold-
ers that distinguishes explicitly between working in different shift
types and in different occupations. The responsiveness of nurs- Appendix A. Multinomial logit tables
ing qualification holders with respect to wages in nursing jobs
is high at an elasticity of 1.37, and this result is mainly due to Tables A1 and A2 show the marginal effects of personal and
our explicit inclusion of the decision to enter and exit the occu- family characteristics on nursing qualification holders’ probability
pation, rather than examining only changes in the number of of working in different employment types and shift types.
hours worked of those already working in nursing. Our results
thus show that nursing qualification holders who currently do not
work in nursing are an important group to target from a policy Appendix B. Checks of model performance
perspective. At the same time, wage increases for non-nursing
jobs are expected to draw individuals out of nursing occupa- As described in Section 5, the average probability of choosing
tions to almost the same extent as increases in nursing wages each of the employment types and working hours bands predicted
might attract nursing qualification holders, who currently work by the model match the observed frequencies for each of the
in another occupation, back to nursing. The high responsiveness choices closely. However, this indicator of overall model perfor-
of the occupational choice implies not only that wage increases mance does not tell us to what extent differences in labour market
in nursing are a promising policy for governments to increase behaviour are predicted by the model, both between individuals
labour supply in the public sector, but also that it is crucial to with different characteristics as well as within individuals whose
ensure that wages in nursing do not fall behind the development personal characteristics and expected wages vary over time.
of wage levels offered in the alternative occupations for nursing To assess how well the model reflects variation in labour market
qualification holders, as this will reduce labour supply in nursing behaviour between individuals, we compare the average predicted
substantially. probability for choosing a given employment type and working
Naturally, nurses’ observed hours of work are not solely deter- hours band for the sub-sample of individuals who were observed
mined by the supply side of the nurses market. The demand side to have chosen that alternative and the sub-sample who were
influences nurses’ observed hours of work as well, e.g. through observed to have chosen any other alternative. Table B1 shows the
restricting the number of combinations of shift type and hours model performance for each specification.
Table A1
Probability of employment type in nursing: multinomial logit model.
No employment Other occupation Part-time nursing Full-time nursing
Variable Marg. Eff. Std. Err. Marg. Eff. Std. Err. Marg. Eff. Std. Err. Marg. Eff. Std. Err.
Is female −0.014 0.050 −0.016 0.062 0.168 0.046 ** −0.118 0.057 *

Age 0.085 0.015 ** 0.009 0.014 −0.061 0.015 ** −0.025 0.012 *
Holds a bachelor degree −0.099 0.029 ** −0.029 0.035 0.027 0.041 0.165 0.040 **
Family situation
◦
Has partner 0.090 0.035 ** −0.088 0.051 0.008 0.056 −0.017 0.039
Partner is employed −0.258 0.038 ** 0.152 0.040 ** 0.095 0.047 * 0.024 0.032
◦
Has child ≤ 4 years 0.053 0.041 −0.036 0.041 0.068 0.040 −0.085 0.026 **
Marginal effect of child if. . .
No partner −0.015 0.106 0.047 0.094 0.114 0.088 −0.168 0.040 **
Non-employed partner 0.124 0.097 0.023 0.091 −0.126 0.089 0.031 0.092
Employed partner 0.077 0.051 −0.117 0.058 * 0.052 0.055 −0.010 0.060
◦ ◦
Partner’s annual net income (in 10,000 AUD) 0.021 0.006 ** −0.013 0.008 0.001 0.006 −0.011 0.006
Marginal effect of partners’ income if. . .
Does not have child ≤4 years 0.019 0.006 ** −0.013 0.008 0.000 0.006 −0.008 0.006
◦
Has child ≤4 years 0.035 0.011 ** −0.025 0.017 0.013 0.015 −0.029 0.015
Partner holds a bachelor degree −0.032 0.033 0.031 0.044 0.011 0.043 0.004 0.037
Personality variables (0 = lower 50th percentile, 1 = upper 50th percentile)

◦
Extroversion −0.053 0.028 0.045 0.035 0.006 0.036 −0.002 0.027
Emotional stability 0.009 0.031 0.050 0.037 −0.024 0.036 −0.036 0.029
Agreeableness −0.015 0.031 −0.020 0.037 0.008 0.037 0.028 0.028
Openness to new experiences 0.061 0.028 * 0.052 0.035 −0.026 0.034 −0.083 0.028 **
Conscientiousness 0.006 0.029 −0.011 0.037 0.002 0.036 −0.003 0.027
Social relationships (0 = lower 50th percentile, 1 = upper 50th percentile)

Satisfaction with relationship to partner 0.028 0.021 0.018 0.030 −0.021 0.028 −0.024 0.024
Feels lack of support when needed 0.014 0.023 −0.012 0.023 0.038 0.027 −0.040 0.020 *
Information on the estimated model:

# person-year observations 3329
Wald-tests of joint significance 2 (dF) p-value 2 (dF) p-value 2 (dF) p-value

◦
Family situation 65.24(8) 0.00 ** 14.95 (8) 0.06 9.80 (8) 0.28
◦
Personality variables 5.99(5) 0.31 Base outcome 4.53 (5) 0.48 11.05 (5) 0.05
Social relationships 0.74(2) 0.69 1.99 (2) 0.37 2.22 (2) 0.33
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level. The standard errors were calculated using the Delta-method.
Specification 1 appears to reflect differences between indi- reflect unobserved heterogeneity rather than the true utility from
viduals with respect to choosing non-employment, employment income and disutility from work. Although the model includes
as a nurse or employment in other occupations well: the aver- the family circumstances, this is only a small part of the differ-
age predicted probabilities of choosing one of these options ences between individuals and thus only contributes to some of
are considerably larger for those nursing qualification hold- the difference in individuals’ preferences. If the estimated utility
ers who were observed to have chosen this alternative than function only reflects unobserved heterogeneity in time-constant
for those who were not. However, specification 1 does not preferences rather than the effect of income and hours of work on
reflect differences in the number of hours worked in a given utility, it should not be possible to predict changes in an individual’s
shift type to the same extent. Adding additional personal labour market behaviour over time based on that utility function.
characteristics improves the model performance considerably. However, if the estimated utility function predicts changes in the
Specification 2 predicts the actually chosen alternatives with probability of the outcomes over time that match actually observed
a substantially higher probability than specification 1, partic- changes in choices over time, then the utility function contains at
ularly for frequent choices such as no employment, full-time least some information on variation in behaviour over time within
employment in nursing (day shifts), full-time employment in individuals.
nursing (irregular shifts) and full-time employment in other To investigate the question, we correlate observed changes in
occupations. choices over time (Zit ) with changes in the predicted probabilities
Besides the model’s performance in predicting variation in for any given choice over time (Pjt and Pkt ).
labour supply between individuals, one might also be interested in Zit takes the value one if an individual experiences a transition
how well the model predicts variation in labour market behaviour from any alternative k in period t − 1 to any other alternative j in
within individuals over time. This is of particular interest since the period t, and zero otherwise:
model is estimated as a pooled cross-section. If individuals have
an unobserved, time-constant preference for working in a certain
employment type or for not working at all, and these unobserved
1 Yit = j ∧ Yit−1 = k; k =
/ j
differences are correlated with expected income in each of these Zit =
0 Yit = j ∧ Yit−1 = k; k = j
alternatives, the estimated parameters of the utility function might
Table A2
Probability of shift type in nursing, multinomial logit model, marginal effects.
Variable Day shift Night shift Irregular shift
Marg. Eff. Std. Err. Marg. Eff. Std. Err. Marg. Eff. Std. Err.
Is female 0.028 0.087 0.043 0.050 −0.070 0.088

Age 0.021 0.019 0.024 0.015 −0.045 0.021 *
Holds a bachelor degree 0.015 0.045 −0.111 0.036 ** 0.096 0.047 *
Family situation
Has partner −0.002 0.072 −0.014 0.038 0.016 0.073
Partner is employed −0.001 0.067 −0.051 0.045 0.052 0.063
◦
Has child ≤ 4 years 0.021 0.047 0.065 0.044 −0.086 0.048
Marginal effect of child if. . .
No partner 0.150 0.149 0.192 0.148 −0.342 0.088 **
Non-employed partner −0.140 0.143 −0.219 0.058 ** 0.359 0.142 **
Employed partner −0.034 0.047 0.037 0.045 −0.004 0.055
◦ ◦
Partner’s annual net income (in 10,000 AUD) 0.012 0.007 0.001 0.006 −0.013 0.007
Marginal effect of partners’ income if. . .
Does not have child ≤4 years 0.012 0.007 0.000 0.006 −0.011 0.007
Has child ≤4 years 0.012 0.018 0.014 0.020 −0.026 0.019
Partner holds a bachelor degree 0.017 0.054 −0.001 0.044 −0.016 0.054
Personality variables (0 = lower 50th percentile, 1 = upper 50th percentile)

◦
Extroversion 0.078 0.045 −0.019 0.034 −0.059 0.046
Emotional stability −0.029 0.044 0.029 0.037 0.000 0.046
Agreeableness 0.055 0.046 0.006 0.034 −0.061 0.047
Openness to new experiences −0.117 0.043 ** −0.006 0.033 0.123 0.044 **
Conscientiousness 0.008 0.044 −0.019 0.036 0.012 0.046
Social relationships
Satisfaction with relationship to partner −0.012 0.039 0.023 0.029 −0.011 0.039
Feels lack of support when needed −0.016 0.032 0.019 0.022 −0.003 0.034
Information on the estimated model:

# observations 1741
Wald-tests of joint significance 2 (dF) p-value 2 (dF) p-value
Family situation 131.70 (8) 0.00 ** 12.52 (8) 0.13

Personality variables Base outcome 2.94 (5) 0.71 13.88 (5) 0.02 *
Social relationships 1.21 (2) 0.55 0.08 (2) 0.96
Notes: **, * and◦ indicate statistical significance at the 1, 5 and 10 percent level. The standard errors were calculated using the Delta-method.
Table B1
Average predicted probabilities for choosing an alternative, by individual’s choice.
Alternative: Average predicted probability of choosing alternative j
Specification (1) Specification (2)
Individuals choose Individuals choose Ratio Individuals choose Individuals choose Ratio
alternative j other alternative alternative j other alternative
No employment 26.76% 19.64% 1.36 29.80% 18.81% 1.58
Day shift, 4–26 h 5.79% 5.31% 1.09 5.99% 5.30% 1.13

Day shift, 26–37 h 4.59% 4.20% 1.09 4.70% 4.19% 1.12
Day shift, ≥38 h 8.17% 6.52% 1.25 9.23% 6.45% 1.43
Day shift 17.75% 15.88% 1.12 18.71% 15.70% 1.19
Night shift, 4–26 h 2.96% 2.85% 1.04 3.24% 2.85% 1.14

Night shift, 26–37 h 1.59% 1.55% 1.03 2.05% 1.54% 1.33
Night shift, ≥38 h 2.17% 1.97% 1.10 2.00% 1.97% 1.02
Night shift 6.66% 6.36% 1.05 7.71% 6.29% 1.23
Irregular shift, 4–26 h 9.32% 9.08% 1.03 9.72% 9.04% 1.08

Irregular shift, 26–37 h 9.20% 8.27% 1.11 10.17% 8.18% 1.24
Irregular shift, ≥38 h 14.68% 11.54% 1.27 18.02% 11.11% 1.62
Irregular shift 31.56% 28.46% 1.11 35.12% 27.00% 1.30
Other occupation, 4–25 h 7.70% 6.91% 1.12 8.40% 6.83% 1.23

Other occupation, 26–37 h 7.83% 6.95% 1.13 8.40% 6.94% 1.21
Other occupation, ≥38 h 17.90% 12.17% 1.47 18.92% 12.00% 1.58
Other occupation 31.14% 25.33% 1.23 33.01% 24.63% 1.34
Table B2
Correlation between within-variation in behaviour predicted by the model and observed within-variation in behaviour.
i j
Corr Coeff. 0.152 −0.079

Specification (1)
p-value 0.000 0.000
Corr Coeff. 0.120 −0.057

Specification (2)
p-value 0.000 −0.000
Table C1
Unconditional elasticity of hours in nursing occupations.
Elasticity w.r.t. Number of labour supply points in specification:
Model with 3 Model with 4 Model with 6 Model with 8

points points points points
Elasticity −0.112 −0.114 −0.112 −0.111

Increase in other net income by 1%
Std. Err. 0.038 0.037 0.039 0.034
Increase in gross wage for nursing Elasticity 1.374 1.356 1.365 1.348
occupations by 1% Std. Err. 0.168 0.165 0.161 0.157
Increase in gross wage for Elasticity −0.929 −0.913 −0.955 −0.937

non-nursing occupations by 1% Std. Err. 0.126 0.123 0.127 0.123
where Yit denotes the labour supply choice of individual i in period respect to net income, wages in nursing occupations, and wages
t. j and k are elements of the individual’s choice set M as described in non-nursing occupations are presented in Table C1. Full results
in Section 2. k denotes the alternative that an individual chooses in are available upon request from the authors.
t − 1, and j is the alternative that the individual chooses in t. Pjt
and Pkt denote the change in the predicted probability of choosing
References
either alternative j or alternative k:
Antonazzo, E., Scott, A., Skatun, D., Elliot, R., 2003. The labour market for nursing: a
Pjt = P̈(Yit = j) − P̈(Yit−1 = j), review of the labour supply literature. Health Economics 12, 465–478.
Askildsen, J., Baltagi, B., Holmas, T., 2003. Wage Policy in the health care sector: a
and Pkt = P̈(Yit = k) − P̈(Yit−1 = k) panel data analysis of nurses’ labour supply. Health Economics 12, 705–719.
Ault, D., Rutman, G., 1994. On selecting a measure of labour activity: evidence from
registered nurses, 1981 and 1989. Applied Economics 26, 851–863.
We expect that j = corr(Zit ,Pjt > 0) k = corr(Zit ,Pkt < 0). Australian Bureau of Statistics, 2012. Consumer Price Index; Cat. No. 6401. 0, Can-
Note that we calculate the correlation of a transition from one berra.
Australian Institute of Health and Welfare, 2008. Nursing and Midwifery Labour
alternative to another with the change in predicted probability of Force 2005. National Health Labour Force Series, No. 39, AIHW, Canberra.
choosing the relevant alternatives, rather than with the change Benham, L., 1971. The labor market for registered nurses: a three-equation model.
in the predicted utility of the alternative. This is because the indi- Review of Economics and Statistics 53, 246–252.
Birch, E., 2005. Studies of the labour supply of Australian women: what have we
vidual chooses the alternative from choice set M with the highest learned. Economic Record 81, 65–84.
utility, and thus takes the change in utility in each alternative from Blundell, R., MaCurdy, T., 1999. Labor supply: a review of alternative approaches. In:
choice set M into account. The change in the predicted probability Ashenfelter, O.C., Card, D. (Eds.), Handbook of Labor Economics, vol. 3. North-
Holland, Amsterdam, pp. 1559–1695.
of choosing a given alternative thus also reflects changes in the Breunig, R., Gong, X., King, A., 2012. Partnered women’s labour supply and child-
estimated utilities from all other alternatives in the choice set M care costs in Australia: measurement error and the child-care price. Economic
relative to the utility in j or k. Record 88 (s1), 51–69.
Chiha, Y., Link, C., 2003. The shortage of registered nurses and some new estimates
The sample contains 3276 observations in t, for which the
of the effects of wages on registered nurses labour supply: a look at the past and
individual’s choice in t − 1 is known as well. For 2030 observations, a preview of the 21st century. Health Policy 64, 349–375.
the individual’s choices in t and t − 1 are the same. We observe Clark, A., 2001. What really matters in a job? Hedonic measurement using quit data.
Labour Economics 8, 223–242.
424 transitions within a given employment type to a higher or
Cook, A., Gaynor, M., Stephens, M., Taylor, L., 2010. The Effect of Hospital Nurse
lower number of working hours, and 822 transitions into another Staffing on Patient Health Outcomes: Evidence from California’s Minimum
employment type. Table B2 shows the resulting correlation Staffing Regulation, NBER Working Paper, No. 16077.
coefficients. As expected, j is positive and k is negative for Creedy, J., Duncan, A., 2002. Behavioural microsimulation with labour supply
responses. Journal of Economic Surveys 16, 1–39.
both specifications, and both are statistically highly significant Di Tommaso, M., Strom, S., Saether, E., 2009. Nurses wanted: is the job too harsh or
at the 0.1%-level, confirming that the model can indeed predict is the wage too low. Journal of Health Economics 28, 748–757.
variation in behaviour within individuals over time, even though Dockery, A.M., Barns, A., 2005. Who’d be a nurse? Some evidence on career choice
in Australia. Australian Bulletin of Labour 31, 350–383.
within-individual variation over time has not been explicitly taken Hotz, V.J., Scholz, J.K., 2003. The earned income tax credit. In: Moffitt, R. (Ed.), Means-
into account in the estimation. Tested Transfer Programs in the United States. The University of Chicago Press,
Chicago, pp. 141–197.
Ikenwilo, D., Scott, A., 2007. The effects of pay and job satisfaction on the labour
Appendix C. Robustness checks – variations in the number supply of hospital consultants. Health Economics 16, 1303–1318.
of labour supply points Kalb, G., 2010. Modelling labour supply responses in Australia and New Zealand.
In: Claus, I., Gemmell, N., Harding, M., White, D. (Eds.), Tax Reform in Open
Economies: International and Country Perspectives. Edward Elgar, Cheltenham,
We have repeated the analysis with four, six and eight inter- pp. 166–193.
vals of labour supply (resulting in 17, 25 and 33 discrete choices), Kane, R.L., Shamliyan, T.A., Mueller, C., Duval, S., Wilt, T.J., 2007. The association of
to compare the estimated elasticities to the elasticities estimated registered nurse staffing levels and patient outcomes: systematic review and
meta-analysis. Medical Care 45, 1195–1204.
in the specification with three points as in our baseline model. Keane, M., Moffitt, R., 1998. A structural model of multiple welfare program partic-
The unconditional elasticities of supplied hours in nursing with ipation and labor supply. International Economic Review 39 (3), 553–589.
Killingsworth, M.R., 1983. Labor Supply Cambridge. University Press, New York. Maddala, G.S., 1983. Limited Dependent and Qualitative Variables in Econometrics.
Kornstad, T., Thoresen, T.O., 2007. A discrete choice model for labor supply and child Cambridge University Press, New York.
care. Journal of Population Economics 20 (4), 781–803. Munasinghe, L., Reif, T., Henriques, A., 2008. Gender gap in wage returns to job tenure
Lagarde, M., Blaauw, D., 2009. A review of the application and contribution of and experience. Labour Economics 15, 1296–1316.
discrete choice experiments to inform human resources policy interventions. Phillips, V., 1995. Participation, hours of work and discontinuities in the supply
Human Resources and Health 7, 62–71. function. Journal of Health Economics 14, 567–582.
Lagarde, M., 2010. Choosing to Care: The Determinants of Nurses’ Job Prefer- Ribeiro, E.P., 2001. Asymmetric labour supply. Empirical Economics 26,
ences in South Africa. Health Services Research Unit, London School of Hygiene 183–197.
and Tropical Medicine, University of London, London, PhD (unpublished Rice, N., HEDG Working Paper; 05/10; York 2005. The Labour Supply of Nurses in
manuscript). the UK: Evidence from the British Household Panel Study.
Lankshear, A.J., Sheldon, T.A., Maynard, A., 2005. Nurse staffing and healthcare out- Shields, M., 2004. Addressing nurse shortages: what can policy makers learn
comes. A systematic review of the international research evidence. Advances in from the econometric evidence on nurse labour supply. Economic Journal 114,
Nursing Science 28, 163–174. F464–F498.
Link, C., 1992. Labour supply behavior of registered nurses. Research in Labor Eco- Shields, M., Ward, M., 2001. Improving nurse retention in the National Health Service
nomics 13, 287–320. in England: the impact of job satisfaction in intentions to quit. Journal of Health
Link, C., Settle, R., 1979. Labour supply responses of married professional nurses: Economics 20, 677–701.
new evidence. Southern Economic Journal 41, 649–656. Skatun, D., Antonazzo, E., Scott, A., Elliott, R., 2005. The supply of qualified nurses: a
Link, C., Settle, R., 1981a. Wage incentives and married professional nurses: a case classical model of labour supply. Applied Economics 37, 57–65.
of backward bending supply? Economic Inquiry 19, 144–156. Sloan, F., Richupan, S., 1975. Short-run supply responses of professional nurses: a
Link, C., Settle, R., 1981b. A simultaneous-equation model of labor supply, fertil- microanalysis. Journal of Human Resources 10, 241–257.
ity and earnings of married women: the case of registered nurses. Southern Van Soest, A., 1995. Structural models of family labor supply: a discrete choice
Economic Journal 47, 977–988. approach. Journal of Human Resources 30, 63–88.
Link, C., Settle, R., 1985. Labour supply responses of licensed practical nurses: a Varian, H., 1992. Microeconomic Analysis, 3rd ed. W.W. Norton & Company, London.
partial solution to a nurse shortage. Journal of Economics and Business 37, World Health Organisation, 2006. The World Health Report 2006. Working Together
49–57. for Health. World Health Organisation, Geneva.

A mass phenomenon: The social evolution of obesity

Holger Strulik ∗
University of Goettingen, Wirtschaftswissenschaftliche Fakultaet, Platz der Goettinger Sieben 3, 37073 Goettingen, Germany
Article history: This paper proposes a theory for the social evolution of obesity. It considers a society in which individuals
Received 17 January 2012 experience utility from consumption of food and non-food, the state of their health, and the evaluation
Received in revised form 12 October 2013 of their appearance by others. The theory explains under which conditions poor persons are more prone
to be overweight although eating is expensive and it shows how obesity occurs as a social phenomenon
such that body mass continues to rise long after the initial cause (e.g. a lower price of food) is gone. The
paper investigates the determinants of a steady state at which the median person is overweight and how
an originally lean society arrives at such a steady state. Extensions of the theory towards dietary choice
D11
I14
and the possibility to exercise in order to lose weight demonstrate robustness of the basic mechanism
Z13 and provide further interesting results.
Keywords:
Obesity epidemic
Social dynamics
Income gradient
Feeling fat
Feeling unhealthy
1. Introduction (Field et al., 2001; Flegal et al., 2005). As a consequence, obese

persons spend not only more time and money on health care
Since about the last quarter of the 20th century we witness (Finkelstein et al., 2005; OECD, 2010) but they also pass away
an unprecedented change in the phenotype of human beings. In earlier. For example, compared to their lean counterparts, 20
the US, for example, the share of overweight (obese) persons was year old US Americans can expect to die about four years ear-
almost constant at about 45 percent (15 percent) of the popula- lier when their BMI exceeds 35 and about 13 years earlier
tion in the years 1960–1980. Since then, the share of overweight when their BMI exceeds 45 (Fontaine et al., 2003). According to
adults rose to 64.7 percent in the year 2008 and the share of obese one study, obese persons actually incur lower health care costs
adults rose to 34.3 percent (Ogden and Carroll, 2010). If these trends over their life time due to their early death (van Baal et al.,
continue, by 2030, 86 percent are predicted to be overweight and 2008).
51 percent to be obese (Wang et al., 2008).1 The phenomenon of The simple answer for why people are overweight is that they
increasing waistlines is particularly prevalent in the US but is also like to eat more than their body can burn. In the US, for example,
observed globally (OECD, 2010; WHO, 2011). The world is getting 70 percent of the adult population in the year 2000 said that they
fat (Popkin, 2009). eat “pretty much whatever they want” (USDA, 2001). Although a
Obesity entails substantial health costs. Obese persons are more fully satisfying answer is certainly more complex, involving bio-
likely to suffer from diabetes, cardiovascular disease, hyperten- logical and psychological mechanisms, perhaps the most striking
sion, stroke, various types of cancer and many other diseases observation in this context is that overeating seems not to be driven
by affluence. At the beginning of the 20th century, when the devel-
oped countries were certainly no longer constrained by subsistence
∗ Tel.: +49 551 39 10614; fax: +49 551 39 8173. income, the English physiologist W.M. Bayliss wrote that “it may
E-mail address: holger.strulik@wiwi.uni-goettingen.de be taken for granted that every one is sincerely desirous of avoid-
1
Overweight is defined as a body mass index (BMI) above 25 and obesity as a BMI ing unnecessary consumption of food” (Bayliss, 1917, p. 1). Indeed,
above 30. In this paper we thus apply the inclusive definition of overweight by the caloric intake per person in the US remained roughly constant
WHO (2011), according to which obese persons are also regarded as overweight.
between 1910 and 1985. But it then rose by 20% between 1985
Some other studies apply an exclusive definition according to which only persons
with BMI between 25 and 30 are regarded as overweight. The BMI is defined as and 2000 (Putnam et al., 2002; see also Cutler et al., 2003; Bleich
weight in kilogram divided by the square of height in meters. et al., 2008).
114 H. Strulik / Journal of Health Economics 33 (2014) 113–125
Across the population, within countries, the historical associ- lean steady state until the 1970s and a transition towards a stable
ation between affluence and body mass actually changed its sign obese steady state afterwards.
over the 20th century; “where once the rich were fat and the poor The theory explicitly takes into account that preferences and
were thin, in developed countries these patterns are now reversed.” income vary across individuals. Holding income constant it pre-
(Pickett et al., 2005). But while it is true that the severity of over- dicts that people with a high preference for food consumption
weight and obesity is much stronger for the poor than for the are heavier. Holding preferences constant it predicts that poorer
non-poor (Joliffe, 2011), it is also true that persons from all social people are heavier, at least if income is sufficiently large and the
strata are equally likely to be overweight (in the US) and that the elasticity of substitution between food and non-food is larger than
secular increase of overeating and overweight is equally observed unity. The reason is that rich persons inevitably consume more
among – presumably richer – college graduates and non-college (food or non-food) than poor ones. Given non-separable utility,
educated persons (Ruhm, 2010). Across countries, obesity and calo- they thus experience higher marginal utility from being lean (or
rie consumption appear to be more prevalent in unequal societies less overweight) and consequently they consume fewer calories.
(Pickett et al., 2005).2 A poor person, in contrast, puts less emphasis on the evaluation
The evolving new human phenotype cannot be explained by of her appearance by others and on the health consequences of
genetics because it occurred too rapidly (e.g. Philipson and Posner, being overweight because the scale of consumption (food or non-
2008). It has to be conceptualized as a social phenomenon. With food) is low. Due to the lower emphasis on weight a larger share
affluence being an unlikely candidate, the question arises what of experienced utility results from food consumption, in partic-
has caused the social evolution of overweight and obesity? The ular if food prices are low compared to other goods. Since the
most popular factors suggested in the literature are decreasing median is poorer in unequal societies, the theory predicts, that,
food prices, decreasing effective food prices through readily avail- ceteris paribus, unequal societies are more afflicted by the obesity
able convenience foods and restaurant supply, and less physical epidemic.
activity on the job and in the household (see e.g. Finkelstein et al., In Section 3 it is shown that the social multiplier produces
2005; OECD, 2010). But these explanations entail some unresolved some perhaps unexpected non-linearities. In particular, an obesity
puzzles with respect to the timing of the obesity epidemic. related health innovation (e.g. beta-blockers, dialysis) can go awry.
The most drastic changes of potential causes of obesity occurred The impact effect of such an innovation is initially better health for
well before obesity prevalence became a mass phenomenon. The everybody. But the lower health consequences of being overweight
price of food declined substantially from the early 1970s through induce some people to eat more and put on more weight. This may
the mid 1980s but changed little thereafter, when the obesity epi- set in motion a bandwagon effect and convergence towards a new
demic took off. Eating time declined substantially from the late steady state at which society is, on average, not only heavier but
1960s to the early 1990s, but stabilized thereafter (see Ruhm, also less healthy than before the health innovation.
2010). Likewise, the gradual decline in manual labor and the rise The basic model fails to capture some further aspects of the obe-
of labor saving technologies at home began before the rapid rise in sity epidemic, most importantly the role of energy-density of food
obesity and slowed down afterwards (Finkelstein et al., 2005). This and that of physical exercise. Section 4 thus extends the model to
means that calories expended have not decreased much further account for these factors and shows that all basic results are pre-
since the 1980s (Cutler et al., 2003). served under mild conditions. It also derives some refinements of
From these facts some studies conclude that food prices and the original theory. For example, while richer people continue to be
caloric expenditure are unlikely to be major contributors to the predicted to be, ceteris paribus, less overweight, leaner bodies are
evolution of obesity because the prevalence of obesity continues to no longer necessarily a consequence of eating less. Instead, richer
rise after the alleged causes have (almost) disappeared. The present people are predicted to exercise more for weight loss. In a two-diet
paper proposes an alternative conclusion based on social dynam- model, a rising energy density of the less healthy diet is predicted
ics. It explicitly considers that one’s appearance is evaluated by to increase body mass if the diet is sufficiently cheap and its con-
others. The social disapproval for displaying an overweight body is sumer sufficiently poor. If this applies to the median person, society
continuously but slowly updated by the actual observation of the at large is predicted to get heavier due to the social multiplier.
prevalence of overweight members of society. This view provides There exists some evidence supporting the basic assumption
(i) a social multiplier that amplifies the “impact effect” of exoge- that being overweight generates less disutility if many others are
nous shocks and (ii) an explanation for why we observe an evolving overweight or obese as well, that is if the prevalence of being
human phenotype long after the impact effect is gone. overweight in society is high. Blanchflower et al. (2009) find that
The theory establishes two exclusively existing, stable, and females across countries are less dissatisfied with their actual
qualitatively distinct social equilibria. At one equilibrium the weight when it is relatively low compared to average weight.
median person is lean and after an exogenous shock that favors Using the German Socioeconomic Panel they furthermore find that
overeating (e.g. lower food prices) social pressure leads society males, controlling for their actual weight, experience higher life
back to the lean equilibrium. This means that, although there are satisfaction when their relative weight is lower. In the US, about
overweight and obese persons in society, obesity is not an evolving half of the respondents to the Pew Review (2006) who are classified
social problem. At the other equilibrium the median is overweight as overweight according to the official definition characterize their
and after an exogenous shock that favors overeating, society at own weight as “just about right”. Etilé (2007) provides similar
large converges towards an equilibrium where people are, on aver- results for France and argues that social norms and habitual BMI
age, heavier than before. The historical evolution of BMI in the US, affect ideal BMI, which in turn influences actual BMI. Christakis and
for example, is conceptualized according to the theory as a stable Fowler (2007) show how obesity spreads from person to person in
a large social network and find that a person’s chances to become
overweight increase by 57 percent if he or she has a friend who
became obese. Trogdon et al. (2008) find that for US adolescents
2
in 1994–1995 individual BMI was correlated with mean peer
Many, but not all, empirical studies of the income obesity nexus find it unam-
biguously negative for all subgroups of society. For example, Lakdawalla and
BMI and that the probability of being overweight was correlated
Philipson (2009) document a hump-shaped association of BMI and income for male with the proportion of overweight peers. Comparing different
US American workers but a monotonously negative association for female workers. periods of observation from the National Health and Nutrition
H. Strulik / Journal of Health Economics 33 (2014) 113–125 115
Examination Survey, Burke et al. (2009) show that, controlling for a pt . Each individual faces a given income y(i) and thus the budget
host of confounders, self-assessed overweight declines with rising constraint (1).
average BMI and the obesity rate in the reference group (persons
of same age and sex). In contrast, using the methodology of Glaeser y(i) = ct (i) + pt vt (i). (1)
et al. (2003) and Auld (2011) finds only small contemporaneous We allow income to be individual-specific but keep it constant over
social multipliers for BMI at the county and state level in the US time in order to focus on social dynamics.
in 1997–2002. Since the method focusses on contemporaneous Units of food are converted into units of energy by the energy
interaction, it provides indirect support for a dynamic process of a exchange rate such that individual i consumes vt (i) energy
gradually decreasing social disapproval of being overweight. units in period t. For simplicity we assume that the period is long
There exists a large literature of economic theories on obe- enough – say, a month – such that we can safely ignore the specific
sity but social interaction is relatively neglected.3 Some empirical (thermo-) dynamics of how energy consumption relates to energy
studies on obesity and social interaction are motivated with rudi- expenditure and fat cell generation and growth. Instead we assume
mentary models (Etilé, 2007; Blanchflower et al., 2009). Burke that there exists an ideal consumption of energy per period such
and Heiland (2007) propose a model of social dynamics of obe- that any consumption beyond it translates one-to-one into excess
sity which – like the present study – emphasizes the role of a weight. Let parameter (i) denote lean body mass of individual i.
social multiplier in the gradual amplification of obesity prevalence. The overweight status of individual i in period t is then given by
The solution method, however, is purely numerical; no general (2).6
results are derived analytically. Wirl and Feichtinger (2010) pro-
pose a mathematically more involved model in a similar spirit.4 ot (i) = vt − (i)≥0. (2)
Levy (2002) and Dragone and Savorelli (2011) propose dynamic
In order to simplify the algebra we impose the constraint that
theories of body size evolution in which a socially desirable weight
ot (i) ≥ 0. This condition requires that all individuals eat at least as
is exogenous and parametrically given. In contrast, the present
much to fulfill the metabolic needs of a lean body and allows us to
paper assumes a simpler static relation between food consumption
avoid considering explicitly that some people in society may pre-
and body weight and focusses on the influence of an endogenous
fer to be underweight. The energy consumption needed to support
and slowly evolving socially desirable weight on the distribution
the metabolic needs of a lean body, vt (i) = (i), can be conceptual-
of body weight in a society.5
ized as “subsistence needs”. We allow (i) to be individual-specific
The present study tries to prove as many results as possible
because subsistence needs vary with height and occupation (mus-
analytically and explicitly considers idiosyncratic differences in
cle mass and energy exertion at work).
preferences and income. Moreover, extensions of the basic model
In the basic model, in which there exists just one type of food,
demonstrate robustness of results with respect to dietary choice
all individuals face the same energy exchange rate. Section 4 sets
and physical exercise, factors, which have not been addressed in
up an extension with two types of diet for which different prices
this context so far. The present study thus proposes a theory that
and energy exchange rates apply (junk food and healthy food). This
is suitable to explain the socio-economic gradient of obesity and to
allows us to optimize over the selection of a specific diet. It will be
provide a comprehensive understanding of the social evolution of
shown that all results from the basic model, in which individuals
obesity.
can only choose the quantity but not the quality of their food, hold
true in the extended version as well.
2. The model Being overweight causes health costs and social costs, which
are both assumed to increase in excess weight. Health costs per
2.1. Setup of society unit of excess weight, , are treated parametrically, which pro-
vides an interesting experiment of comparative statics with respect
Consider a society consisting of a continuum of individuals of to medical technological progress. The arrival of beta blockers, for
fixed height, which is for simplicity normalized to unity, imply- example, can be thought of as a reduction of . The social cost of
ing that weight equals BMI. Later on, in the numerical part of the being overweight st , in contrast, is explicitly treated as a variable in
paper the normalization allows for an easy comparison of results order to address social dynamics. The presence of health costs and
with actual data on obesity. Individuals experience utility from food social costs diminishes utility from consumption. The compound
consumption and from consumption of other goods. Consumption of health implications and social disapproval of being overweight
of food of individual i in period t is denoted by vt (i) and other con- multiplied by the individual’s actual overweight status, given by
sumption is denoted by ct (i). The relative price of food is given by (st + )ot , captures the negative utility resulting from the actual
overweight status ot . To better relate to the sociological literature,
the compound can be interpreted as weight-dissatisfaction and as a
3 proxi of the self-perception of being overweight or obese. A similar
See the extensive discussion in Cutler et al. (2003), and Philipson and Posner
(2009). The economic literature on social norms, based on Granovetter (1978) and assumption has been made by Blanchflower et al. (2009).
Bernheim (1994), has already provided important insight into other phenomena, Specifically, we assume that the utility of individual i in period
including the growing welfare state (Lindbeck et al., 1999), out-of-wedlock t is given by
childbearing (Nechyba, 2001), family size (Palivos, 2001), women’s labor force
˛ 1−˛
participation (Hazan and Maoz, 2002), occupational choice (Mani and Mullin, Ut (i) = [ct (i) + ˇ(i)vt (i)] · [1 − (st + )ot (i)] . (3)
2004), contraceptive use (Munshi and Myaux, 2006), work effort (Lindbeck and
Nyberg, 2006), cooperation in prisoner’s dilemmas (Tabellini, 2008), doping in
sports (Strulik, 2012), and education (Strulik, 2013a).
4
The focus of the study by Wirl and Feichtinger (2010) is on general social dynam-
6
ics and in particular on the possibility of thresholds, i.e. unstable social equilibria In an accompanying technical paper (Strulik, 2013b) I develop and solve the
that separate multiple locally stable equilibria. The application to body size is thus associated dynamic model, in which body weight evolves as a state variable and
rather rudimentary and does not include a role for income, non-food consumption, consumers maximize utility over an infinite horizon. I show that all the main results
health effects, physical exercising for weight loss, dietary choice, and the energy developed here hold true there as well. Focussing on the simple model with a one-to-
density of food. correspondence of eating and body weight and consumers with a short (one-period)
5
An extension of the paper considering body weight as a slowly moving state planning horizon thus causes little loss of information but provides a great gain in
variable is available as Strulik (2013b). simplicity. See, for example, Cutler et al. (2003) for a similar simplifying assumption.
Here, ˛ measures the relative importance of consumption for utility observable phenomenon, ˇ(i) > pt has to hold for at least some indi-
compared to the consequences of food consumption on health and viduals in society.8
social disapproval, 0 < ˛ < 1. In order to arrive at an explicit solution, The solution of (4) provides the optimal food consumption for
I have assumed that utility is non-separable and that the elastic- person i in period t:
ity of substitution between food and non-food is infinite (once
˛ ˛(i) (1 − ˛)y(i)
metabolic subsistence needs are met). Later on I show that most vt (i) = + − . (5)
of the results can be also obtained when the elasticity of substitu-
(st + ) ˇ(i) − pt
tion is finite.7 The parameter ˇ(i) measures how pleasurable food Together with the weight constraint (2) it implies that the over-
consumption is compared to other consumption. It can be thought weight status of person i is obtained as (6).
of as a compound consisting of a common term ˇ and an idiosyn- ˛

ˆ
cratic term ˇ(i), ˆ
that is ˇ(i) ≡ ˇ · ˇ(i). ˆ
Whereas ˇ(i) measures the ot (i) = max 0, − ω(i) ,
st +
“sweet tooth” of person i, the common component ˇ controls for
y(i)
the state of food processing technology, i.e. the general desirability
ω(i) ≡ (1 − ˛) + (i) . (6)
of food (influenced, for example, by flavor enhancing technologies). ˇ(i) − pt
Individual self-control problems, although not explicitly
modeled, can be thought of as being captured by the idiosyncratic
ˆ
preference parameter ˇ(i). Persons with a dominant affective sys- 2.3. Comparative statics
tem experience more gratification from food consumption (above
metabolic needs) and display a larger ˇ(i) ˆ compared to more delib-
We next discuss the solution for given prices pt and social
erative persons. A detailed understanding of how psychological approval st . As shown in (6) excess food consumption of individual
mechanisms are affecting obesity is certainly useful and it has been i is decreasing in the degree of social disapproval of being over-
advanced within an economic framework elsewhere (e.g. Cutler weight st . For any given st , inspection of (6) proves the following
et al., 2003; Philipson and Posner, 2003; Ruhm, 2010). Lumping comparative statics.
these aspects together in one compound parameter is only justi-
fied by the focus of the present study, which focusses neither on Proposition 1. Consider a society defined as a probability distri-
psychological nor on technological aspects but on the interaction bution of tastes ˇ(j) and incomes y(j) for persons j ∈ N. Then, the
of eating behavior and social disapproval of obesity. probability that a person i is overweight is decreasing in her or his per-
sonal income y(i), the unhealthiness of being overweight , the price of
food pt , and the energy exchange rate . It is increasing in the personal
2.2. Individual utility maximization degree of gratification from food consumption ˇ(i) and the weight of
consumption in utility ˛.
Any individual i is assumed to maximize utility (3) subject to
Proposition 2. The weight of an overweight person i is decreasing
the budget constraint (1) and the weight constraint (2). The first
in income y(i), the unhealthiness of being overweight , the price of
order condition for an interior solution requires that
food pt , and the energy exchange rate . It is increasing in the personal
˛−1 1−˛ degree of gratification from food consumption ˇ(i) and the weight of
˛(ˇ(i) − pt )[y(i) + (ˇ(i) − pt )vt (i)] · [1−(st + )(vt (i) − (i))]
consumption in utility ˛.
˛
= (st + )(1 − ˛)[y(i) + (ˇ(i) − pt )vt (i)]
The result with respect to income helps to explain the observed
−˛
· [1 − (st + )(vt (i) − (i))] . (4) negative socioeconomic gradient in obesity, that is why – ceteris
paribus – richer people are less heavy. For an intuition it is useful
Marginal utility from food consumption, at the left hand side of to return to the first order condition (4). Higher income allows for
the equation, is required to equal marginal disutility from the con- a higher level of consumption, c + ˇv, be it in terms of food or non-
sequences of food consumption on being overweight, at the right food. A higher level of consumption in turn means lower marginal
hand side. A necessary but not sufficient condition for excess food utility from consumption relative to the marginal disutility expe-
consumption is ˇ(i) > pt . To see this, notice that the terms is square rienced from being overweight. It implies that the marginal utility
brackets in (4) have to be positive for positive and concave utility experienced from being less heavy, measured by the right hand
and recall that 0 < ˛ < 1. The result is intuitive. Because the price of side (4), is higher for richer persons. In simple words, when many
non-food has been normalized to one, it means that for excess food consumption needs are fulfilled, health considerations and social
consumption to occur, food consumption has to provide higher approval of one’s appearance becomes relatively more important
utility than non food consumption (ˇ(i) > 1), or the price of food for individual happiness. Consequently, richer persons are, on aver-
has to be lower than the price of non-food (pt < 1), or both. Oth- age, less heavy. Notice that food is not an inferior good in the
erwise, the non-negativity constraint binds and individuals derive conventional sense. According to the subutility function u(c, v)
pleasure from eating only until their ideal metabolic needs are ful- an increase in income would induce more food consumption for
filled, vt (i) = (i). This means that for overweight status to be an ˇ(i) > p and leave food consumption unaffected for ˇ(i) ≤ p. The con-
cerns about health and social disapproval are responsible for the
negative income effect on food consumption.
The other comparative static results from Proposition 1, except
7
Most of the results from comparative static analysis can be obtained for gen- for the energy exchange rate, are intuitive. The result with respect
eral utility functions Ut = u(ct , vt ) · x(ot ) or Ut = u(ct , vt ) + x(ot ) with ∂u/∂ct > 0, to the energy exchange rate, at first sight, appears to contradict the
2 2 2
∂ u/∂ct2 < 0, ∂u/∂vt > 0, ∂ u/∂v2t < 0, and ∂x/∂ot < 0, ∂ x/∂o2t < 0, irrespective
of whether the subutilities u and x are combined multiplicatively or additively.
However, the result that poor persons are more prone to be overweight requires
˛
additional assumptions on curvature of the utility function, which are more easily 8
The condition thus requires that for the subutility function u(c, v) = (c + ˇ)
fulfilled in the non-separable case. Notice furthermore that positive utility in (3) the marginal rate of substitution, (∂u/∂v)/(∂u/∂c) = ˇ exceeds the relative price of
requires that 1 > (st + )ot , which is assumed to hold by appropriate choice of units food p. This is so because the consumer takes as well in consideration the health
of measurement. consequences and social disapproval of overweight.
empirical observation that obese people are consuming particu- body weight to income for 1< < ∞.9 The result is helpful to explain
larly energy-dense food. Within the present framework, however, the long-run historical trends of the income body weight associa-
this seemingly counterfactual result is consistently explained: a tion, i.e. why income and body size were positively associated for
higher energy exchange rate increases the negative consequences most of human history, when income was low and average income
of excess food consumption on health and social disapproval, a fact close to subsistence level, and why income and body size are neg-
which discourages the incentive to eat a lot. In order to explain the atively associated when income is relatively high, as in developed
empirical regularity between energy density and obesity the model present day countries. The reason is that, for 1< < ∞, the effect
has to be extended by allowing individuals to chose a particular of health and social disapproval becomes dominating only when
diet. This will be done in Section 4. The seemingly counterfactual income is high enough, i.e. when the marginal utility from con-
result will be resolved by allowing energy-dense diets to be either sumption is low enough. The result may be employed as well to
cheaper or more pleasurable or both. All other results from the explain why some studies find for contemporaneous societies that
simple model will be preserved. body weight is monotonically decreasing in income for some sub-
samples, for example for women, but inversely u-shaped for other
2.4. Robustness check: arbitrary elasticity of substitution subsamples, for example for men (Lakdawalla and Philipson, 2009).
between food and non-food Most importantly the result ensures that, as long as > 1, the sim-
ple model is a reasonable approximation of the general model if
The assumption of perfect substitution between food and non- income is sufficiently large. Having made this qualifying note, we
food has been made in order to allow for an explicit solution of the now return to the simple model.
problem at hand. But how far is the simplifying assumption driv-
ing the results? In order to answer this question we re-consider the 2.5. Social disapproval
original problem now allowing for an arbitrary elasticity of substi-
tution between food and non-food. The generalized utility function Inspired by the observed social attitudes towards obesity (pre-
is given by sented in Section 1) we assume that social disapproval of obesity is
˛/ 1−˛ inversely related to the actual prevalence of obesity in society. The
Ut (i) = {[ct (i)] + (1 − )[vt (i)] } · [1 − (st + )ot (i)] ,
simplest conceivable way to implement this notion is to assume
with ≤ 1. The implied elasticity of substitution between food and that social disapproval is inversely related to the overweight sta-
non-food is = 1/(1 − ). For = 1 we obtain the simple model dis- tus of the median person, denoted by ot . Henceforth idiosyncratic
cussed so far ( =∞). parameters that apply to the median are identified by “upper bars”,
After inserting the budget constraint into the utility function the i.e. for example, ot (i) = ot for the median.10
maximization problem reads In order to discuss social dynamics explicitly we assume that
social disapproval evolves as a lagged endogenous variable depend-
˛/
max Ut (i) = {[y(i) − pt vt (i)] + (1 − )[vt (i)] } ing on the observation of actual obesity in the history of the society.
vt (i)
Let ı denote the rate of oblivion by which the historical preva-
1−˛ lence of obesity
· [1 − (st + )ot (i)] . is∞depreciated in mind so that disapproval is given
by st = (1 − ı) i=0 ıi g(ot−i ). Alternatively, this can be written in
period-by-period notation as st = ı · st−1 + (1 − ı) · g(ot ). Another
The first order condition for optimal food consumption, after transformation of this expression writes the change of disapproval
applying some algebra, reduces to as a function of its level, i.e. st − st−1 = (1 − ı)[g(ot ) − st−1 ] and
illustrates that ı controls the adjustment speed at which social dis-
−1 −1 approval responds to changes in body weight of the median person.
0 = G(vt (i), . . .) ≡ ˛{[y(i) − pt vt (i)] · (−p) + (1 − )[vt (i)] }
The lower ı the greater is the influence of the currently observable

· [1 − (st + )(vt (i) − (i)] − (1 − ˛)(st + ){[y(i) − pt vt (i)] weight of the median for social disapproval of being overweight.

+ (1 − )[vt (i)] }. (7) This formulation is reminiscent of the adjustment of external con-
sumption habits (e.g. Carroll et al., 2000). In the present context it
Dividing (7) by , defining ˇ ≡ (1 − )/, and setting = 1 we get (4), means that the median person does not internalize that his or her
confirming that the simple model is a special case of the general eating behavior has an impact on the evolution of social disapproval
model when =∞. of being overweight.
Generally, (7) has no explicit solution but the comparative Using the simplest conceivable inverse function g(x) = 1/(
+ x),
statics can be assessed using the implicit function theorem (see social disapproval of being overweight in period t can be expressed
Appendix A): as11
Proposition 3. Given an arbitrary elasticity of substitution between 1
food and non-food , the body weight of an overweight person is st = ı · st−1 + (1 − ı) · g(ot ), g(ot ) ≡ . (8)

+ ot
decreasing in the unhealthiness of being overweight and the energy
exchange rate and increasing in the weight of consumption in utility
˛. For ≤ 1 body weight is monotonically increasing in income and for 9
A similar non-monotonous response of body weight to price changes can be
=∞ body weight is monotonically decreasing in income. For 1< < ∞ obtained for < 1 (here omitted for brevity).
body weight is increasing in income if income is sufficiently small and 10
The theory is not generally based on the notion that ot is attached to the median
decreasing in income otherwise. person. In principle social disapproval could originate from the weight of any refer-
ence individual (or reference group). Taking the median person makes it easier to
The proposition is the generalized version of Proposition 2 and, focus on a society as the population of country and thus to relate to the empirical
because of symmetry, an analogous generalized version can be studies from Section 1. It is also essential for the numerical exercise on evolving
stated for Proposition 1 (here omitted for brevity). While most of BMI distributions (Section 3.4). In the Conclusion I briefly discuss the possibility of
different reference individuals or groups.
the results generalize towards arbitrary elasticity of substitution, 11
The results of comparative static analysis below do not depend on the functional
the conclusions with respect to the income–obesity nexus are qual- specification of g(ot ) and can be obtained for general g > 0 with g < 0, given that the
ified in an interesting way, suggesting a non-monotonic response of function supports an equilibrium at which the median is overweight.
The parameter
> 0 controls the strength of social norms. The pos- as a social phenomenon. For any given perturbation resulting in
itivity of
ensures that a social equilibrium is feasible at which the the median person being overweight, social disapproval s is higher
median is lean and some members of society are overweight. With- than the level needed to sustain this weight as a steady state. Con-
out
social disapproval would not be defined when the median is sequently, the median eats less until he or she returns to the corner
lean (for ot = 0). Moreover
works as a control for the strength of solution where o = 0. At the steady state the median person is not
social disapproval for being overweight; 1/
is the maximum disap- overweight. This in turn means that, although there are overweight
proval generated by society, that is the disapproval (per kilogram persons in society at the steady state (for example those poorer
overweight) that a person experiences when the median person than the median or those with a “sweeter tooth”), being overweight
is lean. Notice that a lean median person does not imply that no- is not a mass phenomenon and there exists no obesity epidemic.
one is overweight in society, because individuals are heterogenous Any perturbation or any marginal change of parameters would
in preferences and income and some individuals, poorer than the induce adjustment dynamics back to o = 0. There is no permanent
median or with a stronger preference for eating, may be overweight evolution towards larger bodies in society.
even if a lean median implies a high degree of social disapproval The right hand side of Fig. 1 shows the other, more interest-
for being big. ing, possibility. Here the h(o)-curve lies above the g(o) curve for
The important implication of (8) is that overweight individuals small o. This means that for any overweight status o < o∗ social
experience gradually less social disapproval of their appearance disapproval is lower than needed to sustain this weight as a steady
as the median (or reference) person gets bigger. In conjunction state. Consequently, the median person (and thus any overweight
with the utility function it means that for any given weight indi- person in society) eats more and puts on more weight and social
viduals less strongly assess themselves as being overweight. This disapproval of being overweight decreases until o approaches o* .
implication is empirically supported for US citizens by Burke et al. Excess weight above o* , though, is not sustainable. The associated
(2009) who compare different periods of observation from the disapproval leads to less excess food consumption and less over-
National Health and Nutrition Examination Survey and find that weight. In other words, the equilibrium at o* is stable. The following
self-assessments of being overweight declines with rising average proposition summarizes the results.
BMI and the obesity rate in the reference group. A similar observa-
Proposition 4. There exists a stable steady state at which the median
tion has been made for the UK by Johnson et al. (2008).
person is overweight and being overweight receives relatively little
social disapproval iff
3. The social evolution of obesity
1 ˛
< − . (12)
3.1. Steady-state
ω
Otherwise, the median person is not overweight at the steady state and
At the steady state, pt = p, st = s, and ot = o for all t and solving social disapproval of being overweight is relatively high.
(8) for s provides (9).
Proposition 5. There exists a stable steady state at which the median
1 person is overweight and being overweight receives little social disap-
s = g(o) ≡ . (9)

+o proval if individuals care sufficiently little about the consequences of
being overweight (if ˛ is sufficiently large), if being overweight entails
From (6) we observe excess food consumption of the median per-
sufficiently minor consequences on health (if is sufficiently low), if
son in period t as ot = ˛/(st + ) − ω, in which the compound
the steady-state price of food is sufficiently low (p is sufficiently low), if
parameter ω summarizes the impact of preferences and income
the median person likes eating sufficiently strongly (if ˇ is sufficiently
of the median, ω = (1 − ˛)[y/(ˇ − pt ) − ]. Solving for st
large), and if the median is sufficiently poor (y is sufficiently low).
˛
st = − ≡ h(ot ). (10) The proof evaluates ω(i) in (6) for the median and inserts the
ot + ω
result into (12) which provides the condition
Diagrammatically, (9) and (10) establish two equations for
social disapproval. Eq. (10) holds everywhere, Eq. (9) holds only at 1 ˛ ˇ−p
+< · . (13)
the steady state, implying that the steady state fulfills both equa-
1 − ˛ y − (ˇ − p)
tions. Equating (9) and (10) and solving for o provides (11).
Inspection of (13) verifies the proposition.
Using Propositions 1 and 2 and inspecting Fig. 1 it is straight-
r r2 1 − ˛ + (ω +
)
o = o∗ = − + − q, r≡ > 0, forward to derive the comparative statics of the social equilibrium.
2 4
For that purpose it is helpful to note that the g(o)-curve remains
ω(1 +
) − ˛
unaffected by value changes of the parameters ˛, ˇ, p, y, , and .
q≡ . (11)
The fact that Proposition 2 holds true for any person (and thus in
particular for the median person) and at any st (and thus in partic-
From the fact that r > 0 it follows that there exists a unique steady ular at the steady state) implies that comparative statics for these
state at which the median is overweight iff q < 0, that is iff 1/
< parameters can be obtained simply be observing how they shift the
(˛/ω) − . h(o)-curve. Applying Proposition 2 we see that increasing ˛, ˇ and
The steady state and its comparative statics can be best analyzed decreasing , p, and y shift the h(o)-curve to the right, in direction of
diagrammatically. For that purpose note that both g(o) and h(o) are heavier bodies. This observation proves the following proposition.
decreasing and convex in o. The graph of g(o) originates at 1/
and
approaches zero as o goes to infinity. The graph of h(o) originates Proposition 6. If a social equilibrium of obesity o* exists, then the
at ˛/ω − and approaches − as o goes to infinity. From (11) we median person is heavier and the prevalence of being overweight in
know that there exists either no or one intersection in the positive society is higher if individuals care less about the consequences of eat-
quadrant, identifying the obesity equilibrium. These two cases are ing (if ˛ is larger), if the median has a greater preference for eating
displayed in Fig. 1. (if ˇ is larger), if health consequences of overeating are smaller (if
If there exists no intersection of g(o) and h(o), as displayed on is smaller), if the price of food pt is lower, and if the median person is
the left hand side of Fig. 1, there exists no steady state of obesity poorer (if y is smaller).
s s
1/γ α
ω − η
α
ω̄ − η
1/γ
g(¯o) s*
g(¯o)
0 ō 0 ō
o*
−η h(¯o) −η h(¯o)
Fig. 1. Steady-states and dynamics: lean vs. overweight median. Left hand side: no equilibrium with excess food consumption of the median person: for any given overweight
o of the median person, social disapproval is higher than needed to sustain o > 0 as a steady state. Right hand side: Stable equilibrium at which overweight of the median
o* is supported as a steady state.
The last result provides a rationale for why, apparently, obesity It shows the steady-state value of weight and the experienced
is more prevalent in unequal societies (see Section 1). Controlling unhealthiness by the median for alternative levels of medical
for average income the median is poorer in unequal societies and technology. Without excess eating the parameterized median
– due to the mechanism explained above – motivated to eat more. would have a lean body mass index of 23. Values for the other
This implies that being overweight attracts less social disapproval parameters are given in Fig. 2. Notice that the abscissa is scaled by
and that other members of society are (more severely) overweight 1 − such that movements to the right represent improvements
as well. of medical technology. Coming from a low level of obesity-related
For the comparative static result on
, note that the size of
health technology, that is from the left (high ), a situation, which
affects only the g(o)-curve but not the h(o)-curve. A higher
shifts is associated with a mildly overweight median person, the social
the g(o) curve downwards. This means that, if a social equilibrium multiplier causes the median to be heavier and unhealthier at the
of obesity o* exists, the median is heavier and the prevalence of steady state when decreases. This means that unhealthiness u(o)
being overweight in society is higher if being overweight is less is increasing with medical technological progress. Only if the state
punished with disapproval by society. of medical technology is very high, the u(o) curve is negatively
Shifts of parameters that apply to all persons have a two-fold sloped, implying that further improving technology leads to less
consequence on individual body size. There is a social multiplier severe health consequences at the steady state.
at work. We next consider two examples for the multiplier with
interesting and perhaps non-obvious results.
3.3. Feeling fat
3.2. Feeling unhealthy A similar consideration can be made for the impact of social
attitudes on self-perception. The impact of social disapproval on
If medical technological progress (e.g. the arrival of beta block- the experienced disutility from being overweight is measured by
ers, dialysis, coronary stents) reduces the health consequences the degree of “feeling fat” f(ot ) ≡ st ot . Evaluating the expression for
of being overweight, some persons are motivated to eat more. If the median at the steady state,
the median is among these persons, which is the case when o*
exists, there is a social multiplier at work. Formally we can define o
f (o) = ,
unhealthiness as the part · ot ≡ u(ot ) in utility. Evaluating this
+o
expression for the median at the steady state and taking the first
derivative with respect to we get:
45 0.62
∂u ∂o
= o + · . (14) 0.6
∂ ∂
feeling unhealty (ηō)
40 0.58
The first term in (14) identifies the direct effect of the health inno- 0.56
vation on health of the median. It is positive. For decreasing ,
BMI
35 0.54
representing medical technological progress, this means that the 0.52
median feels less unhealthy. This fact, however, motivates her (and 30 0.5
thus a majority of society) to eat more and to put on more weight. 0.48
The second term in (14) identifies the negative consequences of 25 0.46
increasing body weight on health through the social multiplier. It 0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
is negative (recall Proposition 2). Medical Technology (1− η) Medical Technology (1− η)
Due to the counteracting forces on the response of unhealthi-

Fig. 2. Medical technology, obesity, and health of the median person. Evaluated
ness, it can happen that the social effect dominates the individual
at steady state. Lower values of are associated with a higher level of medical
effect such that the median (and other members of society) are technology, i.e. technology improves from left to right. Body mass index (BMI) is
less healthy at the new steady state after a positive innovation given by + o. Unhealthiness is measured by u(o) = o. Parameters: p = 1, ˛ = 0.8,
of health technology. Fig. 2 verifies this claim by way of example. ˇ = 2,
= 50, = 2.5, = 23, y = 10.
29 0.1
28 0.4 0.09
0.35 0.08
feeling fat (sō)

27
26 0.3 0.07
BMI
0.25 0.06
25
0.2 0.05
24
0.15 0.04
23
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.03
Max Disapproval (1/γ ) Max Disapproval (1/γ )
0.02
Fig. 3. Social disapproval of overweight, obesity, and “feeling fat” of the median 0.01
person. Evaluated at steady state. Lower values of 1/
are associated with lower
social disapproval of being overweight. Body mass index (BMI) is given by + o. 0
10 15 20 25 30 35 40 45
The degree of “feeling fat” is given by f (o) = o/(
+ o). Parameters as for Fig. 2 and BMI ( kg/m 2 )
= 0.1.
Fig. 4. Obesity distribution for alternative levels of medical technology. Parameters:
p = 1, ˛ = 0.8, ˇ = 2, = 4, = 22, y = 10.5, = 0.03, and = 0.1 (low medical tech-
and taking the derivative with respect to
we obtain (15). nology, solid), = 0.075 (medium medical technology, dashed), = 0.05 advanced
medical technology, dash-dotted).
∂f (o) 1
∂o
=− 2
·o + 2
· . (15)
∂
(
+ o) (
+ o) ∂
But inspection of (16) also shows that value changes of and
The first term identifies again the direct effect and it is negative. do not affect the differential w(j) − w(k). Since this is true for any j
When
rises, individuals experience less social disapproval at any and k, it means that changes of these parameters do not change the
given body size. The median (and other persons in society) are variance of any distribution of body weight. The effect of a change
feeling less fat. At a steady state of obesity o* , however, this fact of or
on body weight can thus be discussed conveniently not
motivates eating more and putting on more weight. The negative only with respect to the median person but with respect to the
effect of the social multiplier on disapproval is measured by the distribution of body weight of the whole society.
second, positive term. Individuals “feel fatter” due to the weight We exploit this fact with a numerical experiment and begin with
gain. the empirical observation that body weight w is approximately log-
Again, it can happen that the social effect dominates the direct normally distributed, i.e. the partial distribution function for body
effect. Another example, shown in Fig. 3, corroborates this claim. It weight is given by
shows the steady-state weight and the experienced utility loss from 1 2
f (w) = √ · e−((log w−x) )/(2) .
social disapproval for alternative values of
. When disapproval w 2
for being overweight is very low (1/
is low), the median person
is very fat at the steady state but suffers relatively little from the Recall that x and are the mean and the variance of log(w). The
evaluation of others. Many other persons are anyway obese them- median of the log-normally distributed w is given by ex and thus
selves. At the other extreme, when being overweight is severely x = log(w). The variance of w is given by var(w) = (e − 1) · e2x+ .
punished with disapproval, the median is only mildly overweight The fact that the variance of w does not respond to changes of or
and suffers mildly from “feeling fat”. At an intermediate degree
can be utilized by solving var(w) for the shift-parameter :

of social disapproval and an intermediate degree of overweight
1 e−x 4var(w) + e2x
the suffering from social disapproval is largest. In other words, = log +
coming from the right from a situation of high social disapproval 2 2
of overweight (high 1/
), less disapproval per unit of overweight

2
leads to heavier persons and actually to more suffering from social 1 (1/w) 4var(w) + w
disapproval. = log + .
2 2
3.4. BMI distribution The result implies that the shift parameter of the log-normal
can be inferred from the weight of the median person. This means
At a steady state of obesity o* any overweight person responses that in order to study the impact of changing health consequences
in the same direction as the median to changes of common parame- or changing social disapproval of being overweight
on the
ters (recall Proposition 2). Quantitatively, however, individuals can whole distribution of body weights one needs only to know the
respond quite differently. To see that, take the difference of body variance of the initial distribution of body weight and how
weight for any two persons, i = j, k at the steady state. From (5) we or
affect body weight of the median person. Yet, the latter is
get already known from Proposition 2. An evolving body weight distri-
y(j) y(k)
bution can thus be conveniently derived by calibrating the initial
w(j) − w(k) = ˛[(j) − (k)] − (1 − ˛) − . variance and then feeding in new values for health technology
ˇ(i) − p ˇ(k) − p
(16) () or for the strength of social disapproval of being overweight
(
).
The result shows that a change of almost every parame- An application of this result is shown in Fig. 4. The black curve
ter changes the relative position of individuals in the weight in Fig. 4 shows the density function of a log-normal distribution
distribution. A comprehensive discussion of the effect of inno- such that it approximates the actually observed density function
vations on overweight of all persons would thus require in 1971–1975 (see Cutler et al., 2003, and Veerman et al., 2007).
complete specification of a distribution of preferences and In particular, we have assumed that = 0.1 and adjusted to 0.03
incomes. in order to fit the empirical observation. The blue (dashed) line
1
0.5 27.5
0.9 27
0.45
social disapproval
26.5
0.4
food price
0.8
26
BMI
0.35 25.5
0.7
0.3 25
0.6 24.5
0.25
24
0.5
0 20 40 60 80 0 20 40 60 80 0 20 40 60 80
time time time
Fig. 5. Social evolution of obesity. The figure displays dynamics after a drop of food prices by 5 percent per quarter in quarter 0–12. Parameters: p = 1, ˛ = 0.9, ˇ = 1.75,
= 2,
= 4, = 24, ı = 0.9, and = 0.02. Income: y = 10 (median, blue); yp = 8 (red); yr = 12 (green). (For interpretation of the references to color in this figure legend, the reader is
referred to the web version of this article.)
shows the resulting weight distribution after a medical innova- by social disapproval the median always returns to lean body mass
tion had lowered the consequences of obesity. Specifically, had and the BMI distribution in society is time-invariant. This setup
been reduced by a quarter to 0.075. The BMI distribution shifts approximates the historical situation in the US and many other
to the right and the right hand tail gets fatter. The result roughly developed countries before the 1980s. The experiment shown in
approximates the actual movement documented by Cutler et al. Fig. 5 assumes that, starting in such a situation, the price of food
and Veerman et al. The red (dash-dotted) line represents the pre- drops by 5 percent per quarter for 12 quarters. The solid line in the
diction of the obesity distribution after a further reduction of to right panel shows the initiated evolution of median BMI. During the
0.05. first half of the period of declining food prices, body weight of the
median stays constant; the system is still associated with the lean
3.5. Adjustment dynamics: the evolution of obesity steady state. Only after food prices have been falling long enough,
the lean steady state becomes non-existent and the median per-
Innovations in medical technology can explain the actual evo- son puts on weight and social disapproval of being overweight
lution of the BMI distribution only imperfectly. The last decades deteriorates.
have seen other, potentially more important body-size affecting After 12 quarters the price stops declining but the social multi-
changes. In particular, a falling relative price of food has been pro- plier continues to operate. The new steady state is not yet reached.
posed in the literature. Some researchers, however, are confused by Actually, we observe that median weight in society increases by
the observation that the major decrease of food prices occurred in more during the period of constant prices than during the period
the 1970s and 1980s while body weight continues to grow nowa- of declining prices. Because social disapproval of being overweight
days (Cutler et al., 2003; Ruhm, 2010). The present model is helpful continues to decline, the median (and thus society at large) contin-
in resolving the puzzle. With a social multiplier at work it can ues to eat more, which in turn further reduces social disapproval,
well be that most of the increase of body size occurs long after etc. An observer unaware of the underlying social dynamics might
food prices declined. In other words, lower food prices initiated thus wrongly conclude that falling food prices cannot have caused
the rise of body weight, but it is the social multiplier that devel- the obesity epidemics. A similar argument can be made with
oped it further and amplified it such that the phenomenon evolved respect to the preference parameter ˇ, which affects weight sta-
towards an “obesity epidemic” which affects a majority of soci- tus inversely to p (see Eq. (6)). Technological innovations which
ety. improved the palatability (flavor enhancer) or availability (con-
We next demonstrate the social evolution of obesity and the venience food) of food and therewith increased its likeability,
power of the social multiplier with a numerical example. For measured by an increase of ˇ, can have initiated an obesity evo-
that purpose we set weight of the non-overweight median to lution, which becomes only fully visible long after the innovation
24 kg/m2 . We think of the period length as of one quarter and set took place.
ı to a relatively high value of 0.9. This means that past observa- The dashed (green) and dash-dotted (red) lines in Fig. 5 reflect
tions of weight status in society play a relatively large role in the the associated BMI evolution for individuals which are 20 percent
determination of social disapproval for being overweight, which is richer or poorer, respectively, than the median but face otherwise
only gradually updated on the basis of currently observable weight identical preferences and technologies. While the poor individual
status. As a result we observe a high persistence of BMI and a slow reacts immediately on falling food prices, the rich individual keeps
evolution of the social evaluation of appearances. Slowly evolving a lean body as long as prices are falling (caused by the yet high
social disapproval is an assumption that is helpful to rationalize the social disapproval for a non-lean appearance) and starts putting
actually observed evolution of body sizes over the last decades. It on weight only after food prices have settled down. One could
is consistent with Auld’s (2011) observation that the contempora- thus argue that overeating by the rich individual was not moti-
neous social multiplier is small. The full set of parameter values is vated by falling prices but by declining social disapproval. This
specified in Fig. 5. view, however, fails to acknowledge that falling food prices and
For the initial price of food, p(0) = 1, the system is situated in the triggered median behavior have caused social disapproval to
the lean steady state. Small perturbations and small changes of decline sufficiently such that overeating became attractive for the
parameters do not affect the existence of the steady state. Driven rich individual.
4. Extensions For social dynamics the impact of st on body weight is of partic-

ular interest. Taking the derivatives we obtain:
4.1. Physical exercise and weight loss
∂et (i) (i) ∂vt (i) ˛
=− < 0, =− < 0,
In this section we investigate robustness of results when ∂st D̃ ∂st D̃
individuals have the possibility to lose weight through physical
∂ot (i) (˛ − (i))
exercise. An individual i who decides to spend et (i) units of leisure =− , D̃ ≡ (st + )2 (1 − (i)).
on physical exercises, gets rid of et (i) units of body weight, that ∂st D̃
is ot (i) = vt (i) − (i) − et (i). The parameter controls how effec-
As for the simple model, individuals react to increasing social dis-
tive exercising is with respect to weight loss. The opportunity cost
approval of being overweight by eating less. Maybe surprisingly,
of exercising is that less leisure time is available for other activities.
they also exercise less. This is so because higher disapproval st
Keeping the multiplicative form of utility in order to allow for an
increases the marginal utility from exercising. In order to equal-
explicit solution we assume that exercising reduces utility by fac-
ize marginal utility and marginal cost, individuals reduce et . As a
tor (1 + et (i))−(i) . The parameter (i) controls how much the person
result the response of overweight status to st is generally ambigu-
dislikes physical exercise, 0 < < 1. This formulation allows the sim-
ous. In order to preserve the mechanism and results from the simple
ple model from Section 2 as the corner solution when the individual
model, we have to assume that the median person likes consum-
does not exercise (et = 0). In the following we focus, however, on the
ing sufficiently strongly, ˛ > , that is that he regards consuming
interior solution. Specifically, utility of person i is rewritten as
more important than not exercising. This restriction appears to be
˛
ut (i) = [ct (i) + ˇ(i)vt (i)] · [1 − (st + )(vt (i) − (i) − et (i))]
1−˛ rather mild.
−(i)
· [1 + et (i)] . (17)
4.2. Diet selection, energy-density, and obesity
The first order condition with respect to food consumption and In this section we explore one possible explanation of the pos-
exercise can be solved for the interior solution (18) and (19). They itive association between energy-density and obesity. For that
imply overweight status (20). purpose we extend the model such that there are two food
goods. The unhealthy good, identified by index u, is relatively
˛(ˇ(i) − pt )[1 + (st + )((i) − )] − (1 − ˛ + )(st + )y cheap, energy dense, and potentially tasty (junk food). The second
vt (i) =
(ˇ(i) − pt )(s + )(1 − (i)) good is relatively expensive, light, and potentially less palat-
(18) able. Since the expressions become rather long we omit the
time index and the index i for idiosyncratic variables when-
ever this does not lead to confusion. Specifically we assume
(ˇ(i) − pt )[(i) + ((i)(i) − )(st + )] + (i)(st + )y that
et (i) =
(ˇ(i) − pt )(s + )(1 − (i))
(ˇu − pu ) > (ˇh − ph ), u > h .
(19)
(ˇ(i) − pt ){˛[1 + ((i) − )(st + )] − [(i) + ((i) − )(st + )] − (1 − ˛)(st + )y}
ot (i) = (20)
(ˇ(i) − pt )(s + )(1 − (i))
The solutions for vt (i) and ot (i) look structurally similar to those of
the simple model. But there are also interesting differences. Taking Good u is cheaper or more preferable or both compared to good h
the derivatives with respect to income provides: and its energy exchange rate is higher. Let vu and vh denote con-
sumption of good u and good h. The budget constraint and weight
∂et (i) (i) ∂vt (i) 1 − ˛ − (i) ∂ot (i) constraint are then given by
= > 0, =− ,
∂yt (i) D ∂yt (i) D ∂yt (i)
y = c + pu vu + ph vh , (21)
(1 − ˛)
=− < 0, D ≡ (ˇ(i) − pt )(1 − ).
D
Recalling that ˇ(i) > p is necessary for being overweight and observ- o = u vu + h vh − . (22)
ing the sign of the derivatives proves the following proposition.
Furthermore, we allow consumption of good u to be unhealthy
Proposition 7. Ceteris paribus, individuals with higher income exer- beyond its impact on weight (for example because of high content
cise more for weight loss and are less overweight. They eat less if of sugar or trans-fats) and measure the health effect by the param-
(i) < 1 − ˛. eter . Using this fact and (21) and (22), utility (3) can be restated
as
The possibility of getting rid of weight through exercising breaks
the causal link from food consumption to being overweight. Only ˛
U = [y + (ˇu − pu )vu + (ˇh − ph )vh ]
if the impact of body size for utility is sufficiently large (1 − ˛ is
sufficiently large), richer people eat less. Otherwise they eat more · [1 − (s + )(u vu + h vh − ) − vu ]1−˛ . (23)
and work out the weight gain through increased exercising. In any
case, however, the original result that richer individuals are, ceteris
paribus, less overweight is preserved. In line with the empirical Individuals are maximizing utility by choosing vu ≥0 and vh ≥0.
observation the extended model predicts that richer people, on The double linearity in (21) and (22) implies that only corner solu-
average, exercise more for weight loss (Gidlow et al., 2006). tions are optimal. Individuals either chose the healthy diet or the
unhealthy diet.12 In Appendix A it is shown that the solution is individuals enjoy the healthy diet. While they are potentially over-
either vu or vh : weight as well, eventually, as income rises further, the metabolic
constraint h vh − ≥0 becomes binding with equality. The richest
˛(ˇu − pu ) + (s + )[˛(ˇu − pu ) − (1 − ˛)u y] − (1 − ˛)y
vu = , individuals – due to the mechanism explained in Section 2 – refrain
(ˇu − pu )[u (s + ) + ]
from excess food consumption and are not overweight.
(24)
Living on the unhealthy diet, however, does not necessarily make
people fatter. To see this, consider a society stratified only by food
˛(ˇh − ph ) + (s + )[˛(ˇh − ph ) − (1 − ˛)h y] preferences, ˇu and ˇh , and focus on the limiting case, in which
vh = . (25) diet u is not unhealthy aside from its energy density, that is = 0.
(ˇh − ph )h (s + )
Holding income y (and lean body size ) constant, computing
Inspecting (25) and (5) let us conclude that the solution for the oh (vh ) = h vh − from (25) and subtracting it from (24) provides
healthy diet vh is isomorph to the solution of the simple one-diet the body size differential
model. The interesting case is thus when at least some individuals
prefer the unhealthy diet. Their overweight status is then given by (ˇu − pu )h − (ˇh − ph )u
ou = u vu − , that is by o(vu ) − o(vh ) = · (1 − ˛)y.
(ˇu − pu )(ˇh − ph )
˛u (ˇu − p) − (1 − ˛)[u (s + ) + ]u y + ˛u (s + )(bu − pu )
ou = − . (26)
(ˇu − pu )[u (s + ) + ] The unhealthy eaters are thus only bigger if
Inspecting the response of being overweight on energy-density
provides the following result. ˇu − pu u
> .
ˇh − ph h
Proposition 8. Consider a person who prefers the unhealthy diet and
is overweight. Then, an increase in the energy exchange rate u results
That is, only if eating the unhealthy food provides sufficiently
in even more weight gain for any given level of social approval s if the
great pleasure or if it is sufficiently cheap compared to its energy
unhealthy food is sufficiently cheap (pu sufficiently low) or sufficiently
exchange rate and relatively to the healthy good, are the unhealthy
tasty (ˇu sufficiently large) or if the person is sufficiently poor (y is
eaters more overweight.
sufficiently low).
The proof evaluates the first order derivative
5. Final remarks
2
∂ou ˛(ˇu − pu ) [1 + (s + )] − (1 − ˛)[ + u (s + )] y
= 2
∂u (ˇu − pu )[ + u (s + )] This paper has proposed a theory of the social evolution of being
overweight which explains the changing human phenotype since
and the second order derivatives
the 1980s. A social multiplier rationalizes why declining food prices
2 2
∂ o 1−˛ ∂ o 1−˛ or technological innovations could have initiated an obesity epi-
=− < 0, = > 0. demic although the most dramatic weight gain is observed long
∂u ∂y ˇu − pu ∂u ∂(ˇu − pu ) (ˇu − pu )
2
after the initiating innovation is gone. The social multiplier can also
Observing that one can always find a (ˇu − pu ) high enough and an explain how obesity-related health innovations may have detri-
y low enough such that ∂ou /∂u > 0 completes the proof. mental steady-state effects on health and why unequal societies
For social dynamics and steady states it now matters whether are, ceteris paribus, heavier. Within societies the theory explains
the median person prefers the healthy or the unhealthy diet. Natu- the socio-economic gradient, i.e. why poorer people are more
rally, in case of a healthy diet all results from the simple model carry severely afflicted by obesity although eating is costly. Extensions
over to the two-diet model, because the solution for the median is have shown that the basic mechanism is robust against the con-
isomorph. If the median person prefers the unhealthy diet, results sideration of dietary choice and exercising for weight loss. The
are generally ambiguous. The response of being overweight on extensions have furthermore provided theoretical support for the
social disapproval is obtained as observation that exercising is more popular among richer individ-
uals as well as a condition under which an increasing energy density
∂ou ˛u (u − )
=− 2
of food may have caused and/or aggravated the obesity dynamic,
∂s [u (s + ) + ] namely if the median person indulges an unhealthy diet and is
Increasing social disapproval of being overweight evokes the nor- sufficiently poor.
mal response of weight loss if the constraint u > holds. This The theory has been based on the assumption that social disap-
means the energy density of the unhealthy good must be suffi- proval is influenced by the overweight status of the median person.
ciently high. In this case, as well as generally if the median person While it seems intuitively reasonable, that weight of the aver-
picks the healthy diet, all results from the basic model carry over age person is an important determinant, nothing hinges on this
to the two-diet-model. assumption. In particular we could have assumed a “role model”
Comparing the dietary choices (24) and (25) shows that one can less heavy than the median without qualitative impact on results. A
always find a triple {ˇu , ˇh , y} for which the unhealthy diet is strictly more comprehensive assumption would certainly allow the social
preferred. Since consumption under both diets is strictly decreasing norm to depend on the whole BMI distribution. This assumption,
in income, body weight in a society which is stratified only by however, would severely complicate the analysis and has been
income is distributed as follows. The poorest individuals indulge abandoned in favor of analytically provable results. The median
the cheap unhealthy diet and are potentially overweight. At some has been imagined (implicitly) as the median of a country since
level of income, vu ≥0 becomes binding with equality and the richer most empirical studies are carried out at the country level or across
countries. But in terms of theory, the type of the investigated soci-
ety is actually undetermined. It is easily conceivable that the theory
12
To square the corner solution with reality it is helpful to conceptualize the diets
of obesity evolution applies to smaller societies than countries, that
as bundles of foods and h as the on average more healthy diet, which may include is at the level of local neighborhoods or among peers at school or
an occasional donut. at work.
Appendix A. Suppose that both v1 > 0 and v2 > 0. Solving the Kuhn–Tucker
conditions for v1 and v2 provides:
A.1. Proof of Proposition 3
N1 N2
v1 = − , v2 = ,
D D
The derivative of (7) with respect to food consumption is given
by N1 ≡ (ˇ1 − p1 )[1 + (s + )] + (s + )1 y + y>0
∂G −2 2 −2
N2 ≡ (ˇ2 − p1 )[1 + (s + )] + (s + )2 y > 0
= ˛( − 1)[(y(i) − pvt (i)) p + (1 − )vt (i) ]
∂vt (i) D ≡ (ˇ2 − p2 )[(s + )1 + ] − (ˇ2 − p2 )2 (s + ).
−1
· [1 − (st + )(vt (i) − (i)] − ˛(st + )[(y(i) − pvt (i)) Since both N1 and N2 are positive, sgn(v1 ) = −sgn(v2 ), a contradic-
−1 −1 tion to the initial claim that both v1 and v2 are positive. Thus, either
· (−p) + (1 − )vt (i) ] − (1 − ˛)(st + )[(y(i) − pvt (i))
v1 = 0 or v2 = 0. Solving (A.1) for v2 = 0 provides (24) and solving
−1
· (−p) + (1 − )vt (i) ] (A.2) for v1 = 0 provides (25).
Which can be rewriten as

References
∂G −2 2 −2
= ˛( − 1)[(y(i) − pvt (i)) p + (1 − )vt (i) ] Auld, M.C., 2011. Effect of large-scale social interactions on body weight. Journal of
∂vt (i) Health Economics 30, 303–316.
Bayliss, W.M., 1917. The Physiology of Food and Economy in Diet. Longman, Green
· [1 − (st + )(vt (i) − (i)] − [ + ˛(1 − )] & Co, London.
−1 −1 Bernheim, B.D., 1994. A theory of conformity. Journal of Political Economy 102,
· (st + )[(y(i) − pvt (i)) · (−p) + (1 − )vt (i) ]<0 841–877.
Blanchflower, D.G., Oswald, A.J., Van Landeghem, B., 2009. Imitative obesity and
−1
because ≤ 1 and because [(y(i) − pvt (i)) · (−p) + (1 − relative utility. Journal of the European Economic Association 7, 528–538.
−1 Bleich, S., Cutler, D., Murray, C., Adams, A., 2008. Why is the developed world obese?
)vt (i) ] > 0 for a solution of G(vt (i)) = 0 according to (7)
Annual Review of Public Health 29, 273–295.
to exist. Burke, M.A., Heiland, F., 2007. Social dynamics of obesity. Economic Inquiry 45,
The signs of the derivatives ∂G/∂˛ > 0, ∂G/∂ < 0, ∂G/∂ < 0 are 571–591.
Burke, M.A., Heiland, F.W., Nadler, C.M., 2009. From “overweight” to “about right”:
obvious from (7). The derivative with respect to income is obtained
evidence of a generational shift in body weight norms. Obesity 18, 1226–1234.
as Carroll, C.D., Overland, J., Weil, D.N., 2000. Savings and growth with habit formation.
American Economic Review 90, 341–355.
∂G −2 Christakis, N.A., Fowler, J.H., 2007. The spread of obesity in a large social network
= −p( − 1)˛(y(i) − pvt (i))
∂y(i) over 32 years. New England Journal of Medicine 357, 370–379.
Cutler, D., Glaeser, E.L., Shapiro, J.M., 2003. Why have Americans become more
· [1 − (st + )(vt (i) − (i)] obese? Journal of Economic Perspectives 17 (Summer), 93–118.
Dragone, D., Savorelli, L., 2011. Thinness and obesity: a model of food consumption,
−1
− (1 − ˛)(st + )(y(i) − pvt (i)) health concerns, and social pressure. Journal of Health Economics 31, 243–256.
Etilé, F., 2007. Social norms ideal body weight and food attitudes. Health Economics
16, 945–966.
The derivative is positive for ≤ 0 and negative for = 1. It is Field, A.E., Coakley, E.H., Must, A., Spadano, J.L., Laird, N., Dietz, W.H., Rimm, E.,
ambiguous for 0 < < 1. To see this clearly divide the expression by Colditz, G.A., 2001. Impact of overweight on the risk of developing common
−2
(y(i) − pvt (i)) and conclude that for 0 < < 1 the sign of ∂G/∂y(i) chronic diseases during a 10-year period. Archives of Internal Medicine 161,
1581–1586.
is the same as the sign of a − b,
Finkelstein, E.A., Ruhm, C.J., Kosa, K.M., 2005. Economic causes and consequences of
obesity. Annual Review of Public Health 26, 239–257.
a ≡ p(1 − )˛ · [1 − (st + )(vt (i) − (i)] > 0, b Flegal, K.M., Graubard, B.I., Williamson, D.F., Gail, M.H., 2005. Excess deaths asso-
ciated with underweight, overweight, and obesity. Journal of the American
≡ (1 − ˛)(st + )(y(i) − pvt (i)) > 0. Medical Association 293, 1861–1867.
Fontaine, K.R., Redden, D.T., Wang, C., Westfall, A.O., Allison, D.B., 2003. Years of life
While a is independent from income, b is zero if all income is spent lost due to obesity. Journal of the American Medical Association 289, 187–193.
on food, i.e. for y(i) = pvt (i), and otherwise linearly increasing in Gidlow, C., Johnston, L.H., Crone, D., Ellis, N., James, D., 2006. A systematic review of
the relationship between socio-economic position and physical activity. Health
income. Hence the derivative is positive for low income and nega- Education Journal 65, 338–367.
tive for high income. Glaeser, E.L., Sacerdote, B., Scheinkman, J.A., 2003. The social multiplier. Journal of
Next use from the implicit function theorem, dvt (i)/dx = the European Economic Association 1, 345–353.
Granovetter, M., 1978. Threshold models of collective behavior. American Journal
−((∂G/∂x)/(∂G/∂vt (i))), in which x ∈ {˛, , , y(i)} and notice that of Sociology 83, 1420–1443.
ot (i) = vt (i) to complete the proof of the proposition. Hazan, M., Maoz, Y., 2002. Women’s labor force participation and the dynamics of
tradition. Economics Letters 75, 193–198.
Johnson, F., Cooke, L., Croker, H., Wardle, J., 2008. Changing perceptions of weight
A.2. Derivation of (24) and (25) in Great Britain: comparison of two population surveys. British Medical Journal
337, a494.
Joliffe, D., 2011. Overweight and poor? On the relationship between income and the
The Kuhn–Tucker conditions for a maximum of (23) can be sim- body mass index. Economics and Human Biology 9, 342–355.
plified to Lakdawalla, D., Philipson, T., 2009. The growth of obesity and technological change.
Economics and Human Biology 7, 283–293.
v1 U1 = 0, U1 ≡ ˛(ˇ1 − p1 )[1 − (s + )(1 v1 + 2 v2 − ) Levy, A., 2002. Rational eating, can it lead to overweightness or underweightness?
Journal of Health Economics 21, 887–899.
− v1 ] − (1 − ˛)[(s + )1 + ] · [y + (ˇ1 − p1 )v1 + (ˇ2 − p2 )v2 ] Lindbeck, A., Nyberg, S., Weibull, W., 1999. Social norms and economic incentives
in the welfare state. Quarterly Journal of Economics 114, 1–35.
(A.1) Lindbeck, A., Nyberg, S., 2006. Raising children to work hard: altruism work, norms,
and social insurance. Quarterly Journal of Economics 121, 1473–1503.
Mani, A., Mullin, C.H., 2004. Choosing the right pond: social approval and occupa-
tional choice. Journal of Labor Economics 22, 835–861.
Munshi, K., Myaux, J., 2006. Social norms and the fertility transition. Journal of
v2 U2 = 0, U2 ≡ ˛(ˇ2 − p2 )[1 − (s + )(1 v1 + 2 v2 − ) − v1 ] Development Economics 80, 1–38.
Nechyba, T.J., 2001. Social approval, values, and AFDC: a reexamination of the ille-
− (1 − ˛)(s + )2 · [y + (ˇ1 − p1 )v1 + (ˇ2 − p2 )v2 ]. (A.2) gitimacy debate. Journal of Political Economy 109, 637–672.
Ogden, C.-L., Carroll, M.D., 2010. Prevalence of overweight, obesity, and extreme Strulik, H., 2012. Riding high – success in sports and the rise of doping cultures.
obesity among adults: United States, Trends 1960–1962 Through 2007–2008. Scandinavian Journal of Economics 114, 539–574.
M.S.P.H., Division of Health and Nutrition Examination Surveys. Strulik, H., 2013a. School attendance and child labor – a model of collective behavior.
OECD, 2010. Obesity and the Economics of Prevention: Fit Not Fat. OECD, Paris. Journal of the European Economic Association 11, 246–277.
Palivos, T., 2001. Social norms, fertility and economic development. Journal of Eco- Strulik, H., 2013b. The Social Evolution of Obesity when Body Weight is a State
nomic Dynamics and Control 25, 1919–1934. Variable. University of Goettingen (Discussion Paper).
Pew Review, 2006. Americans See Weight Problems Everywhere But in the Mirror. Tabellini, G., 2008. The scope of cooperation: values and incentives. Quarterly Jour-
Pew Research Center. nal of Economics 123, 905–950.
Philipson, T.J., Posner, R.A., 2003. The long-run growth in obesity as a function Trogdon, J.G., Nonnemaker, J., Pais, J., 2008. Peer effects in adolescent overweight.
of technological change. Perspectives in Biology and Medicine 46, S87–S107, Journal of Health Economics 27, 1388–1399.
Summer. USDA, 2001. Profiling Food Consumption in America. In: United States Depart-
Philipson, T.J., Posner, R.A., 2008. Is the obesity epidemic a public health prob- ment of Agriculture’s Agriculture Fact Book: 2001–2002. http://www.usda.
lem? A decade of research on the economics of obesity (NBER Working Paper gov/factbook/
2008). van Baal, P.H.M., Polder, J.J., De Wit, G.A., Hoogenveen, R.T., Feenstra, T.L., Boshuizen,
Philipson, T.J., Posner, R.A., 2009. The growth of obesity and technological change. H.C., Engelfriet, P.M., Brouwer, W.B.F., 2008. Lifetime medical costs of obe-
Economics and Human Biology 7, 283–293. sity: prevention no cure for increasing health expenditure. PLoS Medicine 5,
Pickett, K.E., Kelly, S., Brunner, E., Lobstein, T., Wilkinson, R.G., 2005. Wider 242–249.
income gaps, wider waistbands? An ecological study of obesity and Veerman, J.L., Barendregt, J.J., Van Beeck, F., Seidell, J.C., Mackenbach, J.P., 2007.
income inequality. Journal of Epidemiology and Community Health 59, Stemming the obesity epidemic: a tantalizing prospect. Obesity 15, 2365–2370.
670–674. Wang, Y., Beydoun, M.A., Liang, L., Caballero, B., Kumanyika, S.K., 2008. Will all Amer-
Popkin, B., 2009. The World Is Fat: The Fads, Trends, Policies, and Products That Are icans become overweight or obese? Estimating the progression and cost of the
Fattening the Human Race. Avery Publication Group. US obesity epidemic. Obesity 16, 2323–2330.
Putnam, J., Allshouse, J., Kantor, L.S., 2002. US per capita food supply trends: more Wirl, F., Feichtinger, G., 2010. Modelling social dynamics (of obesity) and thresholds.
calories, refined carbohydrates, and fats. Food Review 25 (3), 2–15. Games 4, 395–414.
Ruhm, C.J., 2010. Understanding overeating and obesity (NBER Working Paper, WHO, 2011. Obesity and overweight, Factsheet No. 311. http://www.who.int/
16149). mediacentre/factsheets/fs311/en/

Peer effects on risky behaviors: New evidence from college roommate

assignments
Daniel Eisenberg a,∗ , Ezra Golberstein b,1 , Janis L. Whitlock c,2
a
Department of Health Management and Policy, University of Michigan, 1415 Washington Heights, School of Public Health, Ann Arbor, MI 48109-2029,
United States
b
Division of Health Policy and Management, University of Minnesota School of Public Health and Minnesota Population Center, 420 Delaware Street SE,
MMC 729, Minneapolis, MN 55455, United States
c
Bronfenbrenner Center for Translational Research and Department of Human Development, Cornell University, Beebe Hall, Ithaca, NY 14853, United States
Article history: Social scientists continue to devote considerable attention to spillover effects for risky behaviors because
Received 2 May 2012 of the important policy implications and the persistent challenges in identifying unbiased causal effects.
Received in revised form We use the natural experiment of assigned college roommates to estimate peer effects for several meas-
30 September 2013
ures of health risks: binge drinking, smoking, illicit drug use, gambling, having multiple sex partners,
suicidal ideation, and non-suicidal self-injury. We find significant peer effects for binge drinking but little
evidence of effects for other outcomes, although there is tentative evidence that peer effects for smoking
may be positive among men and negative among women. In contrast to prior research, the peer effects
I10
for binge drinking are significant for all subgroups defined by sex and prior drinking status. We also find
I12 that pre-existing risky behaviors predict the closeness of friendships, which underscores the significance
D830 of addressing selection biases in studies of peer effects.
Z13 © 2013 Elsevier B.V. All rights reserved.
Keywords:
Peer effects
Risky behaviors
Substance use
Alcohol
1. Introduction behaviors not only because of the important policy implications but
also because of the challenges in identifying unbiased causal effects.
The spread of substance use and other risky behaviors in social As Manski (1993) and others have described, there are three main
networks is important to understand in order to inform health factors that may bias estimates of social interaction effects: (1) the
and social policy. Information about spillover effects can improve reflection problem, in which the effect of others on the self cannot
predictions about the dynamics of behaviors in populations and be disentangled from the reverse; (2) selection into social networks,
assist the design of interventions that mitigate harmful spillovers which may lead to correlations in unmeasured individual charac-
or leverage beneficial spillovers. Behaviors such as heavy alcohol teristics and generate spurious correlations in outcomes; and, (3)
consumption and other substance use have substantial impacts unmeasured contextual factors, or “common shocks,” which may
on health, functioning, and educational outcomes (Rice, 1999; also generate spurious correlations in outcomes. In addition, from
Carpenter and Dobkin, 2011; Carrell et al., 2011). a policy perspective it is useful to distinguish between “endoge-
Economists and other social scientists continue to devote nous” peer effects that imply multiplier effects (behavior A by one
considerable attention to measuring spillover effects for risky person is directly influenced by behavior A by another person),
versus “contextual” peer effects that imply causal but not neces-
sarily multiplier effects (behavior A by one person is influenced
by being around another person engaging in behavior A, but the
∗ Corresponding author. Tel.: +1 734 615 7764; fax: +1 734 764 4338.
causal mechanism is through other peer characteristics correlated
E-mail addresses: daneis@umich.edu (D. Eisenberg), egolberstein@umn.edu
with behavior A).
(E. Golberstein), jlw43@cornell.edu (J.L. Whitlock).
1
Tel.: +1 612 626 2572. In this study we use the natural experiment of assigned col-
2
Tel.: +1 607 254 2894. lege roommates to estimate peer effects for substance use and
D. Eisenberg et al. / Journal of Health Economics 33 (2014) 126–138 127
other risky behaviors. This empirical approach addresses the iden- larger (in either direction) for behaviors that are more likely to be
tification issues noted above and thus yields unbiased estimates. observed or discussed among peers, as awareness of peer behav-
The approach has been used in previous studies mainly to look at iors is a necessary precursor to learning or preference effects. This
academic outcomes, particularly grade point average (GPA), using suggests that, among the behaviors we examine, binge drinking is
administrative data from colleges and universities. For this study more likely to exhibit large peer effects, because in college settings
we collected new survey data to examine a range of behaviors drinking frequently takes place in social contexts with many peers
with important implications for health and wellbeing: binge drink- (Beck et al., 2008). Also, heavy drinking has relatively low stigma in
ing, cigarette smoking, illicit drug use, gambling, sexual activity, college-age populations (and on the contrary, is often considered
suicidal ideation, and non-suicidal self-injury. a positive marker of social status), suggesting that it is likely to be
We find significant peer effects for binge drinking but little openly discussed among students (Neighbors et al., 2007). Another
evidence of effects for the other outcomes. The effects for binge reason peer effects might be especially strong for alcohol, as well
drinking are robust to controlling for a range of additional peer as drug use, is that peers may have more influence on search costs
characteristics, suggesting that these are true spillover effects for goods that cannot be legally purchased.
(“endogenous” rather than “exogenous” effects, in Manski’s ter- Young people may also experience peer effects differently
minology), although we cannot rule out the possibility that the depending on their gender and their previous risky behaviors. For
effects are driven at least in part by unmeasured roommate char- instance, males and females differ somewhat in their exposures
acteristics. The magnitude of the effects—a 8.6 percentage point and responses to social pressures during adolescence and young
increase in the probability of binge drinking, as a result of hav- adulthood, and they also differ more generally in their develop-
ing a binge-drinking roommate—is somewhat smaller than in most mental processes with respect to risky behaviors (Byrnes et al.,
previous studies of peer effects. As compared to a previous study 1999). Young people with previous risky behaviors may be more
based on college roommate assignments, which finds significant influenced by peer effects, if they are more likely to be near the
peer effects on binge drinking only for men with prior binge drink- margin in their propensity to engage in a behavior or not. On the
ing (Duncan et al., 2005), we find more widespread effects: for other hand, people with previous risky behaviors may be less influ-
both women and men, and for both prior binge drinkers and prior enced by peer effects, if they have more solidly formed preferences
non-binge drinkers. We also examine the closeness of roommates’ and information about the behaviors.
relationships, as reported in the follow-up survey. This analysis
indicates that similarity in pre-existing behaviors predicts close-
2.2. Prior empirical evidence
ness of relationships, for the most part. Also, roommates who end
up being close friends exhibit stronger apparent peer effects on
A number of studies in the recent economics literature esti-
binge drinking; this differential is less robust, however, when we
mate peer effects on substance use among adolescents in secondary
look at predicted friendship levels based on baseline measures,
schools, using various combinations of instrumental variables and
rather than the endogenous actual friendship levels.
fixed effects (Gaviria and Raphael, 2001; Powell et al., 2005;
Lundborg, 2006; Clark and Loheac, 2007; Fletcher, 2010, 2012). All
2. Background and prior research of these studies find significant peer effects for the behaviors under
examination, and in most cases the estimates imply fairly large
2.1. Conceptual discussion effects.3 These approaches improve upon prior studies in terms
of addressing the key identification issues noted earlier, but their
The discussion of social interaction effects by Glaeser and
validity still depends on some untestable assumptions about the
Scheinkman (2001) offers a useful starting point for considering
lack of correlations in unobserved variables among peers.4 A recent
how peers might influence each other’s behaviors. They describe
study by Card and Giuliano (2013) addresses this issue by speci-
various mechanisms that could produce such effects, including
fying structural assumptions about selection into friendships and
what they term learning, stigma, and taste-related interactions.
carefully examining the robustness of these assumptions, finding
Learning about risky behaviors from peers may take place through
significant peer effects for sexual behavior, marijuana use, cigarette
direct communication as well as observation. The new information
smoking, and truancy.
may in turn cause changes in the net price of the behavior (e.g.,
The most similar study to ours is that by Duncan et al. (2005),
by lowering the search costs) and in the perceived benefits and
the only published study using college roommate assignments to
costs (e.g., by demonstrating positive and negative consequences
estimate peer effects on risky behaviors. Their sample includes
of the behavior). Whereas learning refers to effects on information,
stigma and taste-related interactions refer to effects on preferen-
ces. Stigma-related interactions include situations in which one’s
opinion about the desirability of a behavior is influenced by observ- 3
For example, these studies find that for each percentage point increase in peers
ing other people doing that behavior and one’s opinions or feelings engaging in a behavior, individuals have the following percentage point increases
toward those people. For example, if a student observes that her in the likelihood of the behavior: (1) Gaviria and Raphael (2001): 0.35 (SE = 0.13) for
drinking, 0.32 (0.08) for drug use, 0.16 (0.12) for smoking. (2) Powell et al. (2005):
roommate uses marijuana and the student likes or respects the 0.58 (0.10) for smoking. (3) Lundborg (2006): 0.23 (0.08) for binge drinking, 0.17
roommate, this may lower the student’s stigmatizing attitudes (0.05) for smoking, 0.07 (0.02) for drug use. (4) Clark and Loheac (2007): mostly
about drug use. Taste-related interactions refer to a more direct significant effects for smoking, drinking, marijuana, in the range of 0.10–0.20. (5)
influence on preferences: one may simply have a desire for con- Fletcher (2010): 0.35 (0.17) for smoking. (6) Fletcher (2012): 0.57 (0.24) for binge
drinking.
formity or imitation, such that observing someone else’s behavior 4
Another noteworthy contribution to this literature has been the evidence from
raises the desirability of doing the same (Cutler and Glaeser, 2007). a community-level adult sample in the Framingham Heart Study on social interac-
These mechanisms suggest that peer effects on risky behav- tion effects for cigarette smoking (Christakis and Fowler, 2008) and alcohol use
iors may go in either direction. For example, peers’ behaviors may (Rosenquist et al., 2010). Although economists have questioned whether these
reduce one’s behavior if the learning from peers highlights adverse studies sufficiently address the key identification issues (Cohen-Cole and Fletcher,
2008), the studies have garnered significant media attention (Kolata, 2008) and have
consequences, whereas peers’ behaviors may increase the behavior enlivened interest in spillover effects in the fields of medicine, public health, and
if the learning highlights positive consequences. Also, the possible beyond. Also, an important strength of these studies is the rich information on social
mechanisms imply that, other things equal, peer effects should be networks, including neighbors, family members, and friends.
128 D. Eisenberg et al. / Journal of Health Economics 33 (2014) 126–138
three cohorts who began college in fall 1998, 1999, or 2000 at first-year roommates. Fifth, we expand the set of risky behaviors
a large, academically competitive university. In the winter/spring to include gambling, smoking, suicidal ideation,6 and non-suicidal
semester of 2002 they surveyed these students when they were in self-injury, in addition to the behaviors in the previous study (binge
their 2nd, 3rd, or 4th year of college, and they linked the responses drinking, drug use, and sexual behavior). Finally, we examine a
to a pre-existing data set with survey responses from the students supplemental sample of students with requested roommates, as
and their randomly assigned roommates during the summer before a point of comparison that further illustrates the likely biases from
their first year of college. They find no evidence of peer effects on measuring peer effects without fully addressing selection biases.
marijuana use or number of sexual partners. In contrast, they find
significant peer effects on the frequency of binge drinking among
men who binge drank in high school, but not among men who did
not binge drink in high school and not among women regardless of 3. Methods and data
prior binge drinking.
Two other roommate studies are also relevant to ours, although 3.1. Overview
these studies do not directly examine risky behaviors as outcomes.
First, in a study connected to that of Duncan et al. (2005), using data Our data come from online surveys of first-year college stu-
from the same cohorts at the same institution, Kremer and Levy dents at two large and academically competitive universities: one
(2008, 2003) find that men who binge drank in high school obtain public (hereafter “university A”), and one private (“university B”).
lower grade point averages (GPAs) when paired with another binge We fielded the baseline survey in August 2009, shortly before stu-
drinker in their first year of college. As noted by the authors, this is dents arrived at college, and the follow-up survey in March–April
consistent with the finding in Duncan et al. (2005) that peer effects 2010, shortly before the end of the academic year. We linked the
for binge drinking are larger among men, and indicates that there survey data to administrative data on housing preferences, room
are significant academic implications of this peer effect. Kremer assignments, and academic and demographic characteristics.
and Levy also find that the effect on GPA persists beyond the first First-year students are required to live in campus housing at
year of college, which points toward more lasting mechanisms such both universities, except in unusual circumstances. Students have
as preferences and social networks rather than the contemporane- the option of requesting specific roommates, and these requests are
ous disruption of the studying environment in the first-year room. typically granted. Students who do not request specific roommates
Second, Sacerdote (2001) finds that, among randomly assigned are assigned their roommates. Our analysis focuses on students
first-year roommates, whether a man joins a fraternity is positively with assigned roommates, although for comparison’s sake we also
and significantly correlated with his roommate’s decision, and that examine a smaller sample with requested roommates. We describe
living in a dormitory with more students who drank beer prior to the roommate assignment process later in this section.
college is also positively and significantly correlated with the prob- Our main empirical approach is analogous to that of previ-
ability of joining a fraternity (for men) or a sorority (for women). ous studies of peer effects among college roommates, estimating
Given the well-established correlation between fraternity/sorority regressions of the form:
participation and drinking (McCabe et al., 2005), this finding is also
consistent with a significant peer effect for binge drinking, particu-
Y(t+1) = ˇ0 + ˇ1 Pref st + ˇ2 Ypeerst + ˇ3 Yt + ˇ4 Xt + εt+1 (1)
larly among men, and suggests that an important part of the mech-
anism may be the influence on which social networks people join.
While our basic purpose and approach are the same as in the The subscript t denotes a measurement in the baseline survey,
study by Duncan et al. (2005), there are several features that extend and t + 1 denotes a measurement in the follow-up survey. Y refers
our study beyond a straightforward replication (which would be to the risky behavior being examined, Prefs is a vector of hous-
valuable in itself, given their intriguing findings). First, by collecting ing preferences and all other variables used to make roommate
data on the extent to which roommates are close friends and spend assignments (described in more detail later), Ypeers is the average
time together, we are able to characterize the “exposure” in our nat- risky behavior of the peers7 with whom the student is assigned
ural experiment. This information makes it easier to interpret the to live (roommates in most analyses, but hallmates in some), and
peer effects we observe (and the extent to which they generalize X is a vector of individual characteristics including gender, age
to other contexts), and also quantifies the extent to which risky (exact to the day), race/ethnicity, and parents’ education. The key
behaviors are intertwined with selection into friendships, which coefficient is ˇ2 , which represents the effect of peer behavior on
can introduce biases in observational studies of peer effects that the individual’s behavior. For binary outcomes we estimate pro-
are not based on plausibly exogenous natural experiments. bit regressions and report average marginal effects (with standard
Second, our data are from a more recent cohort (entering college errors estimated with the delta method), and for other outcomes
in 2009), for whom peer effects may be quite different, consider- we estimate linear or ordered probit regressions. In all specifica-
ing how behaviors and social context among young people have tions heteroskedasticity-robust standard errors are corrected for
evolved over the past 10–15 years.5 Third, our sample includes two correlated outcomes among roommates (or hallmates in analyses
institutions, which is useful for examining whether peer effects of hallmate effects).
might generalize across campuses. Fourth, we look at outcomes
during the first year of college; these outcomes, as compared to
outcomes later in college, may be more strongly influenced by
6
Suicidal ideation is not a risky behavior per se, but it is a primary risk factor for
suicidal behavior. Suicidal behavior is too rare to be studied meaningfully in most
samples including our own; past-year suicide attempts were reported by only 0.6%
5
For example, between 1998 and 2008 there were significant decreases in the of college students in a national sample (Eisenberg et al., 2013a,b).
7
prevalence of cigarette smoking, and increases in the proportion reporting that their In cases where students have more than one roommate, as in most other studies
friends would disapprove of their cigarette smoking, among Americans ages 18–22 of peer effects we focus on the average peer behavior under the logic that it is the best
(Johnston et al., 2009). Also, during that period there was some closing in the gender proxy of the peer context. Our estimated roommate effects remain very similar if we
gap for binge drinking: the prevalence for men fell from 52% in 1998 to 49% in 2008, instead define the roommate variable as the maximum behavior among multiple
whereas the prevalence for women rose from 31% to 34%. roommates.
3.2. Survey data collection and sample characteristics of binge drinkers (0.34) in the final analytic sample as compared to
all baseline respondents (0.37).11
At both baseline and follow-up we recruited students for the Additional characteristics of the primary analytic sample are
surveys by first sending an introductory letter with a $10 bill (a shown in Table 1. Most students (79%) are in double rooms (i.e.,
“pre-incentive,” with no obligation to participate), and then send- with one roommate), 17% are in triples, and 4% in quads. The typi-
ing up to four email invitations to those who had yet to respond, cal socioeconomic background is high, with 83% of students having
spaced by 3–5 days each. All communications included a web link at least one parent with a college degree. Compared to the national
to the survey and a unique, randomly assigned log-in ID for each population of students in higher education (Planty et al., 2009), our
student. Recruitment messages also informed students that they sample has higher percentages of whites (70% versus 63% nation-
were entered into a sweepstakes for cash prizes regardless of par- ally) and Asians (17% versus 7%), and lower percentages of blacks
ticipation. (3% versus 14%) and Hispanics (5% versus 12%).
Recruitment for the baseline survey was timed at each school to We examine substance use and other risky behaviors that are
take place during the three weeks prior to the start of the semester. relatively common among adolescents and young adults. Our sur-
The follow-up survey data collection also lasted three weeks and vey questions are adapted from the Healthy Minds Study, an annual
was timed to conclude one week prior to final exams in the spring. national survey of college student health (Eisenberg et al., 2007).
Because obtaining informed consent of minors typically requires Binge drinking is measured using the question, “Over the past 30
parental consent, from the outset of the study we excluded students days, on how many occasions have you had X drinks in a row?”,
if they were going to be under the age of 18 as of the follow-up where, following standard definitions, X is shown as 4 for women
survey in March 2010—this restriction excluded 0.9% of otherwise and 5 men. The answer categories in the survey are “None,” “Once,”
eligible students. “Twice,” “3 to 5 times,” “6–9 times,” and “10 or more times.” In most
As implied by Eq. (1), our primary analytic sample consists of analyses we use a binary coding of binge drinking (none versus
students who completed both baseline and follow-up surveys and any), but in some analyses we examine the frequency. As shown
whose roommate(s) also completed the baseline survey.8,9 Prior in Table 1, binge drinking increases substantially between base-
to the baseline survey, the initial number of eligible students with line (33% at least once) and follow-up (54%). Cigarette smoking
assigned roommates was 4971, including 3876 from university A is measured using the question, “In the past 30 days, how many
and 1095 from university B (which has a large proportion of first- cigarettes did you smoke on average?” The prevalence of smok-
year students in single rooms, unlike university A). A total of 3501 ing is low at baseline (only 6% with any smoking) and increases
(70%) of these students completed the baseline survey. Among slightly at follow-up (8%). Illicit drug use is measured using the
baseline responders, 2589 (74%) had at least one roommate who question, “In the past six months, have you used any of the fol-
was also a baseline responder. And among baseline responders with lowing drugs? (Select all that apply)” The answer choices include
at least one roommate baseline responder, 1641 (63%) completed marijuana, cocaine, heroin, methamphetamines, other stimulants,
the follow-up survey.10 ecstasy, other, and none of the above. Drug use increases from 22%
Because our primary analytic sample is only 33% (1641/4971) at baseline to 32% at follow-up, and the vast majority of drug use is
of the initially eligible sample, it is important to examine poten- marijuana (among students reporting any illicit drug use, 97% used
tial biases related to survey non-response. The first two columns marijuana, 5% stimulants, 3% ecstasy, 2% cocaine, 0.3% heroin, and
in Appendix Table A show that the sample responding to the base- 10% other drugs). Gambling is measured by the question, “In the
line survey is nearly identical to the initial sample in terms of age, past month, did you make any sort of bet? (By ‘bet’ we mean bet-
gender, race/ethnicity, and U.S. versus international citizenship. ting on sports, playing cards for money, playing gambling games
The table also reveals that the other layers of attrition (response online, buying lottery tickets, playing pool for money, playing slot
by roommate at baseline, and own response at follow-up) are machines, betting on horse races, or any other kind of betting or
not strongly related to risky behaviors and other characteristics. gambling)” Gambling decreases slightly, from 22% at baseline to
Despite the reasonably large sample size, the only statistically sig- 19% at follow-up. The number of sexual partners is measured by
nificant differences across layers of attrition are a slightly higher the question, “In the past six months, with how many different peo-
proportion of women in the final analytic sample (0.53) as com- ple have you had sex (oral sex or sexual intercourse)?” As shown
pared to the initial sample (0.50) and a slightly lower proportion in the table, the distribution of responses shifts upwards for this
question between baseline and follow-up, with more students with
at least one partner (from 35% to 46%) and also more with mul-
tiple partners (from 12% to 15%). Suicidal ideation is measured
8
If a student has multiple roommates and some but not all completed the base-
by the question, “In the past six months, did you ever seriously
line survey, we still include that student in the sample. In those cases we code the
roommate variable as the average among roommates who completed the baseline think about attempting suicide?” The percentage answering yes
survey. was 2.5% at baseline and 4.1% at follow-up. Finally, non-suicidal
9
Throughout our analysis roommates are defined based on initial assignments. self-injury was measured by the question, “This question asks about
Therefore one can think of our estimates as “intention-to-treat,” ignoring the ways you may have hurt yourself on purpose, without intending
endogenous changes in roommates during the school year. These changes are dis-
couraged by the universities and occurred for only a small proportion of students.
to kill yourself. In the past six months, have you ever done any
Specifically, between our baseline and follow-up surveys 3% of students received
a new room assignment (but remained in a campus residence), and 1.5% of stu-
dents moved out of campus housing. These numbers are similar across the two
11
universities. It is also interesting to examine correlations in the likelihood of survey partic-
10
This lower response rate at follow-up is somewhat surprising, given that it is ipation among roommates. As expected, at baseline the roommates’ participation
conditional on responding at baseline (which indicates a propensity to respond to in the survey is not significantly associated with the probability of participation
surveys). We believe that the response rates were higher at baseline than at follow- (p = 0.64), controlling for housing preferences. Participation in the follow-up survey,
up for several reasons: (a) just prior to arrival students may have been especially however, is significantly associated with roommates’ participation in the follow-
attentive to solicitations related to the university; (b) by the time of the follow-up up survey (in a linear probability model: B = 0.067, SE = 0.018, p < .001), controlling
survey, students had received a number of requests to complete surveys, in addition for housing preferences and participation at baseline. Furthermore, participation
to our baseline survey (we do not know the exact number of other surveys but we at follow-up is significantly associated with roommates’ participation at baseline
are aware of at least a couple others at each campus); (c) students were busier while (B = 0.045, SE = 0.019, p = 0.02), indicating that at least part of the correlation in
school was in session. participation at follow-up is due to a causal effect of roommate participation.
Table 1
Mean values of primary analytic sample (N = 1641).
Baseline Baseline Follow-up
University A (large public) 0.69 Binge drinking (past 30 days)

University B (large private) 0.31 None 0.67 0.46
Once 0.11 0.14
Double room 0.79 Twice 0.08 0.12
Triple room 0.17 3–5 times 0.09 0.16
Quad room 0.04 6–9 times 0.04 0.10
10+ times 0.02 0.03
Age 18.4 (SD = 0.41)
Smoking (past 30 days)
Female 0.54 None 0.94 0.92
<1 cigarette/day 0.04 0.06
White 0.70 1–5 cigarettes/day 0.01 0.02
Asian 0.17 About 1/2 pack/day 0.002 0.001
Black 0.03 About 1 pack/day 0.002 0.001
Hispanic 0.05 >1 pack/day 0.001 0.001
Other 0.02
Multi 0.04 Illicit drugs (past 30 days) 0.22 0.32
Parents’ education Gambled (past 30 days) 0.23 0.19
Less than college degree 0.16
College degree 0.27 Sex partners (past 6 months)
Graduate degree 0.56 None 0.65 0.54
1 0.27 0.32
2 0.06 0.07
3–5 0.02 0.06
6 or more 0.004 0.02
Suicide ideation (past 6 mos.) 0.025 0.041

Non-suicidal self-injury (past 6 mos.) 0.102 0.114
All values are proportions, except age. Primary sample consists of first-year undergraduates meeting these conditions: (a) at least 18 years old as of follow-up survey (March
15, 2010); (b) assigned to their roommate(s) (i.e., did not request their roommate(s)); (c) completed both baseline and follow-up surveys; (d) at least one roommate completed
baseline survey.
of the following intentionally? (Select all that apply)” There were application by a certain deadline, the order in which they are allo-
13 different answer choices, including 11 types of self-injury, an cated to residences and rooms is determined by a random lottery
“Other (specify” option, and “No, none of these.” The percentage (generated by the housing officials using Microsoft Excel’s random
reporting at least one type of self-injury was 10.2% at baseline and number function). This accounts for the vast majority (89%) of
11.4% at follow-up, and the specific types of self-injury, in order of first-year students with assigned roommates at university A. The
prevalence at follow-up, were “punched or banged myself” (3.7%), remaining students (11%) who miss the deadline are assigned in
“punched or banged an object to hurt myself” (3.5%), “scratched or the order in which their housing applications are received (e.g., a
pinched myself severely” (2.5%), “prevented wound from healing” student who prefers a double room would be matched with the
(2.0%), “cut myself” (1.9%), “bit myself” (1.9%), “pulled my hair, eye- next student to submit an application with identical preferences).
lashes, or eyebrows with intent to hurt myself” (1.6%), “ripped or This implies that for university A, roommate assignments are truly
tore my skin” (1.1%), “burned myself” (0.9%), “other” (0.8%), “rubbed random, conditional on the preferences noted above, for the vast
sharp objects into my skin” (0.7%), and “carved words or symbols of majority of students, whereas the date of housing application
into skin” (0.6%). needs to be controlled for as flexibly as possible for the remaining
students.
3.3. Exogeneity of roommate assignments At university B, a private school with approximately 4000 first-
year students, the housing office uses a commercial software
For students who do not request roommates, the assignment program to match students based on a more extensive list of vari-
processes differ somewhat between the two universities in our ables. Although we do not have access to the proprietary algorithm
sample, but the common feature is that assignments are based only by which the matching is done, we have complete data on all vari-
on known variables that we observe in our data set. Therefore, any ables used and we know from conversations with housing officials
variation in roommate characteristics (such as substance use), con- that the variables receiving the most weight are similar to that for
ditional on the variables that explicitly determine the assignments, university A: gender (which is always matched among assigned
should be uncorrelated with the error term in Eq. (1). roommates at both universities), preferred room type (double,
At university A, a public school with approximately 6000 triple, quad), preference for co-ed versus single-sex hallways, and
first-year students, the housing administrators match roommates smoking status. The secondary matching variables include pre-
based on gender plus preferences regarding the following variables ferences about sleeping hours, background noise while studying,
(as indicated on housing applications): geographic area of campus types of music, and the extent of socializing in the room.
(three options); room type (double, triple, or quad); co-ed versus The key assumption in our empirical strategy is that the room-
same-sex hallway; and substance use environment (the student mate’s baseline characteristics (risky behaviors or any factors that
can indicate that he or she is a smoker, a non-smoker, or someone might influence risky behaviors) are uncorrelated with the error
who wants to live in an entirely substance free residence). To term in the regression in Eq. (1). This assumption cannot be tested
the extent possible the housing administrators match roommates unequivocally, but as in prior studies in the roommate literature
with identical preferences on the above variables. If a perfect we obtain suggestive evidence by examining the correlation among
match is not available, the housing officials prioritize the variables roommates in baseline variables, conditional on the variables used
in the order listed above. For students who submit their housing to make assignments. In these checks of exogeneity as well our
main analyses, we control for the variables used to make housing rule out large peer effects with high confidence, but we cannot rule
assignments in the following way. We relax parametric assump- out small but meaningful effects. Particularly in the case of smok-
tions to the extent possible by using a set of dummy variables ing, which is reported at follow-up by only 8% of our sample, we
corresponding to all combinations of the primary variables used for cannot rule out peer effects that are nontrivial relative to the mean.
matching roommates at each university. To control for the date of It is important to interpret the results in Table 2 with the
housing application at university A, we include a vector of dummy caveat that these could be considered multiple tests of a related
variables corresponding to each week during the 9-week period hypothesis—that peers influence each other’s risky behaviors in
in which late applications were received (and on-time applica- general, or that peers influence each other’s health in general.
tions are denoted by a tenth dummy variable). For university B, When thinking of the hypotheses in this composite way, one should
we include the secondary matching variables as sets of categorical adjust the p-values to reflect the greater risk of type I errors
dummies corresponding to each answer choice for each variable. (false positives). Appendix C illustrates this issue by showing the
We also include a dummy variable for university (A or B).12 estimated peer effects for not only the risky behaviors but also
If housing assignments are exogenous conditional on these other health-related outcomes in our larger project.14 Although the
variables, then the conditional correlations among roommates at simple Bonferroni adjustment has been criticized as overly conser-
baseline should not be significantly different from zero. We check vative (over-adjusting to eliminate type I errors, at the expense of
this by estimating Eq. (2) below for each risky behavior variable introducing type II errors), it illustrates the basic story that would
that we consider as an outcome in this paper, as well as several remain true under other types of adjustments: the estimated peer
other characteristics that might plausibly be related to risky behav- effect for binge drinking remains highly significant, and a compos-
iors (happiness, depression, anxiety, eating disorder symptoms, ite null hypothesis of no peer effects for risky behaviors is easily
suicidal ideation, non-suicidal self-injury, parents’ education, reli- rejected.
giosity, binge drinking, physical activity, hours studying for school, We also estimate hallmate effects, where hallmates are defined
admissions test scores, and GPA in high school). We estimate this as students who live in the same floor and residence. In order to
equation both with and without controlling for the correction pro- focus on hallmate effects independent of roommate effects, we also
posed by Guryan et al. (2009); we implement this correction by control for roommate behaviors in the regressions with hallmate
adding a control for the average value of the Y variable among other behaviors. Because hallways typically include many students (the
students with an identical combination of values for the primary majority of our sample has at least 10 hallmates also in the sample),
housing variables (i.e., the pool of potential roommates).13 there is only modest variation in hallmate averages, which results
in relatively large standard errors in our estimated peer effects. We
Yt = ˇ0 + ˇ1 Pref st + ˇ2 Ypeerst + εt (2) find for all seven behaviors that the estimates are not significant at
Appendix Table B shows the estimates of ˇ2 in equation 2 for p < 0.10 (results available on request). When we restrict attention
the main outcome variables in this study as well as several other to same-gender hallmates, however, we do find some evidence for
variables that might plausibly be related to risky behaviors. As peer effects among men for smoking (B = 0.18, SE = 0.07, p = 0.01)
shown in the table none of the variables are significantly correlated and gambling (B = 0.17, SE = 0.10, p = 0.07). The same-gender peer
among roommates at baseline except for illicit drug use. The over- effect is not significant for binge drinking among men (B = −0.10,
all pattern of results in the table is consistent with what one would SE = 0.10, p = 0.31), and it is also not significant for any of the behav-
expect due to chance if the null hypothesis of conditionally random iors among women.
assignment were true: only one of 15 variables exhibits a signifi-
cant correlation at p < 0.10. More generally, when we examine all 4.1. “Endogenous” versus “contextual” peer effects
33 variables available in our baseline data, we continue to find a
pattern consistent with random chance: 3 of 33 variables exhibit As noted earlier, from a policy perspective it matters whether, in
conditional correlations significant at p < 0.10, with one positive Manski’s terminology, peer effects are “endogenous” versus “con-
correlation (for illicit drug use) and two negative. Also, as shown textual.” Both are causal peer effects, but only the former implies
in the table, the correction from Guryan et al. does not change multiplier effects whereby changing the behavior in one person
the estimates appreciably, which is not surprising, given that they leads to changes in the same behavior among peers. Careful consid-
demonstrate that the correction has little impact in larger samples. eration of this issue has often been absent from peer effects studies,
perhaps because it is already challenging enough to identify any
4. Estimates of peer effects type of causal effect. As in other peer effects studies, there is no way
for us to rule out definitively the importance of unmeasured peer
As shown in Table 2, we find strong evidence for peer effects on characteristics, but we can examine the sensitivity of our findings
binge drinking, but no apparent effects for the other behaviors. Hav- to controlling for additional roommate characteristics.15
ing a roommate who binge drinks at baseline increases the proba- We find that the main pattern of results is robust to includ-
bility of binge drinking at follow-up by 8.6%, or a 19% increase rela- ing many other roommate characteristics in the regressions,
tive to the mean. The null results for the other behaviors allow us to
14
The mental health results are described in more detail in Eisenberg et al.
12
Note that residence fixed effects are not necessary to obtain unbiased estimates, (2013a,b), and the obesity and physical activity results are in Yakusheva et al. (2013).
15
because our identification strategy, like most previous roommate studies, focuses on Even in a randomized trial it would be very difficult to make a definitive dis-
roommates’ behaviors at baseline (prior to when any common “shocks” could occur tinction between endogenous and contextual peer effects. Consider, for example,
within residences). Nevertheless, we confirm that our estimates are not sensitive a trial that randomizes individuals to an educational and counseling intervention
to including residence fixed effects: the significant peer effect for binge drinking designed to reduce binge drinking, and then assesses outcomes among the inter-
remains unchanged and the peer effects for other behaviors are still non-significant. vention group, the control, and the peers of both groups. If one finds that the peers
13
This correction is intended to counteract a negative conditional correlation of the intervention group have lower binge drinking than the peers of the control
among roommates that can be present mechanically when there are small numbers group, this would suggest a social multiplier effect from the intervention. But it
of people in groups defined by identical housing preferences. The negative correla- would not necessarily imply a universal social multiplier driven by binge drinking
tion arises because roommates are increasingly likely to be on opposite sides of the behavior per se. It is possible, for example, that the mechanism for the peer effect
group mean (which the dummy variables control for) as the group size gets smaller. was information or attitudes, rather than the behavior per se.
Table 2
Effects of roommate behaviors on own behaviors.
Binge drinking Smoking Illicit drug use Gambling Multiple sex Suicidal ideation Non-suicidal
partners self-injury
Roommate behavior at baseline

Average marginal effect 0.086 0.002 0.012 −0.004 −0.035 −0.001 0.016
Standard error of marginal effect 0.024 0.026 0.025 0.023 0.031 0.033 0.024
p-Value <.001 0.950 0.616 0.875 0.259 0.964 0.503
Own behavior at baseline
Average marginal effect 0.375 0.243 0.443 0.256 0.278 0.131 0.267
p-Value <.001 <.001 <.001 <.001 <.001 <.001 <.001
N = 1641. Average marginal effects calculated from probit regressions with controls for housing preferences and gender, age, race/ethnicity, and parents’ education.
suggesting (but not proving) that we are measuring a true social just two universities, this suggests that the peer effects we observe
multiplier effect for roommate binge drinking. The additional for binge drinking may be a general phenomenon on college
roommate characteristics in these regressions include: the other campuses, or at least on large, academically competitive campuses
risky behaviors in this study, a measure of mental health (the such as the two in our study. We also find that the null results for
K-6 psychological distress score (Kessler et al., 2003)), parents’ the other risky behaviors hold across the two universities, with
educational attainment, religiosity, frequency of exercise in the the exception that university B has a marginally significant peer
past 30 days, amount of studying per day in high school, stan- effect for smoking (B = 0.06, SE = 0.04, p = 0.10).
dardized test scores, and high school grade point average. With
these additional covariates the estimated peer effect for binge
drinking remains almost identical (B = 0.085, SE = 0.028, p = 0.002),
4.3. Asymmetries in peer effects based on social status or
and the estimates for other behaviors also remain nearly identical
influence
as before (in each case, close to zero and insignificant). In future
work, a more confident distinction between endogenous and
It is possible that peer effects between roommates are asym-
contextual peer effects might be made by controlling for a richer
metric. For example, student A may have strong influence over his
set of additional peer characteristics, such as personality type, and
or her roommate, student B, whereas student B has little influ-
also examining specific mechanisms related to information and
ence over student A. This might occur if student A has higher
attitudes.
social status. To examine this possibility, we consider three possi-
ble markers of social status: parental education (i.e., socioeconomic
4.2. Subgroup analyses of roommate effects background), admissions test score (as an indicator of academic
status/ability), and number of sexual partners in the previous six
When the roommate effects are estimated separately by gender months (as an indicator of sexual experience, which is perceived
(Table 3), the binge drinking effect is significant and similar in mag- by many adolescents to have a positive association with social sta-
nitude for men and women. This is a notable difference from the tus (Ott et al., 2006)). For each marker of social status, we code a
results in Duncan et al. (2005), where the effects are only present student as being higher, lower, or the same as the roommate (or
for men. In our results the only apparent difference by gender is roommates’ average for multiple roommates). This is straightfor-
for smoking, where there is a positive, though not significant, peer ward for the categorical variables (parental education and number
effect for men and a negative and significant effect for women. of sexual partners), and the continuous standardized test score we
This difference in estimated effects by gender is significant at use plus or minus one standard deviation as the threshold for higher
p = 0.02. or lower. We then re-run our main regressions with the addition
Next we examine how one’s susceptibility to peer effects varies of an interaction between the roommate behavior variable and the
as a function of one’s behavior at baseline (Table 4). For binge drink- dummy for being higher status than the roommate and an interac-
ing the peer effect is similar whether or not the individual is a binge tion between the roommate behavior variable and the dummy for
drinker at baseline, and if anything the effect is a bit stronger for being lower status.
non-binge drinkers at baseline. This is again somewhat different For binge drinking the results provide some support for the idea
than Duncan et al.’s findings, in which the only group experiencing that peer influence is stronger from higher to lower status students.
a peer effect is men with prior binge drinking. Because our analysis Students with lower test scores than their roommates experience
uses a binary outcome, the peer effect on binge-drinkers at base- higher peer effects (interaction term: B = 0.11, SE = 0.07, p = 0.09)
line could be attenuated by a ceiling effect (87% of baseline binge and students with more sexual experience than their roommates
drinkers report binge drinking at follow-up), which we explore experience lower peer effects (B = −0.16, SE = 0.06, p = 0.008). For
later by looking at frequency of binge drinking. There are no sig- other risky behaviors, however, students with lower status than
nificant differences by baseline behavior in the peer effects for the their roommates appear to experience smaller peer effects, if any-
other behaviors, either. We obtain large negative peer effects for thing. For example, students with lower test scores than their
students who smoked or had multiple sexual partners at baseline, roommates experience lower peer effects for drug use (B = −0.21,
but these estimates are very imprecise due to the small size of these SE = 0.06, p = 0.001), gambling (B = −0.16, SE = 0.06, p = 0.008) and
subgroups. self-injury (B = −0.10, SE = 0.06, p = 0.11). Also, students with lower
We also estimate the peer effects separately by university parental education experience lower peer effects for smoking
(not shown in tables), as tentative evidence on whether effects (B = −0.10, SE = 0.06, p = 0.10) and students with fewer sexual part-
might generalize across campuses. The binge drinking effect is ners experience lower peer effects for drug use (B = −0.08, SE = 0.06,
highly significant at both universities in our sample, with average p = 0.14). Overall, these results suggest that asymmetries related
marginal effects of 0.08 (SE = 0.03, p = 0.01) at university A and 0.12 to social status are highly variable by indicator of status and by
(SE = 0.04, p = 0.01) at university B. Although the sample includes behavioral outcome.
Table 3
Subgroup analysis by gender: effects of roommate behaviors on own behaviors.
Men (N = 774)
Average marginal effect 0.079 0.057 0.033 −0.004 −0.016 0.015 0.021
p-Value 0.023 0.084 0.303 0.898 0.756 0.69 0.564
Women (N = 863)
Average marginal effect 0.095 −0.078 −0.004 −0.015 −0.060 −0.02 0.011
p-Value 0.005 0.053 0.901 0.700 0.152 0.64 0.732
Average marginal effects calculated from probit regressions with controls for housing preferences and gender, age, race/ethnicity, and parents’ education. Sample sizes do
not always add up across sub-groups to the exact total reported in Tables 1 and 2 because of a small number of missing values for specific variables.
Table 4
Subgroup analysis by baseline behavior: effects of roommate behaviors on own behaviors.
Binge drinking Smoking Illicit drug use Gambling Multiple sex partners Suicidal ideation Non-suicidal self-injury
Baseline behavior = 0 (no)

N 1083 1531 1277 1261 1495 1605 1479
Average marginal effect 0.101 −0.001 0.002 −0.015 −0.032 0.008 0.013
p-Value 0.003 0.967 0.954 0.555 0.313 0.787 0.604
Baseline behavior = 1 (yes)
N 549 103 354 372 135 41 167
Average marginal effect 0.072 −0.395 0.062 0.043 −0.258 n/aa 0.024
Standard error of marginal effect 0.035 0.408 0.053 0.059 0.134 0.179
p-Value 0.042 0.333 0.241 0.465 0.054 0.895
Average marginal effects are calculated from probit regressions with controls for housing preferences and gender, age, race/ethnicity, and parents’ education. Sample sizes
do not always add up across sub-groups to the exact total reported in Tables 1 and 2 because of a small number of missing values for specific variables.
a
Insufficient observations to estimate the regression (more righthandside variables than observations).
Table 5 our lack of evidence for peer effects on prior binge drinkers could
Subgroup analysis by baseline behavior and gender (binge drinking only).
be due to a ceiling effect. This does not appear to be the story,
Men, baseline = 0 Women, baseline = 0 however. When we mimic the main specification in Duncan et al.
N 501 578 (2005), with frequency of binge drinking as the dependent vari-
Roommate behavior at baseline able and a binary measure of roommate binge drinking as the key
Average marginal effect 0.130 0.089 independent variable,16 we find a pattern of results mostly similar
Standard error of marginal effect 0.050 0.046 to that in Table 5: no evidence for peer effects on male base-
p-Value 0.009 0.051
line binge drinkers (B = −0.07, SE = 0.46, p = 0.87), significant effects
Men, baseline = 1 Women, baseline = 1
for both male (B = 0.47, SE = 0.24, p = 0.05) and female non-binge
drinkers at baseline (B = 0.45, SE = 0.21, p = 0.04), and a positive but
N 269 280
not significant effect on female binge drinkers at baseline (B = 0.17,
Average marginal effect 0.014 0.127 SE = 0.45, p = 0.71).
Standard error of marginal effect 0.057 0.053 We also examine the possibility that peer effects may be differ-
p-Value 0.809 0.016 ent depending on the frequency of binge drinking by roommates.17
Average marginal effects are calculated from probit regressions with controls for We define roommate binge drinking (or roommates’ average, for
housing preferences (as explained in the text) and gender, age, race/ethnicity, and multiple roommates) as occasional if the number of episodes is
parents’ education. Sample sizes do not always add up across sub-groups to the greater than zero but less than three times in the past month,
exact total reported in Tables 1 and 2 because of a small number of missing values for
specific variables. The difference between the estimated effect for men baseline non-
drinkers (0.130 and men baseline drinkers (0.014) is not quite significant (p = 0.15).
16
In these specifications we estimate linear regressions, where the dependent
variable is linear and corresponds to the number of times in the previous two weeks.
4.4. More detailed examination of binge drinking effects The answer choices in our survey question about binge drinking include numerical
intervals, so we approximate the number of times binge drinking as 4 if “3 to 5
Given the strong evidence of peer effects on binge drinking in times” was selected, 8 if “6 to 9 times was selected,” and 10 if “10 or more times”
was selected.
our sample, as well the interesting findings in Duncan et al. (2005), 17
We can also examine the effects of intensity/frequency for two other behaviors,
we examine the findings for this behavior in more detail. First, to smoking and the number of sexual partners. The problem, however, is that there is
provide a more direct comparison with Duncan et al., we estimate little variation to work with; as indicated in Table 1, at baseline only 2% of students
subgroup effects by both baseline behavior and gender (Table 5). report smoking more than one cigarette per day, and similarly only 2% report more
These results are nearly the opposite of those in the prior study: we than two sexual partners in the past six months. Given this lack of variation, it is not
surprising that we still find non-significant peer effects for these behaviors when we
find the strongest evidence for peer effects in the subgroups other distinguish between lower intensity/frequency (<1 cigarette per day; 1 or 2 sexual
than men who reported binge drinking prior to college, whereas partners) and higher intensity/frequency (>1 cigarette per day; >2 sexual partners)
they find peer effects only for that subgroup. As noted earlier, of roommate behaviors.
and frequent if it is three times or more. The reference category were asked to think about their roommates on average. We use this
is no binge drinking by roommates. We find that being assigned to information to supplement our analysis in several ways.
roommates with occasional binge drinking causes a 0.04 (SE = 0.03, First, we examine the distribution of responses to these ques-
p = 0.11) increase in the probability of binge drinking, and being tions, to understand the extent and nature of contact between peers
assigned to roommates with frequent binge drinking causes a in our natural experiment. In our sample of assigned roommates,
0.11 (SE = 0.03, p = 0.001) increase. The distinction between room- 41% spend at least an hour per day doing things or hanging out with
mates’ occasional and frequent binge drinking appears to matter their roommates, 49% agree or strongly agree that they are close
more for men: the effects for men are 0.01 (SE = 0.04, p = 0.86) friends with their roommates, and 54% agree or strongly agree that
from occasional roommate binge drinking versus 0.12 (SE = 0.05, they enjoy being in the room together with their roommates. This
p = 0.01) from frequent roommate binge drinking, as compared to suggests that assigned roommates have close relationships in about
0.08 (SE = 0.04, p = 0.04) and 0.11 (SE = 0.04, p = 0.01) for women. half of cases, whereas in the other half of cases the “treatment” in
this natural experiment can be thought of as sharing a small living
4.5. Correlations among roommate outcomes area without a close relationship.
Second, we conduct subgroup analyses estimating how peer
For the sake of comparison to our main results, we also estimate effects vary depending on the closeness between roommates. These
how roommates’ behaviors are correlated at follow-up, controlling estimates should be viewed with caution, because the subgroups
for baseline behaviors. This analysis is biased by any contextual are defined by a variable that is measured at follow-up and is
factors shared by roommates during the academic year (“common endogenous with respect to peer and own binge drinking. We find,
shocks”), but is useful to examine as a point of reference. The ana- as one might expect, that the peer effects on binge drinking are
lytic sample size is now reduced to 1041, because data are required stronger for students who report being close friends with their
from the reference individual and his or her roommate(s) at both roommates (B = 0.12, SE = 0.03, p < 0.001) as compared to those who
baseline and follow-up. We find non-significant peer effects for all do not report being close friends with their roommates (B = 0.05,
risky behaviors (even binge drinking), except for a surprising neg- SE = 0.04, p = 0.14), and this difference in effects is significant at
ative peer effects for having multiple sexual partners (B = −0.12, p = 0.07. For the other risky behaviors the estimated peer effects
SE = 0.05, p = 0.02). are close to zero and not significant regardless of closeness between
roommates.
Third, we examine whether the risky behaviors of roommates
4.6. Results with requested roommates at baseline predict the closeness of their relationship. As shown
in Table 6, for three of the risky behaviors—binge drinking, illicit
Also for the sake of comparison, we replicate our analysis with drug use, and gambling—a student who has engaged in the behavior
a smaller supplementary sample of students who requested their before arriving at college ends up spending significantly, more time
roommates. We drew this survey sample only from university A. In with a roommate who has also engaged in the behavior, as com-
contrast to the assigned roommates, this sample exhibits highly pared to one who has not (B3 is significantly larger than B2 ). The
significant correlations among roommates at baseline for some opposite is true, however, for smoking and self-injury: a smoker
risky behaviors, even after controlling for the housing preference (or someone who self-injures) at baseline spends less time with a
variables in equation 1 (even students who requested roommates smoker (or self-injuring) roommate, as compared to a non-smoker
are required to fill out the full housing applications). For exam- (or non-self-injuring) roommate.18 This may be related to stigma
ple, having a roommate who binge drinks at baseline is associated and shame associated with these behaviors. On the other hand, for
with a 0.34 increase (SE = 0.08, p < 0.001) in the probability of binge a student who has not engaged in risky behaviors at baseline, the
drinking at baseline. This further underscores the selection bias that time spent with their roommate is not significantly predicted by
may be present in estimates of peer effects not based on exogenous the risky behavior of the roommate (B1 is not significantly different
variation in peer contacts. from zero). Each of these patterns remains similar when we look
When we replicate our main analysis with this sample of at our other measures of closeness among roommates: whether
requested roommates, we find that the baseline behavior of room- they consider themselves close friends, and whether they enjoy
mates is a significant predictor of own behavior at follow-up for spending time together (results not shown in table, available on
smoking and drug use, but not for the other behaviors (results request).19
available on request). Although we cannot disentangle the true Although peer effects for binge drinking influence students
peer effects from selection biases in this supplemental sample, it regardless of their own prior binge drinking (as shown earlier in
is important to keep in mind that in theory the true effects among the paper), the results in Table 6 suggest a more nuanced pattern
requested roommates may be higher or lower than those among for how the closeness of roommate relationships are influenced
assigned roommates. As Kremer and Levy (2008) point out, stu- by risky behaviors. It appears that the type of roommate (hav-
dents who have known each other for a long time may have already ing engaged in risky behavior or not) influences the relationship
exerted most of their peer effects on each other, which may lower
the effects apparent during the first year of college. On the other
hand, requested roommates tend to have closer relationships than 18
Although we previously showed that peer effects for smoking appear to be neg-
assigned roommates, as we find in our survey measures at follow- ative among women and positive among men, this negative effect of peer smoking
up (e.g., 85% of requested roommates report being close friends), on friendships among roommates is present for both men and women (although
which could contribute to stronger peer effects. more precise for women): for women the estimate of B3 is −0.95 and the p-value is
<0.001, whereas for men the estimate is −0.85 with p-value = 0.07.
19
Note also that, if “time spent with roommate” is an objective and accurate mea-
5. Analysis of roommates’ relationships sure of time that roommates spend together, then we should impose the constraint
ˇ1 = ˇ2 in the regressions shown in Table 6. We thank an anonymous reviewer for
In the follow-up survey we asked students how much time they pointing this out. When those constraints are imposed, the basic pattern of results
remains the same, however. We display the unconstrained version of the regression
typically spend doing things or hanging out with their roommates, because this seems somewhat more intuitive to interpret and it acknowledges that
whether they are close friends with them, and how much they relationship closeness among roommates can be asymmetric for other measures
enjoy spending time with them. Students with multiple roommates (such as perceived closeness).
Table 6
Interactive effects of baseline behavior of self and roommate (RM), on time spent with RM (h/day).
Self no, RM no (reference category)

Self no, RM yes
ˇ1 −0.048 −0.032 −0.188 0.011 0.023 −0.317 −0.146
SE 0.133 0.170 0.125 0.137 0.186 0.244 0.151
p-Value 0.718 0.852 0.131 0.935 0.900 0.194 0.335
Self yes, RM no
ˇ2 0.050 −0.214 −0.206 −0.116 −0.006 −0.058 −0.341
SE 0.139 0.207 0.134 0.133 0.188 0.323 0.148
p-Value 0.717 0.301 0.126 0.381 0.975 0.856 0.021
Self yes, RM yes
ˇ3 0.374 −0.925 0.530 0.297 −0.546 −1.33 −0.587
SE 0.172 0.257 0.243 0.230 0.395 0.25 0.21
p-Value 0.030 <0.001 0.029 0.196 0.168 <.001 0.005
p-Value for Wald test of ˇ2 = ˇ3 0.072 0.011 0.004 0.079 0.215 0.001 0.301
Sample is limited to students in double rooms (with only one roommate) (N = 1274). Average marginal effects are calculated from OLS regressions with controls for housing
preferences (as explained in the text) and gender, age, race/ethnicity, and parents’ education. When we run regressions with just the “main effects” of roommate behavior
at baseline, we find the following estimates of the effects on time spent with roommate(s): for binge drinking, B = 0.10, SE = 0.10, p = 0.33; for smoking, B = −0.06, SE = 0.16,
p = 0.73; for illicit drug use, B = 0.08, SE = 0.11, p = 0.51; for gambling, B = 0.11, SE = 0.11, p = 0.32; for multiple sex partners, B = −0.02, SE = −0.17, p = 0.92; for suicidal ideation,
B = −0.36, SE = 0.23, p = 0.13; for non-suicidal self-injury, B = −0.16, SE = 0.13, p = 0.24.
among roommates only for students who themselves have engaged effects by friendship closeness is consistent with a previous study
in the behavior. Also, these results are direct causal evidence of using a different identification strategy (based on year-to-year con-
self-selection into friendship networks based on risky behavioral tinuity in residential co-location) that finds no evidence for larger
characteristics (in this case, a positive selection for binge drinking, academic peer effects among students who are more likely to be
drug use, and gambling, but a negative selection for smoking), and friends (Foster, 2006).
highlight the potential biases in analyses of peer effects that do not
sufficiently control for such selection.
6. Discussion
Fourth, we take a broader approach to estimating the deter-
minants of closeness among roommates. We estimate regressions
In this study we estimate peer effects for risky behaviors using
similar to those shown in Table 6, except with the addition of
a natural experiment based on college roommate assignments.
several other indicators of roommate similarity/difference at base-
The analysis yields four notable findings. First, the estimated peer
line: dummy variables for whether the roommates are of different
effects are significant for binge drinking, but not for smoking, illicit
race, binge drinking, smoking, drug use, gambling, multiple sex-
drug use, gambling, sexual activity, suicidal ideation, and non-
ual partners, and sexual orientation; and the absolute value of
suicidal self-injury. Second, in contrast to the study by Duncan
the difference in religiosity, political orientation, parental educa-
et al., the peer effects for binge drinking are significant for not only
tion, standardized test z-score, average study hours, body mass
men but also women, and are stronger for students who did not
index (BMI), frequency of exercise, tendency to disclose feelings,
binge drink prior to college. Third, there is tentative evidence that
psychological distress score, and happiness score. Nearly all of
peer effects for smoking may be positive for men and negative for
these variables have negative coefficients, indicating that larger
women. Fourth, the matching of baseline substance use behaviors
differences at baseline predict lower closeness of relationships,
between roommates significantly predicts friendships.
but only three factors are significant at p < 0.10: difference in drug
The robust peer effects for binge drinking are consistent with the
use (p = 0.003), difference in religiosity (p < 0.001), and difference
highly social nature of this behavior among young people, particu-
in happiness score (p = 0.02).20
larly college students. The null findings for smoking and illicit drug
Finally, we examine the predicted friendship level as a possi-
use are somewhat surprising, however, in light of the large effects
ble moderator of peer effects, in a cleaner version of the previous
estimated for secondary school students in other studies. Similarly,
analysis that examined actual friendship level as a moderator.
the null findings for suicidal ideation and self-injury are somewhat
In this case, we do not find clear-cut evidence that peer effects
surprising, considering the widespread notion that suicide and self-
are stronger for closer friends. For example, students with low
injury can spread like a contagion (Prinstein et al., 2010; Velting
(below the median) predicted friendships with their roommates
and Gould, 1997). Although our estimates are not sufficiently pre-
experience only slightly smaller peer effects for binge drinking,
cise to rule out meaningful effects, particularly for smoking, we
as compared to students with high predicted friendships (B = 0.08
can easily reject that the effects for substance use are as large as
versus 0.11, and the difference is not significant in a regression with
in the studies of secondary school students, most of which are in
an interaction between predicted friendship level and roommate
the range of a 0.30 or higher. Even our estimates for binge drink-
binge drinking). This analysis is limited, however, by the fact that
ing are significantly lower than those in most studies of secondary
predicted friendship levels have much less variation than actual
school students. Our lower estimates could be due to either or both
friendship levels.21 Our lack of clear evidence for differential peer
of two factors: (1) differences in true effects between secondary
and postsecondary settings; (2) upward biases that might remain
20
Other studies find that significant predictors of friendship among college stu-
dents include similarity in race/ethnicity, socioeconomic background, political
orientation, and academic performance (Marmaros and Sacerdote, 2006; Mayer and in baseline roommate characteristics and their interactions with roommate risky
Puller, 2008; Foster, 2005). behaviors as instruments for actual friendship levels and interactions with room-
21
Baseline differences in roommate characteristics are too weak to be used as mate behaviors). For example, the first-stage partial F-statistic is only 1.7 in this
instruments in an instrumental variable version of this analysis (with differences specification for binge drinking.
in empirical identification strategies used in the secondary school estimate suggests that women have a negative reaction to being
setting, where natural experiments such as conditionally random around a smoker, such that they become less likely to smoke. A bet-
roommate assignments are not available. ter understanding of this scenario might be leveraged in peer-based
There a few possible explanations for the differences in binge interventions to reduce smoking.
drinking effects in our study versus that of Duncan et al. The Finally, the mutual affinity among roommates who engage in
differences are probably not related to the campus settings: the certain risky behaviors underscores the potential biases in analyses
university in their study is the same as university A in our study, of peers who select each other. This finding also raises the ques-
which accounts for 69% of our sample, and our main pattern of tion of whether the behaviors draw these peers closer together, or
results remains consistent when we restrict our analysis to uni- whether there are other characteristics correlated with the behav-
versity A (results available on request). The differences may be iors that draw the peers together. A related question is whether
related to changes in drinking behaviors over time in college popu- we have estimated pure spillover effects for binge drinking (in
lations, however, given that our cohorts entered college 10–12 which binge drinking by one person directly affects the likelihood
years later than those in their study. For example, the prevalence of binge drinking of another person), or if our estimates also reflect
of binge drinking among college-age women has been catching the effects of unmeasured peer characteristics that are correlated
up to that of college-age men (Grucza et al., 2009), and changes with binge drinking. The robustness of our estimates to control-
in norms may be shifting how women respond to peer behavior. ling for a variety of roommate characteristics supports the former
Another possibility relates to the fact that Duncan et al. estimate interpretation, but there still may be other characteristics, such as
the peer effects of the first-year roommates as of the second, personality traits, that could be important. At a minimum, our esti-
third, of fourth year of college, whereas we estimate effects near mates imply a true causal effect of living with someone who binge
the end of the first year. Peer effects on male binge drinkers drinks. Also, our findings suggest a mutually reinforcing dynamic
may persist and even grow over the course of the college career between binge drinking behavior and the closeness of friendships,
because of the strong interrelationship between drinking and fra- and this angle will also be useful to investigate in future research. In
ternity participation. Peer effects for the other three subgroups addition, the interesting finding by which smokers actually tend to
may dissipate, on the other hand, if fraternity/sorority participation have less close relationships with a roommate who also smokes, as
does not mediate drinking outcomes to the same extent for those compared to a nonsmoker roommate, is also worth exploring fur-
subgroups. ther. This finding is present for both men and women, but might
The discrepancies between our results and those of Duncan et al. still be connected to some of the same factors that lead to negative
need to be understood further, because they have different impli- peer effects for smoking among women.
cations. Their results suggest that intervention strategies designed Overall, the clearest conclusion from our study is that peer
to leverage or mitigate peer effects should focus specifically on effects for binge drinking are important, regardless of gender or
men with prior binge drinking, and also that avoiding the pair- prior drinking behavior. Peer effects for the other risky behav-
ing of binge drinking men as roommates would reduce the overall iors, on the other hand, appear to be small at most. Our study
prevalence of binge drinking. Our results, on the other hand, sug- also raises questions that warrant further exploration, particularly
gest that intervention focused on peer effects should pay roughly related to the surprising null effects for a number of behaviors,
equal attention to men and women. Also, our results raise ques- possible differences by gender in peer effects for smoking, and the
tions about whether the overall prevalence of binge drinking can relationship between risky behaviors and the formation of peer
be influenced by accounting for prior drinking when pairing room- relationships.
mates. Among women, the effect of a binge drinking roommate is
similar for prior drinkers and non-drinkers. Among men, the room- Acknowledgments
mate effect is larger for prior non-drinkers, suggesting that overall
binge drinking might be lowered by pairing drinkers together and Funding from the W.T. Grant Foundation and from the
pairing non-drinkers together, but this difference in effects is not National Institute for Child Health and Human Development
statistically significant (p = 0.15). (5R24HD041023) is gratefully acknowledged. We received help-
The apparent differences by gender that we find for smoking ful comments from many people including: Martha Bailey; Jason
peer effects should be regarded as tentative and warranting fur- Fletcher; Adriana Lleras-Muney; Amanda Purington; Ed Seidman;
ther study, considering the lack of a clear reason to anticipate this Stephanie von Hinke Kessler Scholder; and seminar participants at
difference a priori. If the difference is real, it may be related to differ- the University of Michigan, UC-Berkeley, and the 2011 NBER Health
ences by gender in motivations for smoking. For example, concerns Summer Institute. Much of this work was conducted while Ezra
about controlling body weight are frequently connected to smok- Golberstein was an NIMH Research Fellow at the Department of
ing behavior among women, but less so among men (French and Health Care Policy at Harvard Medical School. The survey data col-
Jeffery, 1995). If this type of personal reason is less prominent for lection was coordinated by Scott Crawford, Sara O’Brien, and their
men, it may be that social context has a relatively larger role. In colleagues at the Survey Sciences Group, LLC. We are also grateful to
future work it would also be valuable to learn more about whether the university staff members who helped us obtain administrative
our estimate of an inverse peer effect among women is real. This data and to the students who participated in the surveys.
Appendix A. Baseline characteristics by sample attrition

(examining nonresponse bias)
Initial sample Baseline BRs w/roommate Final analytic sample (BRs who
respondents (BRs) (RM) BRs responded at follow-up, w/RM BRs)
N 4971 3501 2589 1641

Age 18.4 18.4 18.4 18.4
Female 0.50 0.50 0.51 0.53
Asian or Pacific Islander 0.15 0.16 0.16 0.16
Black 0.04 0.04 0.04 0.04
Hispanic or Latino 0.04 0.04 0.04 0.04
Other or multiple categories 0.07 0.06 0.07 0.06
White 0.70 0.70 0.69 0.70
U.S. citizen 0.91 0.92 0.91 0.92
Parents’ education: less than college degree 0.16 0.16 0.16
Parents’ education: college degree 0.28 0.28 0.27
Parents’ education: graduate degree 0.56 0.56 0.56
Binge drinking (past 2 weeks) 0.37 0.36 0.34
Smoking (past 30 days) 0.07 0.07 0.06
Illicit drug use (past 30 days) 0.23 0.23 0.22
Gambling (past 30 days) 0.24 0.25 0.23
Multiple sex partners (past 6 months) 0.10 0.09 0.08
Suicide ideation (past 6 months) 0.03 0.03 0.02

Non-suicidal self-injury (past 6 months) 0.11 0.11 0.10
Note: None of the differences are significant across a single layer of attrition (from one column to the next one on the right); the difference in the proportion of females in
the initial sample versus the final sample is significant, however (Z = 2.1, p = 0.04).
Appendix B. Conditional correlations among roommates at baseline (randomness checks)
RM coefficient, w/o correction in Guryan et al. (2009) RM coeff., with correction in Guryan et al. (2009)
ˇ SE p ˇ SE p
Binge drinking (past 30 days) (0/1) 0.011 0.033 0.75 0.020 0.031 0.51
Smoking (past 30 days) (0/1) −0.002 0.031 0.94 0.006 0.027 0.84
Illicit drug use (past 30 days) (0/1) 0.077 0.033 0.02 0.065 0.030 0.03
Gambling (past 30 days) (0/1) −0.013 0.034 0.69 0.006 0.030 0.85
Multiple sex partners (past 6 months) (0/1) −0.002 0.033 0.96 0.006 0.030 0.84
Suicide ideation (past 6 months) (0/1) −0.005 0.024 0.85 0.006 0.021 0.77
Non-suicidal self-injury (past 6 months) (0/1) −0.013 0.030 0.67 −0.003 0.028 0.91
Happiness score (0–9) −0.001 0.033 0.98 0.014 0.029 0.64
Psychological distress score (0–24) −0.013 0.034 0.70 0.006 0.030 0.83
Parents’ education (highest attainment) (1–7) 0.041 0.034 0.23 0.025 0.031 0.79
Religiosity (0–3) −0.005 0.032 0.87 −0.004 0.030 0.89
Exercise (frequency in past 30 days) (0–3) −0.012 0.033 0.71 −0.016 0.031 0.60
Studying (time per day in high school) (0–5) −0.037 0.034 0.28 −0.025 0.031 0.43
Admissions test (standardized z-score, SAT/ACT) 0.035 0.032 0.27 0.01 0.029 0.72
GPA in high school (standardized z-score) 0.037 0.032 0.25 0.037 0.029 0.21
N = 1053 (we only include one student per room, as explained in the text). Each row corresponds to a separate linear regression—for each regression only the estimate for
the key coefficient on the RM variable is shown. All regressions include controls for the variables used for housing assignments, as described in the text.
Appendix C. Adjusting for multiple hypothesis testing across all health outcomes in the broader study
Risky behaviors
Binge drinking Smoking Illicit drug use Gambling Multiple sex partners Suicide ideation Non-suicidal self-injury

Average marginal effect 0.086 0.002 0.012 −0.004 −0.035 −0.001 0.0162
Standard error 0.024 0.026 0.025 0.023 0.031 0.033 0.024
p-Value 0.0004 0.950 0.616 0.875 0.259 0.964 0.503
Mental health Physical health
Happiness Depression Anxiety Body weight (BMI) Physical activity

Average marginal effect −0.020 0.012 0.053 0.013 0.047
Standard error 0.028 0.031 0.027 0.011 0.027
p-Value 0.483 0.712 0.049 0.230 0.080
Bonferroni p-value: risky behaviors 0.003

Bonferroni p-value: mental health 0.147
Bonferroni p-value: physical health 0.160
Bonferroni p-value: health overall (all three domains) 0.005
References Johnston, L.D., O‘Malley, P.M., Bachman, J.G., Schulenberg, J.E., 2009. Monitoring the
future national survey results on drug use, 1975–2008: Vol. II, College Students
Beck, K.H., Arria, A.M., Caldeira, K.M., Vincent, K.B., O‘Grady, K.E., Wish, E.D., 2008. and Adults Ages 19–50, Bethesda, MD.
Social context of drinking and alcohol problems among college students. Amer- Kessler, R.C., Barker, P.R., Colpe, L.J., Epstein, J.F., Gfroerer, J.C., Hiripi, E., Howes,
ican Journal of Health Behavior 32, 420–430. M.J., Normand, S.L.T., Manderscheid, R.W., Walters, E.E., 2003. Screening for seri-
Byrnes, J.P., Miller, D.C., Schafer, W.D., 1999. Gender differences in risk taking: a ous mental illness in the general population. Archives of General Psychiatry 60,
meta-analysis. Psychological Bulletin 125, 367–383. 184–189.
Card, D., Giuliano, L., 2013. Peer effects and multiple equilibria in the risky behavior Kolata, G., 2008. Study Finds Big Social Factor in Quitting Smoking. Times Science,
of friends. Review of Economics and Statistics 95, 1130–1149. New York.
Carpenter, C., Dobkin, C., 2011. The minimum legal drinking age and public health. Kremer, M., Levy, D., 2008. Peer effects and alcohol use among college students. The
Journal of Economics Perspectives 25, 133–156. Journal of Economic Perspectives 22, 189–206.
Carrell, S., Hoekstra, M., West, J., 2011. Does drinking impair college performance Kremer, M., Levy, D.M., 2003. Peer effects and alcohol use among college students,
evidence from a regression discontinuity approach. Journal of Public Economics NBER Working Paper #9876.
95, 54–62. Lundborg, P., 2006. Having the wrong friends peer effects in adolescent substance
Christakis, N.A., Fowler, J.H., 2008. The collective dynamics of smoking in a large use. Journal of Health Economics 25, 214–233.
social network. New England Journal of Medicine 358, 2249–2258. Manski, C.F., 1993. Identification of endogenous social effects: the reflection prob-
Clark, A.E., Loheac, Y., 2007. It wasn’t me, it was them! Social influence in risky lem. Review of Economic Studies 60, 531–542.
behavior by adolescents. Journal of Health Economics 26, 763–784. Marmaros, D., Sacerdote, B., 2006. How do friendships form. Quarterly Journal of
Cohen-Cole, E., Fletcher, J., 2008. Detecting implausible social network effects in Economics 121, 79–119.
acne, height, and headaches: longitudinal analysis. BMJ 337, a2533. Mayer, A., Puller, S.L., 2008. The old boy (and girl) network: social net-
Cutler, D.M., Glaeser, E., 2007. Social interactions and smoking, NBER Working Paper work formation on university campuses. Journal of Public Economics 92,
#13477. 329–347.
Duncan, G.J., Boisjoly, J., Kremer, M., Levy, D.M., Eccles, J., 2005. Peer effects in drug McCabe, S.E., Schulenberg, J.E., Johnston, L.D., O‘Malley, P.M., Bachman, J.G., Kloska,
use and sex among college students. Journal of Abnormal Child Psychology 33, D.D., 2005. Selection and socialization effects of fraternities and sororities on
375–385. U.S. college student substance use: a multi-cohort national longitudinal study.
Eisenberg, D., Golberstein, E., Gollust, S.E., 2007. Help-seeking and access to mental Addiction 100, 512–524.
health care in a university student population. Medical Care 45, 594–601. Neighbors, C., Lee, C.M., Lewis, M.A., Fossos, N., Larimer, M.E., 2007. Are social norms
Eisenberg, D., Hunt, J.B., Speer, N., 2013a. Mental health in American colleges and the best predictor of outcomes among heavy-drinking college students. Journal
universities: variation across student subgroups and across campuses. Journal of Studies on Alcohol and Drugs 68, 556–565.
of Nervous and Mental Disease 201, 60–67. Ott, M.A., Millstein, S.G., Ofner, S., Halpern-Felsher, B.L., 2006. Greater expectations:
Eisenberg, D., Golberstein, E., Whitlock, J.L., Downs, M.F., 2013b. Social contagion adolescents’ positive motivations for sex. Perspectives on Sexual and Reproduc-
of mental health: evidence from college roommates. Health Economics 22, tive Health 38, 84–89.
965–986. Planty, M., Hussar, W., Snyder, T., Kena, G., KewalRamani, A., Kemp, J., Bianco, K.,
Fletcher, J.M., 2012. Peer influences on adolescent alcohol consumption: evidence Dinkes, R., 2009. The condition of education 2009 (NCES 2009-081), NCES 2009-
using an instrumental variables/fixed effect approach. Journal of Population 081.
Economics 25, 201–218. Powell, L.M., Tauras, J.A., Ross, H., 2005. The importance of peer effects, cigarette
Fletcher, J.M., 2010. Social interactions and smoking: evidence using multiple stu- prices and tobacco control policies for youth smoking behavior. Journal of Health
dent cohorts, instrumental variables, and school fixed effects. Health Economics Economics 24, 950–968.
19, 466–484. Prinstein, M.J., Heilbron, N., Guerry, J.D., Franklin, J.C., Rancourt, D., Simon, V., Spir-
Foster, G., 2006. It’s not your peers, and it’s not your friends: some progress toward ito, A., 2010. Peer influence and nonsuicidal self injury: longitudinal results
understanding the educational peer effect mechanism. Journal of Public Eco- in community and clinically-referred adolescent samples. Journal of Abnormal
nomics 90, 1455–1475. Psychology 38, 669–682.
Foster, G., 2005. Making friends: a nonexperimental analysis of social pair formation. Rice, D.P., 1999. Economic costs of substance abuse, 1995. Proceedings of the Asso-
Human Relations 58, 1443–1465. ciation of American Physicians 111, 119–125.
French, S.A., Jeffery, R.W., 1995. Weight concerns and smoking: a literature review. Rosenquist, J.N., Murabito, J., Fowler, J.H., Christakis, N.A., 2010. The spread of alcohol
Annals of Behavioral Medicine 17, 234–244. consumption behavior in a large social network. Annals of Internal Medicine
Gaviria, A., Raphael, S., 2001. School-based peer effects and juvenile behavior. The 152, 426–433.
Review of Economics and Statistics 83, 257–268. Sacerdote, B., 2001. Peer effects with random assignment: results for dartmouth
Glaeser, E.L., Scheinkman, J., 2001. Measuring social interactions. In: Durlauf, S., roommates. Quarterly Journal of Economics 116, 681–704.
Young, P. (Eds.), Social Dynamics. MIT Press, Cambridge, pp. 83–132. Velting, D.M., Gould, M.S., 1997. Suicide contagion. In: Maris, R.W., Silverman, M.M.,
Grucza, R.A., Norberg, K.E., Bierut, L.J., 2009. Binge drinking among youths and young Canetto, S.S. (Eds.), Review of Suicidology, 96–137.
adults in the United States: 1979–2006. Journal of the American Academy of Yakusheva, O., Kapinos, K., Eisenberg, D., 2013. Estimating heterogeneous and
Child & Adolescent Psychiatry 48, 692–702. hierarchical peer influences on body weight using roommate assignment as a
Guryan, J., Kroft, K., Notowidigdo, M.J., 2009. Peer effects in the workplace: evidence natural experiment. Journal of Human Resources (in press).
from random groupings in professional golf tournaments. American Economic
Journal: Applied Economics, 34–68.

What a difference a day makes: Quantifying the effects of birth timing

manipulation on infant health
Lisa Schulkind a , Teny Maghakian Shapiro b,∗
a
Trinity College, Department of Economics, 300 Summit Street, Hartford 06106, USA
b
Santa Clara University, Leavey School of Business, Lucas Hall, 500 El Camino Real, Santa Clara, CA 95053, USA
Article history: Scheduling births for non-medical reasons has become an increasingly common practice in the United
Received 21 September 2012 States and around the world. We exploit a natural experiment created by child tax benefits, which rewards
Received in revised form 4 November 2013 births that occur just before the new year, to better understand the full costs of elective c-sections and
inductions. Using data on all births in the U.S. from 1990 to 2000, we first confirm that expectant parents
respond to the financial incentives by electing to give birth in December rather than January. We find that
most of the manipulation comes from changes in the timing of c-sections. Small birth timing changes,
even at full-term, lead to lower birthweight, a lower Apgar score, and an increase in the likelihood of
H31
J13
being low birthweight.
Keywords:
Infant health
Birth weight
Cesarean section
Induction
Maternity care
1. Introduction and policy-makers. In Healthy People 2010, the U.S. government

outlined a goal of reducing the rate of c-sections among low-risk
Over the past 30 years, the United States has seen a dramatic women by 2010. 2 Instead, the cesarean section rate increased dur-
change in the way that both doctors and parents approach labor ing the decade, even among low-risk women, and as a result, the
and delivery. An increasingly larger fraction of births are being government has restated the same goal in Healthy People 2020.
scheduled at times that are more convenient for doctors or parents At the local level, some hospitals and clinics have set smaller-scale
(Gans and Leigh, 2008; Gans et al., 2007; Lo, 2003). Inductions, stim- goals to decrease the prevalence of c-sections, inductions, and elec-
ulations, and cesarean sections (both elective and non-elective) tive birth timing manipulation (Perkes, 2010; Langrew and Morgan,
are becoming more common. In 2008, c-sections accounted for 1996; Landro, 2011). However, there are those who support the
32.3% of all births in the U.S, while 23.1% of all births in the U.S. current trend, arguing in favor of the convenience that c-sections
were induced, up from 21% and 8.5%, respectively, in 1989. 1 While and inductions offer mothers and doctors.
the higher c-section rate may be a reflection of an increase in the The consequences of small changes in birth timing on the health
number of women who genuinely need a cesarean, the incidence of the newborn, when used for non-medical reasons, have not
of c-sections is much lower in other parts of the world, such as been assessed. Despite interest among physicians to understand
France, the Netherlands, and Sweden, whose c-section rates in 2008 the ramifications of birth timing changes, traditional experiments
were 18.8%, 13.5%, and 17.3%, respectively (Gibbons et al., 2012). are challenging to conduct due to financial, time, and possibly also
The growing trend toward c-sections, inductions, and scheduled ethical constraints. However, by exploiting a natural experiment
births has alarmed some health professionals, expectant mothers, created by the U.S. income tax system – the child tax benefit –
we are able to determine the consequences of small, exogenous
∗ Corresponding author. Tel.: +1 408 554 4052.

E-mail address: tshapiro@scu.edu (T.M. Shapiro).
1 2
Induction status is unknown for 1.2% of births in 2008 and for 5.5% of births in Healthy People is a 10-year plan, now in its third round, aimed at improving the
1989. health of Americans.
140 L. Schulkind, T.M. Shapiro / Journal of Health Economics 33 (2014) 139–158
changes in birth timing. A parent can claim a new child on their We also find that manipulation in the timing of births results in
yearly income taxes if the child was born anytime within the cal- lower birthweight and Apgar scores at birth. Since these measures
endar year. 3 This strict rule gives parents who are expecting a child are associated with many short- and long-term consequences, our
at the beginning of January a financial incentive to time the birth at findings suggest that doctors and parents should use extreme cau-
the end of December instead. Although this may seem “extreme” tion when scheduling a birth for non-medical reasons before the
to some parents, there is both anecdotal and statistical evidence natural arrival of the baby.
showing that the timing of births is altered in order to receive the
maximum financial benefit. 4 Countless stories document that the 2. Child tax benefits in the United States
tax benefit is on the minds of many expectant parents; on preg-
nancy forums, mothers-to-be talk openly about wanting December Like many other countries, the United States offers subsidies
babies and how they were able to persuade their doctors to sched- to families based on the number of children they have, with the
ule a December birth. 5 Dickert-Conlin and Chandra (1999) find goal of helping to offset the cost of raising children. In the U.S.
that an additional $500 in tax benefits increases the probability of these subsidies work through the income tax system and have
having a child in the last week of December versus the first week been evolving over time. There are different channels through
of January by 26.9%. In addition, there is evidence that other per- which child tax benefits operate: the Earned Income Tax Credit,
sonal decisions are responsive to tax incentives. Both Sjoquist and Dependent Exemption, Child Tax Credit, Dependent Care Credit,
Walker (1995) and Alm and Whittington (1995) find that a higher and education benefits. For this study, we focus on the first three,
marriage tax penalty increases the probability of a couple delaying as they are the most widely claimed and offer the most substantial
marriage from the last quarter of one year to the first quarter of benefits.
the next year. 6 There is also evidence that the timing of death is The Dependent Exemption is the broadest of the child cred-
responsive to changes in tax consequences (Kopczuk and Slemrod, its, including older children and higher income families as well.
2003; Gans and Leigh, 2006). The exemption is available for any child under 19 and any full-
In this study, we exploit the natural experiment created by the time student under 24 for whom the family provides over half
strict timing requirement for tax benefits, as well as exogenous of the support. The Dependent Exemption reduces a family’s tax
variation in the generosity of child tax benefits during the 1990s, to liability by lowering their taxable income. The value of the exemp-
investigate the mechanisms through which women alter their birth tion to the family depends both on the amount of the exemption
timing and whether these elective timing changes affect the health and the marginal tax rate the family faces. Because of this, the
of the newborn. Using data from the Vital Statistics on the universe dependent exemption is largest for the wealthiest families fac-
of all births in the United States from 1990 to 2001, in addition to ing the highest marginal tax rates. For example, in 1999, the
census income data to estimate the size of tax benefits, we first dependent exemption was $2750 per dependent. For families in
verify that expectant mothers are in fact manipulating the timing the 15% tax bracket, the benefit was worth $413 in tax sav-
of their births in order to gain additional tax benefits. We then ings, but for families in the 31% tax bracket, it was worth $853.
assess the mechanisms through which women are altering their 7
births; through shifting the timing of inevitable cesarean sections The Child Tax Credit, which began in 1998, allows families with
and inductions or by forcing a c-section or induction in December children a credit against their federal income tax for each qual-
when a vaginal birth would have occurred in January. Finally, we ifying child. The credit was initially $400 for each child, but has
explore the effects of birth timing manipulation on the health of increased gradually to $1000. To qualify for the credit, the child
the newborn. must be under the age of 17 and live with the claiming parent for
Our results complement those of Dickert-Conlin and Chandra more than half the year. The credit begins phasing out at moder-
(1999), whose analysis is based on a sample of 170 births from ately high income levels. 8 Unlike the dependent exemption, the
the National Longitudinal Survey of Youth, supporting their finding Child Tax Credit is a direct reduction in a family’s tax liability, not
that larger child tax benefits are associated with a higher chance just a deduction from taxable income. The credit is also partially
of being born in December relative to the adjacent January. Our refundable.
paper is the first to confirm their assumption that this effect is The Earned Income Tax Credit (EITC) was first introduced in
driven by shifts in inevitable c-sections from January to December. 1975 as an earnings subsidy for low-income families with children.
The EITC has been expanding over the decades, with its largest over-
haul occurring in the early 1990s. Unlike the other credits, the EITC
3
If a child is born on December 31, 2010, he can be claimed as dependent begin- is fully refundable and its benefits are more generous, in terms of
ning on his parents’ 2010 taxes. However, a child born on January 1, 2011 can be percent of earnings refunded, for those with lower income. In its
claimed starting in 2011. Children are claimed on parents’ taxes every year they most generous phase, earnings are matched by an EITC equal to
are a dependent. Thus, having a child born at the end of the year gives parents an
34% of earnings for a family with one child, and 40% for a fam-
additional year of tax benefits.
4
Because health insurance deductibles and health savings accounts often reset
ily with two children. For example, in 1999, the credit’s maximum
at the beginning of the year, some parents have an additional incentive to schedule was $2312 for a family with one child and $3816 for a family with
their births in December. two or more children. The EITC begins to phase out at a family
5
For example, from Baby Community Center, posted October 6, 2010: “My Dr. income of $12,460. The EITC offers a large subsidy for children
told me she wont let me go past 38 weeks (January 7) but I’m hoping to deliver in
in low income families, especially as a proportion of the family’s
December for two reasons. . .
income.
1 Taxes The biggest changes in the tax treatment of children came in the
2 I have a high deductible insurance plan. I’ve met my deductible for the year, so 1990s, because of both the introduction of the Child Tax Credit in
everything from now on until December 31 will be covered 100%. After January
1, my $5600 deductible refreshes!”
6
A couple that is married anytime within a year must claim themselves as married 2750 × 0.15 = 412.5; 2750 × 0.31 = 852.5.
7
8
when filing their taxes for that year. Those who delay their marriage until the new Specifically, the phase-out begins at $110,000 for parents who are married filing
year are able to avoid the marriage tax penalty on one additional year of income jointly, $55,000 for married filing separately, and $75,000 for all others (for tax year
taxes. 2008).
L. Schulkind, T.M. Shapiro / Journal of Health Economics 33 (2014) 139–158 141
5000 (A) From Child #1 (B) From Child #2
5000
1990 1990
1995 1995
2000 2000
Tax Savings From Extra Dependant

4000
4000
3000
3000
2000
2000
1000
1000
0
0
0 50000 100000 150000 0 50000 100000 150000
Family Income Family Income
(C) From Child #3 (D) From Child #4

5000
5000
1990 1990
1995 1995
2000 2000

4000
4000
3000
3000
2000
2000
1000
1000
0
0 50000 100000 150000 0 50000 100000 150000

Fig. 1. Child tax benefits – married parents.
1998 and the expansions of the EITC in 1990 and 1993. We exploit 3. Previous studies
this exogenous variation in tax liability to identify the impact of
tax benefits on the timing of births. This allows us to examine the 3.1. Financial incentives and the timing of births
mechanism through which parents manipulate birth timing and
to identify the causal effect of birth timing manipulation on infant Dickert-Conlin and Chandra (1999) first examined the extent
health. to which tax benefits shift births from the first week of January
The size of a family’s tax benefit for an additional child varies to the last week of December using the National Longitudinal Sur-
by income, state, year, marital status and how many children they vey of Youth (NLSY). They find that a $500 increase in tax benefits
already have. Figs. 1 and 2 show how the size of tax benefits varies increases the probability of having a child in the last week of
by income for married couples and single mothers, respectively. December by 26.9%. Our study focuses on the health consequences
The graph in panel (A) shows how the size of the benefit varies by of this manipulation. In order to do so, we use Vital Statistics data,
income for a family having their first child. The other three graphs which includes information on the type of delivery and health
show the same relationship, but for families having their second, measures of the newborn. This allows us to determine the mecha-
third or fourth child. The variation in benefits within income level nisms through which mothers are altering the timing of their births
in each graph comes from yearly variation in tax policy. There is and to assess the health consequences of these actions.
some additional variation by state, but it is not shown in these The first part of our study confirms the Dickert-Conlin and
graphs. 9 For families having their first or second child, the largest Chandra (1999) result, although the magnitudes of our point esti-
savings come at lower income levels, while savings increase with mates are much smaller. There are several notable differences
income for families having their third or fourth child. An analy- between the Dickert-Conlin and Chandra (1999) study and the
sis of variance shows that most of the variation in the magnitude first stage of ours. First, because they use the NLSY their sample
of the tax benefit is generated by variation across birth order and is limited to 170 births. In contrast, our Vital Statistics analysis is
year. based on all births that occurred in the U.S. between 1990 and 2000.
In addition, Dickert-Conlin and Chandra (1999) exploit tax benefit
variation during the 1980s and early 90s, when benefits were less
9
Some states offer state income tax relief for families with children. These ben- generous. Because we use more recent data, we are able to lever-
efits are generally quite small. age the variation in benefits brought forth by the expansions of the
(A) FromChild#1 (B) FromChild#2

5000
5000
1990 1990
1995 1995
2000 2000

4000
4000
3000
3000
2000
2000
1000
1000
0
0
0 50000 100000 150000 0 50000 100000 150000
(C) FromChild#3 (D) FromChild#4

5000
5000
1990 1990
1995 1995
2000 2000

4000
4000
3000
3000
2000
2000
1000
1000
0
0 50000 100000 150000 0 50000 100000 150000

Fig. 2. Child tax benefits – single mother.
EITC and the Child Tax Credit in the 1990s. This is particularly useful changes in the timing of cesarean sections and inductions. Babies
variation because it is not always an increasing function of income, whose births were shifted to be eligible for the bonus were more
as it almost exclusively was in the 1980s. This is important if we are likely to be of higher birth weight. Because the bonus was offered to
concerned that higher income individuals are more likely to alter babies born after a certain date, those altering their date of deliv-
the timing of their child’s birth, independent of the financial ben- ery were almost entirely those originally planning a c-section or
efit of doing so. If child tax benefits simply increase with income induction. Of course, women who naturally went into labor before
and those with higher income are more likely to alter birth timing, the bonus date could not opt to wait. Using the year-end cutoff
the estimate of the effect of tax benefits on birth timing would be of the U.S. tax benefit system allows us to see how all expectant
biased upward. The non-linear relationship between income and mothers react to such incentives, and whether they “switch” their
tax benefits in the 1990s tax code allows us to eliminate this bias. delivery method in order to ensure benefit receipt. We are also able
More recently, LaLumia et al. (2013) have revisited this question to leverage variation in the size of benefits, rather than examining
using data from the universe of tax returns filed between 2001 and a flat benefit.
2010. They find a positive, but very small effect of tax incentives on A similar result was found by Neugart and Ohlsson (2010) in
birth timing, very much in line with the findings from this study. Germany when the government changed its parental benefit sys-
They also document that those with a late-December newborn are tem. Expectant mothers who were working would receive more
substantially more likely to report self-employment income that generous maternity benefits if their child was born on or after
maximizes their benefits from the Earned Income Tax Credit com- January 1, 2007. The policy change increased economic incentives
pared to low-income filers with an early-January newborn. This is for women to postpone delivery, provided they were employed.
strong evidence of tax evasion. They found that employed women near the end of their term were
In a similar vein, Gans and Leigh (2009) explore how the 2004 shifting their births to the new year so they could benefit from the
Australian Baby Bonus affected the timing of births. The one-time new, more generous parental benefit system. There was no change
bonus, worth approximately $2250 USD, was announced just seven in birth timing for mothers who were not employed and would not
weeks before its introduction, when the treatment group was benefit from the policy change.
already in their third trimester of pregnancy. They estimate that We make four important contributions to the current literature.
over 1000 births were moved to a later date to ensure that the First, by using the Vital Statistics, our study includes the universe of
parents would receive the bonus. Most of the effect was due to all births that occurred in the U.S. during our time period. Thus, we
are also able to examine how the tax benefits affect the behavior Australia, Gans and Leigh (2012) estimate the bargaining power
of different subgroups whom we expect may either behave differ- that parents have over the timing of the birth of their children by
ently than the average parent or may be more at-risk for health exploiting the fact that parents are averse to having their children
complications. Second, by focusing our analysis on the 1990s, we born on February 29 and April 1. Since physicians are not averse to
are able to exploit the large variation in tax benefits caused by working on those days, unless those days fall on a weekend, they
the Earned Income Tax Credit and the Child Tax Credit. Third, we examine what happens when parents’ desire to avoid those birth
determine the mechanisms through which birth timing manipula- dates conflicts with doctors’ desire to avoid working on a week-
tion is occurring. Lastly, and perhaps most importantly, we assess end. Their results suggest that in approximately three-fourths
the health consequences to the newborn of elective birth timing of cases, the physician’s views prevail over those of the patient.
changes. Although the current literature has been able to establish Although most of the power rests with the doctors, parents win in
that parents respond to financial incentives in the timing of their approximately one-quarter of the cases. This suggests that parents
child’s birth, they have not determined what the costs of this behav- are often willing and able to alter their delivery dates.
ior are in terms of the health of the child. To our understanding, we
are the first study to do so. 4. Data
3.2. Manipulating birth timing
We use data on the universe of all births in the United States
The advancement of medicine has allowed patients and doctors from 1990 to 2001 from the National Center for Health Statistics’
increased discretion over the timing of births. Cesarean sections Vital Statistics Natality Files. These data are constructed using birth
allow doctors to deliver babies days (sometimes weeks) before they certificate information from each state, and include demographic
would have been born naturally. Labor can also be induced using a characteristics of the mothers and detailed information about the
combination of drugs that first dilate the cervix (Cervidil or Cytotec) birth. Relevant to our study, the data include information on the
and then contract the uterus (Pitocin) (Block, 2008). month and year of birth, how the child was delivered, and a number
Unfortunately, not all women are well-informed about the con- of health measures for the newborn.
sequences of elective labor induction or c-sections. Induced labors One limitation of the Natality data is that it does not include any
tend to be more difficult and painful and they increase a woman’s information on family income, which is essential for determining
chance of having a c-section by two to three times (Block, 2008). the potential financial benefit of having a baby in December versus
10 A c-section is major surgery and brings with it its own set of January. Because of this, we impute tax benefits based on the demo-
risks and complications: infection, hemorrhage, injury to organs, graphic information available to us in the Vital Statistics. These
scar tissue formation inside the pelvic region, and extended recov- include state, year, race, mother’s education, mother’s age, marital
ery time. Babies born by cesarean are at higher risk for breathing status, and parity(number of children). A benefit of using education
problems, low Apgar scores, and fetal injury, as they may be nicked and other demographic characteristics to derive income estimates
or cut during the incision (Shearer, 1993). is that they are largely predetermined and not easily modifiable,
Although the altering of birth timing is often done for medical whereas income can be manipulated in order to receive additional
reasons, there is substantial evidence that birth timing is some- benefits LaLumia et al. (2013). We derive income estimates using
times manipulated for reasons other than the health of the fetus the 1990 5% Public-Use Census to calculate the average total fam-
or mother. A study of patients undergoing repeat cesarean sec- ily income by demographic cell. In order to exploit variation in
tions found that of the 24,077 repeat cesarean deliveries at term the generosity of tax benefits over time, rather than potentially
observed, over half were performed electively; of these, 35.8% were endogenous variation in cross-sectional income growth, we hold
performed before 39 completed weeks of gestation (Nita et al., income fixed in real terms by adjusting the 1990 income estimates
2009). Doctors often schedule c-sections at convenient times or at for inflation and using those income estimates for 1991–2000. 12
times when the hospital is better-staffed; far fewer babies are born This is a common strategy in the public finance literature, and is
on Saturdays and Sundays than on weekdays. In fact, throughout used in order to avoid endogenous changes in income due to tax
the 1990’s, birthrates were approximately 30% higher on weekdays structures (Medicaid expansions: Gruber, 2005; EITC: Meyer and
compared to weekends. 11 Gans and Leigh (2008) find that nearly Rosenbaum, 1999).
one-third of U.S. and Australian babies who would have been born We use NBER’s TAXSIM to calculate yearly tax liabilities based
on a weekend were “moved” to a weekday. They conclude that on income, filing status (single, married, or head of household), and
increases in cesarean sections and inductions account for around number of dependents. To determine the financial benefit of hav-
four-fifths of the shift away from weekend births. Also using data ing a birth in December instead of January, we find the difference
from Australia and the U.S., Gans et al. (2007) find that there are between the tax liability without and with an additional dependent
fewer births on the days of the annual obstetricians and gynecol- for each observation from the 1990 Census. We then average the
ogists’ conferences. There is, alternatively, a rise in births prior to benefit of the December birth by demographic cell and assign the
the conferences, suggesting that physicians are shifting the births average benefit to the observations in the Natality data. If a mother
earlier to accommodate their schedules. is listed as married, we assume she files her taxes jointly with her
Parents may also want to shift the timing of their child’s birth. husband. If she is single, we assume that her filing status is “sin-
For instance, in Taiwan some people believe that date of birth gle” when she has no children, but changes to “head of household”
determines much of a child’s fate; thus, they ask for their child’s once she has at least one child. Our sample includes all 50 states
birth to be scheduled on particular days (Lo, 2003). Using data from and the District of Columbia and all White and Black mothers. 13
10 12
Pitocin creates contractions that are stronger and more frequent than those We are most concerned that changes in the child tax benefits led to changes
produced naturally by the body; this can lead to hyperstimulation of the uterus – in family income. Particularly with the EITC, small adjustments in income could
contractions that come too hard and too fast – which can cause oxygen deprivation lead to larger benefits. The annual inflation measures are from the Bureau of Labor
for the fetus. If the baby is not delivered quickly, a c-section is needed to remove Statistics.
13
the baby before its oxygen levels drop too low. Hispanics are included in these race categories. Black Hispanics as Black and
11
Authors’ calculation using the Vital Statistics. White Hispanics as White.
Table 1 5. Empirical analysis

Summary statistics – comparison of December and January.
(1) (2) (3) (4) 5.1. Altering the timing of births

Variable December January Difference Standard Error
Number of births 2,432,731 2,383,621 The empirical analysis utilizes individual-level Vital Statistics
Mean family income 49.14 49.02 0.12 0.02*** data, and we begin by restricting our sample to births that occurred
($000) in December or January. Consistent with the empirical strategies
Mean tax saving ($000) 0.787 0.783 −0.003 0.0004***
used in Dickert-Conlin and Chandra (1999) and LaLumia et al.
Mean percent of income 2.862 2.843 0.019 0.0034***
saved (2013), we relate the probability that a child is born in December to
% Married 76.67 76.88 −0.21 0.04*** the size of the benefit parents receive from having the child born
% Black 13.66 13.82 0.16 0.03*** before the new year. Thus, we estimate regressions in which the
Mean age 28.21 28.12 −0.09 0.004***
dependent variable, DecBirth, is an indicator variable equal to one
% C-sections 24.87 24.39 −0.48 0.04***
% Inductions 16.80 15.32 −01.48 0.03*** if the birth occurred in December and zero if it occurred in January.
The regressions take the form:
*
p < 0.10.
DecBirthi = ˛ + ˇTaxBenefiti + Xi + t + εi
**
p < 0.05. (1)
***
p < 0.01.
where TaxBenefiti is the tax savings associated with a December
birth relative to a January birth for individual i (in thousands), Xi is
a vector of demographic characteristics as described below and t
Mother’s education is categorized as less than high school diploma, is a vector of year fixed effects. 14 Our coefficient of interest is ˇ,
high school diploma, and college degree or more. We limit our sam- which measures the effect of a one thousand dollar increase in child
ple to mothers between the ages of 20 and 45, and group mother’s tax benefits on the likelihood of having the birth in December rela-
age into five year age intervals. Marital status is assigned accord- tive to January. Tax benefits vary across year, family income, parity,
ing to whether the mother is married or single at the time of birth. marital status, and state. We exploit all aspects of this variation
Finally, we define parity as the number of children the mother has, in order to determine whether those with higher tax benefits are
including the new birth. We limit our sample to families with four more likely to have a December birth. However, if we are concerned
or fewer children in order to avoid the idiosyncrasies of especially that those with higher child tax benefits are more likely to have a
large families. December birth, independent of the benefit, then we want to con-
Table 1 presents the summary statistics for the births in our trol for these characteristics as well as possible. Table 3 shows esti-
sample that occur in December or January. Mean family income mates from this regression for a number of different specifications.
for our sample is approximately $49,000 (in year 2000 dollars) and All specifications in Table 3, with the exception of Columns (3)
the average savings from having a December birth is approximately and (4), include year fixed effects to control for unobservable fac-
$790. While the benefit is, on average, 2.9% of family income, the tors within each year that might impact the likelihood of having a
largest savings are for families at the lowest income levels. The December birth and are estimated using a linear probability model.
table also shows mean attributes of mothers and births that occur in 15 Standard errors are clustered by state-year, because much of
December and January. Mother who give birth in December are, on the variation in tax policy occurs at this level. 16 Columns (1) and
average, more likely to be White, married, and older, than mothers (2) replicate Dickert-Conlin and Chandra (1999)’s methodology by
who give birth in January. December births are also more likely to controlling for total family income, mother’s income, mother’s age,
be induced or c-sections than those in January. These differences race, marital status and education. We include an indicator variable
are statistically significant. We seek to assess how much of these equal to one if the family lives in an urban area and an indicator
differences are attributable to changes in birth timing due to tax variable equal to one if the child is the first or second child born to
incentives. the mother. Column (2) also includes an interaction term between
In later analyses, we expand our sample to include a set of com- the tax benefit for having the child in December and total family
parison months, October and November, in order to assess the income to account for the possibility that families differ in their
effect of the tax benefits on birth and health outcomes. Table 2 responsiveness to tax incentives across income levels.
shows summary statistics for December and January and October The estimate in Column (1) of 0.0037 indicates that a $1000
and November. The second column shows the summary statis- increase in tax benefits is associated with approximately a 0.37
tics for the months of December and January combined, while percentage point increase in the probability of a December birth.
the third column shows the statistics for October and November This point estimate corresponds to approximately 1 out of every
combined. The fourth column of the table shows the difference 134 January births being moved to December for a $1000 increase
in means between the attributes of December/January births and in tax benefits. When we additionally control for the interaction
October/November births. While the differences in the mean char- between tax incentives and income in Column (2), the estimated
acteristics of these births are not very large, many of the differences effect of the tax benefit increases to 0.54 percentage points. Both
are statistically significant. These differences may be attributed to effects are precisely estimated. These translate to approximately
changes in birth timing and/or seasonality in the types of women a 0.7% change on the base probability of being born in December
who give birth across the year, with winter babies having, on aver-
age, poorer health (Stupp and Warren, 1994; Doblhammer and
Vaupel, 2001; Bound and Jaeger, 1996). That is, the average mother 14
Dollar values are indexed to the year 2000 and imputed based on the individual’s
who gives birth in December or January is slightly different than the demographic characteristics.
average woman who gives birth in October and November. In order 15
The marginal effects from a probit estimation, as in Dickert-Conlin and Chandra
to assess how much of these differences in infant health is due to (1999), are very similar, which is not surprising because the predicted probabilities
changes in birth timing, we are careful to control for demographic for all observations are close to 0.5. Linear probability is our preferred model because
in many specifications we will be interpreting interaction terms, and this can be
characteristics in our estimating equation as well as demographic- problematic in nonlinear models (Ai and Norton, 2003).
specific time trends so that seasonality does not contaminate our 16
Alternatively, we have clustered on year-parity and year-state-parity, and this
results. does not substantively change our results.
Table 2
Summary statistics – December and January compared to comparison months.
(1) (2) (3) (4)

Variable Dec. and Jan. Oct. and Nov. Difference Standard Error
Number of births 4,816,352 4,850,797

Family income ($000) 49.08 49.55 0.47 0.02***
Tax saving ($000) 0.785 0.787 −0.001 0.00***
% of income saved 2.85 2.83 −0.02 0.00***
% Married 76.77 77.29 0.52 0.03***
% Black 13.74 13.25 −0.49 0.02***
Age 28.17 28.25 0.08 0.00***
% C-sections 24.63 24.56 −0.07 0.02***

% Inductions 16.09 16.43 −0.36 0.02***
% Non-induced C-sections 20.90 20.79 −0.11 0.03***
% Induced C-sections 3.11 3.20 0.08 0.01***
% Non-induced vaginal 63.36 63.06 0.29 0.03***
% Induced vaginal 13.02 13.29 −0.27 0.02***
Birthweight (g) 3352.83 3362.11 9.27 0.38***

% Low birthweight (<2500 g) 6.79 6.59 −0.20 0.02***
1 Min Apgar Score 8.071 8.070 −0.001 0.0014
5 Min Apgar Score 8.950 8.948 0.0013 0.0006**
% Normal Apgar Score (>6) 99.00 99.00 0.00 0.00
% Ventilation (<30 min) 1.86 1.90 0.04 0.01***
% Ventilation (≥30 min) 0.797 0.788 −0.01 0.006*
% Any ventilation 2.66 2.69 −0.03 0.01***
*
p < 0.10.
**
p < 0.05 .
***
p < 0.01 .
Table 3
Probability of a December birth.
(1) (2) (3) (4) (5) (6) (7) (8)

Dec birth Dec birth Dec birth Dec birth Dec birth Dec birth Dec birth Dec birth
TaxBen (000s) 0.0037*** 0.0054*** 0.0002 0.0031** 0.0035*** 0.0031** 0.0032**

(0.0008) (0.0012) (0.0011) (0.0015) (0.0009) (0.0013) (0.0014)
% Saving 0.0003***
(0.0001)
Family Inc (000s) −0.0002*** −0.0002** −0.0001*** −0.0001***
(0.0001) (0.0001) (0.0000) (0.0000)
Mother’s Inc (000s) 0.0001 0.0001
(0.0001) (0.0001)
TaxBen × Family Inc −0.0000
(0.0000)
First or Second Child −0.0016** −0.0015**
(0.0007) (0.0007)
Marital Status 0.0122*** 0.0118*** 0.0099*** 0.0091***
(0.0023) (0.0023) (0.0020) (0.0022)
Mother ’s Ed. −0.0001 −0.0001
(0.0002) (0.0002)
Urban Residence −0.0039*** −0.0039***
(0.0010) (0.0010)
Mother’s Age 0.0005*** 0.0005***
(0.0001) (0.0001)
Black −0.0007 −0.0006 −0.0010
(0.0011) (0.0011) (0.0014)
Year FE Yes Yes No No Yes Yes Yes Yes

State FE No No No No Yes Yes Yes Yes
Age FE No No No No Yes No No No
Parity FE No No No Yes Yes Yes Yes Yes
Education FE No No No No Yes No No No
Income × Parity No No No No No Yes Yes No
Demographic Group FE No No No No No Yes Yes Yes
Dem. Group Time Trend No No No No No No Yes Yes
Standard errors in parentheses. Notes: Standard errors are clustered by state-year. The mean of Dec Birth is 0.505. There are 4,816,352 observations in columns (1)–(7), and
4,816,076 observations in column (8). The estimates displayed in each column include different control variables, as described in pages 16–18.
*
< 0.10.
**
p < 0.05.
***
p < 0.01.
of 50.5%. These estimates are significantly smaller in magnitude The estimate on the effect of the percent of income saved from
than those of Dickert-Conlin and Chandra (1999), who estimate the the tax benefit shown in Column (8) further supports our find-
effect to be between 22.7 and 34.4 percentage points. Even when ings. A one percentage point increase in the percent of income
we scale up our estimates by four to account for the fact that we saved, increases the probability of being born in December by 0.03
are looking at the entire month of December instead of just the percentage points. This point estimate, along with the negative
last week as in Dickert-Conlin and Chandra (1999), our estimates coefficient on the interaction between income and tax benefit from
are less than one sixteenth the magnitude of theirs. While this dis- Column (2), implies that it may be the lower income families who
crepancy in our findings may be attributable to number of causes, are most responsive to the tax incentive. We investigate this further
which we discussed in 3.1, we are confident that this difference in Section 6 when we assess the effect of an EITC policy change on
is not a result of attenuation bias caused by our imputed income birth timing for low income mothers. We use the level of tax savings
measures as our findings are quite close to those of LaLumia et al. as our treatment variable in all subsequent analyses for consistency
(2013), who use exact income data from tax returns to measure with the previous literature.
this relationship.
In Columns (3)–(7), we test the sensitivity of these results to
5.2. Mechanisms
alternative specifications. Column (3) does not include any con-
trol variables, while Column (4) includes only a linear control for Next, we seek to identify the mechanisms through which birth
income, an indicator variable equal to one if the mother is mar- timing manipulation occurs. We estimate two additional equations
ried, and a full set of parity fixed-effects to account for the fact for each delivery method in order to determine whether birth
that parents may be more (or less) comfortable altering the tim- timing and/or delivery methods are manipulated in response to
ing of their subsequent births as they become more familiar with the child tax benefits. By interpreting these coefficients together
the birth process and child tax benefits. To control more flexibly with our preferred specification from Column (7) of Table 3, we
for demographic characteristics, we first include age-group and are able to identify two different types of manipulation: “shifting”
education level fixed-effects (Column (5)), and subsequently use and “switching.” We define shifting as having a birth occur in
demographic group fixed-effects, where a demographic group is December rather than in January. If only shifting occurs, the type
defined by race, age-group, education level, and marital status of delivery remains unchanged, but occurs earlier than it would
(Column (6)). 17 The demographic group fixed-effects control for have in the absence of the tax benefits. Switching is defined as a
differences in preferences and ensure our results are not driven change in the type of delivery. An example of switching is having
by seasonality in births that vary across demographic groups. We a c-section instead of a vaginal delivery in order to ensure a
include state fixed-effects to account for time-invariant geographic December birth. A birth can be shifted and/or switched in order to
variation in medical practices might influence the probability of receive the tax benefits. One can schedule an inevitable c-section
a December birth and are spuriously correlated with tax values. for December rather than January. In this instance, the birth is
We also control for income interacted with parity. We include shifted, but not switched. A birth that is switched, but not shifted
these interaction terms, rather than one linear control for income, would occur if a vaginal birth would have happened in December,
because of income’s differential effect on child tax benefits by par- but is switched to a December c-section to ensure receipt of
ity. 18 In Column (7), we also include a linear time trend for each the benefit. A birth can also be both shifted and switched. This
demographic group to control for changes in the prevalence of birth would occur if a child would have been born by vaginal delivery in
timing manipulation by demographic group as well as changes in January, but is instead born by c-section in December. Both types
seasonality over time. Finally, in Column (8) we show estimates of manipulation may have health consequences for the newborn,
when using the percent of income saved as the treatment vari- which will be addressed later in the paper.
able. This specification is otherwise identical to that in Column To determine whether c-sections are shifted in order to receive
(7), except we exclude the income controls. In additional analyses the tax benefit, we estimate Eq. (1), while further isolating our sam-
shown in Section 6, we isolate separate components of the varia- ple to births that occurred by c-sections. 19 In this regression, the
tion in child tax benefits. Those estimates lack statistical power, coefficient on TaxBenefiti , displayed in Column (2) of Panel A of
which implies that in order to identify the relationship between Table 4, can be interpreted as the effect of an additional $1000 in
the tax benefits and birth timing, we must utilize all the different tax benefits on the likelihood of a December c-section birth. We esti-
sources of variation in the tax benefits. mate that an additional $1000 in tax benefits shifts approximately
The difference between Column (3) and Column (4) highlights 1 per 75 January c-sections to December. An analogous estimate
the importance of including controls for marital status, income, and for vaginal births is displayed in Column (3) and for induced births
parity. Including additional control variables in Columns (5)–(7) in Column (4). While the coefficient for vaginal births is not statis-
has very little effect on the point estimates, and although the pre- tically significant, the coefficient for induced births suggests that
cision of our estimates drops slightly, they are still statistically 1 per 72 January inductions are shifted to December. Evaluating
significant at the 5% level. Our preferred specification is that in these point estimates at their respective 95% confidence intervals
Column (7), which includes the full set of fixed-effects as well as suggests that between 2 and 11 for every 489 January c-sections
the demographic group time trend. This is the most conservative shift earlier and between 1.5 and 12 for every 489 January induc-
specification, as the demographic group time trend absorbs some tions shift earlier. These results allow us to confirm that changes in
of the variation in tax benefits. However, in order to identify the birth timing happen through the most likely channel: scheduling
causal effect of the tax benefit on birth timing (and subsequently on c-sections and inductions earlier. 20
health outcomes), it is vital that we properly control for differences
in demographic characteristics of the mothers.
19
Births that result in a c-section can either be induced or non-induced.
20
It is important to note that if we were to also find evidence of “switching”,
17
There are 60 demographic groups: 5 age groups × 3 education groups × 2 that effect would account for part of this estimated effect. For example, if five
races × 2 marital statuses. December births switch from vaginal to c-section deliveries, it will inflate the
18
Results do not rely on the manner in which we control for income. Results are number of December c-sections relative to January c-sections, even though they
robust to inclusion of a smooth polynomial or spline in income. were not shifted earlier. We do not, however, find evidence of switching.
Table 4
Shifting and switching births.
(Panel A) Shifting
(1) (2) (3) (4) (5) (6) (7) (8)

All C-section Vaginal Induction C-section C-section Vaginal Vaginal
Non-Induced Induced Non-Induced Induced
TaxBen (000s) 0.0032** 0.0065*** 0.0024 0.0068** 0.0051** 0.0089 0.0020 0.0051*
(0.0014) (0.0023) (0.0015) (0.0027) (0.0025) (0.0057) (0.0016) (0.0030)
Mean 0.505 0.511 0.503 0.511 0.510 0.507 0.501 0.512

N 4,816,352 1,184,823 3,631,529 789,270 993,602 151,391 2,976,285 637,879
(Panel B) Switching
(1) (2) (3) (4) (5) (6) (7)

C-section Vaginal Induction C-section C-section Vaginal Vaginal
Non-Induced Induced Non-Induced Induced
TaxBen × DJ Birth −0.0005 0.0004 0.0003 −0.0009 0.0000 0.0002 0.0002

(0.0013) (0.0007) (0.0007) (0.0008) (0.0003) (0.0008) (0.0006)
TaxBen (000s) 0.0018 −0.0012 −0.0032*** 0.0049*** −0.0034*** −0.0015 0.0002
(0.0015) (0.0013) (0.0012) (0.0013) (0.0006) (0.0016) (0.0010)
DJ Birth 0.0008 −0.0008 0.0013* 0.0016* −0.0002 −0.0024*** 0.0016***
(0.0017) (0.0007) (0.0007) (0.0010) (0.0003) (0.0008) (0.0006)
Mean 0.246 0.763 0.165 0.208 0.032 0.630 0.134

N 9,667,149 9,553,218 9,551,033 9,551,033 9,551,033 9,505,286 9,505,286
Standard errors in parentheses. Notes: All regressions include state, birth order, year and demographic group fixed effects, as well as demographic group linear time trends
and income × birthorder controls. Standard errors are clustered by state-year.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
Columns (5)–(8) further divide the sample into four mutually where we extend our sample to also include two comparison
exclusive delivery methods: non-induced c-section, induced c- months – October and November. 21 CSeci is an indicator variable
section, non-induced vaginal, and induced vaginal. We are more equal to one for c-section births and zero otherwise and DJi is an
cautious about these results since the data for inductions is less indicator variable equal to one for births that occur in December
complete. Induction status is unknown for 3.4% of c-sections and or January and zero otherwise. The remaining variables are the
0.7% of vaginal births, and we only use observations for which same as in Eq. (1). We assign the observations from the comparison
we know the induction status. However, this analysis provides months the benefit they will receive from having their child. It is
additional insight on which types of births are most likely to important to note that these parents are not at risk of not receiving
have their timing changed for non-medical reasons. Induced c- the benefit based on small timing changes (unlike the December
sections should be considered unplanned c-sections, as labor is and January parents). However, they act as a comparison group as
only induced when a vaginal birth is intended. Both induced c- they allow us to observe how those who receive the same benefits
sections and vaginal births can be scheduled or not. A non-induced as their December counterparts behave in the absence of a tim-
vaginal birth is the closest proxy we have for a spontaneous, ing change incentive. ˇ1 is our coefficient of interest; it measures
non-scheduled birth. We consider the estimates for these births whether c-sections are more likely in December and January com-
a falsification test to show that birth timing, and not conception, pared to October and November for families with higher potential
is being manipulated in order to gain the additional tax bene- tax benefits. Families with higher tax benefits may be more or less
fit of a December birth. We lose some precision when assessing likely to have a c-section for reasons unrelated to the timing change
each delivery type separately, particularly for induced c-sections incentive. Similarly, babies born in December or January may be
which account for only 3.2% of all births. Nonetheless, we find more or less likely to be born by c-section (again, unrelated to the
that non-induced c-sections and induced vaginal births are driv- timing change incentive). By controlling for these factors, we can
ing the shifting results. The coefficient for vaginal non-induced isolate the impact of the tax benefit on type of delivery. This regres-
births is small and is not statistically different from zero, which sion does not capture shifts from January to December, as these two
confirms that shifting is only occurring for births whose timing can months are both “treated.” Rather, it captures whether the tax ben-
be shifted. efits affect the difference in the likelihood of having a c-section in
In order to examine whether any of the changes in the timing of December or January relative to the comparison months. In order
birth can be attributable to switching the type of delivery a woman to fully understand whether mothers are shifting and/or switching
has, we assess whether the tax benefit is correlated with the like- their births in order to gain additional tax benefits, we interpret
lihood of having a certain type of delivery in December or January
relative to a set of comparison months. For c-section births, the
regression takes the form:
21
Results are robust to using a variety of comparison months. In order to minimize
potential confounding factors resulting from differences in mothers’ characteristics
throughout the year, we use the two months most similar to December and January
based on a number of predetermined characteristics, including number of births,
CSeci = ˛ + ˇ1 TaxBenefiti × DJi + ˇ2 TaxBenefiti + ˇ3 DJi income, tax savings, percent of income saved, mother’s age, education, marital sta-
tus, and race. Compared to all other consecutive month-pairs in the year, October
+ Xi + t + εi (2) and November is the most similar month-pair to December and January.
these results together with the shifting results. We estimate an To assess the health consequences of birth timing manipula-
analogous equation for each type of delivery. tion, we compare infant health measures in December and January
The estimates of ˇ1 for each delivery type are found in the top to the comparison months, October and November, as in Eq. (2).
row of Panel B of Table 4. For all delivery methods, the coefficients Again, ˇ1 (the coefficient on the tax benefit and December/January
are small and are not statistically different from zero. This sug- birth interaction) is the coefficient of interest, as it describes the
gests that parents are not manipulating birth timing by changing effect of an additional $1000 in tax benefits on the health of babies
the method of delivery and that changes in birth timing are driven born in December or January relative to those born in the compari-
solely by scheduling inevitable c-sections and inductions earlier. son months. ˇ3 captures the difference in the average health of the
Accordingly, any health effects we find in our following analyses “treated” and comparison babies that is not attributable to the tax
are considered to be primarily a result of changing the timing of benefit. By controlling for the tax benefit and December/January
the births and not changing the type of delivery. births separately, we account for factors that may affect infant
health independent of birth timing changes. The summary statistics
from Table 2 show the average infant born in December or January
5.3. Infant health outcomes is different (both in terms of their health and delivery method)
than the average infant born in October or November, as are their
One of the benefits of using the Vital Statistics for this analysis parents.
is the great deal of information it has about each birth, includ- We use this approach, rather than comparing December to
ing information about labor and short-term health outcomes of January, in order to avoid the issue of selection in the decision to
the infant. By assessing the effect of the tax benefit on a num- manipulate birth timing. For example, it is likely that only low-risk
ber of health outcomes, we seek to answer the broader question births are shifted earlier, in which case the estimated impact would
of whether small, elective changes in birth timing has any effect be biased downward. 23 By assessing the impact on the health of
on newborn health. Our main infant health outcomes of interest December and January babies together, we are able to eliminate
are birthweight and Apgar score. We consider two birthweight this source of bias.24 In Appendix A Table A.10, we show that not
measures: log birthweight and whether a newborn is low birth- using a comparison group causes one to conclude that babies born
weight. The log of birthweight is used rather than the level in order in December to look healthier as a result of shifting. This is what
to more easily relate our findings to the existing literature on the we would expect from positive selection into birth timing.
longer-term consequences of lower birthweight. Low birthweight It is important to note that the estimates from our estimation
babies are those weighing less than 2500 g at birth. The Apgar strategy are also capturing the health effects of births that shift
scores come from a test given 1 and 5 min after birth that quickly within December or January as a result of the tax benefit. That is,
assesses the infant’s health through activity, pulse, grimace (reflex there may be parents who would have had a baby born in December
irritability), appearance, and respiration. Each category is scored anyway who decide to move up the birth to ensure a December
from 0 to 2. A total score between 7 and 10 is generally consid- delivery. There may also be parents who try to have a December
ered normal. Birthweight and Apgar score are valuable outcomes to birth, but because of delays from the induction process still end up
assess for four main reasons. First, they both are reported for nearly having a January birth. 25 Doctors may also move up the scheduled
all births. This allows us the statistical power necessary to deter- births of December babies in order to accommodate the additional
mine even small effects of birth timing changes on these outcomes. babies that will be born in the final week of December. In all these
Second, since they are both continuous variables, we can observe cases the birth is hastened as a result of the tax benefit, but there
marginal changes in the newborns’ health. They are also both strong is no shifting from January into December. Our estimates in Sec-
indicators of overall health, and can reflect more specific health tions 5.1 and 5.2 do not capture these changes in birth timing;
problems not observed in our data. Important developmental gains however, the estimates from our health regressions do. As a result,
are made in the final weeks of gestation that would be reflected we are cautious when interpreting the estimates from our earlier
in birthweight and Apgar score. For instance, the lungs and liver analyses with the health results together.
are not fully developed until 36 to 38 weeks, and even when fully Table 5 displays the health results for all births. Columns (1)
developed, they continue to strengthen during the final week(s) and (2) show that higher potential tax benefits are associated with
(March of Dimes, 2012c). Basic weight gain is also an important a decrease in log birthweight and an increase in the likelihood of
part of the final weeks of fetal development. In the third trimester, being a low birthweight baby. This is consistent with the shorter
babies gain an average of 225 g per week (The American College of gestational periods that result from shifting births earlier (which
Obstetricians and Gynecologists, 2010). Finally, a number of stud- we investigate in more detail in Section 5.4). Evaluating the point
ies have sought to determine the causal effect of birthweight on estimate from Column (1) at its 95% confidence intervals suggests
longer-term outcomes allowing us to estimate the effect of birth that an additional $1000 in benefits causes between a 0.07% and
timing changes on the health and well-being of children later on in 0.19% decrease in average birthweight, or between 2.41 and 6.37 g.
life. We discuss these in more detail in Section 7. We also estimate a 0.08 percentage point increase in the likelihood
We also assess the effect of birth timing manipulation on of being born below 2500 g. On average, approximately 68 babies
assisted ventilation. Newborns with respiratory failure are placed
in mechanical ventilators, which help them breathe. The Vital
Statistics reports whether the newborn used assisted ventilation
for less than 30 min or for 30 min or more. Since lung function is 23
Imagine two babies who are due on January 5. Baby A would weigh 3500 g if
one of the final stages of fetal development, ventilation is a valu- born on January 5, while baby B would be 2500 g. The doctors know that baby B
able health outcome to assess. Newborns that are born early are is very small, so they hold off as long as possible, and she is delivered on January
5, weighing 2500 g. The c-section to deliver Baby A is scheduled for December 31
particularly susceptible to respiratory difficulties. 22
and she born at 3300 g. If we compare the shifted baby to the non-shifted one, we
would think that the shifting made the baby bigger. In reality, Baby A is 200 g smaller
because of the earlier delivery.
24
This would also eliminate any upward bias stemming from selection, though we
22
There are two main drawbacks of assessing ventilation as an outcome is that have no reason to expect this.
25
ventilation status is not reported for all births in a way that appears to be non- The time between induction and delivery differs greatly across women. It takes
random. just a few hours for some and up to three days for others (March of Dimes, 2012a).
Table 5
Infant health outcomes for all births.
(1) (2) (3) (4) (5) (6) (7) (8)

Log BW Low BW (<2500 g) Apgar1 Apgar5 Apgar5 ≥7 Any Vent Vent. <30 m Vent. >30 m
TaxBen × DJ Birth −0.0013*** 0.0008** −0.0029 −0.0030** −0.0003 0.0002 0.0001 0.0002
(0.0003) (0.0004) (0.0054) (0.0015) (0.0002) (0.0003) (0.0002) (0.0002)
TaxBen (000s) 0.0021*** −0.0015** −0.0035 0.0048** 0.0000 −0.0006 −0.0006 −0.0000
(0.0006) (0.0006) (0.0062) (0.0021) (0.0003) (0.0005) (0.0005) (0.0002)
DJ Birth −0.0017*** 0.0012*** 0.0061* 0.0015 0.0003* −0.0002 −0.0002 0.0000
(0.0003) (0.0004) (0.0035) (0.0013) (0.0002) (0.0002) (0.0002) (0.0001)
Mean 8.096 0.068 8.074 8.947 0.987 0.027 0.019 0.008

N 9,659,321 9,659,321 3,233,350 7,493,949 7,493,949 9,082,543 9,082,543 9,082,543
and flexible income controls interacted with demographic characteristics. Standard errors are clustered by state-year.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
per 1000 are classified as low birthweight, and we find that this 5.4. Gestation
number increases by approximately 0.8 for an additional $1000 in
benefits. The 5-min Apgar scores of the newborn is also negatively We next seek to determine the magnitude of the birth timing
affected by the tax benefit, but the point estimate is quite small changes. While the Vital Statistics does not include any information
(a 0.003 point increase for a $1000 increase in benefits). The point on the due date of the infant or whether the birth was considered
estimate for the 1-min Apgar score is very similar, but it is only “earlier than normal,” we can assess the effect of the tax benefit on
recorded for approximately one third of births and is less precisely gestational age at birth. Similar to the methodology used to deter-
estimated. We find no effect of the tax benefit on ventilation. While mine whether delivery methods were being switched, we estimate
the point estimates from the ventilation regressions are positive the following equation:
(suggesting an increased need for ventilation), ventilation is a rare
occurrence and the standard errors on the estimates are large.
The average effects that we find are quite small, but we know Gestationi = ˛ + ˇ1 TaxBenefiti × DJi + ˇ2 TaxBenefiti + ˇ3 DJi
that these are driven by small number of births that are actu- + Xi + t + εi (3)
ally shifted. If we assume that the entire change in the average is
driven by the few births that are successfully shifted from January where gestation is measured in weeks. Again, ˇ1 is our coeffi-
to December (what we show in Table 4), the health cost to each cient of interest, as it tells us whether the tax benefit is correlated
of the shifted babies would be quite large. 26 We also test whether with gestation for infants born in December or January. Column
TaxBenefiti × DJi has negative health effects across all outcomes we (1) of Table 7 shows results from this regression. We find that,
look at, 27 and we are able to reject the null hypothesis through on average, a $1000 increase in the tax benefit for a child leads
bootstrapped replications. 28 to a 0.033 week decrease in gestation relative to the comparison
Table 6 contains the results for the delivery methods responsi- months. Once again, this estimate captures the decrease in gesta-
ble for the majority of the manipulation: non-induced c-sections tion caused by moving births from January into December as well
and induced vaginal deliveries. The magnitudes of the birthweight as births that may have occurred earlier in December or January
point estimates are larger for babies born via non-induced c- than they would have otherwise in order to secure receipt of the
sections, suggesting that this group is driving the health estimates. benefits. Thus, these estimates do not only tell us by how much
It is important to keep in mind, however, that while we do not find the January births are shifted in order to receive the benefits as
evidence of switching from one delivery method to another, the a December birth. The 0.033 decrease in gestation, on average,
observed responses in these tables could be a result of changes in equates to approximately a 0.23 day (5.5 h) reduction in gestation
the composition of babies born via the delivery method examined, per child born in December and January (not just those who are
and should be interpreted with caution. shifted). The implied birthweight decrease from this reduction in
gestation is consistent with the birthweight estimates in Table 5.
Babies gain approximately 225 g per week in the final trimester of
pregnancy. 225 g/week × −0.033 weeks = −7.43 g. A 7.43 g decrease
in birthweight is 0.22% of the mean birthweight 3351 g.
While gestation is certainly an important outcome for us to
26
For example, using our estimates from Section 5.1 and dividing our “second assess, it is not without its shortcomings. Gestation is measured
stage” estimate by our “first stage” estimate, we can calculate that each shifted
baby’s birthweight declines by an average of 40% (−0.0013/0.0032). We remind the
in weeks, while the shifting we observe may only be on the magni-
reader, however, that this calculation assumes that the effect on birthweight only tude of days. Thus, not all shifts in birth timing translate to changes
operates through the births shifted into December, an assumption that we believe in their reported gestation. Gestation is computed using the dates
is very unlikely to be true. In addition, the 90% confidence interval, calculated using of the birth of the child and the mother’s last normal menstrual
a bootstrap with 200 replications is quite large. The 5th and 95th percentile of the
period (LMP). The LMP is used as the initial date because it can be
sampling distribution are −111% and −20%, respectively.
27
We include ventilation of <30 min and >30 min, but don’t include “any” venti- more accurately determined than the date of conception, which
lation, because it is a linear combination of those two variables. usually occurs 2 weeks after the LMP. In some instances (5.1% of
28
Specifically, we test the null hypothesis that the ˇ1 for the outcomes: LogBW ≥ 0 births in 1999) a clinical estimate of gestation is used rather than
& LowBW ≤ 0 & Apgar1 ≥ 0 & Apgar5 ≥ 0 & NormApgar ≥ 0 & Vent less the time between the LMP and birth. This is done for normal weight
than 30 m ≤ 0 & Vent more than 30 m ≤ 0. In 200 bootstrap replications, there
weren’t any where all 7 of these restrictions held. Table A.12 in Appendix A gives the
births of apparently short gestations and very low birthweight
information needed to test alternative joint tests. It lists the percent of replications births reported to be full term. The most common use of the clinical
where 0–7 of the restrictions held. estimate is for instances when the date of the LMP is not reported.
Table 6
Infant health outcomes by delivery type.
(Panel A) Non-induced C-section births
(1) (2) (3) (4) (5) (6) (7) (8)

TaxBen × DJ Birth −0.0029*** 0.0030*** −0.0179 −0.0023 −0.0002 −0.0007 −0.0003 −0.0004
(0.0010) (0.0011) (0.0141) (0.0039) (0.0007) (0.0007) (0.0006) (0.0005)
TaxBen (000s) 0.0020 −0.0048*** 0.0050 −0.0116** −0.0021*** 0.0000 −0.0006 0.0007
(0.0013) (0.0014) (0.0151) (0.0045) (0.0008) (0.0010) (0.0008) (0.0006)
DJ Birth −0.0013 0.0010 0.0122 −0.0006 0.0001 0.0012* 0.0005 0.0007
(0.0009) (0.0011) (0.0094) (0.0034) (0.0006) (0.0007) (0.0005) (0.0004)
Mean 8.065 0.120 7.870 8.872 0.979 0.045 0.027 0.018

N 1,988,256 1,988,256 676,412 1,507,491 1,507,491 1,868,544 1,868,544 1,868,544
(Panel B) Induced vaginal births
(1) (2) (3) (4) (5) (6) (7) (8)

TaxBen × DJ Birth −0.0004 −0.0006 0.0471*** 0.0004 0.0002 0.0002 −0.0002 0.0004
(0.0007) (0.0009) (0.0156) (0.0034) (0.0005) (0.0007) (0.0007) (0.0004)
TaxBen (000s) 0.0017 0.0011 −0.0071 0.0103** 0.0003 0.0001 −0.0002 0.0003
(0.0011) (0.0013) (0.0167) (0.0041) (0.0006) (0.0011) (0.0010) (0.0004)
DJ Birth −0.0027*** 0.0017** −0.0289*** −0.0001 0.0000 −0.0007 −0.0003 −0.0003
(0.0007) (0.0008) (0.0109) (0.0031) (0.0004) (0.0007) (0.0006) (0.0003)
Mean 8.125 0.047 8.073 8.957 0.991 0.028 0.023 0.005

N 1,271,380 1,271,380 350,223 1,067,918 1,067,918 1,229,634 1,229,634 1,229,634
and flexible income controls interacted with demographic characteristics. Standard errors are clustered by state-year.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
For about 1% of births in our sample, gestation is not known. Col- are shown in Columns (3) and (4) are not statistically different from
umn (2) of Table 7 shows results from estimating the regression zero.
while excluding observations for which a clinical estimate of ges- To complete our gestation analysis, we seek to determine at
tation is used. These results are very similar in magnitude from what gestational age the shifting is occurring. That is, are the infants
the results using all measures of gestation, which assures us that more or less likely to be born at a certain gestational age as a result
mis-measurement of gestation from the clinical estimate is not of the tax benefit? To do this, we estimate the following equation
biasing our results. for each gestational age between 30 and 45 weeks:
We may also be concerned that gestation may be more likely to
be misreported or missing when the timing of the birth has been
changed for non-medical reasons. Thus, we estimate the regression, GestAgeIndi = ˛ + ˇ1 TaxBenefiti × DJi + ˇ2 TaxBenefiti
first using an indicator variable that is equal to one if gestation + ˇ3 DJi + Xi + t + εi (4)
information is missing from the vital statistics for that birth as the
dependent variable, and second using an indicator variable equal where GestAgeIndi is equal to one if the infant is born at that
to one if the gestation measure is a clinical estimate. The estimates specific gestational age. That is, if the infant is born at 37 weeks,
Table 7
The effect of tax benefits on gestation-related outcomes.
(1) (2) (3) (4)
Gestation Gestation Gestation Missing Gestation Imputed

TaxBen × DJ Birth −0.0330*** −0.0323*** 0.000351 −0.00101
(0.00524) (0.00534) (0.000285) (0.000719)
TaxBen (000s) 0.0422*** 0.0467*** −0.000307 0.00749***
(0.00631) (0.00618) (0.000420) (0.00182)
DJ Birth 0.0275*** 0.0219*** 0.000407* 0.00732***
(0.00480) (0.00501) (0.000237) (0.00104)
Mean 38.983 39.004 0.008 0.125

N 9,594,292 8,433,510 9,667,149 9,667,149
and income × birthorder controls. Column (1) shows the regression results from estimating Eq. (3) for all individuals and Column (2) displays the results from estimating
the regression while excluding observations for which a clinical estimate of gestation is used. Column (3) shows the regression results when the dependent variable is an
indicator variable that is equal to one if gestation information is missing from the vital statistics for that birth, and Column (4) shows the results for an indicator variable
equal to one if the gestation measure is a clinical estimate. Standard errors are clustered by state-year.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
the indicator variable for gestational age of 37 weeks will be (A) All Births
equal to one, while all other gestation age indicator variables will
.002
equal zero for that observation. Once again, ˇ1 is the coefficient of
interest as it tells us whether a larger tax benefits affects the like-
lihood of being born at a certain gestational age in the “treated”
.001
months – December and January, compared to October and
November. By assessing the estimates for all regressions simulta-
Coefficient with 90% CI

neously, we can better understand which babies are being shifted to
0
earlier births. To best show these findings, we graph the estimated
coefficients from Eq. (4) along with their 90% confidence interval
−.001
in Fig. 3. The first figure is for all births, while the two subsequent
figures are for non-induced c-sections, and induced vaginal births,
respectively. While many of the estimates are not statistically dif-
−.002
ferent from zero, a pattern emerges in the first two figures– babies
may be more likely to be born before 38 weeks, but less likely to
be born at 41 weeks or later. For induced vaginal births, the pat-
−.003
tern isn’t as clear. However, babies born by induced vaginal birth
are more likely to be born at exactly 37 weeks the larger the tax 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
benefit. Babies are considered full-term beginning at 37 weeks of Gestation (in weeks)
gestation; thus, it appears that physicians wait to cross this poten-

tially important threshold before inducing the birth. It is worth (B)Non-Induced C-Section
noting that since the mean of each of the gestation age variables is
.005
quite different, the magnitudes of the plotted coefficients should
not be interpreted as the magnitude of the effect at the average.
Since most of the changes in birth timing occur after 35 weeks
of gestation, we replicate our main analysis isolating the sample
to births that occurred at 35 weeks of gestation or later to verify

0
that our results are driven by the higher gestational aged new-
borns. Table A.11 in Appendix A shows these results, which are
very consistent with our main findings.
−.005
6. Supplemental analyses and robustness checks
6.1. Heterogeneity in responsiveness and consequences

−.01
We next investigate whether any heterogeneity exists in both

the timing response to the tax incentive and in the health con- 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
sequences of birth timing changes across subgroups. To assess Gestation (in weeks)
differences in the timing of births, we estimate the following equa-
tion: (C) Induced Vaginal
DecBirthi = ˛ + ˇ1 TaxBenefiti × Groupi + ˇ2 TaxBenefiti

.004
+ ˇ3 Groupi + Xi + t + i (5)

.002
As in our main estimates from Section 5.1, this analysis only

includes December or January births, but seeks to determine
whether certain groups are more responsive to the tax benefit.
0
Groupi is an indicator variable equal to one if the birth belongs to

one of the various groups we assess. ˇ1 is the coefficient of interest,
as it shows the differential effect of the tax benefit for the subgroup.
−.002
The results from these analyses are shown in Column (1) of

Table 8. The first subgroup we consider are higher-order births.
We find that parents having their second, third, or fourth child are
−.004
less likely to alter the timing of the birth in order to receive the
tax benefits. This may be because higher-order births have shorter 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
gestation, on average, and are less likely to go to post-term (42 Gestation (in weeks)
weeks or more) than first births. This gives parents having their
subsequent child less of an opportunity to change the timing of the Fig. 3. Effect of tax benefit on share of December and January births at each week
of gestation.
birth while still being able to deliver at “full-term.” The second sub-
group we consider are those who had a repeat c-section. This group
does not respond any differently than the average birth. While this
may be surprising, considering our earlier finding that most of the
shifting in the timing of births is driven by inevitable c-sections,
Table 8
Additional sub-groups.
Dec Birth Gestation Log BW Low BW Apgar Score Ventilation
Not 1st Birth −0.0038 **

−0.0970 ***
−0.0071 ***
0.0046 ***
−0.0124 ***
0.0018***
(0.0019) (0.0091) (0.0008) (0.0010) (0.0034) (0.0006)
Mean 0.505 38.926 8.101 0.067 8.96 0.026
Repeat C-Sec −0.0024 −0.0472*** −0.0041*** 0.0024 −0.0048 −0.0004

(0.0029) (0.0166) (0.0015) (0.0018) (0.0060) (0.0013)
Mean 0.517 38.695 8.114 0.063 8.968 0.034
Female 0.0006 0.0061 0.0004 0.0002 0.0013 −0.0003

(0.0009) (0.0079) (0.0006) (0.0008) (0.0030) (0.0005)
Mean 0.505 39.044 8.079 0.074 8.959 0.025
Weekday 0.0119** 0.0009 −0.0005 0.0004 −0.0008 −0.0003

(0.0051) (0.0089) (0.0008) (0.0010) (0.0036) (0.0006)
Mean 0.508 39.007 8.100 0.066 8.951 0.027
Longer Work Week 0.0025 −0.0105 0.0000 0.0001 0.0013 −0.0006

(0.0017) (0.0121) (0.0008) (0.0009) (0.0033) (0.0006)
Mean 0.507 38.955 8.096 0.068 8.947 0.027
Smoke or Drink −0.0011 −0.0148 0.0000 0.0011 −0.0005 0.0003

(0.0017) (0.0119) (0.0011) (0.0013) (0.0040) (0.0009)
Mean 0.503 38.921 8.037 0.109 8.936 0.033
Black 0.0038*** −0.0161 −0.0016* 0.0012 −0.0067* 0.0003

(0.0013) (0.0110) (0.0009) (0.0011) (0.0036) (0.0006)
Mean 0.503 38.465 8.024 0.119 8.869 0.028
Less than HS −0.0011 −0.0363*** −0.001 0.0000 −0.0023 0.0008

(0.0015) (0.0101) (0.0008) (0.0011) (0.0037) (0.0006)
Mean 0.505 38.99 8.079 0.076 8.945 0.022
Age 35+ 0.0005 −0.0157 −0.0013 0.0031 0.0021 −0.0002

(0.0021) (0.0166) (0.0015) (0.0020) (0.0059) (0.0009)
Mean 0.507 38.696 8.092 0.083 8.929 0.03
Standard errors in parentheses. Notes: Each coefficient comes from a separate regression. All regressions include state, birth order, year and demographic group fixed effects,
as well as demographic group linear time trends and income × birthorder controls. Standard errors are clustered by state-year. For rows 1–6 the 60 demographic groups are
the same as in the rest of the paper, and they are defined by age, race, marital status and education. For rows 7–9, the characteristic that is being examined is excluded from
the demographic group creation and is included as a separate fixed effect.
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
since parents having their first child seem to be the most respon- a weekday. Finally, we assess whether the timing response is
sive to the benefits, we aren’t alarmed that parents having repeat stronger in years in which the last week of December has more
c-sections are not responding additionally to the incentive. work days. Depending on the year, Christmas and New Year’s Day
Because of the widespread use of ultrasound, parents often may fall on a weekend or weekday day. Thus, the last week of
know the gender of the child prior to birth. Thus, we can assess December may have either 4 or 5 work days. We find no differential
whether parents behave differently depending on the gender of effect on birth timing for those years.
their child. Consistent with previous literature on gender prefer- We also look at the responsiveness of four subgroups who
ence in the United States, we do not find that parents having female are more likely to have “high-risk” pregnancies (March of
babies are any more likely to alter the timing of the birth (Lhila and Dimes, 2012b): mothers who reported smoking or drinking while
Simon, 2008). Next, we determine whether timing changes as a pregnant, Black mothers, mothers with less than high school edu-
result of the tax benefits are more likely to happen on weekdays cation, and mothers aged 35 or older. We find that Black mothers
rather than weekends. In this case the Groupi is equal to one if are more responsive to the incentive to change the timing of their
the birth occurs on a Monday–Friday and zero otherwise. Since child’s birth, but the other groups show no statistically significant
doctors prefer to work on weekdays and a birth timing change differences.
requires a doctor’s compliance, we expect that the timing of week- We next want to determine whether the consequences of
day births would be most responsive to the tax benefit. We find that altering the timing of births differ across these subgroups. To do
so, we again use our comparison months, October and November,
and estimate the following equation:
Healthi = ˛ + ˇ1 TaxBenefiti × Groupi × DJi + ˇ2 TaxBenefiti × DJi + ˇ3 Groupi × DJi +

(6)
ˇ4 TaxBenefiti × Groupi + ˇ5 TaxBenefiti + ˇ6 Groupi + ˇ7 DJi + Xi + t + i
The coefficient on the triple interaction term, ˇ1 , tells us

whether the health of babies born to mothers in the subgroup dur-
this is certainly the case. All of the effect of tax benefits on birth ing the treated months is especially responsive to the size of the
timing operates through weekday births. The estimate of 0.0119 tax benefit. We first look at gestation in Column (2) of Table 8.
equates to approximately a one percentage point increase in the Estimates from this column tell us the magnitude of the timing
probability of being born in December, if the birth occurs on shifts for these groups relative to the average. Both higher-order
births and repeat c-sections show larger shifts in birth timing. The
only other subgroup that shows a statistically significant differ-
(A) All Births
ential response in the magnitude of the timing shift are mothers
.005
with less than high school education. The health outcomes we
assess are log birthweight (shown in Column (3)), low birthweight
(Column (4)), 5-min Apgar score (Column (5)), and ventilation (Col-
umn (6)). We consider the health estimates indicative of whether

certain groups are more adversely affected by birth timing changes
(as a result of the tax incentive).
We begin by considering higher-order births. The health of
0
higher-order births are more adversely affected by birth timing
changes on all our measures. This is likely caused by the larger
timing shifts they experience, even though they are less likely to
be shifted than first births. We see a similar pattern for repeat c-
sections. While higher-order births are no more likely to switch
from January to December as a result of the tax benefit, the mag-
−.005
nitude of the switch is larger. A $1000 increase in tax benefits
decrease mean gestation by about 0.10 weeks. This is three times F M A M J J A S O D
the magnitude as for the general population. While some of the Month: Compared to Following Month
loss in health is occurring because the births are moved earlier,
the decrease in gestation does not explain all of it, suggesting that (B) C-Sections
higher-order births are either a particularly vulnerable group or
.01
that the consequences of timing changes differ across the gestation
distribution. 29 The birthweight and Apgar scores of Black new-
borns are also more adversely affected by birth timing changes.
While mothers with less than high school education are hastening
.005
their birth timing by a larger magnitude than others, the health

of their newborns are not more adversely affected by these timing
changes. The infant health of the other subgroups we assess are not
more adversely affected by birth timing changes than that of the
average shifted newborn.
0
This analysis provides us with information that may help inform

optimal maternity care policies. We have found that a few groups
are especially vulnerable to changes in birth timing – Blacks,
higher-order births, and repeat c-sections. These three subgroups
−.005
have lower mean gestation, suggesting that the adverse affects of

birth timing changes are exacerbated the earlier the birth occurs.
F M A M J J A S O D
All these birth timing changes are presumably occurring under the
Month: Compared to Following Month
supervision of a physician. Thus, timing changes even at the end of
pregnancy can have adverse effects on the health of the infant.
(C) Inductions
6.2. Isolating the variation in the tax code
.01
Identification in our main analysis from Eq. (1) relies on all

sources of variation in the tax benefits – year, income, parity, mar-
ital status, and state. As shown in Figs. 1 and 2, there are several
.005
sources of variation in tax benefits that we can leverage in order to

identify how one specific source of variation affects birth timing.
First, we isolate variation in benefits over time by looking at
0
the effect of the introduction of the Child Tax Credit in 1998. As

previously discussed, the Child Tax Credit was a $400 per child
credit in tax liability. Everyone except those with the lowest and
−.005
highest incomes were eligible to claim the benefit. Thus, we use

a difference-in-differences approach to determine the effect of the
increase in tax benefits from the Child Tax Credit. The treated group
−.01
consists of those families that become eligible for the benefit in

1998, and the control group consists of families that do not bene- F M A M J J A S O D
fit from its introduction. We compare outcomes in 1996, the year Month: Compared to Following Month
before the legislation was introduced, and 1998, the first year of
eligibility. 30 Fig. 4. Falsification test: difference in births for all month pairs.
29
The health effects are more than 3 times as large as for the rest of the population.
30
A full, more precise, description of the creation of the treatment and control
groups is included in B.
Table 9
Isolating variation in the tax code.
(Panel A) Child tax credit
(1) (2) (3) (4) (5) (6) (7)

Total C-Sec Induct Log BW Low BW Apgar5 Gestation
CTC × After 0.0050 0.0045 −0.0124 −0.0039 0.0024 −0.0314** −0.0611

(0.0036) (0.0070) (0.0090) (0.0040) (0.0036) (0.0144) (0.0421)
CTC × After × DJ −0.0027 0.0039 0.0197 −0.0131
(0.0039) (0.0044) (0.0143) (0.0453)
Mean 0.253 0.256 0.257 8.101 0.064 8.952 39.009

N 873,899 198,169 171,699 873,264 873,264 700,045 868,120
(Panel B) Earned income tax credit
(1) (2) (3) (4) (5) (6) (7)

Total C-Sec Induct Log BW Low BW Apgar5 Gestation
EITC × After 0.0101 0.0037 0.0001 −0.0077 0.0071 −0.0420 −0.1723*

(0.0072) (0.0146) (0.0254) (0.0077) (0.0092) (0.0265) (0.0947)
EITC × After × DJ 0.0031 0.0014 0.0421 0.1410
(0.0071) (0.0092) (0.0282) (0.0964)
Mean 0.255 0.258 0.252 8.035 0.111 8.909 38.613

N 79,085 15,683 7305 78,976 78,976 70,772 78,717
Standard errors in parentheses. Notes: Regressions in Panel A include state, birth order and demographic group fixed effects, as well as income × birthorder controls.
Regressions in Panel B include state and marital status fixed effects. Standard errors are clustered by state-year.
*
p < 0.10.
**
p < 0.05.
***p < 0.01.
Panel A of Table 9 displays the results of this analysis. Row 1 dis- 6.3. Falsification tests
plays the coefficients on the interaction of CTC, an indicator variable
equal to one for families in the eligible income range, and after, an To verify that the difference in births between December and
indicator variable equal to 1 in 1998. The estimation in Column January is not just spuriously correlated with tax benefits, we esti-
(1) examines whether a December birth is more likely for those mate Eq. (1) for every other month pair. For example, we look
who are newly eligible for the Child Tax Credit. While the coeffi- at whether the likelihood of being born in February rather than
cient is positive, it is not statistically significant. The same is true in March is correlated with the child tax benefit they will receive.
Column (2), when the sample is restricted to only c-sections. Col- We assign each observation the benefit they will receive for having
umn (3) shows a statistically insignificant decrease in December their child in that year. This is the same benefit that is assigned
inductions. Row 2 in Columns (4)–(7) shows the coefficient of to the December/January observations from that year, except,
interest for the health outcomes, with an additional term in the unlike the December/January observations, these families are not
interaction, DJ, which is equal to one for births in December and on the margin of receiving the benefit. 32 Panel (A) of Fig. 4 plots
January and zero for births in October and November. With the the estimated effect of the tax benefit for each month pair as
exception of a positive coefficient for the 5-min Apgar score, the well as their 90% confidence interval. For reference, we include
coefficients show the same pattern as the main results, but are not the December/January estimate – the rightmost point – in the
statistically different from zero. graph as well. Two of the estimates are statistically different from
Next, we leverage differences in benefits across birth order. As zero, but the point estimates are smaller in magnitude than the
shown in Figs. 1 and 2, the 1993 EITC reform increased benefits for December/January estimate. We next estimate whether the likeli-
the second child for those with low incomes, but had no effect on hood of having a c-section and induction in the earlier month of
the benefit (or lack thereof) for third and fourth children. Thus, we each of the month pairs is correlated with the tax benefit (this is
perform a differences-in-differences analysis isolating our sample our “shifting” analysis). The estimates from these regressions are
to 1992 (the pre period) and 1994 (the post period) and to fam- plotted in Panels (B) and (C) of Fig. 2. For c-sections, only one of
ilies with incomes that make them eligible for the EITC benefits the month-pairs, September and October, has a statistically signif-
for their second child. Mothers having their second child are the icant estimate. However, it is again smaller in magnitude than the
treated group and mothers having their third or fourth child are the December/January estimate. For inductions, none of estimates for
control group. 31 In Columns (1)–(3) of Panel B of Table 9, we see a the other month pairs is statistically different from zero.
statistically insignificant increase in December births for all three We next estimate the health regressions for the other month
categories, and in Columns (4)–(7) we see a statistically insignifi- pairs. As in our main analysis, we use the two months prior
cant increase in all health measures. The standard errors are quite to each month pair as their comparison group. In order to
large for all of the estimates. Isolating only across birth order vari- exclude December and January from any of the comparison groups,
ation or the over-time variation in benefits (as in the Child Tax April/May is the first month pair of the year we can assess. The
Credit regressions) limits our statistical power greatly and justifies plotted estimates for each of the health outcomes are shown in
our need to use all sources of variation in tax benefits to in our main Fig. 5. The estimates for birthweight are statistically significant
identification strategy. for a number of the month-pairs, but in most months are less
31 32
A full, more precise, description of the creation of the treatment and control We do not include January/February or November/December, as these compar-
groups is included in B. isons would be contaminated by the shifting into (out of) December (January).
.002 (A) Birthweight (B) Low Birthweight
.002
.001
.001

0
0
−.001
−.001
−.002
−.002
A/M M/J J/J J/A A/S S/O O/N D/J A/M M/J J/J J/A A/S S/O O/N D/J
Months: Compared to Previous 2 Months Months: Compared to Previous 2 Months
(C) Apgar Score (D) Gestation
.04
.004 .002
.02

0
0
−.002
−.02
−.004
−.04
−.006
A/M M/J J/J J/A A/S S/O O/N D/J A/M M/J J/J J/A A/S S/O O/N D/J
Months: Compared to Previous 2 Months Months: Compared to Previous 2 Months
Fig. 5. Falsification test: difference in health for all month pairs.
precisely estimated then the December/January estimate. The esti- the tax incentive on these later-life outcomes. While we know that
mates for low birthweight are not statistically significant, except for approximately 1 for every 155 January births is moved to December
May/June, which is inconsistent with the finding for log birthweight for a $1000 increase in tax benefits, we believe that there are a num-
showing that the tax benefit is correlated with higher birthweight ber of December and January births that are hastened as a result
for May/June babies. Additionally, Apgar score is not statistically of the tax benefit but do not result in being born in a different
significantly correlated with the benefits for any of the month pairs month. Our estimates on health outcomes include the effects on all
except our actual estimate for December/January. Panel (D) shows December and January births and not just those whose birth month
the estimates when considering gestation as the outcome. For these is changed because of the incentive. Second, since the relationship
regressions, nearly all of the estimates are statistically significant, between birthweight and other life outcomes is subject to a num-
but not correctly signed to support the estimates in Fig. 4 Panel ber of omitted variables, the most reliable estimates come from
(A). However, these findings further contribute to our doubt about within-twin comparisons. While these studies eliminate much of
drawing any conclusions based on the gestation results. While the the bias, they consider a very different sample than we do. Twins
falsification tests are not all statistically insignificant, the inconsis- have a lower gestational age and are smaller, on average, than the
tency of the estimates makes us confident that our main findings births we believe are impacted by elective changes in birth time.
are not driven by spurious correlation, but reflect the true relation- While we do see an increase in low birthweight babies as a result of
ship between tax benefits and birth timing changes and its health the tax incentive, on average, there is positive selection into which
consequences. babies’ births are shifted as a result of the tax benefits. Hence, we
are estimating the effects on a population of newborns who may
7. Discussion be larger and healthier than those from the twins-based studies.
One way of assessing this potential difference in the vulnera-
There have been a number of studies on the effects of bility of the births in our samples is by comparing the estimated
birthweight on both short- and long-term outcomes. To better effect of birth timing on the 5-min Apgar score, a common out-
understand the full costs of birth timing changes, we assess our come measure among our studies. We estimate that a 1 g decrease
estimated effects of the tax benefits on birthweight in conjunction in birthweight corresponds to a 0.0007 point decrease in the Apgar
with the estimates from other studies. There are a couple caveats score. For the twins samples in both the Almond et al. (2005) and
to these calculations. First, we are assessing the average effect of Black et al. (2007) papers, a 1 g decrease in birthweight corresponds
to approximately a 0.015 decrease in Apgar score. This discrep- number of births. That is, while the average effects may be quite
ancy in the magnitudes of our estimates suggests that Apgar score small, the effects for the treated are likely quite sizable.
is not as strongly affected by birthweight changes for our sam-
ple of likely larger and healthier newborns. Our sub-analysis of
higher-order births in Table 8 supports this assertion. higher-order 8. Conclusion
births have a lower birthweight than first births and their health is
more adversely affected by birth timing changes than the average Altering the timing of births for convenience has become an
shifted birth. 33 This brings to light the fact that the relationship increasingly common practice in the United States and around
between birthweight and Apgar score may differ greatly across the the world. The consequences of these actions were previously
birthweight distribution. unknown, however. The U.S. income tax system is structured in
Keeping these important caveat in mind, we continue by such a way that it gives women who are due to deliver in the
assessing the relationship between birthweight and hospital costs. beginning of January a financial incentive to give birth in December
Using twin fixed-effects, Almond et al. (2005) estimate that a 1 g instead. This gives us a unique natural experiment with which to
increase in birthweight decreases hospital costs by $4.93. Thus, estimate how elective birth timing changes affect the wellbeing of
the 95% confidence interval of our estimates on log birthweight newborns.
from Table 5 would suggest that a $1000 increase in the tax ben- We verify that expectant mothers are in fact manipulating the
efit would increase hospital costs between $11.59 and $31.40 per timing of their births in order to gain additional tax benefits, and
child, on average, for every child born in December or January. With that this increase in December births is mostly driven by women
approximately 480,000 babies born each December and January, a who shift their January c-sections and inductions to December.
$1000 increase in tax benefits per family would roughly lead to a Roughly 1 of every 75 January c-sections are moved to December for
$10 million increase in overall hospital costs each year. every $1000 in additional tax benefits while 1 of every 72 January
Also using a population of twins, Figlio et al. (2013) use admin- inductions are moved to December for the same increase. These
istrative data from the state of Florida linking Vital Statistics results support previous findings that parents respond to finan-
records to standardized test scores to estimate the effect of birth- cial incentives when planning the birth of their child, although the
weight on academic outcomes. They estimate that a 10% increase magnitudes of our estimated effects are much smaller. There is no
in birthweight corresponds to a 0.045 standard deviation increase evidence that parents switch their method of delivery in order to
in reading and math test scores in third through eighth grade. guarantee a December birth and receipt of the benefits. These find-
Using this estimate, we would conclude that a $1000 increase in ings have allowed us to estimate the health effects of changing the
tax benefits leads to a 0.006 standard deviation decrease in stan- timing of birth, holding fixed the method of delivery.
dardized test scores, on average, for children born in December Our health results indicate that this type of medical intervention
or January. Black et al. (2007) estimate the relationship between has consequences for infant health: an increase in tax benefits, and,
birthweight and longer-term outcomes using a sample of twins therefore an increased probability of altering the timing of birth,
and rich administrative data from Norway. Consistent with other is associated with lower birthweight for the infant and a higher
studies on the consequences of birthweight, they find that relative probability that the infant will be low birthweight. Apgar scores
to their counterparts, low birthweight infants tend have lower IQ, are also negatively affected by changes in the timing of births. The
lower educational attainment, poorer self-reported health status, health effects are mostly driven by non-induced c-section births.
lower earnings as adults, and lower birthweight children (Behrman Our gestation analysis suggests that the infants being shifted are
and Rosenzweig, 2004; Currie and Hyson, 1999; Black et al., 2007). those who would have otherwise been born at 40 weeks of ges-
Interpreting our estimates with that from Black et al. (2007), tation or later, but are born at 37–39 weeks of gestation instead.
we find that a $1000 increase in tax benefits will decrease the While still considered full-term, these infants still experience neg-
average height of those born in December/January by 0.041 cm, ative consequences of small changes in the timing of their births.
will increase their Body Mass Index 34 by 0.0015, will decrease As we compare our findings with those from previous studies
their IQ by 0.0008 stanines, and will decrease their earnings by on the short-term health effects of birthweight, we find that the
0.02%. relationship between birthweight and the 5-min Apgar score is
In interpreting the magnitude of these effects, there are two less pronounced for the presumably larger and heavier infants
points we encourage the reader to keep in mind. First, work by whose birth timing is changed in order to receive the tax bene-
Almond and Mazumder (2013) suggests that parental investments fit. Findings from our subgroup analysis suggests that the adverse
in their children reinforce initial endowment differences. While affects of birth timing changes are exacerbated the earlier the birth
many of the twin fixed-effect analyses differ from their OLS coun- occurs.
terparts in initial health outcomes, they are actually quite similar Over time, both patients and doctors have experienced increas-
when assessing longer-term outcomes. If this is the case, many of ing discretion with respect to when a baby is born. Because even
the longer-term effects of birthweight that have been estimated are small reductions in gestation length can have negative, long-term
biased by differential parental investment and overestimate of the effects on the newborn, the cost of timing births for convenience or
causal relationship between birthweight and later-life well-being. financial gain may be far greater than the benefit. Since the conse-
Second, we estimate the average effects for the entire population quences of birth timing manipulation were previously unknown,
of December/January babies subject to the tax benefit. While we doctors and parents may have felt comfortable making adjust-
expect that the birth timing of more than just the few births we ments to the delivery date, especially when done at full-term and
estimated to be shifted from January to December are impacted by when there were personal gains to doing so. Our findings shed
the benefit increase, these effects are driven by a relatively small light on the fact that these changes are not without consequence,
and there should be increased scrutiny within the medical com-
munity regarding elective birth timing prior the baby’s natural
arrival. As the prevalence of c-sections and inductions continues
33
to rise, an important next step in this line of research will be to
The mean for first births is 3351 g and 2989 g for second–fourth births.
34
Body mass index is calculated by dividing weight in kilograms by height (in
determine whether changing the method of delivery, especially for
meters) squared. non-medical reasons, has any effect on infant health.
Table A.10
Health results: December vs. January.
(1) (2) (3) (4) (5) (6) (7)

Log BW Low BW (<2500 g) Apgar1 Apgar5 Apgar5 ≥7 Any vent. Vent. <30 m Vent. >30 m
TaxBen × Dec Birth 0.0005 0.0003 0.0031 0.0012 −0.0002 0.0004 −0.0000 0.0004
(0.0005) (0.0006) (0.0065) (0.0021) (0.0003) (0.0004) (0.0003) (0.0002)
Dec Birth −0.0005 −0.0005 0.0084* 0.0020 0.0003 −0.0006* −0.0001 −0.0005***
(0.0005) (0.0005) (0.0043) (0.0019) (0.0003) (0.0003) (0.0003) (0.0002)
TaxBen (000s) 0.0010 −0.0008 −0.0052 0.0003 −0.0003 −0.0004 −0.0003 −0.0001
(0.0008) (0.0008) (0.0085) (0.0027) (0.0004) (0.0005) (0.0005) (0.0003)
Mean 8.096 0.068 8.078 8.948 0.987 0.027 0.019 0.008

N 4,812,365 4,812,365 1,531,961 3,740,278 3,740,278 4,525,480 4,525,480 4,525,480
*
p < 0.10.
**
p < 0.05.
***
p < 0.01 .
Table A.11
Shifting and switching births (35+ weeks gestation).
(Panel A) Shifting
(1) (2) (3) (4) (5) (6) (7) (8)

All C-section Vaginal Induction C-section C-section Vaginal Vaginal
Non-induced Induced Non-induced Induced
TaxBen (000s) 0.0034** 0.0073*** 0.0025 0.0071** 0.0061** 0.0085 0.0020 0.0055*
(0.0014) (0.0025) (0.0015) (0.0028) (0.0027) (0.0060) (0.0016) (0.0030)
Mean 0.505 0.511 0.503 0.512 0.511 0.508 0.501 0.513

N 4,570,839 1,091,779 3,479,060 767,355 909,745 145,723 2,841,429 621,632
(Panel B) Switching
(1) (2) (3) (4) (5) (6) (7)

C-section Vaginal Induction C-section C-section Vaginal Vaginal
Non-induced Induced Non-induced Induced
TaxBen × DJ Birth −0.0008 0.0007 0.0006 −0.0013 0.0001 0.0003 0.0004

(0.0013) (0.0007) (0.0007) (0.0008) (0.0003) (0.0008) (0.0006)
TaxBen (000s) 0.0024 −0.0018 −0.0034*** 0.0054*** −0.0034*** −0.0018 0.0000
(0.0015) (0.0013) (0.0012) (0.0013) (0.0006) (0.0017) (0.0010)
DJ Birth 0.0008 −0.0007 0.0014** 0.0016* −0.0003 −0.0024*** 0.0017***
(0.0017) (0.0007) (0.0007) (0.0010) (0.0003) (0.0008) (0.0006)
Mean 0.239 0.770 0.169 0.201 0.032 0.633 0.137

N 9,187,161 9,080,800 9,080,637 9,080,637 908,0637 9,036,988 9,036,988
*
p < 0.10.
**
p < 0.05.
***
p < 0.01.
Acknowledgements In each case, the treated group consists of those families that are
eligible for the full CTC when having a child at the parity listed in
We thank Marianne Page, Hilary Hoynes, Scott Carrell, Amitabh the first column of that row. Those who are not eligible are included
Chandra, Doug Miller, Nyree Abrahamian, and seminar partici- in the control group. Those who fall in either the phase in or phase
pants at UC Davis, Santa Clara University, the Western Economic
Association International Annual Meetings, the Society of Labor Table A.12
Health outcomes joint test.
Economists Annual Meetings, the All-California Labor Conference,
Union College, and Acumen for their helpful comments and sug- # Restrictions # Reps % Reps
gestions. 0 50 25.0
1 82 41.0
2 50 25.0
Appendix A. Additional tables
3 12 6.0
4 4 2.0
See Tables A.10–A.12 5 1 0.5
6 1 0.5
7 0 0.0
Appendix B. Construction of groups for child tax credit and
This table gives the number and percent of bootstrap replications where 0–7 restric-
EITC analysis
tions held. When <7 restrictions hold (as described in footnote 28), we can reject the
null hypothesis that TaxBen × DJ has a positive effect on all health outcomes. When
The treatment and control groups for the CTC regressions are 0 restrictions hold, we can reject the null hypothesis that TaxBen × DJ has a positive
created according to the income ranges displayed in Table B.13. effect on any health outcome.
Table B.13 attendance affect schoool’. In: NBER Working Papers 5835. National Bureau of
Child tax credit. Economic Research, Inc.
Currie, J., Hyson, R., 1999. Is the impact of health shocks cushioned by socioeconomic
Single Married status? The case of low birth weight. In: AEA Papers and Proceedings.
Dickert-Conlin, S., Chandra, A., 1999. Taxes and the timing of births. Journal of
Parity Treat Control Treat Control
Political Economy 107 (1).
1st –a –a 28–116 >125 Doblhammer, G., Vaupel, J.W., 2001. Lifespan depends on month of birth.
2nd 31–88 >109 32–122 >133 Proceedings of the National Academy of Sciences of United States of America
3rd 19–96 <20, >111 26–125 <18, >141 98 (5), 2934–2939.
4th 21–105 <24, >114 25–128 <21, >150 Figlio, D.N., Guryan, J., Karbownik, K., Roth, J., 2013. The effects of poor neonatal
health on children’s cognitive development. In: NBER Working Papers 18846,
All values are in thousands of dollars, indexed to the year 2000. National Bureau of Economic Research, Inc.
a
Single mothers having their first child are not included because there is no clear Gans, J.S., Leigh, A., 2012. Bargaining over labor: do patients have any power?,. The
range where the child tax credit matters but the EITC does not. Economic Record, The Economic Society of Australia 88 (281).
Gans, J.S., Leigh, A., 2006. Did the Death of Australian Inheritance Taxes Affect
Deaths? Topics in Economic Analysis & Policy 6 (1) (Article 23).
Gans, J.S., Leigh, A., 2008. What Explains the Fall in Weekend Births?
Table B.14
Gans, J.S., Leigh, A., 2009. Born on the first of July: an (un)natural experiment in birth
Earned income tax credit. timing. Journal of Public Economics 93, 246–263.
Gans, J.S., Leigh, A., Varganova, E., 2007. Minding the shop: the case of obstetrics
Single Married
conferences. Social Science & Medicine 65, 1458–1465.
Parity Treat Control Treat Control Gibbons, L., Belizan, J.M., Lauer, J.A., Betran, A.P., Merialdi, M., Althabe, F., 2012.
Inequities in the use of cesarean section deliveries in the world. American Jour-
1st – – – – nal of Obstetrics and Gynecology 206 (4), 331.e1–331.e19.
2nd <12 – <16 – Gruber, J., 2005. Tax policy for health insurance. In: Tax Policy and the Economy,
3rd – <12 – <16 vol. 19. National Bureau of Economic Research, Inc.
4th – <12 – <16 Kopczuk, W., Slemrod, J., 2003. Dying to save taxes: evidence from estate-tax returns
on the death elasticity. Review of Economics and Statistics 85 (2), 256–265.
LaLumia, S., Sallee, J.M., Turner, N., 2013. New Evidence on Taxes and the Timing of
Birth.
Landro, L., 2011. A Push for More Pregnancies to Last 39 Weeks.
out of the CTC are not included in either group. Similarly, those who Langrew, D., Morgan, M., 1996. Decreasing the cesarean section rate in a private hos-
are eligible for the EITC are also excluded. pital: success without mandated clinical changes. American Journal of Obstetrics
The treatment and control groups for the EITC regressions are and Gynecology 174 (1), 184–191.
Lhila, A., Simon, K.I., 2008. Prenatal health investment decisions: does the child’s
created according to the income ranges displayed in Table B.14. The
sex matter? Demography 45 (4), 885–905.
treatment group consists of those families that are newly eligible Lo, J.C., 2003. Patients’ attitudes vs. physicians’ determination: implications for
for the EITC for their second child after the 1997 reform. The control cesarean sections. Social Science & Medicine 57, 91–96.
March of Dimes, 2012a. Inducing Labor, www.marchofdimes.com/pregnancy/
group consists of those families who give birth to their 3rd or 4th
inducing-labor.aspx.
child, and are not eligible for additional EITC benefits even after the March of Dimes, 2012b. Low Birthweight, www.marchofdimes.com/baby/low-
reform. birthweight.aspx.
March of Dimes, 2012c. Why at Least 39 Weeks is Best for Your Baby,
http://www.marchofdimes.com/pregnancy/why-at-least-39-weeks-is-best-
References for-your-baby.aspx.
Meyer, B.D., Rosenbaum, D.T., 1999. Welfare, the earned income tax credit, and the
Ai, C., Norton, E.C., 2003. Interaction terms in logit and probit models. Economics labor supply of single mothers. In: NBER Working Papers 7363. National Bureau
Letters 80 (1), 123–129. of Economic Research, Inc.
Alm, J., Whittington, L., 1995. Does the income tax affect marital decisions? National Neugart, M., Ohlsson, H., 2010. Economic incentives and the timing of births: evi-
Tax Journal 48 (4), 565–572. dence from the German parental benefit reform. In: Working Paper Series.
Almond, D., Chay, K.Y., Lee, D.S., 2005. The costs of low birth weight. Quarterly Center for Fiscal Studies, Uppsala University Department of Economics.
Journal of Economics. Nita, A., Landon, M., Spong, C., Mercer, B., steve Caritis, Sorokin, Y., 2009. Timing of
Almond, D., Mazumder, B., 2013. Fetal origins and parental responses. Annual elective repeat cesarean delivery at term and neonatal outcomes. New England
Review of Economics 5 (1), 37–56. Journal of Medicine 360 (2).
Behrman, J.R., Rosenzweig, M.R., 2004. Returns to birthweight. Review of Perkes, C., 2010. O.c. Doctors Push to Lower C-section Rates.
Economics and Statistics 86 (2), 586–601, http://ideas.repec.org/a/tpr/ Shearer, E.L., 1993. Cesarean section: medical benefits and costs. Social Science &
restat/v86y2004i2p586-601.html. Medicine 37 (10), 1223–1231.
Black, S.E., Devereux, P.J., Salvanes, K.G., 2007. From the cradle to the labor market? Sjoquist, D.L., Walker, M.B., 1995. The marriage tax and the rate and timing of
The effect of birth weight on adult outcomes. Quarterly Journal of Economics marriage. National Tax Journal 48 (4), 547–558.
122 (1), 409–439. Stupp, P.W., Warren, C.W., 1994. Seasonal differences in pregnancy outcomes:
Block, J., 2008. Pushed: The Painful Truth About Childbirth and Modern Maternity United States, 1971–1989. Annals of the New York Academy of Sciences 709
Care. Da Capo Press. (1), 46–54.
Bound, J., Jaeger, D.A., 1996. On the validity of season of birth as an instrument in The American College of Obstetricians and Gynecologists, 2010. How Your Baby
wage equations: a comment on Angrist & Krueger’s ‘does compulsory school Grows During Pregnancy (technical report).

Measuring the effects of reducing subsidies for private insurance on

public expenditure for health care夽
Terence Chai Cheng ∗
Melbourne Institute of Applied Economic and Social Research, The University of Melbourne, Melbourne, Australia
Article history: This paper investigates the effects of reducing subsidies for private health insurance on public sec-
Received 30 April 2012 tor expenditure for hospital care. An econometric framework using simultaneous equation models is
Received in revised form developed to analyse the interrelated decisions on the intensity and type of health care use and private
14 November 2013
insurance. The framework is applied to the context of the mixed public–private system in Australia. The
simulation projections show that reducing premium subsidies is expected to generate net cost savings.
This arises because the cost savings achieved from reducing subsidies are larger than the potential increase
in public expenditure on hospital care.
H42
C31
C15
Keywords:
Private health insurance
Subsidies
Public and private finance
Simultaneous equation models
Count data
1. Introduction health insurance. Even in the market-oriented health system of the

United States, health insurance is directly subsidised for individuals
In many modern economies, the public sector plays an impor- and families with low incomes and the elderly, while employment-
tant role in the financing of health care. Nearly all OECD countries based private health insurance is indirectly subsidised through
have universal health systems, where health care is funded either the tax system. The extent of public involvement on health care
through taxation, or through publicly-sponsored or subsidised financing in the United States is expected to rise with the imple-
mentation of the Patient Protection and Affordable Care Act which
aims to ensure that all individuals have health insurance through
夽 I am grateful for the comments from Deborah Cobb-Clark, John Deeble, Guy- a combination of mandates, premium regulations and subsidies.
onne Kalb, Elizabeth Savage, and participants at the 3rd Australasian Workshop While there are generally strong justifications for the pro-
on Econometrics and Health Economics, the 8th World Congress on Health Eco- vision of subsidies for health care and health insurance, the
nomics, and seminars at the University of Melbourne and Monash University. The
arguments for subsidising duplicate private health insurance (PHI)
suggestions and comments from two anonymous referees have improved the paper
considerably. This paper uses unit record data from the in-confidence version of the
in countries with universal health systems are less compelling. In
Household, Income and Labour Dynamics in Australia (HILDA) survey. The HILDA these countries (e.g. Australia, Spain, United Kingdom), a private
project was initiated and is funded by the Australian Government Department of health care market coexists alongside the public sector providing
Families, Housing, Community Services and Indigenous Affairs (FaHCSIA) and is health services already covered under the public system. Public
managed by the Melbourne Institute of Applied Economic and Social Research (Mel-
subsidies for private insurance, either in the form of tax incen-
bourne Institute). I gratefully acknowledge financial support from the Australian
Research Council Discovery Project Grant (ID: DP0880429) and through grants pro- tives or monetary rebates on premiums, have been a source of
vided by the Faculty of Business and Economics at The University of Melbourne. The policy contention (Colombo and Tapay, 2004). It is often argued
findings and views reported in this paper are mine and should not be attributed to that incentives for PHI can stimulate the private health care mar-
either FaHCSIA, Melbourne Institute or the funding organisations.
∗ Tel.: +61 3 83442124; fax: +61 3 83442111.
ket, relieving both capacity and cost pressures off the public system,
E-mail addresses: techeng@unimelb.edu.au, cheng.terence@gmail.com
and improving access to and quality of public sector care. However,
160 T.C. Cheng / Journal of Health Economics 33 (2014) 159–179
questions have been raised as to whether an expanding private which there have been considerable research in the United States
sector diverts valuable resources away from the public sector. In (e.g. Gruber and Poterba, 1994; Gruber and Washington, 2005) and
addition, issues of equity arise as privately insured individuals, who Canada (e.g. Stabile, 2001; Finkelstein, 2002). These studies, like
usually have higher incomes, can bypass the public sector queues most program evaluation research, focus on the ex-post evalua-
and obtain faster access to care. tion of a policy or program. They exploit variation in the insurance
An important question in evaluating the effectiveness of subsi- price that arises from changes in the tax treatment on premiums
dies for duplicate PHI is whether subsidies are self-financing – if and benefits affecting only a part of the population (e.g. defined
its introduction would lead to cost savings within the public health by occupation types or geography), and examine how demand has
care system that exceed the cost of the subsidy program. The con- changed relative to a subpopulation or ‘control group’ not affected
verse question, one that is pursued in this paper, is whether public by the policy change. While this treatment-control approach has
savings achieved from reducing subsidies can more than offset the its merits, it is not always feasible. For many important applica-
potential increase in public expenditure. Understanding the costs tions, it is often the case that the policy of interest affects the entire
and benefits of subsidy programs for private insurance is important population, or that a comparable ‘control group’ is not available.
as it apprises the effectiveness of policy instruments avail to gov- In some instances, a series of policies could have been introduced
ernments seeking to influence the public and private composition either concurrently or in close succession, as in the case of the PHI
of health expenditures. This is especially relevant as policy makers market in Australia, for which isolating the effects of a particu-
look towards sources of private finance to pay for the health care lar policy is difficult. Another limitation of the ex-post analytical
demands of their populace in the face of rapidly growing public techniques is that they do not allow the evaluation of the impacts
spending. of policies prior to their implementation. It is often important for
A number of studies have examined the self-financing nature of governments to be able to assess the expected impacts, and costs,
subsidies for private insurance. Emmerson et al. (2001) and Frech arising from a range of hypothetical policy options, hence facilitat-
and Hopkins (2004) investigate this issue through an ex-post policy ing the optimal design of policies to achieve the outcomes desired
evaluation for the United Kingdom and Australia respectively. The (Todd and Wolpin, 2006).
studies conclude that the cost of subsidising PHI exceeds the fiscal The second theme concerns the determinants of the demand for
benefits on the public sector. López Nicolás and Vera-Hernández PHI, and the relationship between private insurance and health care
(2008), through an ex-ante policy simulation, arrive at a similar use in the context of a National Health Service. On the former, the
conclusion when simulating the effects of abolishing tax subsi- literature emphasises the role of public sector waiting times and
dies for private insurance in Spain. In addition to the significant quality, household income, and education attainment as impor-
cost involved in subsidising PHI, Colombo and Tapay (2004) high- tant determinants of the decision to purchase PHI (see Barros and
light two other reasons why duplicate PHI is expected to have little Siciliani (2012) for a comprehensive review). There is evidence
cost-shifting effects (pp. 193 and 194). Firstly, relative to the public from a variety of countries that individuals with private insurance
sector, the private health care sector usually focusses on elective consume more public and private health care.1 In these stud-
treatments for patients with less complex and severe medical con- ies, a key methodological issue that has to be addressed is that
ditions. Secondly, privately insured individuals can continue to health insurance status is potentially endogenous to health care
utilise the public sector. Following this, a question that is central use, which arises as a result of the interdependency in the decisions
to the debate is whether privately insured individuals ‘opt out’ of to insure, and to consume health care (Cameron et al., 1988).
the public system by substituting private for public health care, or The endogeneity problem is addressed using three simulta-
‘top up’ and enlarge their use of health care without reducing their neous equation models that accommodate the mixed count-binary
reliance on the public system (Fabbri and Monfardini, 2011). nature of the utilisation, type and insurance outcome variables.
This paper contributes to the literature investigating the self- Traditional workhorse models for non-negative and integer-valued
financing nature of PHI subsidies. It develops a microeconometric (count) outcomes such as the Poisson and Negative Binomial mod-
framework to analyse the interrelated decisions on the intensity els have been extended to more advanced models with a variety
and type of health care use and PHI. The framework builds on the of applications such as the multivariate count data models (e.g.
work by López Nicolás and Vera-Hernández (2008) who construct Munkin and Trivedi, 1999; Fabbri and Monfardini, 2009; Hellström,
a discrete choice model to analyse the (binary) decisions surround- 2006), and count data models as a system of simultaneous equa-
ing public and private health care use and private insurance. The tions (Deb and Trivedi, 2006; Atella and Deb, 2008; Cheng and
framework proposed here comprises of three simultaneous equa- Vahid, 2011). A novel econometric model developed in this paper is
tion models which accommodate the count data nature of the a bivariate lognormal Poisson model with a common endogenous
health care utilisation measures (number of hospital admissions, binary regressor, used to jointly analyse two hospital care utilisa-
length of overnight stay) and the binary nature of the measures of tion measures (number of day and overnight hospital admissions)
the type (public vs. private) of hospital care and PHI. and the decision to purchase PHI. This novel model contributes
The econometric framework is applied to the context of the to the growing literature on multivariate simultaneous equation
mixed public and private health system in Australia. Australia is count data models.
an interesting case study for examining the self-financing hypoth- The econometric results show that individuals with private
esis. In the late 1990s a series of policy measures was introduced health insurance are more likely to seek hospital care as a private
in the PHI market within a short time frame to encourage the pur- patient compared to those without private insurance. However,
chase of private insurance. These measures include a tax penalty on resource usage in terms of hospital admissions and length of
high income individuals without private health cover, a generous overnight stay do not differ between the insured and uninsured
30 percent rebate on premiums, and the introduction of entry-age groups. The simulation projections show that reducing premium
adjusted premiums. To evaluate the self-financing nature of PHI,
the estimates from the econometric models are used in an ex-ante
simulation analysis in which the premium rebates are scaled back. 1
This is the subject of substantial research in a number of countries such as
This study combines two themes within the literature on the Australia (Savage and Wright, 2003); Germany (Riphahn et al., 2003); Ireland
economics of health care and health insurance. The first concerns (Harmon and Nolan, 2001); and Spain (Vera-Hernandez, 1999). Jones et al. (2006)
the effects of tax subsidies on the demand for health insurance, for focusses on the use of specialist services in four European countries.
T.C. Cheng / Journal of Health Economics 33 (2014) 159–179 161
subsidies is expected to generate net cost savings. This is because and higher claims experiences. Ministerial approval is required
the cost of treating patients who drop private cover and rely on the before health insurers can change the premiums charged.
public system is substantially lower than the cost of subsidising A series of three policy changes was introduced in the PHI mar-
private insurance for the whole population. The savings are mainly ket from the late 1990s with the aim of encouraging the uptake of
driven by the inelastic demand for private insurance as only a small PHI, which had been steadily declining since the introduction of
fraction of individuals are likely to respond to higher prices for Medicare in 1984.2 Then the prevailing policy position supported a
insurance by dropping private health coverage. balanced public and private involvement in the delivery of health
The remainder of the paper is organised as follows. Section 2 care, and the declining PHI membership was seen as threatening
describes the institutional context in Australia. Section 3 presents the financial viability of the private health care sector, which could
the econometric framework, as well as discusses the estimation and eventually have spillover effects on the public hospital system.
identification strategies. Section 4 describes the data used in the The first policy change in 1997 was the Private Health Insur-
empirical analysis. The results from the econometric analysis are ance Incentive Scheme which involves the use of tax subsidies for
discussed in Section 5 and those of the simulation analysis are dis- low income individuals, and a tax levy for high income individuals
cussed in Section 6. Section 7 assesses the sensitivity of the results. and households without private insurance. Under this policy, sin-
Finally, Section 8 concludes with a discussion of the key findings in gles and families with taxable income below $35,000 and $70,000
the paper. respectively are eligible for a subsidy, either in the form of a pre-
mium discount, direct payment, or tax offset. For the tax penalty,
singles and families (inclusive of couples) with an annual house-
2. The institutional context in Australia hold income greater than $50,000 and $100,000 respectively are
liable for a tax levy amounting to one percent of their taxable
Health care in Australia is funded through a combination of income if they do not have private health insurance. This levy is
public and private sources of financing. Medicare, the universal known as the Medicare Levy Surcharge. By 1998, a second policy
tax-funded public health insurance scheme, subsidises medical ser- change was introduced, replacing the subsidy component with a
vices and technologies according to a schedule of fees referred to non-means tested 30 percent rebate on premiums.
as the Medicare Benefit Schedule. Primary care is predominantly A third policy is the Lifetime Health Cover, which involves the
provided by general practitioners in private practice who are usu- modification of the community rating regulations. Introduced in
ally paid fee-for-service, and act as gatekeepers for specialist care July 2000, this policy allows health funds to charge a premium load-
as a referral is required to access the Medicare subsidy. Medicare ing on those without private health cover, which is applied when
also provides free access to public hospitals. Patients can choose these individuals decide to purchase insurance in the future. The
hospital care as public (Medicare) patients in public hospitals and loading is calculated as 2 percent of the base premium for each year
receive free treatment from doctors nominated by hospitals, as well of age above 30, and applies over the remainder of individuals’ life-
as free (shared) accommodations and meals. Private care can be times. What followed the implementation of these three policies
obtained from either private or public hospitals. Individuals who was a dramatic increase in percentage of the population with PHI,
choose private care are entitled to their choice of treatment doc- from a low of 30.1 percent in 1999 to 45.7 percent in September
tor, better amenities such as private rooms, and quicker access to 2000, which has since stabilised at around 44 percent.
treatment by avoiding public hospital waiting lists. Private medi-
cal specialists are free to charge patients what the market will bear, 3. Econometric framework
with a fixed subsidy resulting in a patient copayment. The differ-
ence is afforded either as out-of-pocket expenditure, or covered by The key elements of a decision model of the demand for hospi-
insurance funds if individuals have PHI. Unlike private specialist tal care and PHI in a mixed system such as Australia’s is described
fees, private hospital charges (e.g. room and theatre charges, med- below to motivate the empirical modelling. This model is elabo-
ical expendables) do not attract any Medicare subsidy. Having PHI rated in Cheng (2011). The subject of interest is an individual who
does not preclude the use of hospital care as a public patient, and is an expected utility maximiser, who solves the following resource
individuals may do so if they perceive that there is no advantage in allocation problem
paying the copayments to be a private patient.
The PHI industry in Australia is heavily regulated. The role max (s) U[C, h(m, q | s)] (1)
m, q, d
of PHI, coverage and benefit requirements, and premium-setting s
behaviour by health insurance funds are constrained by legisla-
tion. There are two types of PHI cover, with the first being private with illness severity s that is distributed (s), consumption C, and
hospital coverage which provides financial protection towards pri- health status h. The inputs to the health production function h(m,
vate hospital expenditures. The second type is ancillary or extra q|s), are hospital care services m of quality q. Here, m = (my , mov , stk )
coverage which covers services such as dental care, allied health is a three-dimensional vector where my and mov denote the number
(e.g. physiotherapy), and items such as eye glasses which are not of day and overnight hospital admissions respectively. stk is the
covered under Medicare. There is mandatory community rating on length of overnight stay (the number of nights in hospital) for the
y
PHI premiums which stipulates that insurers must charge the same kth overnight admission. The quality indicator vector q = (qj , qov k
),
y
premium for a given insurance plan regardless of individuals’ age, where qj , qov
k
∈ {0, 1} and subscripts refer to the jth day and kth
gender, health status, and utilisation and claims history. overnight admission respectively, are binary indices which assume
For any given health fund, premiums are allowed to vary across the value of 0 if the individual chooses publicly (Medicare) funded
different products, and across states, to reflect the costs of claims hospital care, or 1 if private care is chosen. The individual may buy
in local markets. The former allows insurers significant flexibility PHI, a decision denoted by d where d ∈ {0, 1}, which is available
in the product design, particularly to offer policies that explicitly
exclude coverage for certain conditions or treatments (e.g. joint
replacements for young adults) (Buchmueller, 2008). A system of 2
Hall et al. (1999) provide an excellent discussion of institutional environment
risk equalisation, in place since 2007, exists to partially compensate and motivations that underpin the policy changes. See also Butler (2002) for a full
health funds with enrollees that have risker demographic profiles description of the three policies.
y
at nominal premium P. The effective premium P̃, defined as the ε2 , mi and mov
i
have independent Poisson distributions, with mean
y
nominal premium net a premium subsidy (r percent), and a tax levy parameters i and ov i
of percent of income which applies to individuals with incomes
y y
above YT and who do not have private insurance, is written as f (mi |Xi1 , di , εi1 ) = Po[i ] (4)

(1 − r)P if Y ≤ YT f (mov
i
|Xi2 , di , εi2 ) = Po[ov
i
] (5)
P̃ = (2)
(1 − r)P + Y if Y > YT y
i = exp(Xi1 ˇ1 + 1 di + 1 εi1 ) (6)
The solution to the optimisation problem is given by the ov

= exp(Xi2 ˇ2 + 2 di + 2 εi2 ) (7)
i
simultaneous equilibriums in the demands for day and overnight
admissions; the length of overnight stay in each overnight admis- where ε1 and ε2 are standardised and distributed standard normal,
sion; the choice of public or private admissions, and the choice to that is ε1 , ε2 ∼ N(0, 1). The decision rule to purchase private hos-
purchase PHI. These are represented by pital insurance is represented by the continuous latent variable di∗ ,
and the observed insurance choice di is in turn related to di∗ by a
m̃y = my (q̃, d̃, s) dichotomous rule. These are written as
mõv = mov (q̃, d̃, s) di∗ = Xi3

ˇ3 + εi3 (8)
st˜k = st k (q̃, d̃, s) di = 1[di∗ > 0] (9)

y
y
q̃j (s) = arg max V (d̃, s)
q
j (3) where ε3 ∼ N(0, 1). The RHS insurance variables di in (6) and (7)
qy ∈{0,1} are allowed to be endogenous by assuming that ε1 and ε2 are
ov qov correlated with ε3 . A mixed bivariate Poisson lognormal model is
q̃k (s) = arg max V k (d̃, s)
q ∈{0,1}
ov adopted to gain efficiency, in which the unobserved heterogeneity
d̃ = arg max EV d terms ε1 and ε2 are allowed to be correlated. More specifically, it
d∈{0,1} is assumed that the unobserved variables, ε1 , ε2 and ε3 , are dis-
tributed trivariate normal with zero means, unit variances, and
y y
q q
where V j and V j are the indirect utilities associated with the correlation parameters εjεk , ∀ j =
/ k, j, k = 1, 2, 3. The likelihood
y
admission type (i.e. public or private) strategies qj and qov ; and function of the model is shown in Eq. (A.1) in Appendix A.
k
EVd the expected utility associated with insurance choice d.
3.2. Admission type choice for day hospital admission
The joint estimation of the outcomes in Eq. (3) involves esti-
mating an econometric model consisting of six non-linear equation
Suppose the decisions to seek day hospital care as a private
which is a computationally intensive task. To simplify the econo-
patient and to purchase insurance are given by the continuous
metric analysis, an econometric framework consisting of three
latent variable qy* and d* respectively where
distinct econometric models is developed. The first model jointly
estimates the demands for day and overnight hospital admissions, y∗ y y∗
qi = Zi1 ˛1 + ıdi + vi1 ; qi = 1[qi > 0] (10)
and the demand for PHI (Section 3.1). The second model jointly
analyses the admission type choice for day admission and the
demand for PHI. The third model jointly estimates the admission (11)
di∗ ˛ +v ;
= Zi2 2 i2 di = 1[di∗ > 0]
type choice for overnight admission, the length of hospital stay, and
the demand for PHI. The latent variables are related to the observed care type and
There are two reasons why the econometric framework is insurance choices via the dichotomous rule given in (10) and (11).
specified in the manner described above. Firstly, this specifica- The RHS variable di in (10) is allowed to be endogenous by assuming
tion distinguishes between the frequency of admissions separately that vi1 and vi2 are distributed bivariate normal with zero means,
from outcomes that are conditional on admission, namely that of unit variances and correlation parameter v . The insurance out-
public–private choice and length of stay. Secondly, there is a data come d is observed for all individuals, whereas qy is observed only
limitation in that the information of whether individuals chose to for those for whom the number day hospital admissions are pos-
receive public or private care, and the length of overnight hospital itive. The likelihood function for this model, shown in Eq. (A.2),
stay, is only available for the most recent admission episode. Given accommodates this feature of the data.
this limitation, alternative specifications such as a bivariate model
of public and private hospital admissions, or a joint model of day 3.3. Admission type choice and length of hospital stay for
admissions and the total number of hospital nights, are not feasible. overnight admission
Below, the econometric framework is described which takes
into account the simultaneity of the decisions on hospital care use Let sti denote the duration of hospital stay and qov i
the admis-
and PHI, and is congruent with the count data nature of hospital sion type binary variable which takes the value of 1 when private
length of stay and binary nature of care type and insurance choice care was chosen and 0 otherwise. The binary variable indicating
variables. insurance status is given by di as above. Suppose conditional on Wi ,
qi , di , and the unobserved heterogeneity i1 , sti follows a Poisson
distribution with truncation at zero, with mean parameter i .
3.1. Demand for day and overnight hospital admissions
mi
exp−i i
Let
y
mi and mov
be the observed frequencies for day and f (st i |Wi1 , di , qi , i1 ) = (12)
i mi !(1 − exp− )
overnight hospital admissions for the ith individual. Suppose con-
ditional on the exogenous covariates Xi1 and Xi2 , the endogenous ov
i = exp(Wi1
1 + 1 di + 2 qi + 3 i1 ) (13)
variable di , and random unobserved heterogeneity terms ε1 and
where i1 is N(0, 1). The decision rules to seek private hospital care observations with missing or ambiguous responses, 7089 observa-
and to purchase insurance are given by the continuous latent vari- tions remained in the sample.3
ables qov∗
i
and di∗ and are related to the observed outcomes by the
respective decision rules 4.1. Private hospital insurance

qov∗
i
= Wi2
2 + 3 di + i2 ; qov
i
= 1[qov∗
i
> 0] (14) In the HILDA survey, individuals were also asked to provide
information on whether they have PHI and the type of coverage.
The three coverage types are hospital, ancillary or both. The focus
di∗ = Wi3

3 + i3 ; di = 1[di∗ > 0] (15) of the paper is whether individuals have private hospital insurance,
that is they possess either hospital only or combined cover.4 Table 1
The RHS variables qov
i
and di in (13) and di in (14) are allowed presents the proportion of individuals with private hospital insur-
to be endogenous by assuming that i1 , i2 and i3 are distributed ance by coverage and income unit types. Overall, 50.9 percent of
trivariate normal with zero means, unit variances and correlation the sample have private hospital insurance. Respondents from cou-
parameters j k ∀ j =
/ k ; j, k = 1, 2, 3. It is emphasised that the out- ple only (56.3 percent) and couple family (55.9 percent) households
come variables mi , qov
i
, di are observed only for individuals who are more likely to be privately insured compared with lone persons
have been hospitalised for an overnight admission, and for non- (39.1 percent) or lone parent (28.1 percent) households. Combined
hospitalised individuals only di is observed. The likelihood function hospital and ancillary cover is more common compared to hospital-
accommodates this feature of the data, and is shown in Eq. (A.3). only cover, with 79.5 percent of insured individuals having policies
of this type.
3.4. Estimation The average annual expenditure on private hospital insurance
premiums by coverage and income unit type for the HILDA data
The likelihood functions for the econometric specifications in are shown in Table 1.5 The expenditure on premiums is higher
Sections 3.1 and 3.3 are complex and require the evaluation of inte- for individuals with combined cover compared with hospital-only
grals. Maximum simulated likelihood (MSL) techniques are used and ancillary-only cover, and are generally higher for larger house-
to approximate the likelihoods given that these functions do not holds (e.g. couples vs. lone person).6 Variations in premiums also
have a closed-form expression. Quasi-Monte Carlo draws based on arise from the differences in the generosity of coverage, defined by
the Halton sequence were used in the simulations which have been the level of deductibles, percentage of copayment, as well as com-
shown to be more accurate and faster compared to the conventional prehensiveness in terms of the menu of services covered. For the
random number generator (Bhat, 2001; Train, 2003). The number of purpose of benchmarking the data on premium expenditures, data
simulations S has a considerable effect on the properties of the MSL from the 2003 to 2004 Household Expenditure Survey on house-
estimator (Gouriéroux and Monfort, 1996). 2000 simulations were hold expenditure on hospital, medical and dental insurance are
chosen, beyond which the estimation results obtained were very presented at the bottom of Table 1. One would observe that the
similar. The Berndt, Hall, Hall and Hausman (BHHH) quasi-Newton data on premiums are broadly consistent in both datasets.
algorithm was used to maximise the simulated likelihood using
numerical derivatives. The variance of the MSL estimates was com- 4.2. Defining the price of insurance
puted post convergence using the “cluster-robust” formula (Deb
and Trivedi, 2002, p. 608), which takes into account the presence of Estimating the demand elasticity for private insurance requires
multiple observations from each household, as well as the minimis- the specification of an insurance price. In this paper, the price
ing the influence of simulation noise (McFadden and Train, 2000). of insurance is defined as the ratio of effective premium to the
The model in Section 3.2 is estimated via maximum likelihood. expected benefits received. To illustrate this definition, following
Phelp (1997), suppose that the expected benefits from health insur-
ance is E(B), which is a function of the prices and quantities of
4. Data medical services, and patients’ cost sharing (e.g. copayments). The
nominal premium is defined as P = (1 + L)E(B), where L is the per-
The empirical analysis uses data from the In-Confidence ver- centage loading imposed by the insurer. The price of insurance is
sion of the Household, Income and Labour Dynamics in Australia given by 1 + L = P/E(B), which is the ratio of nominal premium to
(HILDA) survey. HILDA is a nationally representative longitudi- the expected benefits. Given the financial incentives, the effective
nal survey which collects extensive information on household and premium P̃ faced by individuals is the nominal premium P net of
family formation, labour force participation and income, and life the premium subsidy, and where applicable the tax levy. By sub-
satisfaction, health and well-being. Every member of the house- stituting P̃ for P, the price of insurance is expressed as P̃/E(B). This
hold aged 15 and over are surveyed via a face-to-face interview is interpreted as the effective price that is required for each dollar
and are requested to complete a self-completion questionnaire. The of expected benefits.
primary data source is from wave 4 (2004) of the HILDA survey,
where data is available on 17,209 individuals from 8280 house-
holds. A health module, in addition to the core survey questions, 3
The following are the variables (top five by frequency) and the corresponding the
was included in wave 4 in which information on hospital care use number of observations (in brackets) dropped due to missing or incomplete infor-
and PHI status was collected. The data is combined with responses mation: Private insurance and policy type (335); whether hospitalised for day (418)
or overnight (59) admission; self assessed health (1026); daily alcohol consumption
on self-assessed health status from wave 3 (2003) and informa-
(353).
tion on households’ expenditure on PHI premiums from wave 5 4
In the original data set, 6.9 percent (N = 428) of the 6183 respondents who
(2005). In the analysis sample, respondents from multiple fam- indicated that they have PHI have ancillary-only policies.
5
ily households (N = 2276) and those where the respondents age Household expenditure data on premiums for PHI are obtained from Wave
is below 25 years (N = 5591) were excluded. Given the emphasis 5 (2005) of the HILDA survey and adjusted to account for the average growth
in premiums of 7.6 percent between 2004 and 2005 (Private Health Insurance
on the relationship between hospital care use and PHI, individ- Administrative Council, 2005).
uals with PHI policies that cover only ancillary services (cf. Section 6
Children under the 21 years of age and full-time students below 25 years may
4.1) were coded as not having hospital insurance. After excluding be covered under their parents’ policy without additional cost.
Table 1
Private hospital insurance status and premiums by coverage and income unit types.
Lone person Lone parent Couple only Couple family Total
Insurance status
No hospital cover 789 (60.9%) 264 (71.9%) 1086 (43.7%) 1085 (44.9%) 3224 (49.1%)
With hospital cover 507 (39.1%) 103 (28.1%) 1397 (56.3%) 1330 (55.9%) 3337 (50.9%)
% Hospital only 22.7 17.5 23.2 17.1 20.5
% Hospital & ancillary 77.3 82.5 76.8 82.9 79.5
Premiums ($)
Ancillary onlya 410.45 670.76 712.03 683.75
Hospital only 751.95 894.39 1261.79 1345.17
Hospital & ancillary 1095.72 1549.65 1766.07 1939.09
Total 957.20 1326.58 1602.96 1761.21
HES (2003–04)b 982.90 1367.55 1694.08 1740.29

a
Individuals with ancillary only policies are not included in the analysis but shown here for comparison.
b
Source: Confidentialised Unit Record Files (CURF) from the Household Expenditure Survey (HES) 2003–04. Derived from weekly household expenditure on ‘hospital,
medical and dental insurance’.
Based on this definition, the observed variation in the insurance expected benefits that are higher than those with hospital-only
price arises from two main sources. The first source of variation is coverage.
introduced through the Medicare Levy Surcharge. This is shown in Calculating the insurance price requires information on pre-
Eq. (2) in Section 3. For a given insurance contract, variation in the miums. This is relatively straightforward for individuals with
effective premium arises from differences in the levy that individ- private insurance as one can use data on premium expenditure.
uals are liable to pay if they choose not to buy private insurance. For For those without private insurance, premiums have to be esti-
individuals above the income threshold, those with higher house- mated. This approach has been previously applied in studies that
hold income are required to pay a larger levy, and correspondingly estimate the price elasticity of demand for private health insur-
face a lower effective premium. For those whose income are suffi- ance (e.g. Marquis and Long, 2001; Costa and García, 2003). For this
ciently high, the tax liability may exceed the nominal premium for paper, premiums for uninsured individuals are estimated using the
the most generous policy available, resulting in a situation where predictions from a regression of premium expenditure observed
the effective premium is less than zero. from privately insured individuals in the sample. With data on pre-
The second source of variation arises from differences in miums, one can calculate the effective premium, which forms the
the expected benefits from insurance, which in turn depend on numerator in the price formula. For the denominator, the expected
individuals’ characteristics such as age, gender and household com- benefits are estimated using published data on benefits payouts.
position. For instance, all else being equal, older individuals are Here, information on age, sex and state of residence of respon-
expected to have higher use of hospital care compared to those dents and their family members (where applicable), as well as data
who are younger. Females in their childbearing years are expected on respondents’ coverage types (hospital only, hospital and ancil-
to have higher use compared with males. These utilisation patterns lary) and policy types (e.g. couple, family) are used to calculate
are illustrated in Fig. 1 which shows the expected annual bene- the expected benefit accruing to each household.7 Expected bene-
fits per person for a hospital-only policy by sex, age (available in fits are calculated using statistics published by the Private Health
five-year bands) and states. In terms of household composition, Insurance Administrative Council for the 2004–05 financial year
households with children accrue higher expected benefits com- (July 2004 to June 2005). For uninsured individuals, the insurance
pared with those without children. price is calculated using the estimated premiums and benefits of
An advantage of defining the insurance price in this manner is a combined hospital and ancillary policy as a benchmark. This is
that it allows one to account for heterogeneity in policy types that because the combined cover is the most common coverage type as
is observed in the data. For instance, individuals with combined shown in Table 1.
hospital and ancillary cover pay a larger premium, and receive For the subsample of individuals where observed premium
expenditure is used to calculate the insurance price, the price may
be inaccurate given that premiums reflect information on plan
5000
characteristics (e.g. deductibles and copayments) whereas benefits

do not. As a result, the price may be understated (overstated) for
4000
individuals with more (less) generous coverage. This may affect the
price elasticity estimate, and the direction and magnitude of any
potential bias will depend on the distribution of benefits payouts.8
3000
Unfortunately the lack of detailed data on these aspects does not

$
2000
7
The approach adopted in a number of studies estimating the price elasticity
1000
of private health insurance is to include premiums as a determinant of the insur-

ance choice (see Kiil (2012) for a recent review). In addition, covariates such as age,
gender, and household size are usually added as additional controls. These vari-
ables account for the expected benefits of insurance, although in a non-specific way.
0
Hence the definition of the insurance price adopted in this paper provides a more
0 20−24 45−49 70−74 95+
Age in years structured interpretation in that these characteristics affect insurance choice by
influencing individuals’ assessment on the expected benefits from insurance. The
NSW−M VIC−M QLD−M SA−M WA−M implied price elasticity estimates from both approaches are quite similar. These
NSW−F VIC−F QLD−F SA−F WA−F results are available upon request from the author.
8
The price of insurance would potentially be mismeasured if individuals with
Fig. 1. Expected annual hospital benefits by age, sex & states. higher utilisation switch into plans with more generous coverage. This is a
allow one make reasonably accurate inference on how the elasticity Table 2
OLS regression of log annual expenditure on premiums.
estimate might be affected.9
To address this potential problem, an alternative formulation of Variables Coeff Std Err.
the insurance price is considered, where predicted premiums are Hospital & ancillary cover 0.310 ***
0.025
used in place of observed premiums. Here, the insurance price is Family policy 0.575*** 0.023
calculated from dividing predicted premiums by expected benefits. Couple-only policy 0.543*** 0.030
This formulation has the advantage that it does not over- or under- Sole parent policy 0.198 0.184
% of policy with excess 4.778*** 1.822
state benefits (in relation to premiums) and the insurance price, and
% over 65 years 9.127*** 1.990
hence serves as a point of reference to assess the sensitivity of the Avg specialist fees (’00) 0.0045*** 0.00086
price elasticity estimate. There are two disadvantages of using pre- Specialist density −0.923*** 0.253
dicted premiums, the first of which is the general issue of efficiency (Specialist density)2 0.0045** 0.0012
Policy to employees ratio −0.035*** 0.013
loss given that information captured within observed premiums
Constant 51.186*** 12.367
is not utilised. A second disadvantage is that endogeneity prob-
Number of obs. 3533
lems may be more severe given that plan-specific characteristics Adjusted R-squared 0.201
are neither incorporated within premium expenditures nor ben-
*
p < 0.1.
efits, and these will be subsumed into the unobservables in the **
p < 0.5.
insurance equation. Endogeneity arises if these unobservables cor- ***
p < 0.01.
relate with the unobserved characteristics that influence admission
type choices and hospital use intensity, potentially leading to
biased estimates. However, as with the case of observed premiums,
such potential biases are mitigated with simultaneous equation Community rating requirements imply that individuals’ risk
estimation. Foreshadowing the results, the estimated price elas- characteristics cannot be used as predictors of premiums. Instead,
ticities from using observed and predicted premiums are very variations in premiums are explained by differences in coverage
similar (see Table 10), which suggest that the former is not sig- and policy types, as well as factors that influence claims experience
nificantly affected by unobserved plan characteristics. In addition, and operating cost of health insurers across states. To this end, two
the regression estimates show a higher degree of correlation using sets of explanatory variables are included in the regression. The first
predicted compared with observed premiums. This is consistent is a set of binary indicators that represent individual-level informa-
with expectations given that predicted premiums contain less tion on coverage and policy types. The second is a set of state-level
information.10 variables. For instance, the percentage of insurance policies with
While the price formulation incorporates the premium subsidy an excess across the different states is used as a proxy for plan
and tax levy, the effect of the Lifetime Health Cover (LHC) policy characteristics, which is not collected in the data.
is not explicitly modelled. Under this policy, individuals without To capture claims cost, state-level information on the percent-
private coverage are charged a premium loading which is applied age of population over 65 years is used as it has been shown that
when they decide to buy private insurance in the future. To benefit payouts are significantly higher for older individuals. Claims
empirically model the effect of this policy requires the use of panel are expected to be larger where professional fees charged by doc-
data given that the decision of whether and when to buy PHI tors are higher, the effect of which is captured by including the
would affect the premium cost. Ellis and Savage (2008) proposed average price of an office-based consultation with a private special-
a simple approach to analyse the effect of this policy by adding ist. The specialist-to-population ratio is also included which could
into the insurance equation an additional regressor measuring the be positively related to claim expenditures arising from higher
difference between anticipated higher premiums under LHC, and activity rates, or negatively related if insurance funds are in a
actual premiums. This approach is used in a sensitivity analysis better position to negotiate for lower payments. Lastly, to cap-
that is discussed in Section 7. Previewing the results, the estimated ture economies of scale in the cost of administering private health
price elasticities are very similar with or without accounting for insurance, the ratio of health insurance contracts to the number
the LHC policy. of individuals employed the general and health insurance industry
is included as a regressor.11 It should be pointed out that because
4.3. Estimating premium expenditure of multicollinearity, state identifiers cannot be included alongside
the state-level covariates. Hence the estimates on these covariates
Premiums are estimated by fitting an ordinary least squares also reflect heterogeneity across regions.
(OLS) regression to the logarithm of annual premium expendi- The results from the OLS regression on log expenditure on
ture observed from individuals with private hospital insurance. premiums are shown in Table 2. The coefficients are mostly statisti-
cally significant and are consistent with expectations. Expenditure
on premiums are higher for individuals with combined hospi-
tal and ancillary cover compared with hospital only cover, and
possibility given that data on premium expenditure is collected in the year after
the hospitalisation data. If plan switching occurs, the price of insurance would be
are larger for family and couple only policies relative to single
overstated (understated) if individuals with higher (lower) spending switch into policies. The estimate on the fraction of policies with excess is
more (less) generous plans. The extent to which plan switching takes place in the
Australian context is unclear. Models of risk selection under community rating pre-
dict adverse selection while more recent research (e.g. Doiron et al., 2008) suggest
11
advantageous selection. Hence whether the institutional context supports adverse Data on the supply-side variables are obtained from a variety of sources: infor-
or advantageous selection, and whether these factors drive plan-switching, remain mation on policies with excess and the number of health insurance contracts based
open research questions. I am grateful to a referee for raising this issue. on 2004–05 data published by the PHI Administrative Council; information on the
9
Aggregate data on plan characteristics indicate that 96 percent of insurance con- number of individuals employed in the insurance industry and characteristics of
tracts in-force at the end of 2004 have no exclusion conditions, and 59 percent have the Australian population is based on the 2006 Census; information on the number
a front-end deductible (Private Health Insurance Administrative Council, 2004). of specialists in Australia based on the sampling frame from wave 1 (2008) of the
10
The results of the simulation analysis based on predicted premiums are pre- Medicine in Australia: Balancing Employment and Life (MABEL) longitudinal survey
sented as a sensitivity check in the paper, but not the regression estimates. These of doctors while the price of private specialists service is based on self-reported data
are available upon request. also from the MABEL data.
statistically significant, albeit with an unexpected sign. Premiums 4.5. Remaining explanatory variables
are positively related to the percentage of individuals over 65 years
and specialists’ fees, and have a U-shaped relationship with spe- The remaining explanatory variables that are used in this
cialist density. Finally, premiums are negatively associated with study can be classified into the following categories: demograph-
the fraction of policies to individuals employed in the insurance ics and socioeconomic characteristics (e.g. age, gender, household
industry. income), health status measures (presence of chronic conditions),
Studies have applied sample selection models to impute pre- health risk factors (drinker, smoker) and geographical information
miums which are otherwise not observed for the uninsured (state/territories, remoteness). The choice of variables is similar to
individuals. For example, Marquis and Long (2001, 2002) used this that in Cameron et al. (1988), Cameron and Trivedi (1991), Savage
approach on the grounds that selectivity correction removes the and Wright (2003) and Propper (2000). In addition to variables that
effects of omitted variables that influence both the health insur- are available in the survey, two variables that are obtained through
ance premium as well as the decision (by firms) to offer insurance. external data sources are included. The first is the number of indi-
The theoretical justification is based on a behaviorial model of viduals employed in the “general and health insurance” industry
firms’ decision to offer insurance to its employees whereby the within the intermediate local area of the survey respondents’ res-
problem of selectivity occurs when firms that do not offer insur- idential location. This variable is included to capture individuals’
ance face higher premiums compared to those who offer insurance, accessibility to insurance services such as information on health
for reasons such as higher perceived risk by insurers that are not insurance products.14 The second variable is the distance to the
observed by the researcher. In the Australian context, community nearest private hospital from the survey respondents’ location of
rating regulations prohibit health insurers from setting premiums residence, which is included to capture the accessibility to pri-
to discriminate based on individuals’ risk and hence the selectiv- vate hospital services.15 This is defined as the Euclidian distance
ity problems observed by the preceding studies are likely not to be between the centroids of the postal area of survey respondents’
relevant.12 and the postal area of the nearest private hospital.16 Distance is
For high income individuals, the combination of the tax and calculated using data on the coordinates (longitude and latitude)
subsidy programs results in a situation where the potential tax lia- of centroids via the Haversine formula.
bilities through the Medicare Levy Surcharge exceed the premium The descriptive statistics for the explanatory variables are pre-
cost of PHI. When this occurs, the price of insurance is less than sented in Table 4. The average age of the sample is 49.9 years, with
zero. In the sample of 7089 observations, 495 individuals have an a range of 25–99 years. Females make up roughly 54 percent of the
effective price of insurance that is less than zero.13 These obser- sample. The average annual household income is $60,120. Approx-
vations are excluded from the regression analysis but are included imately 47 percent of individuals have no postschool qualification
in the simulation analysis. The subsequent sections describe the (year 12 and below). By occupations, 40 percent of individuals are
characteristics of the 6594 observations in the sample. not in the labour force, while the largest group is Professionals (24
percent). Measures of individuals’ health status include indicators
4.4. Measures of hospital care use of self-assessed health status collected in wave 3 and a set of binary
variables that indicate the presence of chronic conditions that affect
In the survey, individuals were asked about the number of occa- physical and social functioning. For the self-assessed health status,
sions they had been admitted to hospital as a day patient and for 45 percent of individuals reported to be in excellent or very good
an overnight stay in the twelve months preceding the survey. Indi- health in the wave 3 survey, with 36 percent indicating that their
viduals who have been hospitalised were further queried on the health is good and 19 percent fair and poor. Indicators of health
duration of hospital stay and whether they were admitted as a risk factors include whether individuals consume alcohol daily
Medicare (public) or private patient on the most recent day and (9.7 percent) and are regular smokers (19 percent). Geographical
overnight admission. The descriptive statistics for the hospital care information include state/territory indicators as well as remote-
utilisation measures are summarised in Table 3. The count data ness categories. Approximately 58 percent of individuals reside
utilisation measures are the number of day admissions, overnight in major cities in Australia. The average number of individuals
admissions, and the number of hospital nights. The measures have employed in the general and health insurance in a statistical subdi-
mass points at 0 and 1 and exhibit overdispersion where the uncon- vision is 800 and the mean distance to the nearest private hospital is
ditional variance (Sy2 ) is larger than its mean (ȳ). The number of 30.3 km.
hospital nights, and the binary outcome of whether individuals
chose to be admitted as private patients in their last overnight hos-
pital stay, are observed only for the 881 individuals in the analysis
14
sample who have been admitted for an overnight stay. On the latter, The variable is derived using information on industry and location of employ-
46.9 percent of individuals chose private care. Of the 800 individ- ment based on data from the 2001 Australian Census of Population and Housing.
The industry category is Industry Code 742 (Other Insurance), which includes 7421
uals who had at least one day hospitalisation episode, 58.6 percent (Health Insurance) and 7422 (General Insurance) based on the Australian and New
chose to be admitted as private patients in the last day admission. Zealand Standard Industrial Classification (ANZSIC) in 1993. The unit of reference
for defining the location of employment is the Statistical Subdivision (SSD), a spatial
unit of intermediate size. In the 2001 Australian Standard Geographical Classifica-
tion, there were a total 207 SSDs, with each SSD containing an average of 22 postal
12
The possibility of selectivity bias is investigated by estimating a regression of areas. The postal codes of respondents in the HILDA survey are linked to the SSD
observed premiums using a Heckman selection model. The results obtained were using the 2001 Australian Bureau of Statistics (ABS) “Statistical Subdivision and
counter-intuitive: predicted premiums were higher for insured individuals com- Postal Area Concordance” data that is available by request from the ABS.
15
pared with those without insurance. Patients in principle can elect to receive hospital care as private patients in public
13
It is expected that these individuals would purchase private hospital insurance hospitals but this practice is not common until a few years ago when public hospitals
given they would be financially better off, although it is observed that a small per- began to actively encourage public patients to use their private health insurance. For
centage (4.9 percent) are not privately insured. This may be explained by a variety instance, some health jurisdictions (e.g. Queensland Health) have a policy of paying
of factors, including the presence of measurement error in household income. For the cost of patients’ front-end deductibles or co-payments to encourage them to use
these uninsured individuals, the magnitude of the effective outlay on insurance pre- their private insurance coverage.
miums is small (median = −$451, mean = −$705) which suggest that the effects of 16
Data on the centroids of postal areas are obtained from the Australian Bureau
measurement error, if any, is not likely to be significant. of Statistics (2006).
Table 3
Summary statistics: hospital care utilisation measures.
Frequency Day admissions Overnight admissions Hospital nights Private: day Private: overnight
Pr(y = 0) 0.874 0.866 0.414 0.531

Pr(y = 1) 0.098 0.098 0.310 0.586 0.469
Pr(y = 2) 0.018 0.022 0.132
Pr(y = 3) 0.005 0.006 0.124
Pr(y = 4) 0.001 0.004 0.081
Pr(y = 5)a 0.001 0.002 0.104
Range 0–12 0–12 1–135 0–1 0–1

Mean (ȳ) 0.177 0.194 5.058
Variance (Sy2 ) 0.406 0.372 65.930
Sy2 /ȳ 2.283 1.919 13.035
N 6594 6594 881 800 881
a
For brevity, only frequencies up to Pr(y = 5) are presented for the count utilisation measure. See ‘range’ for information on all realisations.
Table 4
Descriptive statistics: explanatory variables (N = 6594).
Variable Description Mean Std dev
Insurance price Log of insurance price −0.40 0.83

Age Age 49.91 15.45
Female Female (0/1) 0.54 0.50
Couple Couple income unit (1/0) 0.75 0.43
Depchild Have dependent children (1/0) 0.42 0.49
Country of birth:
Australia (Ref) Person is born in Australia (0/1) 0.77 0.42
Main English Person is born in main English speaking countries (0/1) 0.12 0.32
Other Person is born in other countries (0/1) 0.11 0.31
HH Income Annual household income ($ ‘000) 60.12 40.63
Education Qualification (qual.):

School (Ref) Person’s highest qual. is year 12 or below (0/1) 0.47 0.50
Certificate Person’s highest qual. is a Certificate (0/1) 0.23 0.42
Dipl/Adv Dipl Person’s highest qual. is a (Advanced) Diploma (0/1) 0.097 0.30
Bach. and postgrad Person’s highest qual. is a degree or above(0/1) 0.21 0.41
Occupation category:
Unemploy (Ref) Not in employment (0/1) 0.40 0.49
Manager/Admin Managers and Administrators (0/1) 0.063 0.24
Professional Professionals (0/1) 0.24 0.43
Clerical/Service Clerical and Service workers (0/1) 0.15 0.36
Trades/Transport Trades, Production, Transport, Labourers (0/1) 0.15 0.36
Self assessed health (SAH):

Very Good/Excellent (Ref) Person’s SAH in t − 1 is excellent or very good (0/1) 0.45 0.50
Good Person’s SAH in t − 1 is good (0/1) 0.36 0.48
Fair/Poor Person’s SAH in t − 1 is fair or poor (0/1) 0.19 0.39
Chronic health conditions (conds.):

Work Limiting Conds. limit amount and type of work (0/1) 0.39 0.67
Self Care Conds. causes difficulties with self care (0/1) 0.041 0.20
Mobility Conds. causes difficulties with mobility activities (0/1) 0.086 0.28
Communication Conds. causes difficulties with communication (0/1) 0.0080 0.089
Alcohol daily Person drinks alcohol daily (0/1) 0.097 0.30
Regular smoker Person is a regular smoker (0/1) 0.19 0.39
State:
NSW (Ref) Person lives in New South Wales (0/1) 0.29 0.45
VIC Person lives in Victoria (0/1) 0.25 0.43
QLD Person lives in Queensland (0/1) 0.21 0.41
SA Person lives in South Australia (0/1) 0.091 0.29
WA Person lives in Western Australia (0/1) 0.10 0.30
TAS/NT Person lives in Tasmania or Northern Territory (0/1) 0.039 0.19
ACT Person lives in the Australian Capital Territory (0/1) 0.017 0.13
Remoteness:
Major cities (Ref) Person resides in major cities (0/1) 0.58 0.49
Inner region Person resides in inner regional areas (0/1) 0.28 0.45
Other Person resides in outer regional and (very) remote (0/1) 0.15 0.35
Headcount Number (’000) of persons working in the health and general insurance 0.80 1.89
industry (respondents’ residential local area)
Distance Euclidian distance (in km) to the nearest private hospital 30.30 90.18
In the regression analysis, the entire set of covariates described this reason would justify as an excluded variable in the length of
above is included in the intensity of hospital use, and the choice stay equation.
of admission types and insurance equations except for the age,
gender and income unit types (presence of dependent chil-
dren, couple vs. single) variables which are excluded from the 5. Results
insurance equation.17 The effects of these covariates on the propen-
sity to insure have been factored into the insurance price variable, The simultaneous equation models are estimated, along with
through the calculation of the expected benefits from insurance. their corresponding single equation variants, to assess the rela-
The implicit assumption here is that age, gender and household tive merits of the different models. For the single equation models,
composition are not expected to have an independent effect on the number of hospital admissions and length of stay are esti-
the propensity to insure after controlling for the price of insurance. mated using a Poisson lognormal model, while the admission type
It is emphasised that these exclusions are justified on theoretical and insurance equations are estimated using simple probit regres-
grounds, and are not required for the identification of the econo- sions. An underlying assumption of these single equation models is
metric models. The identification strategy is discussed next. that the insurance and admission type regressors, d and q, may be
treated as exogenous. For the bivariate Poisson lognormal model
4.6. Identification and exclusion restrictions in Section 3.1, the likelihood ratio test overwhelmingly rejects
the single equation model in favour of the simultaneous model.
Formally, the econometric model described in Section 3 is The likelihood ratio test statistic is 1553.18, which rejects the null
identified by the nonlinearity of the functional form and error hypothesis that the correlation parameters j k are jointly equal
distributions. However, the reliance on such a strategy is unap- to zero. For the models in Sections 3.2 and 3.3, the likelihood tests
pealing and hence exclusion restrictions are imposed to strengthen cannot reject the null hypothesis that the correlation parameters
the identification. The model in Section 3.1 constitutes a mixed are zero.
bivariate Poisson lognormal model with a single common endoge- It is noted that the correlation parameters (see bottom of
nous binary insurance variable while that in Section 3.2 is a probit Tables B.1 and B.2) are individually statistically significant in some
model for the admission type (day hospitalisation) equation with instances. For the model of the number of hospital admissions
one endogenous binary variable. The model in Section 3.3 con- (Table B.1), there is significant positive correlation (1 2 ) between
sists of a Poisson lognormal model with endogenous insurance the unobserved determinants of day and overnight hospital admis-
and admission type binary variables, as well as a probit model for sions. This result is expected as individuals’ treatment regimen
the admission type (overnight hospitalisation) equation with one may involve overnight hospitalisations for more intensive investi-
endogenous insurance variable. Imposing exclusion restrictions for gations and interventions and further day admissions for follow-up
the model in Section 3.1 requires that there is at least one variable procedures. For the model of length of hospital stay and admis-
in X3 that is excluded from X1 and X2 , and for the model in Section sion type for overnight hospitalisation (Table B.2), the estimate
3.2 one variable in Z2 that is excluded from Z1 . For the model in of the correlation parameter between the unobservables in the
Section 3.3, this requires that there is one variable in W3 that is conditional mean equation for length of stay with the insurance
excluded from W1 and W2 , and one variable that is in W2 that is equation (1 3 ) is statistically significant from zero suggesting that
excluded in W1 . there is evidence of endogeneity in the decision to purchase insur-
To satisfy the above exclusion restrictions, the number of indi- ance and the length of overnight stay. The positive signs of these
viduals employed in the general and health insurance industry correlation parameters indicate that the unobserved factors that
is included in the insurance equation (X3 , Z2 , W3 ) but not in the positively influence the propensity to purchase private insurance
admission type choice (Z1 , W2 ), frequency of admissions (X1 , X2 ) have a positive effect on the intensity of overnight stay.
and length of stay equations (W1 ). This variable performs the role Goodness of fit measures for the simultaneous and single equa-
as a proxy for the accessibility to insurance services which influ- tion models are shown in Table 5. The measures, which include
ences the ease with which individuals can acquire information on the correctly classified ratio (CCR), root mean squared prediction
health insurance products. This is likely to influence whether indi- error (RMSPE), and the mean absolute prediction error (MAPE),
viduals choose to purchase private hospital insurance but not the quantify the deviations between observed outcomes and their pre-
choice between public and private hospital care and the intensity dictions. Both models performed almost identically on how well
of hospital care use. In addition, the distance to the nearest private the predictions fit the observed data for most outcome variables
hospital is included in the admission type for overnight hospitali- except for hospital nights. For the binary outcomes, the percentage
sation equation but excluded from the length of stay equation. Data of correctly classified predictions range from 71.6 percent for pri-
on distance to hospitals have frequently been employed as instru- vate health insurance status, to 87.1 percent for the proportion of
ments to address selection bias in studies on treatment outcomes day admission patients who sought private hospital care. For the
and hospital quality (e.g. McClellan et al., 1994; Gowrisankaran and count data measures, the frequencies of day and overnight hospital
Town, 1999). For our purpose, individuals’ choice to seek private or admissions are considerably better predicted compared with hos-
public hospital care is based on a variety of factors which include pital nights. This may be because the distribution of hospital nights
the types and severity of illness, the availability of private hospital has a long upper tail that is not satisfactorily accommodated by
insurance, as well as the proximity of private hospitals. The distance conventional models for overdispersed data such as the Poisson
to private hospitals is very likely to be uncorrelated with the unob- lognormal and negative binomial. An alternative is to use more
served type and severity of individuals’ medical conditions and for flexible approaches such as the semi-parametric finite mixtures
and flexible count models based on series expansions (Cameron
and Trivedi, 2013). These however are not attempted as they are
computationally complex, even without considering the presence
17
A case may be made to include the supply-side variables that are used to predict of endogenous regressors.
premiums into the insurance equation. This would be justified if individuals residing
in states where specialist fees are higher are more likely to purchase insurance to
Below, the results on the determinants of the intensity of hos-
afford these high fees. Including the supply-side variables as additional regressors pital care use and the choice to receive public or private care
in the insurance equation produced estimates that are virtually identical. is discussed. The estimates of the marginal effects of private
Table 5
Goodness of fit statistics.
Variables Day admissions Overnight admissions Hospital nights Private patient: day Private patient: overnight Private health insurance
Simultaneous equation
CCR 76.69% 75.45% 11.0% 87.13% 86.04% 71.57%
RMSPE 0.75 0.74 7.99
MAPE 0.29 0.32 3.70
Single equation
CCR 76.58% 75.28% 12.83% 84.40% 87.06% 71.58%
RMSPE 0.74 0.74 7.80
MAPE 0.29 0.31 3.52
Note: CCR, correct classification ratio; RMSPE, root mean square prediction error; MAPE, mean absolute prediction error.
insurance is discussed, followed by a brief overview of the coef- The results from the single equation estimation on the other hand
ficient estimates for the remaining explanatory variables. These indicate that privately insured individuals utilise both day and
results are based on the estimates from the simultaneous equa- overnight hospital care at a higher intensity.
tion models. For the discussion on the marginal effects, the results A priori, it is not clear whether individuals with double insur-
obtained under the single equation models are included for com- ance coverage, through both PHI and the public health insurance
parison. system, are expected to have higher or lower use of hospital care
Before proceeding, a note on interpreting the marginal effects. compared with individuals insured only under the public system.
For the count data outcome measures, the marginal effects for This is because the measures of hospital admission observed in the
binary variables (e.g. insurance) is expressed as a proportional data reflect total (i.e. public and private) admissions. On one hand,
change in the expected outcome E(m|X) from changing xj from privately insured individuals face a lower effective price for private
0 to1, which is calculated as E(m | xj = 1, X̄)/E(m | xj = 0, X̄) = hospital care, and may therefore have higher use of private hospi-
eˇj + i =/ j ˇi xi /e i =/ j ˇi xi = eˇj . For the binary outcomes measures, tal services. On the other hand, these individuals may concurrently
the marginal effects are calculated in the usual way with the reduce their use of public hospital services. The net effect depends
remaining covariates at their mean values. All standard errors for on whether individuals with private insurance substitute private
the marginal effects are calculated via the delta method. for public health care, or expand their use of private health care
without reducing their reliance on the public system.18 The rela-
5.1. Marginal effects of insurance tive efficiency of the public and private sectors is another important
factor (Vera-Hernandez, 1999). It may be the case that private hos-
Table 6 presents the marginal effects of PHI on the intensity pitals are more efficient, hence the treatment of a given medical
and type of hospital care use. The results from the simulta- condition would require less visits in private hospitals compared
neous equation models are presented in column (1). Those from with public ones.
the single equation modelling are shown in column (2) for The interpretation of the marginal effect of insurance on the
comparison. duration of private overnight hospital stay is slightly more straight-
The first two sets of estimates are the marginal effects of pri- forward. Given that privately insured individuals face a lower
vate insurance on the number of day and overnight admissions. effective price for private care, one would expect that the length
The estimates from the simultaneous models are 1.32 and 1.09 for of private overnight stay is on average higher among those with
day and overnight admissions respectively, and are not statisti- PHI. The point estimate from the simultaneous model is 1.62 but
cally significant from unity. These results indicate that privately is not statistically significant from unity. This result suggests that
insured individuals do not have significantly higher intensity of privately insured individuals do not have longer duration of pri-
hospital admissions compared to those without private insurance. vate hospital stays compared to those without private insurance.
In contrast, the estimate from the single equation modelling sug-
Table 6
gests that privately insured individuals have a significantly higher
Marginal effect of private insurance on hospital use and public/private choice. duration of hospital stay.
The last two estimates are the marginal effects of private insur-
Variables (1) Simultaneous (2) Single
ance on the probability that an individual chooses to obtain day
equation equation
and overnight hospital care as private patients. The choice vari-
Day admissions 1.317 1.270**
ables are binary and hence the marginal effect is interpreted as
(0.640) (0.115)
Overnight admissions 1.094 1.210** the difference in the expected probability of choosing private care
(0.357) (0.101) by individuals with or without private insurance. The estimates
Hospital nights: private patientsa 1.616 2.422** from the simultaneous model indicate that privately insured indi-
(0.578) (0.558) viduals are 78 percent and 56 percent more likely to choose private
Private patient: Day admissionb 0.779*** 0.732***
care for day and overnight admissions respectively. An interesting
(0.132) (0.031)
Private patient: Overnight admissionb 0.555*** 0.746*** observation can be seen by comparing the estimates from the two
(0.171) (0.027) different modelling approaches. For day admissions, the marginal
Note: For count outcomes, test for statistical significance from the null hypothe- effect from the simultaneous model is larger compared with the
sis that the marginal effect, which is interpreted as a proportional change in the single equation model whereas for overnight admissions the rela-
expected outcome, is equal to 1. For binary outcomes, marginal effect is the change tive magnitude is reversed. These results suggest that the direction
in the expected probability of the event occurring. Standard errors in parenthesis.
a
Length of overnight stay for private patients in the last admission episode.
b
Admitted as private patient in the last day/overnight admission.
* 18
p < 0.1. Fabbri and Monfardini (2011) find evidence that individuals with voluntary
**
p < 0.05. health insurance consume more private specialist visits, and at the same time
***
p < 0.01 demand less public specialist visits in the Italian National Health Service.
of the endogeneity effects are different for day and overnight maximum of 25 percent. The outcomes of interest are (i) the change
admissions. in the proportion of individuals with PHI; (ii) changes in the pro-
portions of individuals obtaining day and overnight hospital care
5.2. Coefficient estimates as private patients; and (iii) changes in the frequency of day and
overnight hospitalisations, and the number of hospital nights. In
Before proceeding to the policy simulations which is the key the simulations, the predicted PHI choice for every observation in
focus of the paper, the coefficient estimates from the regressions the sample is first determined. Thereafter, these predicted insur-
are briefly discussed. These are presented in Tables B.1 and B.2 in ance choices are used to compute the predicted admission types
Appendix B.19 The coefficient estimates have the expected signs, choices and the expected intensity of day and overnight hospital
and indicate that demographic and socioeconomic characteristics use in a fashion that is consistent with the recursive structure of
such as age, sex, household income and employment status are the econometric models.
important determinants of the demand for hospital care, the choice The expected impact on public and private hospital use, based
to seek private care, and the decision to purchase private insurance. on the estimates obtained from the simultaneous equation models,
Health status plays an important role in influencing the intensity is presented in Table 7. In the sample of 10,000 observations, the
of hospital admissions and propensity to be privately insured. The number of individuals hospitalised for day and overnight care are
headcount and distance variables, which serve as exclusion restric- 1144 and 1276 respectively. To illustrate the results, consider the
tions, are mostly statistically significant.20 Finally, the positive and scenario where rebates are reduced by 10 percent (see column (2)
statistically significant estimates on the standard deviations of the in Table 7). As shown in Panel A, the reduction in the premium
heterogeneity terms strongly suggest the presence of overdisper- subsidy is expected to reduce the proportion of individuals with PHI
sion in the data, which justifies the use of the Poisson lognormal from a baseline of 52.37 percent to 51.61 percent. This translates
model.21 to a decrease of 0.76 percentage points.
The point estimates of the percentage change in the number of
6. Policy simulations day and overnight admissions are −0.27 percent and −0.06 per-
cent respectively. However these estimates are not statistically
The estimates from the econometric models are used in a sim- significant from zero given that the 95 percent confidence intervals
ulation analysis to investigate the impact of reducing subsidies for contain the null value of zero. Hence, the decrease in the proportion
PHI on the use of public and private hospital care. A representative of privately insured individuals is not expected to have any effect
sample is first constructed by taking 10,000 random draws from on the frequency of hospital admissions. This result is not surpris-
the analysis sample, using information on the age–sex distribution ing given that the estimates of the marginal effect of insurance on
of the Australian population in 2005. the number of day and overnight hospital admissions (cf. Table 6)
To allow for uncertainty in the projections, distributions of the are not statistically significant.
simulated outcomes (e.g. proportion of individuals with PHI, num- The decrease in the number of individuals with private insur-
ber of day admissions) are obtained by repeatedly drawing from ance is expected to reduce the proportion of individuals seeking
the estimated sampling distribution of the parameters (Creedy and hospital care as private patients. As shown in Panel B of column
Kalb, 2006). Random draws are taken from a multivariate normal (2), the fraction of individuals choosing to be private patients for
distribution with the means and covariances given by the point day hospital admissions is predicted to decrease from a baseline
estimates and the estimated variance-covariance matrix. For each of 59.81 percent to 59.22 percent, or 0.59 percentage points. This
draw, model calibration is performed for binary outcomes by tak- is also the case for overnight hospital admissions (Panel C), which
ing random draws of error terms via Cholesky transformation using is predicted to drop by 0.45 percentage points, from 46.94 percent
the vector of correlation parameters. A total of 200 sets of draws to 46.49 percent. As with the number of hospital admissions, the
are taken, and the simulated mean outcomes from each set of change in the proportion of individuals with PHI has no predicted
draws are used in the construction of confidence intervals. For the effect on the number of hospital nights. This is because the mean
count data measures, the conditional mean E[mi |Xi ] is calculated as predicted changes in hospital nights are small in magnitude, and
ˆ i exp(ˆ2 /2). are not statistically significant from zero.
For the simulation, private hospital insurance rebates are The implied arc price elasticity for private health insurance
reduced by a multiple of 5, from a minimum of 5 percent to a can be calculated using the estimates in Table 7. This is calcu-
lated as the percentage change in the proportion of individuals
with private insurance divided by the percentage change in
19
the price of insurance. For the latter, each reduction of pre-
The estimates in columns (1)–(3) of Table B.1 correspond to the model of hospital
admissions and insurance (cf. Section 3.1), and those presented in columns (4)–(5) mium rebates by 5 percent corresponds approximately to a 2.14
refer to the model of the choice of private care for day hospital admission (cf. Section percent increase in the insurance price. Consider the scenario
3.2). The estimates in Table B.2 refer to the model of private care and length of stay where premium rebates are reduced by 10 percent (column
for overnight admission (cf. Section 3.3). To interpret the coefficient estimates on (2) of Panel A), which corresponds to a 4.29 percent increase
binary variables for the count data measures, eˇj ≈ 1 + ˇj when ˇj is small so the
coefficient approximates the proportional increase in E(m|X) as xj changes from 0
in the insurance price. The price elasticity is calculated as
to 1. For a continuous explanatory variable xk , the coefficient ˇk is a semi-elasticity. −1.45/4.29 = −0.34. Overall, the implied price elasticities across
Hence an increase in xk by 0.01 changes the expected length of stay E(m|X) by ˇk the different simulated scenarios fall in the range of −0.32 to
percent. −0.35.
20
The sensitivity of the regression estimates were assessed in relation to the
The simulation results based on the single equation econo-
instrument set. The results obtained are largely similar in sign, magnitude, and
statistical significance compared with the benchmark specification. Convergence metrics models (Table B.3) are broadly similar with the results
problems were encountered when the headcount variable was excluded from the from the simultaneous equation models. The notable difference is
variable set plausibly due to weak model identification in the absence of this instru- that the estimate of the change in the proportion seeking private
ment. These results are available on request from the author. overnight hospital care from the single equation model is not statis-
21
For Poisson lognormal model, the conditional variance V[mi |Xi ] is given by
E[mi |Xi , i ]{1 + E[mi |Xi , i ]} where = [exp( 2 ) − 1] (see Eqs. (2.2-23) and (2.2-26) in
tically significant from zero (Panel C in Table B.3). This is probably
Greene (2005)). Overdispersion is present in the data if V[mi |Xi ] > E[mi |Xi , i ] which because the corresponding coefficient estimate from this model is
occurs if > 0. less precisely estimated.
Table 7
Simulating reductions in premium rebates on hospital use (simultaneous equation).
N = 10, 000
% in premium rebate
(1) (2) (3) (4) (5)
Baseline −5% −10% −15% −20% −25%
Panel A: Number of day and overnight admissions

% with private hospital insurance 52.37 51.99 51.61 51.26 50.90 50.56
% change −0.73% −1.45% −2.13% −2.80% −3.46%
[−1.03, −0.52] [−1.85, −1.09] [−2.64, −1.70] [−3.42, −2.28] [−4.18, −2.81]
% change in day admissions 0.18 −0.13% −0.27% −0.39% −0.52% −0.64%
[−8.75, 11.87] [−8.82, 12.17] [−8.88, 12.48] [−8.94, 12.84] [−8.99, 13.07]
% change in overnight admissions 0.21 −0.03% −0.06% −0.09% −0.12% −0.15%
[−7.54, 9.01] [−7.64, 8.96] [−7.67, 9.07] [−7.71, 9.16] [−7.74, 9.11]
Panel B: Public–private choice: day admission

N(Day admission) = 1144
% change −0.73% −1.44% −2.13% −2.79% −3.43%
[−1.00, −0.52] [−1.85, −1.13] [−2.60, −1.72] [−3.42, −2.31] [−4.11, −2.85]
% private patient: day admission 59.81 59.50 59.22 58.90 58.60 58.32
% change −0.52% −0.97% −1.53% −2.02% −2.49%
[−1.21, −0.04] [−1.79, −0.18] [−2.67, −0.62] [−3.25, −0.91] [−3.83, −1.35]
Panel C: Public–private choice & length of stay: Overnight admission

N(Overnight admission) = 1276
% change −0.75% −1.46% −2.16% −2.84% −3.50%
[−1.03, −0.52] [−1.87, −1.09] [−2.75, −1.72] [−3.55, −2.26] [−4.32, −2.85]
% private patient: overnight admission 46.94 46.70 46.49 46.26 46.04 45.85
% change −0.50% −0.97% −1.46% −1.92% −2.33%
[−1.17, −0.002] [−2.17, −0.17] [−2.67, −0.17] [−3.67, −0.34] [−4.34, −0.34]
% change in hospital nights: public patient 5.97 0.01% 0.04% 0.06% 0.09% 0.10%
[−20.80, 26.88] [−20.79, 27.17] [−20.79, 27.35] [−20.81, 27.45] [−20.98, 27.31]
% change in hospital nights: private patient 4.09 −0.06% −0.12% −0.20% −0.26% −0.32%
[−17.09, 18.62] [−17.06, 18.72] [−17.21, 18.08] [−17.36, 18.02] [−17.42, 17.34
Note: 95% confidence intervals in parenthesis based on 2.5 and 97.5 percentiles of the distribution.
6.1. Predicted impact on public expenditure for hospital care million or 0.85 percent. The estimated increase in expenditure
is however not statistically significant given that its confidence
The estimates from the simulation analysis are combined with interval contains the null of zero. Concurrently, the reduction of
published statistics on hospital activity and expenditure to assess premium rebates by 10 percent is expected to decrease govern-
the effects of reducing premium subsidies on public expenditure for ment expenditure on the subsidy by $298.5 million given that total
hospital care. The predicted number of day and overnight public public spending on these subsidies was $2985 million in 2004–05.
hospital admissions are first calculated by applying the simula- Based on the point estimates, the reduction in premium rebates
tion estimates in Table 7 to public hospital discharges in 2004–05 is expected to generate net savings of $189.74 million to the pub-
(Australian Institute of Health and Welfare, 2006). These are then lic budget, as the reduced spending on premiums rebates is larger
multiplied by the expenditure per unit discharge from public hos- than the increase in spending on public hospitals.
pitals to calculate the changes to the total expenditure on public The cost projections presented in Table 8 are imprecisely esti-
hospitals. Published data on the total recurrent expenditure on mated given that a number of the simulated outcomes have large
public hospitals ($21,758 million Australian dollars in 2004–05) are confidence intervals. These include changes in the frequency of day
used to calculate the former. Both the data on activity and expen- and overnight hospital admissions and changes in hospital nights
diture are adjusted using the public patient day proportion, and (cf. Table 7). An alternative modelling strategy is to incorporate
where applicable, the admitted patient cost proportion.22 For the only statistically significant outcomes into the cost projections. This
activity data, the average public cost weights for day and overnight approach effectively assumes that changing the proportion of pri-
discharges are applied to adjust for differences in the resource use vately insured individuals does not have any effect on the number
across the two types of hospital care. of hospital admissions and hospital nights. These projections are
Table 8 summarises the expected impact on public sector expen- shown in Table 9. For a 10 percent reduction in premiums, the
diture. These projections are based on the simulation estimates expected increase in public expenditure is $124.82 million, and
from the simultaneous equation models (cf. Table 7). As an illus- the corresponding projected net savings is $173.68. Not surpris-
tration, consider the effects of reducing premium rebates by 10 ingly, the confidence intervals are narrower and indicate that the
percent (column (2) in Table 8). A 10 percent reduction in rebates cost increase is statistically significant.
is expected to increase public hospital expenditure from $12,794
million to $12,902 million, corresponding to an increase of $108.76
7. Robustness checks
22
The admitted patient cost proportion and the public patient day proportion are
To assess the robustness of the results, a set of sensitivity anal-
0.70 and 0.84 respectively. See Table 4.1 in Australian Institute of Health and Welfare yses is conducted. The analyses consider the simulation scenario
(2006). where rebates are reduced by 10 percent. These results are shown
Table 8
Predicted impact on public hospital use and expenditure.
(1) (2) (3) (4) (5)
Percentage change in premium rebates Baseline −5% −10% −15% −20% −25%
(A) Expenditure on public hospitals ($ mil.) 12,794a 12,851 12,902 12,959 13,012 13,058
[11,957, 13,717] [12,002, 13,763] [12,064, 13,858] [12,141, 13,928] [12,192, 139655]
(A1) Change in expenditure ($ mil.) 56.84 108.76 165.55 218.15 264.83
[−836.62, 923.01] [−791.31, 969.55] [−729.47, 1,064.66] [−652.61, 1134.71] [−601.54, 1,171.56]
(A2) Percentage change 0.44% 0.85% 1.29% 1.71% 2.07%
[−6.54, 7.21] [−6.19, 7.58] [−5.70, 8.32] [−5.10, 8.87] [−4.70, 9.16]
(B) Change in public expenditure on rebates 149.25 298.50 447.75 597.00 746.25
($ mil.)
(C) Net effect on public expenditure ($ mil.) −92.41 −189.74 −282.20 −378.85 −481.42
[(C) = (A1) − (B)]
Note: 95% confidence intervals in parenthesis.

a
The baseline estimate of the total recurrent expenditure on public inpatient care is calculated by multiplying the total recurrent expenditure on public hospitals by the
admitted patient cost and public patient day proportions ($21, 758 mil × 0.7 × 0.84 = $12, 794 mil).
Table 9
Predicted impact on public hospital utilisation and expenditure (excluding statistically insignificant simulated outcomes).
(1) (2) (3) (4) (5)
Percentage change in premium rebates Baseline −5% −10% −15% −20% −25%
(A) Expenditure on public hospitals ($ mil.) 12,794a 12,859 12,919 12,983 13,043 13,098
[12,801, 12,924] [12,828, 13,033] [12,857, 13,118] [12,886, 13,194] [12,907, 13,265]
(A1) Change in expenditure ($ mil.) 64.90 124.82 189.49 249.83 304.23
[7.47, 130.77] [33.83, 238.97] [63.01, 324.16] [92.47, 400.64] [113.00, 471.16]
(A2) Percentage change 0.51% 0.98% 1.48% 1.95% 2.38%
[0.06, 1.02] [0.26, 1.87] [0.49, 2.53] [0.72, 3.13] [0.88, 3.68]
(B) Change in public expenditure on rebates 149.25 298.50 447.75 597.00 746.25
($ mil.)
(C) Net effect on public expenditure ($ mil.) −84.35 −173.68 −258.26 −347.17 −442.02
[(C) = (A1) − (B)]
Note: 95% confidence intervals in parenthesis.

a
The baseline estimate of the total recurrent expenditure on public inpatient care is calculated by multiplying the total recurrent expenditure on public hospitals by the
admitted patient cost and public patient day proportions ($21,758 mil X 0.7 X 0.84 = $12,794 mil).
in Table 10. As with the cost projections presented in Table 9, the to the reference case. This is because the number of overnight hos-
sensitivity analyses use only simulated outcomes that are statisti- pital admissions is expected to remain unchanged, hence the cost
cally significant in the cost projections. In the discussion herein, the increase is driven solely by changes in day admissions.
reference case refers to the results from the simultaneous equation The second robustness check involves modelling endogeneity
model (column (1)). using two-stage residual inclusion estimation (column (3)). The
projected increase in public cost is $36.24 million, and is also
7.1. Endogeneity considerably smaller than the reference case. As with the projec-
tions from the single equation model, the expected increase in cost
The first two sets of robustness checks assess the sensitivity results only from changes in day admissions. The two alternative
of the results when different estimation strategies are used to estimation strategies produced price elasticity estimates that are
address the potential problem that the insurance and patient type very similar to the simultaneous equation model.
binary variables are endogeneous. Column (2) of Table 10 shows
the cost estimates and the implied price elasticity when each equa- 7.2. Predicted premiums
tion in the econometric model is estimated separately rather than
simultaneously. This estimation approach assumes that there is no As discussed in Section 4.2, the price of insurance is formulated
endogeneity problem. The point estimate of the net effect on public using the observed premium expenditure data for individuals who
expenditure is $38.61 million, and considerably smaller compared have private health insurance, and predicted premiums for those
Table 10
Robustness check – expected cost and price elasticity from a 10 percent reduction in premium rebates ($ mil.).
Specifications (1) (2) (3) (4) (5)
Simultaneous equation Single equation 2SRI Predicted premiums LHC loading
Change in expenditure (%) 124.82 (0.98) 38.61 (0.30) 36.24 (0.28) 83.64 (0.65) 126.29 (0.99)
[33.83, 238.97] [11.14, 71.81] [5.52, 76.63] [11.50, 175.81] [54.06, 231.34]
Implied insurance elasticity (–) 0.32–0.35 0.31–0.34 0.32–0.35 0.36–0.41 0.33–0.36
Note: The above set of cost projections uses only simulated outcomes that are statistically significant. The specifications considered are as follows: (1) Simulated effects from
simultaneous equation model (cf. Table 9). (2) Single equation (cf. Table B.3). (3) Using two-stage residual inclusion (2SRI) approach to account for endogeneity in insurance
and admission type binary variables. (4) Using predicted premiums to construct insurance price. (5) Premium loading under Lifetime Health Cover in accordance with Ellis
and Savage (2008).
who are not privately insured. As a robustness check, predicted potential endogeneity problems, the conclusion that reducing sub-
premiums are used for all observations in the sample to com- sidies generate cost savings remains unchanged.
pute the insurance price. Using predicted premiums (column (4)), A question that naturally arises is whether the simulation results
public expenditure is projected to increase by $83.64 million. The are driven primarily by the inelastic demand for private insurance,
expected cost increase is lower because this model predicts that a or whether it is due to the differential rates of hospital care use
smaller shift from private to public care for both day and overnight by the insured and uninsured groups. The simulation projections
admissions. indicate that the frequency of hospital admissions and the number
The implied price elasticity lie within the range −0.36 to −0.41 of hospital nights remain unchanged when the proportion of pri-
and is similar to that of the reference case. In Section 4.2, it was vately insured individuals decreases. This suggests that differences
suggested that the price elasticity estimates may be biased if the in utilisation across the insured and uninsured groups are not the
price of insurance variable, formulated using observed premiums, reason why cost savings occur. It is also not the case that individuals
omits plan characteristics that are unobserved. It is worth empha- continue to use private hospital services despite dropping private
sising that since the implied price elasticities from using either health insurance cover. The inelastic demand is the main reason
observed or predicted premiums are similar, there is no indication for the cost savings, as only a small fraction of individuals are likely
of a significant bias. to respond to the higher prices for insurance by dropping private
health coverage. This conclusion is consistent with previous stud-
ies by Emmerson et al. (2001) and Frech and Hopkins (2004), who
7.3. Lifetime health cover
both show that the price elasticity needs to be substantially higher
for private health insurance subsidies to be self-financing.
The last set of sensitivity check uses the approach adopted by
The estimates of the demand elasticity for private insurance
Ellis and Savage (2008) to analyse the impact of the Lifetime Health
obtained in this study is broadly comparable with those from pre-
Cover policy. As shown in column (5), this approach produces esti-
vious Australian studies. Butler (1999) for instance uses a similar
mates of the price elasticities, and projections of cost increase that
definition of the insurance price and obtained a price elasticity
are very similar to the reference case.
estimate of −0.44 for hospital cover. This study is based on data
that predates the policy reform when financial incentives were
8. Discussion and conclusions introduced. It is conceivable that consumers are more respon-
sive to price changes back then, in the absence of the tax levy
Policy incentives such as premium or tax rebates for duplicate that penalises high income individuals without private insurance.
private health insurance are often justified based on the argument Frech III et al. (2003) use time series data and model the spike in
that private financing can relieve cost and capacity pressures on the private health insurance coverage in the Australia population fol-
public system. This paper investigates the question of whether cost lowing the introduction of the 30 percent rebate and obtained a
savings that are achieved from reducing subsidies for private health price elasticity estimate of −0.37, which is very similar to those
insurance outweigh the potential increase in public expenditure on in this study. In a recent study, Ellis and Savage (2008) use survey
hospital care in the context of the Australian health care system. data from the post-reform period and obtained elasticity estimates
The study develops a microeconometric framework to analyse the with respect to premiums of −0.6 for singles and −0.4 for fami-
effect of policy incentives on the decision to buy private health lies. In this paper, using the method by Ellis and Savage (2008) to
insurance, and the effect of private insurance on the decisions of incorporate entry-age rating produces elasticity estimates in the
whether to obtain hospital care from the public or private sector, range of −0.33 to −0.36, which are quite similar to their estimate
and how much care to consume. The econometric estimates are for families. This is interesting to note that the estimates of the
used in a simulation analysis to assess whether reducing premium price elasticities for duplicate and supplementary PHI are broadly
rebates on private health insurance in Australia is likely to generate comparable internationally (e.g. −0.5 in UK and Spain (King and
net cost savings to the public budget. Mossialos, 2005; López Nicolás and Vera-Hernández, 2008), −0.4
The econometric results indicate that individuals with private to −0.49 in Canada (Stabile, 2001; Finkelstein, 2002)). Obviously,
health insurance are more likely to seek hospital care as a private differences in definitions and empirical methods may limit compa-
patient compared to those without private insurance. However, rability. Notwithstanding, the slightly smaller elasticity estimates
the intensity of hospital admissions and length of overnight stay obtained in this paper are likely due to the tax levy and the entry-
do not differ between the insured and uninsured groups. The esti- age rating policy in Australia. What follows is that individuals,
mates of the elasticity of demand for private insurance are in the particularly those with high incomes, have considerable financial
range of −0.32 to −0.35. The simulation projections show that incentives to take up or maintain private insurance cover even in
reducing premium subsidies is expected to generate net cost sav- the face of premium increases.
ings to the public budget. This result is obtained in all 5 different It is important to highlight that the results in this paper may not
simulation scenarios that are considered, and is robust to differ- be generalisable, and do not necessarily imply that reducing sub-
ent methods used to account for uncertainty in the projections. sidies on private health insurance would generate cost savings in
To illustrate, a 10 percent reduction in rebates for instance is pro- other countries. This is because the Australian context is unique in
jected to lead to an increase in public expenditure that is either that a series of three different policy incentives exists to encourage
not statistically different from zero, or substantially smaller than individuals to buy private health insurance. Reducing the premium
the cost savings achieved through lower spending on premium subsidy may have little effect given that individuals would still have
subsidies. considerable financial incentives to buy private health insurance
The estimates of the price elasticity and the simulation results because of the tax levy and entry-age premium adjustments.
are mostly robust to different approaches used to formulate the With further cost pressures on public health care budgets, inter-
insurance price. For example, using predicted premiums in placed est in a greater role of private financing is only likely to increase.
of observed premiums, and accounting for entry-age rating through Two follow-on questions that are open for further research arise
the Lifetime Health Cover policy produce elasticity estimates that from this study. First, what are the alternative measures that are
are very similar. Although the cost projections from the simulation available to policy makers seeking to expand the role of private
analysis vary slightly by the estimation strategies used to address financing? Second, how do these measures compare in terms of
their costs and effectiveness in achieving the stated health policy where Hdi = 1 if the ith individual has been hospitalised for a day
goals? These are some important questions that would inform the admission and 0 otherwise; 2 is the set of parameters to be
debate surrounding the appropriate roles and relative sizes of the estimated; ˚ and ˚2 denote the univariate and bivariate normal
public and private sectors for health care. cumulative density functions respectively.
Appendix A. Likelihood functions

Admission type choice and length of hospital stay for overnight
admission (cf. Section 3.3)
Demand for day and overnight hospital admissions (cf. Section
3.1)

+∞ +∞
+∞
y i3 ( 3 ) = Hoi · f (st i | Wi1 , qi , di , i1 ) · ˚2 [y1i 3 , y2i 4 , ∗ ]

i1 ( 1 ) = f (mi |Xi1 , di , εi1 ) · f (mov |Xi2 , di , ε1ε2 , εi1 , i )
i −∞
−∞ −∞
X ˇ + 1
X ˇ + 1
× ( i1 )d i1 + (1 − Hoi ) · ˚[y2i (W3i
)] (A.3)
i3 3 i3 3
· di ˚ + (1 − di ) 1−˚
2 2
with
× (i )(εi1 )di dεi1 (A.1)
Wi2
2 + 3 di + 1 2 i1
3 =
1/2
where (1 − 2 )

1 2
Wi3
3 + 1 3 i
ε2ε3 − ε1ε2 ε1ε3 4 =
1 = 13 εi1 + i 1/2
2
1 − ε1ε2 (1 − 2 )
1 3
( 2 3 − 1 2 1 3 )
2 (ε2ε3 − ε1ε2 ε1ε3 )2 ∗ = y1i · y2i ·
2 = (1 − ε1ε3 )− 2 1 − 2 1 − 2
1 − ε1ε2 1 2 1 3
and 1 is the set of parameters to be estimated. where Hoi = 1 if the ith individual has been hospitalised overnight
and 0 otherwise; 3 is the set of all unknown parameters to be
Admission type choice for day hospital admission (cf. Section 3.2) estimated; and y1i = 2qov − 1 and y2i = 2di − 1.
i
Appendix B. Coefficient estimates & sensitivity analysis

i2 ( 2 ) = Hdi · ˚2 [yi1 (Zi1 ˛1 + ıdi ), yi2 (Zi2 ˛2 ), yi1 yi2 v ]
Tables B.1–B.3
+ (1 − Hdi ) · ˚[y2i (Zi2 ˛2 )] (A.2)
Table B.1
Model estimates: demand for hospital admissions and choice of private care for day admissions.
Hospital admissions Private day admission
(1) (2) (3) (4) (5)

Variable Day admission Overnight admission Private health insurance Private patient: day Private health insurance
Insurance Price −0.476*** −0.478***

(0.035) (0.034)
Insurance 0.276 0.090 2.474***
(0.486) (0.242) (0.703)
Age 0.0035 −0.111*** −0.023
(0.018) (0.015) (0.026)
(Age)2 4.92e−06 0.0010*** 0.00025
(0.00016) (0.00014) (0.00024)
Female 0.264*** 0.206** −0.236*
(0.084) (0.081) (0.134)
Couple 0.102 0.074 −0.082
(0.100) (0.097) (0.153)
Depchild −0.199** 0.074 −0.182
(0.099) 0.097) (0.161)
Country of birth:
Main English 0.104 0.139 −0.300*** −0.178 −0.305***
(0.137) (0.113) (0.060) (0.198) (0.060)
Other −0.162 −0.079 −0.240*** −0.184 −0.241***
(0.135) (0.124) (0.064) (0.198) (0.064)
Education:
Certificate 0.097 0.040 0.051 0.012 0.057
(0.096) (0.101) (0.043) (0.180) (0.042)
Dipl/Adv Dipl. 0.149 −0.062 0.250*** 0.553*** 0.253***
(0.139) (0.137) (0.060) (0.225) (0.060)
Bach. and postgrad. 0.131 0.102 0.300*** −0.048 0.301***
(0.124) (0.116) (0.056) (0.192) (0.056)
HH Income 0.0048 0.0014 0.014*** 0.0052 0.013***
(0.0039) (0.0032) (0.0016) (0.0065) (0.0015)
Table B.1 (Continued)
(1) (2) (3) (4) (5)

(HH Income)2 −1.33e−05 1.71e−05 −4.46e−05*** −1.54e−05 −4.38e−05***

(1.47e−05) (1.22e−05) (8.32e−06) (2.80e−05) (7.87e−06)
Occupation:
Manager/Admin −0.366* −0.664*** 0.598*** −0.027 0.595***
(0.204) (0.202) (0.090) (0.335) (0.090)
Professional −0.535*** −0.568*** 0.319*** 0.281 0.318***
(0.130) (0.126) (0.057) (0.206) (0.057)
Clerical/Service −0.299** −0.662*** 0.207*** −0.025 0.207***
(0.130) (0.138) (0.059) (0.203) (0.059)
Trades/Transport −0.554*** −0.762*** −0.130** −0.104 −0.134**
(0.151) (0.162) (0.058) (0.244) (0.058)
Self assessed health:

Good 0.212** 0.285*** 0.044 −0.148 0.045
(0.089) (0.091) (0.040) (0.141) (0.040)
Fair/Poor 0.642*** 0.793*** −0.151*** −0.229 −0.150***
(0.117) (0.112) (0.055) (0.171) (0.055)
Health conditions:
Work Limiting 0.292*** 0.379*** 0.0142 0.129 0.014
(0.054) (0.053) (0.028) (0.093) (0.028)
Self Care 0.436** 0.383** −0.093 −0.156 −0.099
(0.181) (0.150) (0.098) (0.247) (0.082)
Mobility Activities −0.011 0.257** −0.091 0.043 −0.082
(0.149) (0.127) (0.071) (0.214) (0.072)
Communication Diff. −0.124 0.077 −0.211 −1.332** −0.222
(0.414) (0.295) (0.198) (0.592) (0.198)
Alcohol daily 0.169 −0.023 0.149** 0.487** 0.154***
(0.128) (0.134) (0.062) (0.210) (0.062)
Regular smoker −0.155 −0.0045 −0.482*** 0.089 −0.485***
(0.134) (0.111) (0.050) (0.206) (0.050)
State:
VIC 0.386*** 0.153 0.123** −0.063 0.125**
(0.105) (0.103) (0.058) (0.2+2) (0.058)
QLD 0.217* 0.150 −0.075 0.337* −0.069
(0.112) (0.110) (0.063) (0.179) (0.064)
SA 0.321** 0.303** 0.252*** 0.209 0.264***
(0.143) (0.140) (0.080) (0.241) (0.081)
WA 0.097 0.186 0.231*** 0.279 0.233***
(0.147) (0.134) (0.076) (0.262) (0.076)
TAS/NT 0.451** −0.413* −0.049 0.578** −0.040
(0.188) (0.213) (0.117) (0.267) (0.117)
ACT 0.397 0.420 0.114 −0.422 0.111
(0.285) (0.291) (0.189) (0.391) (0.189)
Remoteness:
Inner regional 0.0080 0.305*** −0.038 −0.368** −0.041
(0.093) (0.090) (0.057) (0.160) (0.057)
Other
0.127 0.242* −0.038 −0.327 −0.044
(0.117) (0.115) (0.096) (0.247) (0.096)
Headcount 0.138*** 0.139***
(0.038) (0.038)
(Headcount)2 −0.016*** −0.016***
(0.0038) (0.0038)
Distance −0.157* −0.090 −0.153*
(0.085) (0.230) (0.086)
(Distance)2 0.015 0.014 0.014
(0.011) (0.027) (0.011)
Constant −3.737*** −0.409 −0.844*** −0.602 −0.847***
(0.521) (0.445) (0.082) (0.742) (0.081)
1.291*** 1.247***
(0.046) (0.049)
Correlation
ε1 ε2 0.440***
(0.073)
ε1 ε3 −0.012
(0.236)
ε2 ε3 0.059
(0.121)
v −0.153
(0.498)
(1) (2) (3) (4) (5)

Loglikelihood −9111.45 −4004.52

Number of observations 6594 6594
Robust standard errors in parenthesis, clustered at household level.

*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table B.2
Model estimates: demand for hospital stay, choice of private patient and private insurance.
Overnight hospital stay
(1) (2) (3)

Variable Hospital nights Private patient: overnight Private health insurance
Insurance Price −0.479***

(0.034)
Insurance −0.462 1.538***
(0.339) (0.587)
Private Pat. −0.855**
(0.369)
Insurance X Private Pat. 0.942***
(0.287)
Age −0.043** −0.0099
(0.018) (0.025)
(Age)2 0.00051*** 0.00023
(0.00018) (0.00022)
Female 0.073 −0.027
(0.094) (0.118)
Couple 0.0049 −0.140
(0.110) (0.132)
Depchild 0.393*** −0.248
(0.139) (0.174)
Country of birth:
Main English 0.070 −0.280* −0.303***
(0.131) (0.167) (0.060)
Other 0.194 −0.515*** −0.240***
(0.134) (0.180) (0.064)
Education:
Certificate 0.133 −0.018 0.052
(0.107) (0.152) (0.042)
Dipl/Adv Dipl. 0.155 0.152 0.252***
(0.156) (0.215) (0.060)
Bach. and postgrad. 0.059 −0.052 0.299***
(0.131) (0.179) (0.055)
HH Income −0.0015 0.011* 0.013***
(0.0033) (0.0056) (0.0015)
(HH Income)2 1.63e−05 −1.02e−05 −4.39e−05***
(1.34e−05) (2.96e−05) (7.83e−06)
Occupation:
Manager/Admin −0.419* 0.107 0.600***
(0.1235 (0.286) (0.090)
Professional −0.278** 0.597*** 0.321***
(0.139) (0.177) (0.057)
Clerical/Service −0.202 0.077 0.209***
(0.153) (0.208) (0.059)
Trades/Transport 0.013 0.038 −0.127**
(0.190) (0.289) (0.058)
Self assessed health:

Good −0.033 0.021 0.044
(0.112) (0.145) (0.040)
Fair/Poor 0.179 −0.164 −0.155***
(0.126) (0.168) (0.055)
Health conditions:
Work Limiting 0.102 0.157* 0.016
(0.062) (0.086) (0.028)
Self Care 0.203* −0.0099 −0.096
(0.150) (0.224) (0.098)
Mobility Activities −0.086 0.069 −0.095
(0.120) (0.175) (0.072)
Overnight hospital stay
(1) (2) (3)

Variable Hospital nights Private patient: overnight Private health insurance
Communication Diff. −0.134 −0.602 −0.259

(0.202) (0.455) (0.194)
Alcohol daily 0.117 0.050 0.147**
(0.128) (0.194) (0.062)
Regular smoker −0.224* −0.466*** −0.486***
(0.120) (0.163) (0.050)
State:
VIC −0.063 −0.074 0.122**
(0.112) (0.157) (0.058)
QLD −0.176 0.042 −0.075
(0.122) (0.160) (0.063)
SA −0.148 0.192 0.252***
(0.140) (0.184) (0.080)
WA −0.085 0.198 0.230***
(0.156) (0.213) (0.076)
TAS/NT 0.120 −0.135 −0.056
(0.183) (0.295) (0.116)
ACT −1.047*** −0.221 0.113
(0.375) (0.318) (0.189)
Remoteness:
Inner regional 0.067 −0.270* −0.040
(0.093) (0.142) (0.057)
Other 0.052 −0.681*** −0.045
(0.120) (0.258) (0.095)
Headcount 0.135***
(0.037)
(Headcount)2 −0.015***
(0.0038)
Distance 0.542** −0.149*
(0.241) (0.085)
(Distance)2 −0.075** 0.015
(0.031) (0.011)
Constant 1.620*** −1.445** −0.849***
(0.520) (0.736) (0.080)
0.989***
(0.044)
1 2 0.167
(0.151)
1 3 0.328***
(0.137)
2 3 0.475
(0.347)
Loglikelihood −6114.97
Number of observations 6594
Robust standard errors in parenthesis, clustered at household level.

*
p < 0.1.
**
p < 0.05.
***
p < 0.01.
Table B.3
Simulating reductions in premium rebates on hospital use (single equation).
N = 10, 000
% in premium rebate
(1) (2) (3) (4) (5)
Baseline 5% 10% 15% 20% 25%
Panel A: Number of day and overnight admissions

% change −0.72% −1.41% −2.10% −2.77% −3.41%
[−0.97, −0.49] [−1.81, −1.08] [−2.52, −1.67] [−3.31, −2.27] [−4.05, −2.80]
% change in day admissions 0.18 −0.09% −0.18% −0.27% −0.36% −0.45%

[−8.97, 10.51] [−9.07, 10.39] [−9.16, 10.21] [−9.23, 10.07] [−9.33, 9.96]
% change in overnight admissions 0.22 −0.07% −0.13% −0.20% −0.26% −0.32%
[−8.10, 7.92] [−8.16, 7.82] [−8.21, 7.79] [−8.26, 7.75] [−8.32, 7.73]
Panel B: Public–private choice: day admission

N(Day admission) = 1144
% change −0.73% −1.43% −2.11% −2.78% −3.41%
N = 10, 000
% in premium rebate
(1) (2) (3) (4) (5)
Baseline 5% 10% 15% 20% 25%
[−1.01, −0.49] [−1.78, −1.06] [−2.58, −1.69] [−3.30, −2.27] [−3.96, −2.79]
% private patient: day admission 59.84 59.54 59.26 58.97 58.70 58.45
% change −0.49% −0.97% −1.45% −1.90% −2.32%
[−1.10, −0.00] [−1.79, −0.28] [−2.21, −0.56] [−2.90, −0.83] [−3.46, −1.11]
Panel C: Public–private choice & length of stay: Overnight admission

N(Overnight admission) = 1276
% change −0.72% −1.41% −2.06% −2.71% −3.33%
[−2.66, 1.10] [−3.34, 0.44] [−4.04, −0.28] [−4.64, −0.98] [−5.40, −1.60]
% private patient: overnight admission 45.98 45.74 45.51 45.27 45.06 44.85
% change −0.53% −1.02% −1.55% −2.00% −2.46%
[−5.78, 5.64] [−6.25, 5.16] [−6.57, 4.52] [−7.19, 3.89] [−7.51, 3.26]
% change in hospital nights: public patient 4.92 −0.05% −0.10% −0.15% −0.19% −0.23%
[−11.77, 13.80] [−11.89, 13.80] [−11.87, 13.56] [−11.89, 13.54] [−11.76, 13.38]
% change in hospital nights: private patient 4.81 −0.13% −0.25% −0.40% −0.53% −0.66%
[−11.79, 13.76] [−11.81, 13.58] [−11.99, 13.41] [−12.13, 13.26] [−12.32, 13.01]
Note: 95% confidence intervals in parenthesis based on 2.5 and 97.5 percentiles of the distribution.
References Fabbri, D., Monfardini, C., 2009. Rationing the public provision of healthcare in
the presence of private supplements: evidence from the Italian NHS. Journal
Atella, V., Deb, P., 2008. Are primary care physicians, public and private sector spe- of Health Economics 28, 290–304.
cialists substitutes or complements? Evidence from a simultaneous equations Fabbri, D., Monfardini, C., 2011. Opt Out or Top Up? Voluntary Healthcare Insurance
model for count data. Journal of Health Economics 27, 770–785. and the Public vs. Private Substitution. IZA Discussion Papers 5952, Institute for
Australian Bureau of Statistics, 2006. Census of Population and Housing: Census the Study of Labor (IZA).
Geographic Areas Digital Boundaries 2006. Australian Bureau of Statistics, Cat Finkelstein, A., 2002. The effect of tax subsidies to employer-provided supplemen-
no. 292.0.30.001. tary health insurance: evidence from Canada. Journal of Public Economics 84,
Australian Institute of Health and Welfare, 2006. Australian Hospital Statistics 305–339.
2004–05. AIHW Health Services Series no. 26, Canberra, AIHW cat. no. HSE 41. Frech, H., Hopkins, S., 2004. Why subsidise private health insurance. Australian
Barros, P.P., Siciliani, L., 2012. Public and private sector interface. In: Pauly, M.V., Economic Review 37 (3), 243–256.
McGuire, T.G., Barros, P.P. (Eds.), In: Handbook of Health Economics, vol. 2. Frech III, H., Hopkins, S., Macdonald, G., 2003. The Australian private health insurance
Elsevier Science B.V (chapter 15). boom: was it subsidies or liberalised regulation? Economic Papers 22, 58–64.
Bhat, C.R., 2001. Quasi-random maximum simulated likelihood estimation of the Gouriéroux, C., Monfort, A., 1996. Simulation-based Econometric Methods. Oxford
mixed multinomial logit model. Transport Research Part B 35, 677–693. University Press, New York, USA.
Buchmueller, T., 2008. Community rating, entry-age rating and adverse selection in Gowrisankaran, G., Town, R.J., 1999. Estimating the quality of care in hospitals using
private health insurance in Australia. The Geneva Papers 33, 588–609. instrumental variables. Journal of Health Economics 18, 747–767.
Butler, J.R.G., 1999. Estimating elasticities of demand for private health insurance in Greene, W., 2005. Functional form and heterogeneity in models for count data. In:
Australia. In: NCEPH Working Paper Number 43, Canberra. Foundation and Trends in Econometrics.
Butler, J.R.G., 2002. Policy change and private health insurance: did the cheapest Gruber, J., Poterba, J., 1994. Tax incentives and the decision to purchase health insur-
policy do the trick? Australian Health Review 25 (6), 33–41. ance: evidence from the self-employed. Quarterly Journal of Economics 109 (3),
Cameron, A.C., Trivedi, P.K., Milne, F., Piggott, J., 1988. A microeconometric model 701–733.
of the demand for health care and health insurance in Australia. Review of Gruber, J., Washington, E., 2005. Subsidies to employee health insurance premiums
Economic Studies 55 (1), 85–106. and the health insurance market. Journal of Health Economics 24, 253–276.
Cameron, C.A., Trivedi, P.K., 1991. The role of income and health risk in the choice Hall, J., De Abreu Lourenco, R., Viney, R., 1999. Carrots and sticks – the fall and fall
of health insurance: evidence from Australia. Journal of Public Economics 45, of private health insurance in Australia. Health Economics 8, 653–660.
1–28. Harmon, C., Nolan, B., 2001. Health insurance and health services utilisation in
Cameron, C.A., Trivedi, P.K., 2013. Regression Analysis of Count Data, 2nd ed. Ireland. Health Economics 10, 135–145.
Cambridge University Press, New York, USA. Hellström, J., 2006. A bivariate count data model for household tourism demand.
Cheng, T.C., 2011. Measuring the Effects of removing subsidies for private insur- Journal of Applied Econometrics 21, 213–226.
ance on public expenditure for health care. In: Melbourne Institute of Applied Jones, A., Koolman, X., Van Doorslaer, E., 2006. The impact of supplementary pri-
Economic and Social Research, Working Paper 26/11. University of Melbourne. vate health insurance on the use of specialists in selected European countries.
Cheng, T.C., Vahid, F., 2011. Demand for hospital care and private health insurance in Annales d’Economie et de Statistiques 83, 251–275.
a mixed public–private system: empirical evidence using a simultaneous equa- Kiil, A., 2012. What characterises the privately insured in universal health care
tion modeling approach. In: Melbourne Institute of Applied Economic and Social systems? A review of the empirical evidence. Health Policy 106, 60–75.
Research, Working Paper 22/11, University of Melbourne. King, D., Mossialos, E., 2005. The determinants of private medical insurance preva-
Colombo, F., Tapay, N., 2004. The OECD Health Project. Private Health Insurance in lence in England. Health Services Research 40 (1), 195–212.
OECD Countries. OECD. López Nicolás, A., Vera-Hernández, M., 2008. Are tax subsidies for private medical
Costa, J., García, J., 2003. Demand for private health insurance: how important is the insurance self-financing? Evidence from a microsimulation model. Journal of
quality gap? Health Economics 12, 587–599. Health Economics 27, 1285–1298.
Creedy, J., Kalb, G., 2006. Labour Supply and Microsimulation: The Evaluation of Tax Marquis, M.S., Long, S.H., 2001. To offer or not to offer: the role of price in employer
Policy Reforms. Edward Elgar, Cheltenham, Northampton, MA. demand for insurance. Health Services Research 36 (5), 935–958.
Deb, P., Trivedi, P.K., 2002. The structure of demand for health care: latent class Marquis, M.S., Louis, T., 2002. On using sample selection methods in estimating the
versus two-part models. Journal of Health Economics 21, 601–625. price elasticity of firms’ demand for insurance. Journal of Health Economics 21,
Deb, P., Trivedi, P.K., 2006. Specification and simulated likelihood estimation of a 137–145.
non-normal treatment-outcome model with selection: application to health McClellan, M., McNeil, B.J., Newhouse, J.P., 1994. Does more intensive treat-
care utilisation. Econometrics Journal 9, 307–331. ment of acute myocardial infraction in the elderly reduce mortality? Analysis
Doiron, D., Jones, G., Savage, E., 2008. Healthy, wealthy and uninsured? The role of using instrumental variables. Journal of the American Medical Association 272,
self-assessed health in the demand for private health insurance. Health Eco- 859–866.
nomics 17, 317–334. McFadden, D., Train, K., 2000. Mixed MNL models for discrete response. Journal of
Ellis, R., Savage, E., 2008. Run for cover now or later? The impact of premiums, threats Applied Econometrics 15, 447–470.
and deadlines on private health insurance in Australia. International Journal of Munkin, M., Trivedi, P., 1999. Simulated maximum likelihood estimation of mul-
Health Care Finance and Economics 8 (4), 257–277. tivariate mixed-Poisson regression models, with application. Econometrics
Emmerson, C., Frayne, C., Goodman, A., 2001. Should Private Medical Insurance be Journal 2, 29–48.
Subsidised? Health Care UK, pp. 49–65. Phelp, C., 1997. Health Economics. Addison-Wesley, New York, USA.
Private Health Insurance Administrative Council (2004). Data Tables PHIAC Savage, E., Wright, D., 2003. Moral hazard and adverse selection in Australian private
A, December 2004. Private Health Insurance Administration Council hospitals: 1989–1990. Journal of Health Economics 22, 331–359.
(PHIAC), Canberra. Retrieved from http://www.phiac.gov.au/for-industry/ Stabile, M., 2001. Private insurance subsidies and public health care mar-
industry-statistics/datatablesphiaca/ kets: evidence from Canada. Canadian Journal of Economics 14 (4),
Private Health Insurance Administrative Council, 2005. Operations of the Registered 921–942.
Health Benefits Organisations Annual Report 2004–05. Private Health Insurance Todd, P., Wolpin, K., 2006. Ex ante evaluation of social programs. In: Penn Institute
Administration Council (PHIAC), Canberra. for Economic Research Working Paper 06-022.
Propper, C., 2000. The demand for private health care in the U.K. Journal of Health Train, K., 2003. Discrete Choice Methods with Simulation. Cambridge University
Economics 19, 855–876. Press, New York, USA.
Riphahn, R.T., Wambach, A., Million, A., 2003. Incentive effects in the demand for Vera-Hernandez, M., 1999. Duplicate coverage and demand for health care. The case
health care: a bivariate panel count data model. Journal of Applied Econometrics of Catalonia. Health Economics 8 (7), 579–598.
18 (4), 387–405.

Estimating the price elasticity of beer: Meta-analysis of data with

heterogeneity, dependence, and publication bias
Jon P. Nelson ∗
Department of Economics, Pennsylvania State University, University Park, PA 16802, USA
Article history: Precise estimates of price elasticities are important for alcohol tax policy. Using meta-analysis, this paper
Received 3 December 2012 corrects average beer elasticities for heterogeneity, dependence, and publication selection bias. A sam-
Received in revised form ple of 191 estimates is obtained from 114 primary studies. Simple and weighted means are reported.
18 November 2013
Dependence is addressed by restricting number of estimates per study, author-restricted samples, and
author-specific variables. Publication bias is addressed using funnel graph, trim-and-fill, and Egger’s
Available online 7 December 2013
intercept model. Heterogeneity and selection bias are examined jointly in meta-regressions containing
moderator variables for econometric methodology, primary data, and precision of estimates. Results for
I120
fixed- and random-effects regressions are reported. Country-specific effects and sample time periods are
I100 unimportant, but several methodology variables help explain the dispersion of estimates. In models that
C10 correct for selection bias and heterogeneity, the average beer price elasticity is about −0.20, which is less
C590 elastic by 50% compared to values commonly used in alcohol tax policy simulations.
Keywords:
Alcohol
Prices
Taxes
Elasticity
Meta-analysis
1. Introduction policy alternatives. However, a broad, population-based tax policy

is difficult to sustain if there are two types of consumers: heavy
Excessive consumption of beverage alcohol, especially by youth drinkers who account for the majority of social costs but whose
and young adults, can be addressed by a variety of policy tools, demands are relatively price inelastic; and light/moderate drinkers
including demand-reducing policies in the form of higher alco- with more elastic demands. In the worst case scenario, demands by
hol taxes and prices (Phelps, 1988; Cook, 2007; Cawley and Ruhm, heavy-drinking individuals are perfectly inelastic, so higher taxes
2012). The economic justification for higher alcohol taxes is based or prices do little to reduce consumption and social costs, and end
on two broad principles. First, there are external costs associated up imposing welfare losses on moderate drinkers.1 The taxation
with excessive consumption such as drink-driving, violence, prop- paradox is often challenged by reviewers citing particular stud-
erty destruction, and drunken behavior. Higher alcohol taxes reflect ies suggesting heavy drinkers are highly responsive to prices (e.g.,
social costs not contained in market prices, which restores eco- Xu and Chaloupka, 2011), but this view is challenged on empirical
nomic efficiency. Other options include restrictions on time, place grounds by Ayyagari et al. (2011), Manning et al. (1995), Nelson
or manner of consumption, and severe penalties for some costly (2013a,b), and Ruhm et al. (2012). Taxation policies are clearly tied
behaviors. Second, some individuals are poorly informed about to the price elasticity (or inelasticity) of demand for alcohol, but
addictive, intoxicating or adverse health effects of excessive alco- elasticity estimates exhibit substantial dispersion across drinking
hol use, and their current consumption may not be optimal due
to information costs. Higher alcohol taxes also address this market
failure, although better health information or restricted sales are
1
An economic model that addresses this paradox is due to Pogue and Sgontz
(1989), who base optimal alcohol taxes on relative social gains and losses. Lacking
empirical evidence, they assume heavy drinkers and moderate drinkers have iden-
∗ Tel.: +1 814 237 0157. tical price elasticities as their optimal tax simulation requires only the relative price
E-mail address: jpn@psu.edu elasticity of the two groups; for contrary evidence, see Nelson (2013a,b).
J.P. Nelson / Journal of Health Economics 33 (2014) 180–187 181
patterns, beverages, countries and econometric models and meth- inverse variances. Section 3 contains an investigation of publica-
ods, making precise summary estimates difficult to obtain. tion bias. Using several methods, corrected weighted-averages are
The objective of this paper is to summarize and synthesize reported. Section 4 examines the effects of heterogeneity within
estimates of the price elasticity of beer, while accounting for dis- the set of 191 estimates, and presents meta-regressions that test
persion and potential biases due to heterogeneity (factual and for effects of independent variables describing factual differences,
methodological); dependence of estimates due to methodology methodology, and publication bias. Section 5 contains a summary
and sampling; and publication selection bias. Meta-analysis and of results and discusses implications for alcohol tax policy. Over-
meta-regression analysis are used to address these data issues, and all, the demand for beer is less elastic than reported in previous
I first discuss deficiencies in several previous meta-analyses that meta-analyses or contained in many policy discussions of alcohol
failed to analyze fully the dispersion of estimates for beer price problems.
elasticities. The importance of this investigation for public policy is
illuminated by prior studies of drinking patterns for what is termed
2. Meta-analysis: procedures in previous analyses and the
the “beer-drinking subculture.” First, Greenfield and Rogers (1999)
present study
and Rogers and Greenfield (1999) present survey evidence show-
ing that the top 5% of US drinkers by volume account for 39–42% of
Meta-analyses in economics must cope with several prob-
alcohol consumption, men and young adults are overrepresented
lems that arise due to the observational nature of data used
among heavy drinkers, and beer accounts for the bulk of alcohol
in econometric studies, reporting methods favored by academic
consumed by heavy drinkers (80% compared to 4% for wine and 16%
researchers and journals, and potential biases arising from selective
for spirits). Second, beer is more often involved with drink-driving
reporting and publication of empirical results in primary stud-
and other risky behaviors, regardless of confounding variables such
ies (Nelson and Kennedy, 2009; Stanley and Doucouliagos, 2012).
as age, gender, and education (Berger and Snortum, 1985; Hennessy
These problems include heteroskedasticity, heterogeneity, outliers,
and Saltz, 1990). Third, beer accounts for two-thirds of all alcohol
dependence of estimates, and publication bias. Three previous
consumed by binge drinkers (Naimi et al., 2007), and generally is
meta-analyses examine price and income elasticities for alcohol
the preferred beverage among males, young adults, and college stu-
beverages. This section first summarizes approaches taken by prior
dents (Dawson, 1993; Kerr et al., 2004; Snortum et al., 1987). As
analyses, with particular attention to primary data selection and
noted by Cook and Moore (1994, p. 53), there is no basis for claim-
estimation methodology.
ing that beer is the “drink of moderation.” A refined or targeted
tax policy could levy higher taxes on beer compared to wine and
spirits, but success of this policy depends on the price elasticity 2.1. Previous meta-analyses: procedures and methods
of beer (and cross-price elasticities with wine and spirits). Over-
all, tax policies that address social costs of excessive consumption Gallet (2007) was the first to conduct a meta-analysis of alcohol
must carefully consider price responsiveness of alcohol demands elasticities. He examines 132 primary studies that yield 1172 price
for beverages and consumers, especially the price elasticity of beer elasticities and 1014 income elasticities. Multiple estimates are
drinkers. obtained from many studies, so dependencies must be addressed.
Numerous empirical studies estimate demand functions for Unweighted median values are reported for beer, wine, spirits, and
beer. The present study reviews, summarizes, and synthesizes 191 total alcohol. Unweighted averages ignore the precision of elas-
price elasticity estimates – the effect size – drawn from 114 primary ticities. Meta-regressions are estimated, which pool elasticities for
studies. Criteria for inclusion and exclusion of primary studies and three beverages and total alcohol. These regressions control for het-
estimates are described below. Estimates are based on several types erogeneity, but pooling of beverages makes coefficient estimates
of data for 24 different countries. However, six English-speaking difficult to interpret. Gallet uses author-specific dummy variables
countries – Australia, Canada, Ireland, New Zealand, the UK, and US for studies reporting multiple estimates or authors with multiple
– account for 143 estimates or 75% of the total. Estimates exhibit studies based on the same or similar data. It is unclear how his
substantial dispersion and contain potential biases, which reflect study handles outliers and he does not discuss publication bias as
several factors. First, there is variation due to factual heterogeneity, a data problem. The meta-analysis by Fogarty (2009) departs in
such as sample time period and country. There also are numerous several important ways from procedures used by Gallet. Fogarty
methodological factors that vary from study to study, such as level employs full and restricted samples, with weights based on inverse
of aggregation, econometric models, estimation methods, and mea- variances. This procedure corrects for heteroskedasticity due to dif-
surement of prices and income. Second, investigators may report ferent sample sizes and other features of the data (Rhodes, 2012).
more than one elasticity estimate from the same data or different Heterogeneity is addressed by a variety of covariates, with special
investigators may use identical or similar data sets. This means that attention paid to theoretical concepts being analyzed (e.g., Hicksian
effect-size estimates are not independent, which biases summary vs. Marshallian price elasticities). Both fixed- and random-effects
estimates and their standard errors (Borenstein et al., 2009; Nelson models are estimated. Outliers are selectively deleted (Fogarty,
and Kennedy, 2009). Third, studies and estimates may be subject 2009, p. 32), and dependence is addressed by restricting the num-
to publication selection bias, which distorts the magnitude and ber of observations to one per study or a few observations that differ
precision of reported effect sizes (Borenstein et al., 2009; Stanley in important ways. Fogarty pools observations for three beverages,
and Doucouliagos, 2012). A complete analysis must consider all so his meta-regressions suffer from the same shortcoming as Gal-
three sources of dispersion. Previous analyses have addressed het- let’s study. Finally, Fogarty presents several funnel graphs, which
erogeneity and dependence, but this is the first meta-analysis to appear to show asymmetries in the dispersion of price elasticities
report summary estimates of beer price elasticities that correct for for each beverage. He reports “moderate publication bias only for
publication bias. the beer own-price elasticity” (Fogarty, 2009, p. 32), but does not
The remainder of the paper is divided into four sections. Section correct for any selection bias in the estimates.
2 contains a brief review of three previous meta-analyses of alcohol Wagenaar et al. (2009) fail to improve on the above studies.
price elasticities, and comments on methods used here compared First, the analysis combines price and tax elasticities, which is
to earlier analyses. Several weighted-average estimates for price incorrect and misleading. In some cases, aggregate and survey-
and income elasticities are reported, where weights are based on based results are combined, without attention to basic differences
182 J.P. Nelson / Journal of Health Economics 33 (2014) 180–187
in results. Procedures for estimating primary standard errors, if Table 1

Summary of meta-analysis variables.
any, are not reported. Second, meta-analysis is conducted using
partial correlation coefficients: the change in standard deviation Category/variablea No. of beer obs.,
units of alcohol consumption for a one standard deviation change medians (191 total
obs.)
in price/tax. While this procedure allows comparisons of stud-
ies employing different theoretical concepts, it fails to address an Basic study information
important policy issue – the magnitude of price elasticities by bev- No. of studies 114 studies
Publication date (median) 1997
erage. Third, dependence is handled by restricting observations to
Countries See Table 2
one per study, but exclusion criteria applied are uncertain; see Published in Applied Economics 49 obs.
Wagenaar et al. (2009, p. 182). The study begins with 1003 esti- Article/report or working paper/dissertation 164 article obs.
mates, but appears to end up analyzing only 190–392 estimates, Basic sample information
including 105 for beer prices/taxes from only 47 studies. Separate Median sample year 1977
analyses are conducted for beer, wine, spirits, and total alcohol, No. of primary observations (median) 39 obs.
but there are no covariates for observable sources of heterogeneity. Data frequency: annual, qtr./monthly, or survey 133 annual
Data type: time series, cross-section, or panel 150 time series
Averages reported are unweighted. Fourth, Wagenaar et al. (2009)
Aggregation level: country, state, province, or person 138 country-level
omit a number of econometric studies that report price elasticities
for alcohol beverages. Fifth, publication bias is not addressed. Effect-size unwt. medians
Price elasticity – beer (191 obs.) −0.320
It is useful for economic policy to know if beer elasticities Std. error of price 0.150
vary systematically over time or space. Several testable hypotheses Std. error reported or estimated 96 reported
emerge from prior analyses. First, Gallet (2007) and Fogarty (2009) Income elasticity – beer (169 obs.) 0.660
report that year or trend effects are important for alcohol price elas- Std. error of income 0.130
ticities, but it is unclear if their results apply to beer specifically. Econometric models
Second, Fogarty (2009) reports that country-effects are important Double log 64
for beer prices and income. Third, Wagenaar et al. (2009) report that System (AIDS, Rotterdam, etc.) 105
Other models (linear, ARIMA, etc.) 22
all effect sizes for alcohol are large, but their study fails to present Lagged dep. variable 44
regression-based estimates, analyze weighted means, or distin- Rational addiction model 4
guish between price and tax elasticities. The present study provides Error correction model 15
a more comprehensive meta-analysis for price and income data Standard demand model 128
underlying these hypotheses. Effect-size features
Short-run price estimate 176
2.2. Present meta-analysis: procedures and methods Long-run price estimate 15
Conditional model 98
Unconditional model 93
I begin by describing the selection of studies and effect-size esti- Hicksian price elasticity 169
mates, and then provide a summary of unweighted and weighted Marshallian price elasticity 22
average price and income elasticities for beer. Other data features
A search of the substantial literature on alcohol demand was Price index or other measure 65 index
conducted during the months of August–October 2012, with ref- Income/total exp. or other measure 84 income/exp.
erence lists from recent meta-analyses providing a useful starting a
Two articles analyze two data sets each and are counted as four studies. Tables
point. Articles, chapters, books, reports, dissertations, and working presenting the effect sizes and standard errors used in this study are available on
papers were examined on alcohol demand and alcohol-related the author’s Web page or upon request.
outcomes, such as traffic fatalities, crime, labor productivity and
wages, and other adverse outcomes. Some econometric studies on
alcohol harms include first-stage or structural demand estimates, Elasticity and standard error data were collected by the author
which are easily overlooked. Searching in related literatures for beer, wine, spirits, and total alcohol. This paper uses data
also is one way to reduce publication bias. Among search terms on beer price and income elasticities from 114 of 178 studies.
used were combinations of “alcohol” AND “tax” OR “price,” OR Following previous meta-analyses, data also were collected on a
“elasticity.” Searches also were conducted using “beer,” “wine” OR variety of potential moderator variables, which are summarized in
“spirits.” Among databases searched were AgEcon Search, EconLit, Table 1. In general, moderator variables are conditioned so there
JSTOR, RePEc, SSRN, and the on-line retrieval engines for EBSCO are sufficient positive observations in each of several possible
Host, ProQuest, ScienceDirect Journals, and Wiley Online Library. categories.
Numerous Google searches also were conducted. The search was Empirical studies in economics frequently report multiple esti-
restricted to materials in the English language, but not limited mates based on the same data or major portions of the same data
to articles in peer-reviewed journals. A total of 578 studies were set. Using all estimates creates within-study dependence or corre-
recovered and examined. Abstracts, tables, and other summaries lation, which will bias summary statistics. More weight is assigned
were screened by the author to select alcohol-consumption to primary studies with multiple estimates than to studies with one
studies with price elasticity estimates. This procedure narrowed or two estimates. More importantly, using multiple estimates will
the search to 232 studies of alcohol demand, which was further bias standard errors of summary statistics, with understatement of
reduced to 178 studies that contained sufficient information on errors being a generally expected outcome (Borenstein et al., 2009,
price elasticities and standard errors. There are 60 primary studies p. 226). There are several possible solutions to dependence prob-
not included in Fogarty (2009) or Gallet (2007), and 54 studies lems, including use of a single “best” estimate per study or model,
included in their analyses that are excluded here. There are 135 cluster robust standard errors, and hierarchical/multi-level models
studies reviewed here that were not included in Wagenaar et al. (Nelson, 2013c). Between-study dependence also is encountered
(2009). Reasons for exclusion of selected studies are found in a when primary investigators use the same data to estimate different
bibliography of 578 studies available on the author’s Web page models in separate publications or different investigators use sim-
at: http://econ.la.psu.edu/people/biographies/nelson bio.shtml. ilar data. Following Fogarty (2009, p. 10), I restricted the collection
of effect sizes to one or two estimates per primary study or model. Table 2
Summary of average beer price and income elasticities: three meta-analyses.
In general, the primary author’s preferred result was selected from
each study and an important variation was used to capture dif- Study (smpl. obs. for price, income)a Ave. price Ave. income
ferent model specifications or estimation methods. Results of this elasticity (se) elasticity (se)
selection procedure are summarized in a data appendix, available Gallet (2007, p. 124)
on the author’s Web page. While this restriction reduces within- unwt. median (315, 278 obs.) −0.360 0.394
study dependence, it does not address between-study dependence. Fogarty (2009, p. 24)
In particular, several authors have published multiple studies using unwt. mean (154, 121 obs.) −0.450 0.640
similar data sets, such as repeated country-level models using time- unwt. median (154, 121 obs.) −0.330 0.670
series data for increasingly longer time periods. How serious is Fogarty (2009, p. 25)
this duplication is a matter of judgment, since investigators also Australia, unwt. mean (19, 14 obs.) −0.340 (.220) 0.770 (.130)
tend to use different econometric methods, resulting in different Canada, unwt. mean (29, 19 obs.) −0.450 (.310) 0.890 (1.41)
UK, unwt. mean (42, 33 obs.) −0.470 (.540) 0.550 (.510)
“treatment effects.” A more complete analysis of between-study
USA, unwt. mean (36, 27 obs.) −0.520 (.490) 0.450 (.570)
dependence is reported below, where I consider restricted data sets
and meta-regression models with author-related variables. There is Wagenaar et al. (2009, p. 187)
unwt. mean price/tax (105 obs) −0.460 na
no evidence of systematic bias due to sampling methods employed.
Standard meta-analysis methods require estimates of effect size This study – full samples
unwt. mean (191, 169 obs.) −0.424 (.348) 0.604 (.430)
precision in the form of standard errors (Hedges and Olkin, 1985;
unwt. median (191, 169 obs.) −0.320 0.660
Rhodes, 2012; Stanley and Doucouliagos, 2012, p. 27). For half of the wt. mean – fixed effect −0.233 (.005) 0.390 (.006)
effect sizes, standard errors could be obtained directly, i.e., the pri- wt. mean – random effects −0.350 (.018) 0.567 (.035)
mary study reports an elasticity and standard error (or t-statistic).
This study – unwt. medians
For the other half, standard errors had to be obtained from esti- Australia-NZ (24, 22) −0.215 0.795
mates reported in other portions of the study. If the primary study Canada (28, 22) −0.285 0.390
reported a regression coefficient for price and its standard error or Nordic (23, 20) −0.390 0.430
UK-Ireland (49, 47) −0.310 0.680
t-statistic, then I used this information to derive standard errors
USA (42, 36) −0.330 0.315
for reported price elasticities. Thus, a standard error estimate was All other (25, 22) −0.350 0.890
derived by dividing the elasticity value by the t-statistic on price
This study – random effects
(No elasticity values were calculated.) This procedure introduces
Australia-NZ −0.250 (.028) 0.703 (.067)
bias in standard error estimates, with expectation of an upward bias Canada −0.369 (.042) 0.432 (.078)
as covariance terms are ignored. As a check, I regressed standard Nordic −0.448 (.075) 0.450 (.055)
errors on effect sizes and a dummy indicator variable for estimated UK-Ireland −0.361 (.045) 0.619 (.046)
errors. For prices, the dummy coefficient was 0.053 (se = .025). The USA −0.287 (.034) 0.351 (.062)
All other −0.423 (.082) 0.926 (.109)
coefficient is just significant, and indicates a modest upward bias
(mean elasticity standard error is 0.207). For incomes, the dummy This study – restricted samples
unwt. mean (161, 139 obs.) −0.464 (.359) 0.575 (.448)
coefficient was 0.080 (.03). The estimate is statistically significant
unwt. median (161, 139 obs.) −0.370 0.650
and indicates an upward bias (mean elasticity standard error is wt. mean – fixed effect −0.238 (.005) 0.366 (.006)
0.205). As an additional check, means and regressions reported wt. mean – random effects −0.385 (.021) 0.537 (.039)
below use restricted samples or include a dummy variable that unwt. mean (96, 83 obs.) −0.450 (.391) 0.570 (.453)
identifies observations with estimated errors. There is no evidence unwt. median (96, 83 obs.) −0.315 0.500
wt. mean–fixed effect −0.296 (.008) 0.335 (.009)
of systematic bias due to estimation procedures for effect-size
wt. mean–random effects −0.414 (.033) 0.598 (.046)
standard errors.
a
Weighted means computed using CMA 2.2 (Borenstein et al., 2008). For sample
of 95 observations with estimated standard errors, fixed-effect mean price elasticity
2.3. Summary of average beer elasticities in four meta-analyses is −0.198 (se = .01) and random-effects mean is −0.271 (.02).
Table 2 summarizes average beer price and income elastici-

ties using three prior analyses and the present study. Unweighted models give greater weight to less precise studies, which is an
average price elasticities are about −0.30 to −0.50. Unweighted advantage given estimation procedure described above. Using
income elasticities are about 0.50–0.70 in Fogarty and the present random-effects, Table 2 indicates a weighted-mean price elasticity
study, but smaller in Gallet. Possible reasons are exclusion of most of −0.35 and a weighted-mean income elasticity of 0.57. Fixed-
individual-level studies in the two former analyses or outliers in effect means are noticeably less elastic: price elasticity is −0.23
Gallet’s income data. and income elasticity is 0.39.
In a meta-analysis, fixed-effect models use inverse variances for Table 2 also presents averages for two individual countries
weights and treat dispersion in estimates as due solely to stochas- (Canada, USA) and four groups of countries: Australia-New
tic sampling error in each primary study (Hedges and Olkin, 1985; Zealand; Nordic countries (Denmark, Finland, Norway, Sweden);
Borenstein et al., 2010). That is, all studies are viewed as esti- United Kingdom-Ireland; and all other countries. Medians range
mating a common, or fixed, population effect size. Estimates with from −0.22 to −0.39 for price. Random-effect price means range
smaller standard errors provide more precise information about from −0.25 for Australia-NZ to −0.45 for Nordic countries. For
the population value and are given greater weight. The plural or income, means range from 0.35 for the US, 0.70 for Australia-NZ,
random-effects model also accounts for variation in population and 0.93 for all other counties. While average beer price elasticities
values by estimating a common between-study variance based on vary across countries, the extent of variation is not substantial. This
observed dispersion, which is added to the study-level variance. outcome is confirmed by meta-regression results reported below.
Inverses of combined variances provide weights for each effect Greater country-to-country variation is observed for average beer
size. Thus, in a random-effects model, true effect sizes are sim- income elasticities. Lastly, the bottom rows in Table 2 show, first,
ilar or comparable, but not identical across studies. This is the results for a restricted sample that deletes nine studies using sim-
simplest procedure for addressing heterogeneity. Random-effect ilar data. This reduces sample sizes to 161 observations for price
elasticities and 139 observations for income elasticities. Results of Funnel plot for 191 beer price elasticity estimates
this restriction are a slightly more elastic price mean (−0.38 com- 0.0
pared to −0.35) and less elastic income mean (0.54 compared to

0.57). These are not substantial changes. The second restricted sam-
ple excludes observations where standard errors are estimated. 0.2
Results for this restriction are a slightly more elastic price mean
(−0.41 compared to −0.35) and more elastic income mean (0.60
Standard Error
0.4
compared to 0.57). Again, these are not substantial changes. Over-
all, tentative results from the meta-analysis in this section are a
beer price elasticity of about −0.35 and a beer income elasticity
0.6
of 0.55. These results correct for precision and unobserved hetero-
geneity in random-effects models.
0.8
3. Publication bias: detection and treatment
Publication bias occurs when primary researchers search among 1.0

elasticity estimates and select those with statistically significant -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
coefficients, “correct” signs, and more elastic values. The result Point estimate
is a biased set of published estimates, with general expectations
of a positive association between reported effect sizes and their Fig. 1. Funnel plot for 191 beer price elasticity estimates.
standard errors, i.e., less precise estimates are more likely to be

published if they have larger effects.2 The purpose of this section A method for correcting for publication bias is the trim-and-fill
is to determine if publication bias exists for the sample of beer procedure suggested by Duval and Tweedie (2000), which uses an
price elasticities, measure the extent of bias, and report weighted iterative algorithm to add missing values until observations are
average price elasticities corrected for bias. symmetric about a recomputed mean effect size. Using trim-and-
fill, the fixed-effect mean is −0.20 (se = .01) and the random-effects
3.1. Fixed-effect versus random-effects: which model? mean is −0.23 (.02). This correction for publication bias reduces
weighted-mean estimates by 15–35%.
A possible indicator of publication bias is a noticeable differ-
ence between fixed- and random-effects weighted means (see 3.3. Meta-regression analysis of publication bias: FAT–PET results
Table 2). This difference arises because less precise estimates are
given greater weight in random- compared to fixed-effect models, In the absence of publication bias, effect size estimates and
reflecting unobservable heterogeneity that can be factual or meth- standard errors are independent. If not, they are correlated and
odological. As pointed out by Stanley and Doucouliagos (2012, p. summary statistics and standard errors are biased. For the i-th
82), “random effects are then, in part, the result of these greater observation (i = 1,. . .,N), a regression-based test of this associa-
[methodological] efforts to select and report desired estimates.” tion can be represented as (Card and Krueger, 1995; Stanley and
However, bias also is possible in estimated standard errors (Stanley Doucouliagos, 2012)
and Doucouliagos, 2012, p. 112), so “more precise” estimates
may simply reflect downward biased or inefficient estimates for ESi = ˇ1 + ˇ0 Sei + εi (1)
standard errors. Using random-effects is a possible offset to this
where ESi is the estimated price elasticity in study i, Sei is its esti-
inefficiency and also corrects for error outliers (Fogarty, 2009, p.
mated standard error, and εi is a stochastic error term. In the
28). As a consequence, results reported below use both fixed- and
absence of selection and heterogeneity, observed effects should
random-effects models.
vary randomly about the true effect size, ˇ1 , independent of
standard errors.
3.2. Funnel graph and trim-and-fill mean estimates However, if model specifications and estimates are selected
based on significance of the main covariates, selection bias will
Fig. 1 displays the funnel graph for beer price elasticities, where vary directly with standard errors, i.e., larger Se values are asso-
standard errors are arrayed on the Y-axis and elasticity estimates ciated generally with larger effect-size estimates. Because effect
on the X-axis. A vertical line indicates the fixed-effect mean (−0.23) size estimates are inherently heteroskedastic, it is appropriate to
and diagonal lines are the 95% confidence interval. In the absence of divide Eq. (1) by standard errors to yield (Egger et al., 1997; Sterne
publication bias, a funnel plot is symmetric about the mean effect and Egger, 2005; Stanley, 2005)
size. The graph indicates negative bias because there is greater dis- 1
persion of estimates to the left of the mean and a clear indication ti = ˇ0 + ˇ1 + i (2)
of “missing” studies to the lower right of the mean. The upper part Sei
of the graph shows a concentration of more precise estimates near where ti is the t-statistic for the i-th observation, 1/Sei is its pre-
the mean, but studies also are missing in the mid-range of values. cision, and i is an error term corrected for heteroskedasticity. In
Eq. (2), a test of H0 : ˇ1 = 0 is a test of absence of publication bias or
funnel-asymmetry test (FAT). If FAT rejects symmetry, the estimate
2
Card and Krueger (1995) attribute publication bias to three sources: (1) of ˇ0 is interpreted as the effect size corrected for publication bias,
reviewers and journal editors may be predisposed to accept papers that support con- referred to as the Egger intercept or precision-effect test (PET); see
ventional views such as negative price elasticities; (2) reviewers and journals tend Stanley and Doucouliagos (2012).
to favor papers with statistically significant results; and (3) primary researchers use
t-statistics of two or more for the main covariates as a guide for model specification
Table 3 shows results for several FAT–PET regressions for beer
and selection. For economic models of publication bias, see Young et al. (2008) and price elasticities, estimated with different samples and model spec-
Doucouliagos and Stanley (2012). ifications. Applying FAT, all intercepts are statistically significant,
Table 3
Regression-based tests of publication bias.
Variablea (1) (2) (3) (4) (5)
Intercept −1.577 (.235)* −1.867 (.256)* −1.643 (.455)* −1.275 (.236)* −0.939 (.337)*
Precision-1/Se −0.153 (.027)* −0.148 (.028)* −0.190 (.059)* −0.143 (.029)* −0.155 (.028)*
Australia-NZ −0.499 (.648)
Canada −1.079 (.606)
Nordic −0.595 (.522)
UK −1.031 (.609)
Other countries (excl. USA) −0.493 (.637)
R-square 0.243 0.230 0.162 0.462 0.260
Sample size 191 161 96 95 191
Mean dep. var. −3.31 −3.42 −3.42 −2.87 −3.31
a
Dependent variable is t-statistic value for beer price elasticity estimates; standard errors in parentheses based on White’s heteroskedastic robust OLS estimator. Asterisks
indicate statistically significant at 95% level.
and values between −1.0 and −2.0 are consistent with substan- random-effects means are −0.23 and −0.30 at the median value for
tial selection bias (Doucouliagos and Stanley, 2012). Applying PET, precision.
precision coefficients indicate mean price elasticities in the range
−0.14 to −0.19, with little difference between the full sample and 4. Meta-regression analysis of heterogeneity and
an author-restricted sample of 161 observations (−0.153 compared publication bias
to −0.148). Regressions (3) and (4) split the data into two samples
based on reporting of primary errors. Price elasticities are −0.19 Adding moderator variables to Eq. (1) yields a weighted
(Se reported) and −0.14 (Se estimated). Differences are probably least-squares (WLS) meta-regression model of heterogeneity and
more a function of underlying heterogeneity than indicative of dif- publication bias. Specifying all moderator variables as binary
ferent results due to estimation procedures. Lastly, regression (5) dummies preserves the interpretation of coefficients in Eq. (1), with
includes a set of country dummies, but none of these are statisti- the intercept as an estimate of the true effect size for a null case and
cally significant. All corrected price elasticities in Table 3 are about precision coefficient as an index of distortion due to publication
−0.20 or 50% less elastic than conventional averages. Compared to bias. As a correction for heteroskedasticity, weights are based on
weighted means in Table 2, meta-regression estimates corrected inverse variances of elasticity estimates. Both fixed- and random-
for publication bias are about 30–35% less elastic. Hence, correct- effects regressions are estimated. A general-to-specific approach
ing for publication bias has an important impact on the magnitude was used to arrive at a preferred model specification (Stanley and
of average beer price elasticities. Doucouliagos, 2012). All moderator variables were entered in a
regression and then ten core significant variables were retained.
Results for a robustness test are obtained by back-adding selected
3.4. Cumulative meta-analysis: mean elasticities by precision variables, which demonstrates that core coefficients are generally
restrictions stable and statistically significant.
An alternative to FAT–PET regressions is to order the data by 4.1. Fixed-effect meta-regressions

precision and focus attention on more precisely estimated effect
sizes (Borenstein et al., 2009, p. 289; Stanley and Doucouliagos, Table 4 displays results for five meta-regressions estimated
2012, p. 56). If less precise estimates are biased, it is possible to using WLS. Regression (1) is the preferred model, and the null
think of assigning zero weight to these estimates relative to those category is a primary elasticity with the following features: (1)
with smaller errors, regardless of effect sizes (Stanley et al., 2010). published in a journal article or book using annual data at the
With data ordered by precision, a cumulative meta-analysis is an country level; (2) theoretical model for unconditional Hicksian
analysis run with the first observation, then repeated with the first compensated price elasticity; (3) estimated using a double-log
two observations, and so on until all 191 observations are analyzed. specification; (4) an index for the price variable, with no lag terms
In this framework, the analysis yields a fixed-effect mean of −0.17 on the right-hand side; and (5) the primary study reports a standard
(se = .01) for 20 observations with the smallest standard errors and error for the price elasticity. A negative coefficient for a covariate
a mean of −0.23 (.01) for the 50% most precise. Random-effects indicates the null category yields a less elastic price effect and a pos-
means are −0.18 (.02) and −0.30 (.02).3 itive coefficient indicates the opposite. For example, other things
In summary, a meta-analysis that corrects for publication bias equal, Hicksian price elasticities (null) should be smaller in absolute
yields mean price elasticities for beer that are less elastic, indicating value compared to Marshallian elasticities, due to income effects
negative bias in reported results. Using trim-and-fill, mean values (Fogarty, 2009, p. 23). This relationship is confirmed empirically in
are −0.20 and −0.23. Using meta-regression results, fixed-effect Table 4, where in regression (1) the coefficient is −0.20 (se = .04).
values are −0.14 to −0.19. Using a cumulative analysis, fixed- and The intercepts in regressions (1)–(5) are negative and significant,
indicating price elasticities from −0.14 to −0.17 for the null cate-
gory. Precision coefficients are about −1.0 or slightly smaller. The
3
dummy for estimated errors is insignificant.
If publication bias operates in part through a reading of past results, then order-
ing observations by publication date might reveal that bias is more important for Regressions (2)–(5) test for robustness by back-adding mod-
more recent studies. Bias can creep into temporal patterns of estimates as investiga- erator variables to the model specification. None of the added
tors condition their reported values to lie within an “acceptable” range of estimates. coefficients are significant, and most have standard errors that
To test this proposition, I ran the cumulative meta-analysis with observations are at least as large as the coefficient estimate. Regression (2)
ordered by publication date. For fixed-effect models, values are −0.30 (se = .01) in
1991, −0.25 (.01) in 2005, and −0.23 (.01) in 2011. Bias toward more elastic values in
shows that adjusting for dependence due to author-specific effects
recent studies does not appear to be a problem. This is confirmed by meta-regression is not important, given sampling procedures. Country-effects are
results in Table 4. insignificant in regression (3) and time effects are insignificant
Table 4
Meta-regression results for beer price elasticities.
Variablea (1) (2) (3) (4) (5) (6)
Intercept (null category) −0.166 (.059)* −0.168 (.059)* −0.165 (.059)* −0.155 (.066)* −0.135 (.061)* −0.204 (.067)*
Precision (std. error) −1.065 (.262)* −1.144 (.269)* −1.024 (.271)* −1.023 (.264)* −1.129 (.262)* −0.951 (.218)*
Publication (article = 0) −0.104 (.035)* −0.093 (.037)* −0.103 (.036)* −0.112 (.036)* −0.113 (.036)* −0.120 (.057)*
Frequency (annual = 0) −0.206 (.042)* −0.199 (.042)* −0.204 (.043)* −0.222 (.044)* −0.194 (.042)* −0.110 (.051)*
Aggregation level (country = 0) −0.096 (.042)* −0.090 (.043)* −0.097 (.043)* −0.178 (.085)* −0.118 (.049)* −0.110 (.050)*
Theory model (uncond = 0) 0.080 (.042) 0.083 (.043) 0.084 (.043) 0.086 (.042)* 0.101 (.047)* 0.070 (.050)
Price model (Hicks = 0) −0.201 (.042)* −0.209 (.047)* −0.202 (.042)* −0.186 (.043)* −0.230 (.045)* −0.156 (.064)*
Lag model? (no = 0) 0.168 (.041)* 0.167 (.042)* 0.169 (.041)* 0.167 (.043)* 0.185 (.042)* 0.055 (.054)*
Econometric (double log = 0) −0.101 (.047)* −0.106 (.047)* −0.104 (.049)* −0.083 (.049) −0.110 (.050)* 0.002 (.048)
Price index (index = 0) 0.126 (.050)* 0.120 (.050)* 0.123 (.050)* 0.116 (.051)* 0.158 (.054)* 0.027 (.055)
Se estimated? (no = 0) −0.065 (.043) −0.063 (.044) −0.064 (.044) −0.072 (.044) −0.052 (.044) −0.019 (.067)
Author restriction? (no = 0) – 0.043 (.042) – – – –
Applied Econ pub.? (no = 0) – 0.026 (.033) – – – –
Nordic? (no = 0) – – −0.050 (.077) – – –
Other non-Anglo? (no = 0) – – 0.001 (.050) – – –
Data mid-year (<1977 = 0) – – – −0.026 (.029) – –
Time series data? (yes = 0) – – – 0.087 (.087) – –
Pub. year (<1997 = 0) – – – – −0.045 (.026) –
Income meas. by dpi or exp.? (yes = 0) – – – – −0.057 (.059) –
R-square (wt.) 0.444 0.450 0.445 0.450 0.455 0.432
Estimation method WLS WLS WLS WLS WLS MM
a
Dependent variable is the price elasticity of beer (wt. mean = −0.233). Coefficient standard errors in parentheses, asterisks indicate significance at 95% level; sample size
of 191 observations. All explanatory variables specified as binary dummies, except the precision variable; see Table 1 for variables. WLS estimated using inverse variance
weights. Regression (6) is random-effects regression, using method-of-moments (MM) and Stata metareg command.
in regressions (4) and (5). Robustness checks show that core The justification for higher alcohol taxes to address social costs
coefficients are generally stable and significant. Visual inspection rests importantly on elastic demands, especially for beer as the bev-
of residuals suggests normality, although a few outliers remain. erage of choice among heavy and youthful drinkers. In the absence
Overall, regression (1) captures the main features of primary data of this condition, tax increases on alcohol may fail a social benefit-
and econometric modeling that are important sources of dispersion cost test, although revenue, fairness, or “sin tax” objectives might
in primary estimates. R-square is 0.44 for regression (1), which is be considerations (Cnossen, 2007; Heien, 1995/1996; Kenkel and
consistent with many other meta-analyses in economics. Manning, 1996). Motivated by studies of social costs of alcohol
abuse and dependence, there have been a number of attempts
to calculate optimal alcohol taxes. In principle, these calculations
4.2. Random-effects meta-regression
are complex and involve alcohol demands that vary by drinking
patterns (heavy vs. moderate drinkers, acute vs. chronic consump-
Regression (6) in Table 4 is a random-effects regression,
tion); beverage-type (beer vs. wine or spirits); drinker age and
using method-of-moments procedure in Stata’s metareg command
gender; and non-linear marginal external costs. As a result, cur-
(Harbord and Higgins, 2008). As before, this is a more conservative
rent attempts to determine or simulate optimal taxes usually end
approach. The intercept is −0.20 and the precision slope is −0.95,
up with fairly blunt policy instruments. For example, Lhachimi et al.
which are modest changes in magnitude compared to regression
(2012) use an alcohol price elasticity of −0.50, which is constant
(1). Fewer methodological variables are significant or have wider
across populations and beverage types. In a study by van den Berg
confidence intervals, but this is a generally expected outcome for
et al. (2008), elasticities are specified by beverage using older val-
random-effects. However, the difference between −0.17 and −0.20
ues from Clements et al. (1997). Other simulations involve some
is not substantial, indicating that different meta-regression meth-
econometric modeling such as Purshouse et al. (2010). However,
ods give rise to similar results.
their elasticity estimates ignore some basic data issues, such as not
accounting for zero observations in survey data (Purshouse et al.,
5. Discussion 2010, p. 1363). Using meta-analysis, this study demonstrates that
the demand for beer, on average, is highly inelastic, so attempts to
This study uses meta-analysis to correct summary averages construct alcohol taxes that ignore this feature of beverage demand
of beer price elasticities for heteroskedasticity, heterogeneity, will be misguided. This result when combined with other sur-
dependence, and publication bias. First, weighted means for a vey evidence on the inelasticity of demands by heavy drinkers
sample of 191 estimates are −0.23 and −0.35 for fixed- and (Nelson, 2013a,b) suggests that population-based alcohol tax poli-
random-effects. Second, correcting for publication bias using trim- cies are unlikely to achieve their desired or stated objectives. Other
and-fill yields estimates of −0.20 and −0.23. At the median non-fiscal alcohol policies are deserving of greater attention and
precision, a cumulative meta-analysis yields weighted-means of consideration.
−0.23 and −0.30. Third, correcting for heterogeneity and publica-
tion bias, fixed-effect estimate for the null category is −0.17 and
random-effects estimate is −0.20. Other samples or categories do Acknowledgements
not produce results that differ widely from these estimates, such
as dispersion due to dependence, country-specific effects, time Research leading to this paper was supported in part by the
period, and estimation methods. Overall, the analysis indicates the International Center for Alcohol Policies, Washington, DC. This
average price elasticity of beer is about −0.20, which is 50% less paper presents the work product, findings, viewpoints, and conclu-
elastic than previously reported averages. A highly inelastic esti- sions solely of the author. The views expressed are not necessarily
mate has important implications for alcohol tax policy. those of ICAP or any of ICAP’s sponsoring companies.
References Manning, W.G., Blumberg, L., Moulton, L.H., 1995. The demand for alco-
hol: the differential response to price. Journal of Health Economics 14,
Ayyagari, P., Deb, P., Fletcher, J., Gallo, W., Sindelar, J.L., 2011. Understanding het- 123–148.
erogeneity in price elasticities in the demand for alcohol for older individuals. Naimi, T.S., Brewer, R.D., Miller, J.W., et al., 2007. What do binge drinkers drink?
Health Economics 22, 89–105. Implications for alcohol control policy. American Journal of Preventive Medicine
Berger, D.E., Snortum, J.R., 1985. Alcoholic beverage preferences of drinking-driving 33, 188–193.
violators. Journal of Studies on Alcohol 46, 232–239. Nelson, J.P., 2013a. Does heavy drinking by adults respond to higher alcohol taxes
Borenstein, M., Hedges, L., Higgins, J., Rothstein, H., 2008. Comprehensive Meta- and prices? A review of the empirical literature. Economic Analysis and Policy
Analysis Version 2.0. Biostat, Inc., Englewood. (forthcoming) http://www.eap journal.com/forthcoming.php
Borenstein, M., Hedges, L.V., Higgins, J.P.T., Rothstein, H.R., 2009. Introduction to Nelson, J.P., 2013b. Gender differences in alcohol demand: a systematic review of the
Meta-Analysis. Wiley, Chichester. role of prices and taxes. Health Economics, http://dx.doi.org/10.1002/hec.2974
Borenstein, M., Hedges, L.V., Higgins, J.P.T., Rothstein, H.R., 2010. A basic introduction (forthcoming).
to fixed effect and random-effects models for meta-analysis. Research Synthesis Nelson, J.P., 2013c. Meta-analysis: statistical methods. In: Johnston, R., Rolfe, J.,
Methods 1, 97–111. Rosenberger, R. (Eds.), Benefit Transfer of Environmental and Resource Values: A
Card, D.E., Krueger, A.B., 1995. Time-series minimum-wage studies: a meta-analysis. Handbook for Researchers and Practitioners. Springer, New York (forthcoming).
American Economic Review 85, 238–243. Nelson, J.P., Kennedy, P.E., 2009. The use (and misuse) of meta-analysis in envi-
Cawley, J., Ruhm, C.J., 2012. The economics of risky health behaviors. In: Pauly, M.V. ronmental and natural resource economics: an assessment. Environmental and
(Ed.), Handbook of Health Economics, vol. 2. Oxford, North-Holland, pp. 95–199. Resource Economics 42, 345–377.
Clements, K.W., Yang, W., Zheny, S.W., 1997. Is utility additive? The case of alcohol. Phelps, C.E., 1988. Death and taxes: an opportunity for substitution. Journal of Health
Applied Economics 29, 1163–1167. Economics 7, 1–24.
Cnossen, S., 2007. Alcohol taxation and regulation in the European Union. Interna- Pogue, T.F., Sgontz, L.G., 1989. Taxing to control social costs: the case of alcohol.
tional Tax and Public Finance 14, 699–732. American Economic Review 79, 235–243.
Cook, P.J., 2007. Paying the Tab – The Costs and Benefits of Alcohol Control. Princeton Purshouse, R.C., Meier, P.S., Taylor, K.B., Rafia, R., 2010. Estimated effect of alco-
University Press, Princeton. hol pricing policies on health and health economics outcomes in England: an
Cook, P.J., Moore, M.J., 1994. This tax’s for you: the case for higher beer taxes. National epidemiological model. Lancet 375, 1355–1364.
Tax Journal 47, 559–573. Rhodes, W., 2012. Meta-analysis: an introduction using regression models. Evalua-
Dawson, D.A., 1993. Patterns of alcohol consumption: beverage effects on gender tion Review 36, 24–71.
differences. Addiction 88, 133–138. Rogers, J.D., Greenfield, T.K., 1999. Beer drinking accounts for most of the hazardous
Doucouliagos, H., Stanley, T.D., 2012. Theory competition and selectivity: are all alcohol consumption reported in the United States. Journal of Studies on Alcohol
economic facts greatly exaggerated? Journal of Economic Surveys 27, 316–339. 60, 732–739.
Duval, S.J., Tweedie, R.L., 2000. Trim and fill: a simple funnel-plot-based method Ruhm, C.J., Jones, A.S., Kerr, W.C., et al., 2012. What U.S. data should be used to
of testing and adjusting for publication bias in meta-analysis. Biometrics 56, measure the price elasticity of demand for alcohol? Journal of Health Economics
455–463. 31, 851–862.
Egger, M., Davey-Smith, D., Schneider, M., Minder, C., 1997. Bias in meta-analysis Snortum, J.R., Kremer, L.K., Berger, D.E., 1987. Alcoholic beverage preference as a
detected by a simple, graphical test. British Medical Journal 315, 629–634. public statement: self-concept and social image of college drinkers. Journal of
Fogarty, J., 2009. The demand for beer, wine and spirits: a survey of the literature. Studies on Alcohol 48, 243–251.
Journal of Economic Surveys 24, 428–478. Stanley, T.D., 2005. Beyond publication bias. Journal of Economic Surveys 19,
Gallet, C.A., 2007. The demand for alcohol: a meta-analysis of elasticities. Australian 309–345.
Journal of Agricultural and Resource Economics 51, 121–135. Stanley, T.D., Doucouliagos, H., 2012. Meta-Regression Analysis in Economics and
Greenfield, T.K., Rogers, J.D., 1999. Who drinks most of the alcohol in the U.S.? The Business. Routledge, New York.
policy implications. Journal of Studies on Alcohol 60, 78–89. Stanley, T.D., Jarrell, S.B., Doucouliagos, C., 2010. Could it be better to discard 90% of
Harbord, R.M., Higgins, J.P.T., 2008. Meta-regression in Stata. Stata Journal 8, the data? A statistical paradox. American Statistician 64, 70–77.
493–519 (reprinted in: J.A.C. Sterne (Ed.), Meta-Analysis in Stata, Stata Press, Sterne, J.A.C., Egger, M., 2005. Regression methods to detect publication and other
College Station, 70–96). bias in meta-analysis. In: Rothestein, H.R., Sutton, A.J., Borenstein, M. (Eds.),
Hedges, L., Olkin, I., 1985. Statistical Methods for Meta-Analysis. Academic Press, Publication Bias in Meta-Analysis – Prevention Assessment and Adjustments.
Orlando. Wiley, Chichester, pp. 99–110.
Heien, D.M., 1995/1996. Are higher alcohol taxes justified? Cato Journal 15, 243–257. van den Berg, M., Van Baal, P.H.M., Tariq, L., et al., 2008. The cost-effectiveness
Hennessy, M., Saltz, R.F., 1990. The situational riskiness of alcoholic beverages. Jour- of increasing alcohol taxes: a modeling study. BMC Medicine 6, 36,
nal of Studies on Alcohol 51, 422–427. http://dx.doi.org/10.1186/1741-7015-6-36.
Kenkel, D., Manning, W., 1996. Perspectives on alcohol taxation. Alcohol Health and Wagenaar, A.C., Salois, M.J., Komro, K.A., 2009. Effects of beverage alcohol price and
Research World 20, 230–238. tax levels on drinking: a meta-analysis of 1003 estimates from 112 studies.
Kerr, W.C., Greenfield, T.K., Bond, J., Ye, Y., Rehm, J., 2004. Age, period and cohort Addiction 104, 179–190.
influences on beer, wine and spirits consumption trends in the US National Xu, X., Chaloupka, F.J., 2011. The effects of prices on alcohol use and its consequences.
Alcohol Surveys. Addiction 99, 1111–1120. Alcohol Research and Health 34, 236–245.
Lhachimi, S.K., Cole, K.J., Nusselder, W.J., et al., 2012. Health impacts of increas- Young, N.S., Ioannidis, J.P.A., Al-Ubaydli, O., 2008. Why current publication practices
ing alcohol prices in the European Union: a dynamic projection. Preventive may distort science. PLoS Medicine 5, 1418–1422.
Medicine 55, 237–243.

(Kelompok 1) JHE Vol. 33 (January 2014)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Kelompok 1) JHE Vol. 33 (January 2014)

Uploaded by

Copyright:

Available Formats

Volume 33, January 2014 ISSN: 0167-6296

The Journal of Health Economics has no page charges

Contents lists available at ScienceDirect

Journal of Health Economics

How does provider supply and regulation inﬂuence health care

preventative health care services, or prices. However, primary care

Analysis sample Full dataset

Panel A. Person sample

Analysis sample Full dataset

Panel B. Ofﬁce-based visits sample

4.1. Fixed effects speciﬁcation (2)

where ymijt is the log price of visit m made by individual i in county

4.2. Identification challenges with fixed effects specification

(1) log(NP per population) (2) log(PA per population)

log(MDs per population) (in 1995) 0.310*** 0.003 0.309*** 0.000

No controls Full controls Individual controls, county FE Full controls 2SLS

(1) (2) (3) (4) (5) (6) (7)

Log(number of ofﬁce-based visits)

(1) (2) (3) (4) (5) (6)

log(NP per population) −0.069 −0.217** 0.001 −0.255*** −0.017 −0.037

Overall index Reimbursement Legal Prescribing

Panel A. log(total number of visits)

log(NP per population) 0.009 0.036 −0.017 −0.042 **

K. Stange / Journal of Health Economics 33 (2014) 1–27

Dependent variable = log(total amount paid)

All individuals Medicare Medicaid Private Uninsured

log(NP per population) 0.003 −0.287 *

Had the following in the previous 12 months

(1) (2) (3) (4) (5) (6) (7) (8) (9)

Coefﬁcient on Control for supply Level of ﬁxed effects N (rounded)

Analysis sample Full dataset

Provider supply and regulation

Analysis sample Full dataset

Cholesterol check in last 12 months 0.526 0.499 0.523 0.499

Sample size (rounded to 100) 293,100 404,400

Analysis sample Full dataset

No condition associated with visit 0.20 0.40 0.20 0.40

Sample size (rounded to 100) 803,200 1,114,900

Provider supply and regulation

K. Stange / Journal of Health Economics 33 (2014) 1–27

Dept variable: provider was primary care provider

(1) (2) (3)

Broad visit category (omitted = “shots”)

Dependent variable: number of ofﬁce-based provider visits

OLS Poisson Negative binomial

X>0 ln X E[X] X>0 ln X E[X] X>0 ln X E[X]

K. Stange / Journal of Health Economics 33 (2014) 1–27

Individual-level regressions Visit-level regressions

−0.135 −0.146 −0.106 −0.117

K. Stange / Journal of Health Economics 33 (2014) 1–27

>0 log( ) log( ) log( ) log( )

Panel A. ﬁxed effects estimates

K. Stange / Journal of Health Economics 33 (2014) 1–27

Panel A. No interactions with regulatory environment

K. Stange / Journal of Health Economics 33 (2014) 1–27

Log(total charges) Log(amount paid)

Panel A. Fixed effects estimates

K. Stange / Journal of Health Economics 33 (2014) 1–27

Contents lists available at ScienceDirect

Journal of Health Economics

Health information exchange, system size and information silos

Mean Mean Std. Dev.

External exchange 0.17 # Hospitals in system in HRR 1.48 2.87

(1) (2) (3) (4) (5) (6)

log(MDs per population) (in 1995) 0.310* 0.003 0.309* 0.000

log(NP per population) −0.069 −0.217 0.001 −0.255* −0.017 −0.037

# Hospitals in system in HRR 0.017* 0.020* 0.021* 0.022* 0.021* 0.019*

# Hospitals in system in HRR −0.006* −0.005 0.004*

# Hospitals in system in HRR −0.009* −0.016* −0.007 −0.007 −0.003

# Hospitals in system in HRR 0.0432 −0.0413 −0.0446*

# Hospitals in system in 1994 0.932* 1.051* 0.932***

# Hospitals in system in HRR -0.007 −0.005 −0.013* −0.004 −0.011** −0.005*

# Hospitals in system in HRR 0.0326 0.0633*

# Hospitals in system in HRR 0.024* 0.015 0.023* 0.018* 0.026* 0.018*