You are on page 1of 17

Organizational Behavior and Human Decision Processes 161 (2020) 3–19

Contents lists available at ScienceDirect

Organizational Behavior and Human Decision Processes


journal homepage: www.elsevier.com/locate/obhdp

Nudging: Progress to date and future directions☆


John Beshears a, *, Harry Kosowsky b
a
Harvard Business School and National Bureau of Economic Research, Soldiers Field, Boston, MA 02163, United States
b
National Bureau of Economic Research, 1050 Massachusetts Avenue, Cambridge, MA 02138, United States

A R T I C L E I N F O A B S T R A C T

Keywords: Nudges influence behavior by changing the environment in which decisions are made, without restricting the
Nudge menu of options and without altering financial incentives. This paper assesses past empirical research on nudging
Choice architecture and provides recommendations for future work in this area by discussing examples of successful and unsuccessful
Behavioral economics
nudges and by analyzing 174 articles that estimate nudge treatment effects. Researchers in disciplines spanning
Behavioral science
the behavioral sciences, using varied data sources, have documented that many different types of nudges succeed
in changing behavior in a wide range of domains. Nudges that automate some aspect of the decision-making
process have an average effect size, measured by Cohen’s d, that is 0.193 larger than that of other nudges.
Our analyses point to the need for future research to pay greater attention to (1) determining which types of
nudges tend to be most impactful; (2) using field and laboratory research approaches as complementary methods;
(3) measuring long-run effects of nudges; (4) considering effects of nudges on non-targeted outcomes; and (5)
examining interaction effects among nudges and other interventions.

1. Introduction The insight that small changes to the decision-making environment


can significantly alter choices has its foundations in a long literature in
In 2008, Richard H. Thaler and Cass R. Sunstein published Nudge: behavioral economics. Going back at least as far as the work of Herbert
Improving Decisions about Health, Wealth, and Happiness. The authors A. Simon on bounded rationality (Simon, 1955) and the work of Daniel
argued that managers and policy makers can help individuals make Kahneman and Amos Tversky on heuristics and biases (Kahneman &
wiser choices by subtly altering the features of the environments in Tversky, 1972, 1979; Tversky & Kahneman, 1973, 1974, 1981), social
which those individuals make decisions. For example, changing the scientists have recognized that individuals face limitations on their
language used to describe the options in a menu, the format in which the ability to process information. Because of these limitations, individuals
options are presented, or the process by which the options are selected form many judgments using mental shortcuts that can lead to systematic
can alter individuals’ choices in domains ranging from health care to decision-making errors. Subsequent work in behavioral economics
personal finance to environmental conservation. Importantly, to qualify incorporated these psychological phenomena into theoretical models of
as “nudges,” these “choice architecture” strategies do not mandate or individual decisions (e.g., Bordalo, Gennaioli, & Shleifer, 2012; Bush­
forbid options, and they do not meaningfully change the financial in­ ong, Rabin, & Schwartzstein, 2019; Gabaix, 2014; Koszegi & Rabin,
centives associated with various options. Rather, nudges tap into the 2006; Laibson, 1997; O’Donoghue & Rabin, 1999, 2001; Shefrin &
psychology of decision making and gently guide individuals to different Thaler, 1988; Thaler & Shefrin, 1981). Additional work in this area also
outcomes. In this paper, written a little more than ten years after the documented the empirical relevance of such psychological factors for
publication of Nudge, we provide an assessment of the academic litera­ economic outcomes in both laboratory and field settings (e.g., Angele­
ture evaluating the impact of nudge techniques, and we highlight tos, Laibson, Repetto, Tobacman, & Weinberg, 2001; Beshears & Milk­
challenges that this literature will need to confront going forward. man, 2011; Busse, Pope, Pope, & Silva-Risso, 2015; Camerer, Babcock,


This article is an invited submission. It is part of a supplemental issue on “Healthy Habits” edited by Katherine L. Milkman, Dilip Soman, and Kevin G. Volpp and
supported by WW. This supplemental issue collects papers by participants in the Roundtable Discussion on Creating Habit Formation for Healthy Behaviors,
organized in late 2019 by the Wharton-Penn Behavior Change for Good Initiative (BCFG) and the Penn Center for Health Incentives and Behavioral Economics
(CHIBE).
* Corresponding author.
E-mail addresses: jbeshears@hbs.edu (J. Beshears), kosowsky@nber.org (H. Kosowsky).

https://doi.org/10.1016/j.obhdp.2020.09.001
Received 4 March 2020; Received in revised form 7 August 2020; Accepted 20 September 2020
Available online 10 December 2020
0749-5978/© 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

Loewenstein, & Thaler, 1997; Camerer & Lovallo, 1999; Chetty, Looney, the Pension Protection Act of 2006 promoted the adoption of automatic
& Kroft, 2009; DellaVigna & Malmendier, 2006; Fehr & Goette, 2007; enrollment in defined contribution plans (Beshears, Choi, Laibson,
Lacetera, Pope, & Sydnor, 2012; Lerner, Small, & Loewenstein, 2004; Madrian, & Weller, 2010), and a 2016 survey conducted by the Plan
Loewenstein & Prelec, 1992; Malmendier & Tate, 2005; Milkman & Sponsor Council of America (2018) found that 60% of the 401(k) plans
Beshears, 2009; Odean, 1998, 1999).1 in the sample used automatic enrollment. In several states, including
Building on this large literature that investigates deviations from the California, Illinois, and Oregon, certain employers that do not sponsor
neoclassical economic model of decision making, the idea that choice qualified retirement plans are required to automatically enroll their
architecture strategies can and should be used to influence behavior employees in state-administered Individual Retirement Accounts (Cen­
relies on two further implications of the behavioral economics ter for Retirement Initiatives, 2019). Automatic enrollment also features
perspective. First, the impact of psychological biases on behavior im­ prominently in national-level retirement systems in the United
plies that individuals’ choices cannot necessarily be relied upon, in the Kingdom, New Zealand, and Turkey. Beyond the domain of savings, the
spirit of Afriat (1967), to reflect individuals’ normatively relevant Organisation for Economic Co-operation and Development reports that
preferences. This gap between revealed and normative preferences 202 organizations around the world apply nudge tactics to public policy
provides a rationale, beyond standard market failure considerations, for (OECD Research, 2018), and multinational firms such as Google, Merck,
well-intentioned managers and policy makers to intervene in in­ Swiss Re, and Deloitte have internal groups of employees responsible for
dividuals’ decision making (Beshears, Choi, Laibson, & Madrian, incorporating choice architecture techniques into organizational
2008a). Second, the role of psychological factors in decision making processes.
represents not only a challenge but also an opportunity for improving The enthusiasm for nudge tactics among practitioners has been
economic outcomes. Managers and policy makers can augment their accompanied by burgeoning interest in nudge tactics among scholarly
traditional toolkit of financial incentives with nudges that harness psy­ researchers. In this paper, we use two approaches to discuss the schol­
chological factors in the service of promoting wiser decisions. The arly literature on nudging to date and to highlight important challenges
literature on nudges is thus a natural extension of prior work in for research in this area to consider in the future. First, we describe in­
behavioral economics, but it is also distinct from past research in that it dividual research articles that serve as examples illustrating key issues.
focuses on how to apply behavioral economics ideas to important This discussion is far from a comprehensive literature review, but it
practical problems with the goal of impacting outcomes. serves to make our points concrete. Second, we build and analyze a data
For one example of how to apply behavioral economics ideas in this set of 174 articles that estimated nudge treatment effects. These articles
way, consider automatic enrollment in retirement savings plans. Based comprise the full set of articles in Elsevier’s Scopus database that (1)
on the insights that people are sometimes inattentive (Gabaix, 2019), presented new data evaluating the effect of a nudge, (2) cited one of
tend to procrastinate when it comes to taking actions with short-run three seminal works on nudging (Camerer et al., 2003; Thaler & Sun­
costs but long-run benefits (Laibson, 1997; O’Donoghue & Rabin, stein, 2003, 2008), and (3) had at least ten citations as of July 2019. This
1999), are often reluctant to switch away from an option because of an data set represents only a fraction of the universe of research on nudging
aversion to giving up its benefits (Samuelson & Zeckhauser, 1988), and and is not a perfectly representative sample, but the selection criteria
are attracted to options that are perceived to be the social norm (Cialdini suggest that the data set is unlikely to generate a misleading impression
& Goldstein, 2004) or are perceived to be endorsed by a trusted au­ of the portion of the literature receiving citations.
thority (McKenzie, Liersch, & Finkelstein, 2006), a manager who hopes We begin our assessment of the empirical literature on nudging by
to increase employee savings in a firm’s retirement plan may wish to exploring nudge treatment effect estimates across academic disciplines,
change employees’ default enrollment status in the plan from non- domains of application, research settings (field observation versus other
participation to participation (Thaler, 1994). This nudge changes the approaches), and types of nudges (those that automate some aspect of
enrollment process from an opt-in mechanism (which implements a the decision-making process versus those that do not, as well as nudge
default contribution rate of zero to the plan for employees who do not categories defined by Beshears and Gino (2015)). For each category
actively elect to participate) to an opt-out mechanism (which imple­ along each of these dimensions, we calculate the fraction of treatment
ments a default contribution rate that is strictly positive for employees effect estimates associated with the category and perform a p-curve
who do not actively elect an alternative). This change in the default analysis (Simonsohn, Nelson, & Simmons, 2014) for estimates that fall in
dramatically increases the fraction of employees contributing to the plan the category. For all categories that have enough treatment effect esti­
(Beshears et al., 2006, 2008b, 2018; Choi et al., 2002, 2004; Madrian & mates to permit meaningful analysis, we find that nudges have a sta­
Shea, 2001). Better yet, retirement savings can be further boosted by tistically reliable effect on behavior. Thus, whether we consider the
automatically escalating employees’ contribution rates at future points literature on nudging overall or category by category, the finding that
in time, such as the beginning of each year (Thaler & Benartzi, 2004). nudges change outcomes is not purely an artifact of statistically unsound
Many public and private organizations have embraced choice ar­ research practices or publication bias in favor of statistically significant
chitecture approaches to influencing decisions. Adoption of nudge estimates. However, this result does not rule out the possibility that our
techniques by these organizations has been driven in part by the quan­ database captures a set of nudge treatment effect estimates that reflects
titative evidence demonstrating the potential for nudges like automatic some degree of publication bias. Comparisons of average treatment ef­
enrollment to cheaply and effectively change behavior. Adoption has fects across categories are subject to the caveat that the severity of
also been bolstered by the argument that nudges do not restrict the publication bias may vary by category, potentially causing average
choice set of well-informed individuals but do help individuals who treatment effects in our sample to be differentially inflated relative to
would otherwise have difficulty selecting and implementing beneficial true average treatment effects.
options from the choice menu (Camerer, Issacharoff, Loewenstein, We discuss five issues that future research on nudging should devote
O’Donoghue, & Rabin, 2003; Thaler & Sunstein, 2003, 2008). more attention to. First, we recommend that researchers investigate
Continuing the savings example, automaticity and defaults are now which types of nudges tend to have larger effects on outcomes. Such
widely applied in the context of retirement plans. In the United States, guidance would be useful to managers and policy makers who must
select from a large number of possible nudge techniques when seeking to
change behavior. In exploratory regressions analyzing our data set of
1
Thaler’s “Anomalies” column in the Journal of Economic Perspectives played nudge treatment effect estimates, we find suggestive evidence that
an important role in establishing and disseminating these ideas. See Rabin nudges that automate some aspect of the decision-making process tend
(1998), DellaVigna (2009), and Bazerman and Moore (2012) for reviews of this to have larger and more robust effects than nudges that do not automate
literature. some aspect of the decision-making process, but further work is needed

4
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

to examine this issue. Second, researchers should increasingly use field- published in 2008, so the publication year for articles in our data set is
based and laboratory-based approaches as complementary methods to concentrated in the years following 2008. The distribution tails off in the
investigate why and in which situations nudges change outcomes. Third, later years because a later publication date leaves less time for an article
researchers should place greater emphasis on studying the extent to to accumulate at least ten citations by July 1, 2019.
which nudges lead to cumulative long-run effects on outcomes. Nudges
may have long-run effects because they induce changes in habits, 2.2. Determination of the number of nudge treatment effect observations
because they prompt investments in durable capital (e.g., physical
capital or organizational capital as embedded in systems and processes), The unit of observation in our data set is the nudge treatment effect
or because they are applied repeatedly over time. Fourth, researchers estimate. A given article may include multiple experimental conditions,
should put more effort into measuring the effects of nudges on non- measure multiple outcome variables, and work with multiple pop­
targeted outcomes, as such unintended consequences can partially or ulations in multiple settings. Our data set includes separate observations
even completely offset the intended effects of nudges on targeted out­ to capture the distinct nudge treatments embedded in a single article.
comes. Fifth, in applications, nudges often represent only one part of a For example, an article comparing two nudge treatments to a control
multi-pronged approach to changing behavior, so researchers should group (using one outcome variable and one study population in one
increase focus on the interaction effects among nudges and traditional research setting) is represented by two observations in our data set.
interventions such as financial incentives. Depending on the circum­ When an article examines multiple outcome variables, populations, or
stances, various interventions may be substitutes or complements for settings, we follow the authors of the article in determining how to re­
one another, and the distinction is critical for designing effective pack­ cord the results. If the authors report results aggregated across these
ages of interventions. dimensions, our data set records an observation for the aggregated
Overall, the literature on nudging has been a success, as the empir­ treatment effect. Otherwise, our data set records multiple observations
ical evidence indicates that applying behavioral science in this way to for the disaggregated treatment effects. This process generated 965
solve managerial problems and to advance policy objectives can indeed nudge treatment effect observations. See the Online Appendix for
change behavior. Still, we advocate a more ambitious approach for the details.
next generation of nudging research. Instead of merely asking whether
choice architecture strategies can change the action that a person takes 2.3. Variables recorded in the data set
in a particular situation (the resounding answer is that they can), future
research should go much further by asking which types of choice ar­ For each observation in our data set, we recorded the following
chitecture strategies are most impactful at changing outcomes in an variables:
enduring and comprehensive fashion in important contexts.
The paper proceeds as follows. In Section 2, we describe the con­ • What academic discipline is associated with the article containing
struction of our data set of nudge treatment effect estimates. In Section the treatment effect estimate (e.g., economics)?
3, we use the data set and a series of example articles to assess the nudge • In what domain of application was the treatment effect estimated (e.
literature to date. We explain our recommendations for future nudge g., health care)?
research in Section 4. Section 5 contains our general discussion and • In which type of research setting was the treatment effect estimated
conclusions. (e.g., field experiment)?
• What is the nature of the outcome variable (e.g., a summary measure
2. Data set construction of a series of actions in a field setting)?
• Did the nudge treatment automate some aspect of the individual’s
In this section, we briefly describe the process for constructing our decision-making process?
data set capturing information on past empirical research on nudging. • To which categories and subcategories of nudges does the treatment
The Online Appendix provides further details. belong, in the taxonomy of Beshears and Gino (2015)?
• Did the researchers collect follow-up data to measure the treatment
2.1. Selection of articles effect over a longer time horizon?
• If the researchers collected follow-up data, did the treatment effect
To build our data set, we first identified all of the articles in Scopus, persist over the longer time horizon?
Elsevier’s abstract and citation database, that cite at least one of three • Did the researchers also measure the treatment effect of the nudge on
foundational works on nudging: “Regulation for Conservatives: Behav­ an outcome variable that could offset the treatment effect on the
ioral Economics and the Case for ‘Asymmetric Paternalism’” (Camerer focal outcome variable recorded in this observation? If so, was there
et al., 2003), “Libertarian Paternalism” (Thaler & Sunstein, 2003), and such a variable in the same domain as the focal outcome variable?
Nudge: Improving Decisions about Health, Wealth, and Happiness (Thaler & Was there such a variable in a different domain from the focal
Sunstein, 2008). In order to focus on articles that have some degree of outcome variable?
influence on the scholarly conversation, we limited the list of articles to • Did the article test whether there is an interaction effect between the
those that had at least ten citations in Scopus as of July 1, 2019. We then nudge and another intervention?
examined each article in the resulting list of 1052 articles to determine • Is the outcome variable continuous or dichotomous?
whether it reported at least one treatment effect estimate for a nudge • What was the size of the treatment effect, as measured by Cohen’s d?
intervention based on novel data (as opposed to a re-analysis of previ­ • For the test of the hypothesis that the treatment effect was zero, was
ously reported data). In judging whether or not an intervention qualifies the p-value less than 0.10? Was the p-value less than 0.05? What was
as a nudge, we relied on the definition articulated by Thaler and Sun­ the exact p-value?2
stein (2008, p. 6): a nudge is an intervention that changes “people’s • Did the nudge backfire?
behavior in a predictable way without forbidding any options or
significantly changing their economic incentives. To count as a mere The Online Appendix describes the details of these variables, but
nudge, [an] intervention must be easy and cheap to avoid. Nudges are
not mandates.” This process narrowed the list to 174 articles.
Fig. 1 shows the distribution of publication year for the 174 articles 2
We create three variables to capture information about the p-value because
in our data set. The book Nudge, which was cited more frequently in our some papers report that the p-value was less than 0.10 or less than 0.05 without
data set than the other two foundational works on nudging, was first reporting the exact p-value.

5
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

40

35

30

Number of articles
25

20

15

10

0
2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Year of publication
Fig. 1. Distribution of articles in the data set by year of publication.

√̅̅̅̅̅ )
several variables require further commentary here. arcsine p2 to calculate Cohen’s d, where p1 and p2 are the proportions
The variables capturing the research setting and the nature of the of successes in the two experimental groups (see, for example, Chernev,
outcome variable are closely related to each other. In our analysis, we Böckenholt, & Goodman, 2015).4 Many articles do not report sufficient
focus on the research setting and not the nature of the outcome variable. information to calculate Cohen’s d, so out of the 965 nudge treatment
To develop a taxonomy of nudges, Beshears and Gino (2015) build on effect observations in our data set drawn from 174 research articles,
the idea, popularized by Kahneman (2011), that humans have two basic Cohen’s d is non-missing for only 507 observations drawn from 101
modes of thinking that they use to make decisions. System 1 thinking is articles. We perform an ordinary least squares regression for which the
fast and intuitive, but it is prone to errors because it relies on mental outcome variable is an indicator for having a missing Cohen’s d. The
shortcuts that can sometimes be led astray. System 2 thinking is slow explanatory variables are (a) either an indicator for whether the nudge
and deliberative, but it is more likely to reach well-considered conclu­ involves automaticity or indicators for nudge categories in the Beshears
sions. Beshears and Gino (2015) argue that nudges can alter an in­ and Gino (2015) taxonomy; (b) indicators for academic discipline; (c)
dividual’s decisions by triggering system 1 (that is, by eliciting an indicators for domain of application; (d) an indicator for whether the
intuitive reaction that leads to a different choice), by engaging system 2 research setting involves field observation (either via a field experiment
(that is, by prompting a momentary pause during which the individual or via observational analysis of a natural experiment involving field
engages in a more reflective cognitive process), or by bypassing both data); and (e) an indicator for whether the outcome variable is dichot­
systems (that is, by removing the individual from some aspect of the omous. The indicators for academic discipline are jointly statistically
decision-making process).3 Each of these categories of nudges has sub­ significant, as are the indicators for domain of application, suggesting
categories, which we describe in Sections 3.4.2–3.4.4. We allow a given that scholarly norms within a subfield are important determinants of
nudge to fall in more than one category or subcategory to recognize that whether researchers report the necessary information for calculating
a nudge may operate through multiple mechanisms. Note that the set of Cohen’s d. The indicator for whether the outcome variable is dichoto­
nudges that automate some aspect of the decision-making process is mous is also statistically significant, but this pattern is not surprising
larger than the set of nudges that bypass both systems because some because the data requirements for calculating Cohen’s d for dichoto­
nudges that trigger system 1 and some nudges that engage system 2 also mous variables are less stringent. The indicator for whether the nudge
involve automaticity. For example, a nudge might automate the process involves automaticity and the indicator for whether the research setting
of filling out a registration form for a service, but the individual might involves field observation are not significant, and the indicators for
still be required to submit the form to sign up for the service. This nudge nudge categories are not statistically different from each other.
triggers system 1 by simplifying the process of implementing the deci­ Finally, it is natural to ask whether the analysis recorded in a given
sion to sign up. The nudge does not bypass both systems because it has observation in our data set was pre-registered. We follow the method­
not removed the individual from the process of making the decision to ology of Schäfer and Schwarz (2019) to determine whether the analyses
sign up. Thus, the variable recording whether the nudge involves captured in our data set were pre-registered, and we find that none of
automaticity is not redundant with the variable recording the nudge them were. Thus, it is important to keep in mind that the effect size
category. estimates in our data set may overstate true effect sizes (DellaVigna &
To calculate Cohen’s d for a continuous outcome variable, we divide Linos, 2020; Schäfer & Schwarz, 2019). Furthermore, differences in ef­
the estimated treatment effect size in the natural units of the outcome fect size estimates across the nudges in our data set may be due to dif­
variable by the mean of the standard deviation of the outcome variable ferential overstatement of true effect sizes, perhaps driven by
across the experimental conditions. When the outcome variable is differential severity of publication bias. However, for some important
( √̅̅̅̅̅
dichotomous, we use the arcsine transformation 2*arcsine p1 − 2* comparisons of nudge treatment effect estimates (e.g., the comparison of
nudges that involve automaticity versus those that do not), there is little
reason to expect differential overstatement of true effect sizes. In
3
See Johnson et al. (2012), Ly, Mažar, Zhao, and Soman (2013), and Halpern
(2015) for alternative schemes for categorizing nudges.
4
When information on effect size in percentage points is not missing but
information on the baseline proportion of successes to total observations is
missing, we assume that the effect is symmetric around a proportion of 0.5.

6
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

addition, the possibility of overstated effect sizes does not call into effect with p < 0.05, and the likelihood of an effect with p < 0.10 to
question the conclusion that nudges affect behavior. When we conduct Section 4, in which we report the results of regression analyses. Every
p-curve analyses (described in Section 3), the evidence indicates that the category for which we conduct the p-curve analysis has 86% or more of
conclusion that nudges impact outcomes is not purely the result of its associated p-curves rejecting the null hypothesis of no evidential
publication bias. value, with most categories having 100% of their associated p-curves
rejecting that null hypothesis. Thus, we do not discuss the p-curve results
3. Assessing the empirical literature on nudging category by category. We simply note here that the presence of nudge
treatment effects is statistically robust.
In Tables 1–4, we analyze our data set by breaking down the nudge
treatment effect estimates into categories along several different di­ 3.1. Nudge research by academic discipline
mensions: academic discipline, domain of application, research setting,
and type of nudge. For each category that we consider, we report the Table 1 shows that the literature evaluating nudge interventions is
percentage of nudge treatment effect estimates in the data set that are truly multidisciplinary. The most represented academic discipline in our
associated with that category. In this calculation, each observation re­ data set is psychology, which is associated with 24.1% of nudge treat­
ceives a weight proportional to the inverse of the total number of ob­ ment effect estimates. Economics is the second-most represented disci­
servations associated with the same article. The calculation thereby pline, with 21.3% of estimates. The category encompassing marketing
weights each article equally. We prefer this weighting scheme because it and consumer behavior follows close behind with 18.4%. Medicine,
does not place greater weight on an article merely because it measures a environmental science, and public health also have substantial repre­
larger number of outcome variables. sentation. In each of the categories with at least ten associated research
Within a category that we consider, we also report the mean Cohen’s articles, mean Cohen’s d falls in the range 0.20–0.53, and the percentage
d, the percentage of effects with p < 0.05, and the percentage of effects of effects with p < 0.05 (p < 0.10) falls in the range 49–70% (55–70%).
with p < 0.10. Here, each observation within a category receives a
weight proportional to the inverse of the total number of non-missing 3.2. Nudge research by domain of application
observations in that category associated with the same article.
We also perform a p-curve analysis (Simonsohn et al., 2014) for each In Table 2, we see that the effect of nudge interventions on behavior
of the categories that we consider. A p-curve analysis is a statistical test has also proved to be robust across many domains of application. In our
that is applied to a collection of null hypothesis statistical tests. In our data set, 33.9% of the nudge treatment effect estimates are applied to
case, the inputs to the p-curve analysis are the p-values from the tests of health-related decisions, and 24.1% are applied to decisions related to
the null hypothesis that the nudge treatment effect is zero. If the null the environment. Financial decision-making and prosocial behavior are
hypothesis were true for all of the nudge treatment effects in the cate­ also well represented. Among domains associated with at least ten
gory under consideration, we would expect the p-values less than 0.05 to research articles, mean Cohen’s d falls in the range 0.21–0.50, and the
be uniformly distributed over the interval from zero to 0.05. The p-curve percentage of effects with p < 0.05 (p < 0.10) falls in the range 52–70%
analysis formally tests the hypothesis that the distribution is uniform. If (60–76%).7
this hypothesis is rejected and the nudge treatment effect p-values less
than 0.05 are close to zero more frequently than they are close to 0.05, 3.3. Nudge research by research setting
we conclude that there is evidence for the existence of a nudge treatment
effect not equal to zero.5 In order to weight articles equally in the p- Table 3 splits our data set into two types of research settings:
curve analysis and to use independent p-values as inputs to the p-curve observation in the field (via field experiments or observational analyses
algorithm, we implement the following procedure. Among observations of natural experiments) and observation in other environments (via
within a category, we randomly select one observation per article and laboratory experiments, online experiments, or surveys). These two
enter the resulting subset of p-values into the p-curve algorithm.6 We types of settings are roughly equally prevalent, with observation in the
record whether the algorithm rejects (with p < 0.05) the null hypothesis field accounting for 44.0% and observation in other environments ac­
that the collection of p-values represents a body of research that does not counting for 56.0% of the nudge treatment effect estimates. Among
have evidential value. We repeat this procedure for 49 additional in­ treatment effect estimates associated with field observation, mean
dependent random draws of one observation per article, and we report Cohen’s d is 0.32, and the percentage of effects with p < 0.05 (p < 0.10)
the percentage of these 50 random draws for which the algorithm rejects is 66.0% (70.0%). Among treatment effect estimates not associated with
the null hypothesis of no evidential value. field observation, mean Cohen’s d is 0.48, and the percentage of effects
In Tables 1–4, we always report the percentage of nudge treatment with p < 0.05 (p < 0.10) is 60.1% (65.3%).
effect estimates associated with a given category. However, we only
report the results of the other calculations when the category is associ­ 3.4. Nudge research by type of nudge
ated with at least ten research articles. In this section, we primarily
discuss the percentages of nudge treatment effect estimates associated In Table 4, we present statistics summarizing nudge treatment effect
with various categories because we wish to establish that many different estimates associated with different types of nudges. Because one of the
types of nudges have been shown to be effective in a wide range of ac­ issues that we highlight in Section 4 is the importance of determining
ademic disciplines, domains of application, and research settings. We which types of nudges are particularly impactful relative to others, this
defer cross-category comparisons of Cohen’s d, the likelihood of an subsection goes into detail regarding our definitions of different cate­
gories and subcategories of nudges.

5 3.4.1. Nudges that do and do not use automaticity


If the hypothesis of a uniform distribution is rejected and the nudge treat­
ment effect p-values less than 0.05 are close to 0.05 more frequently than they
First, we divide nudge treatments into those that do and those that do
are close to zero, we suspect that the collection of results is tainted by unsound not automate some aspect of an individual’s decision-making process.
research practices that allow the researchers to manipulate p-values to be just
below the 0.05 threshold.
6 7
For the p-curve algorithm, we use the code available at http://www.p-c Nearly one-quarter of nudge treatment effect estimates fall in an uncate­
urve.com/ (accessed January 30, 2020). When a research article reports a p- gorized domain of application. Mean Cohen’s d in this group is 0.554, and the
value of zero, we enter a p-value of 0.001 into the algorithm. percentage of effects with p < 0.05 (p < 0.10) is 59.1% (63.0%).

7
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

Table 1
Nudge research by academic discipline.
Percentage of effects associated Mean Percentage of effects Percentage of effects Percentage of p-curves with
with this discipline Cohen’s d with p < 0.05 with p < 0.10 evidential value (p < 0.05)

All disciplines 100% 0.405 61.0% 65.6% 100%


Economics and finance 21.8% 0.201 62.4% 66.1% 100%
Economics 21.3% 0.201 61.3% 65.2% 100%
Finance 0.6% – – – –
Environmental science 8.6% 0.480 65.4% 67.1% 100%
Marketing / consumer 18.4% 0.377 48.7% 54.8% 100%
behavior
Medicine 11.5% 0.377 64.1% 68.3% 100%
Psychology and cognitive 27.0% 0.526 62.4% 69.1% 100%
science
Psychology 24.1% 0.531 60.9% 68.4% 100%
Cognitive Science 2.9% – – – –
Public health 6.3% 0.435 69.5% 69.5% 98%
Miscellaneous 6.3% 0.490 66.0% 70.0% 100%
Computer science 0.6% – – – –
Engineering 0.6% – – – –
Law 1.2% – – – –
Management 1.2% – – – –
Political science 0.6% – – – –
Public administration 0.6% – – – –
Transportation 1.7% – – – –

Percentage of effects associated with this discipline is a weighted mean, with an observation’s weight proportional to the inverse of the number of observations
associated with the same article. Mean Cohen’s d and percentage of effects with p < 0.05 (p < 0.10) are weighted means, with an observation’s weight proportional to
the inverse of the number of non-missing observations associated with the same article and with the discipline named in the leftmost column. For the p-curve analysis,
we randomly sample one observation per article and calculate the p-value for the null hypothesis that the random sample does not have evidential value. We repeat this
procedure for 50 independent random samples and report the fraction of samples for which p < 0.05. We only report mean Cohen’s d, the percentage of effects with p <
0.05, the percentage of effects with p < 0.10, and the percentage of p-curves with evidential value if the discipline is associated with at least ten articles.

Table 2
Nudge research by domain of application.
Percentage of effects associated Mean Percentage of effects with Percentage of effects with Percentage of p-curves with evidential
with this domain Cohen’s d p < 0.05 p < 0.10 value (p < 0.05)

Environment 24.1% 0.421 66.6% 71.3% 100%


Finance 7.4% 0.316 70.1% 76.1% 98%
Health 33.9% 0.416 59.3% 62.5% 100%
Exercise 1.7% – – – –
Health care 10.9% 0.286 57.3% 59.5% 96%
Healthy eating 19.5% 0.504 59.7% 62.0% 100%
Miscellaneous 1.7% – – – –
health
Prosocial behavior 6.9% 0.213 51.8% 66.4% 86%
Miscellaneous 27.7% 0.504 59.0% 62.2% 100%
Crime / criminal 2.3% – – – –
justice
Development 0.6% – – – –
Education 1.1% – – – –
Labor 0.6% – – – –
Other 23.1% 0.554 59.1% 63.0% 100%

Percentage of effects associated with this domain is a weighted mean, with an observation’s weight proportional to the inverse of the number of observations associated
with the same article. Mean Cohen’s d and percentage of effects with p < 0.05 (p < 0.10) are weighted means, with an observation’s weight proportional to the inverse
of the number of non-missing observations associated with the same article and with the domain named in the leftmost column. For the p-curve analysis, we randomly
sample one observation per article and calculate the p-value for the null hypothesis that the random sample does not have evidential value. We repeat this procedure for
50 independent random samples and report the fraction of samples for which p < 0.05. We only report mean Cohen’s d, the percentage of effects with p < 0.05, the
percentage of effects with p < 0.10, and the percentage of p-curves with evidential value if the domain is associated with at least ten articles.

An example of a nudge that involves automation is changing the default of effects with p < 0.05 (p < 0.10) is 72.7% (78.6%). Among the subset of
enrollment status in a defined contribution retirement savings plan from treatment effect estimates involving nudges that do not use automa­
non-participation to participation. If an individual does not actively ticity, mean Cohen’s d is 0.385, and the percentage of effects with p <
indicate a desire to contribute to the plan or not contribute to the plan, 0.05 (p < 0.10) is 58.2% (62.7%). While the effects of nudges are sta­
this nudge automatically implements a strictly positive contribution tistically robust regardless of whether or not they automate some aspect
amount on behalf of the individual. Previous research has documented of the decision-making process, there is suggestive evidence that auto­
that this nudge generates large increases in retirement plan participation maticity produces larger effects. We explore this possibility in more
rates (Beshears et al., 2006, 2008b; Choi et al., 2002, 2004; Madrian & depth in Section 4.1.
Shea, 2001).
Table 4 reveals that 15.3% of the nudge treatment effect estimates in 3.4.2. Nudges that trigger system 1
our data set involve a nudge that uses automaticity. Among this subset of We next examine nudges that operate by triggering system 1—that
treatment effect estimates, mean Cohen’s d is 0.521, and the percentage is, nudges that change behavior by invoking a fast, intuitive reaction. In

8
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

Table 3
Nudge research by research setting.
Percentage of effects Mean Percentage of effects Percentage of effects Percentage of p-curves with
associated with this setting Cohen’s d with p < 0.05 with p < 0.10 evidential value (p < 0.05)

Field observation (field experiment 44.0% 0.320 66.0% 70.0% 100%


or observational study)
Laboratory experiment, online 56.0% 0.481 60.1% 65.3% 100%
experiment, or survey

Percentage of effects associated with this setting is a weighted mean, with an observation’s weight proportional to the inverse of the number of observations associated
with the same article. Mean Cohen’s d and percentage of effects with p < 0.05 (p < 0.10) are weighted means, with an observation’s weight proportional to the inverse
of the number of non-missing observations associated with the same article and with the setting named in the leftmost column. For the p-curve analysis, we randomly
sample one observation per article and calculate the p-value for the null hypothesis that the random sample does not have evidential value. We repeat this procedure for
50 independent random samples and report the fraction of samples for which p < 0.05.

Table 4
Nudge research by type of nudge.
Percentage of effects associated Mean Percentage of effects Percentage of effects Percentage of p-curves with
with this category Cohen’s d with p < 0.05 with p < 0.10 evidential value (p < 0.05)

Nudges that use automaticity 15.3% 0.521 72.7% 78.6% 100%


Nudges that do not use 84.7% 0.385 58.2% 62.7% 100%
automaticity
Nudges that trigger system 1 72.1% 0.468 63.2% 66.7% 100%
By arousing emotions 29.3% 0.326 54.9% 59.7% 100%
By harnessing biases 20.0% 0.515 64.3% 65.0% 100%
By simplifying the process 24.4% 0.539 71.4% 73.9% 100%
Nudges that engage system 2 41.6% 0.346 60.9% 65.1% 100%
By encouraging joint – – – – –
evaluation
By creating opportunities 22.4% 0.329 65.0% 68.7% 100%
for reflection
By prompting planning 1.9% – – – –
By inspiring broader 14.7% 0.396 49.5% 54.2% 100%
thinking
By increasing 1.2% – – – –
accountability
By emphasizing – – – – –
disconfirming evidence
By using reminders 2.6% – – – –
Nudges that bypass both 13.8% 0.546 69.9% 77.1% 100%
systems
By setting the default 13.2% 0.546 68.8% 76.3% 100%
By making automatic 0.6% – – – –
adjustments

Percentage of effects associated with this category is a weighted mean, with an observation’s weight proportional to the inverse of the number of observations
associated with the same article. Note that percentages add up to more than 100% because a given nudge can belong to more than one category and more than one
subcategory. Mean Cohen’s d and percentage of effects with p < 0.05 (p < 0.10) are weighted means, with an observation’s weight proportional to the inverse of the
number of non-missing observations associated with the same article and with the category named in the leftmost column. For the p-curve analysis, we randomly
sample one observation per article and calculate the p-value for the null hypothesis that the random sample does not have evidential value. We repeat this procedure for
50 independent random samples and report the fraction of samples for which p < 0.05. We only report mean Cohen’s d, the percentage of effects with p < 0.05, the
percentage of effects with p < 0.10, and the percentage of p-curves with evidential value if the discipline is associated with at least ten articles.

Table 4, we see that 72.1% of the treatment effect estimates in our data implications of annuities for flexibility and control over asset allocation
set involve a nudge that triggers system 1, at least in part (recall that a and the timing of spending. Of course, having “more guaranteed in­
given nudge may simultaneously trigger system 1, engage system 2, and come” and having “less flexibility” are two ways of framing the same
bypass both systems). Mean Cohen’s d for these treatment effect esti­ fundamental feature of annuities, namely the steady stream of payments
mates is 0.468, and the percentage of effects with p < 0.05 (p < 0.10) is that they provide. Emphasizing one lens or the other invokes different
63.2% (66.7%). emotional reactions that sway an individual’s annuity purchase de­
Beshears and Gino (2015) list three different techniques for trig­ cisions. Table 4 shows that 29.3% of the treatment effect estimates in our
gering system 1: arousing emotions, harnessing biases, and simplifying data set are associated with a nudge that arouses emotions.
the decision-making process.
3.4.2.2. Nudges that harness biases. The second technique for triggering
3.4.2.1. Nudges that arouse emotions. The first technique for triggering system 1 is to harness biases. Nudges that harness biases change
system 1 is to arouse emotions. For example, Beshears, Choi, Laibson, behavior by tapping into psychological processes that are known to in­
Madrian, and Zeldes (2014) study framing effects in annuity purchases. fluence decision making in a systematic fashion. For example, Beshears,
Life annuities transform a lump sum payment into a steady stream of Dai, Milkman, and Benartzi (2019) test a retirement savings nudge that
monthly income that lasts for the rest of the annuitant’s life. When a harnesses the “fresh start effect,” the tendency of individuals to initiate
hypothetical annuity purchase decision is framed as a choice between the pursuit of virtuous goals at moments that represent the beginning of
more versus less guaranteed income, survey respondents annuitize more a new time period (Dai, Milkman, & Riis, 2014). When individuals
of their wealth compared to when the decision frame emphasizes the received mailings offering the opportunity to increase retirement plan

9
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

contributions at a future point in time that was framed as a new associated with a nudge that creates opportunities for reflection.
beginning (the recipient’s birthday, New Year’s, or the first day of
spring), they contributed more compared to when the future point in 3.4.3.2. Nudges that prompt planning. Other nudges engage system 2 by
time was framed as a control temporal landmark and compared to when prompting individuals to create plans for implementing valuable actions
the future point in time was discussed without reference to a temporal in the future. The process of thinking through the details of how to enact
landmark. Table 4 shows that 20.0% of the treatment effect estimates in plans embeds those plans more firmly in memory, raises awareness of
our data set are associated with a nudge that harnesses biases. possible logistical hurdles (which are then more likely to be resolved),
and creates a personal commitment from which deviation is undesirable,
3.4.2.3. Nudges that simplify the process. The third technique for trig­ and all of these factors increase the likelihood of following through on
gering system 1 is to simplify the process by which decisions are made. intentions (Beshears, Milkman, & Schwartzstein, 2016). Consistent with
By making it easy to select a particular option from the choice menu, a this reasoning, when an employer offers free workplace influenza
nudge can take advantage of the tendency of system 1 to gravitate to­ vaccination clinics to employees, prompting employees to write down
wards options that involve little up-front effort to implement. For the date and time when they plan to visit a clinic increases vaccination
example, Beshears, Choi, Laibson, and Madrian (2013) apply this tech­ rates (Beshears, 2016a, 2016b; Milkman, Beshears, Choi, Laibson, &
nique to increase retirement savings plan participation. The typical plan Madrian, 2011). Similar results obtain in the domain of preventive
enrollment process is somewhat complex because it often involves cancer screenings (Dai et al., 2012; Milkman, Beshears, Choi, Laibson, &
selecting both a contribution rate and an asset allocation for those Madrian, 2013). Table 4 indicates that 1.9% of the treatment effect es­
contributions. When the process is simplified to be a “yes or no” choice timates in our data set are associated with a nudge that prompts
of whether to enroll with a pre-selected contribution rate and pre- planning.
selected asset allocation, plan participation rates increase by 10–20
percentage points. Here, simplification makes people more likely to take 3.4.3.3. Nudges that inspire broader thinking. Some nudges engage sys­
action. In Table 4, we see that 24.4% of the treatment effect estimates in tem 2 by encouraging individuals to reconsider the lens through which
our data set are associated with a nudge that simplifies the decision- they are viewing a problem and to contemplate the consequences of
making process. different possible actions within a broader context. A broader decision-
making frame tends to incorporate factors that might not otherwise be
3.4.3. Nudges that engage system 2 considered, such as long-run implications and possible unintended
We now turn to nudges that operate by engaging system 2—that is, consequences. For example, McKenzie and Liersch (2011) demonstrate
nudges that change behavior by prompting individuals to initiate a more that encouraging employees at a company to think about what their
methodical decision-making process than would have otherwise taken future retirement savings plan balance will be causes 41% of them to
place. Table 4 shows that 41.6% of nudge treatment effect estimates in report greater interest in saving more, whereas encouraging employees
our data set involve a nudge that engages system 2. Mean Cohen’s d for to think about their current plan balance causes only 27% of them to
these treatment effect estimates is 0.346, and the percentage of effects report greater interest in saving more. In Table 4, we see that 14.7% of
with p < 0.05 (p < 0.10) is 60.9% (65.1%). the treatment effect estimates in our data set are associated with a nudge
Beshears and Gino (2015) identify seven techniques for engaging that inspires broader thinking.
system 2, and five of these techniques are represented in our data set:
creating opportunities for reflection, prompting planning, inspiring 3.4.3.4. Nudges that increase accountability. Another technique for
broader thinking, increasing accountability, and using reminders.8 engaging system 2 is to increase the extent to which individuals feel
accountable for their actions. Even if individuals are holding themselves
3.4.3.1. Nudges that create opportunities for reflection. The first tech­ accountable to some internally generated standard of behavior, and not
nique for engaging system 2 is to create opportunities for reflection. being held accountable to a standard imposed by an external party, their
Individuals have limited attention, so in some cases it is possible to desire to be consistent with their self-professed values and to honor their
promote wise decision making by simply encouraging individuals to promises to themselves can change their behavior. In an experiment
pause for a moment to consider the options available to them. For conducted at a hotel, guests who were invited during the check-in pro­
example, consider the case of a company that was looking to increase the cess to commit to reusing bathroom towels and were given a symbolic
number of its employees who were receiving maintenance prescription pin to wear for making such a commitment were more likely to reuse
medications for chronic conditions, such as high cholesterol, by mail towels than guests who were invited to make a generic commitment to
delivery instead of by in-person pick-up at the pharmacy. Mail delivery environmentally friendly behavior or who were not given a symbolic pin
is more cost-effective both for the employer and for the employees. The (Baca-Motes, Brown, Gneezy, Keenan, & Nelson, 2013). See Aleksovska,
company changed from an opt-in policy, under which employees could Schillemans, and Grimmelikhuijsen (2019) for a review of the literature
elect to receive mail delivery but would otherwise obtain prescriptions on accountability. Table 4 shows that 1.2% of the treatment effect es­
by in-store pick-up, to an active choice policy, under which employees timates in our data set are associated with a nudge that increases
were eligible for the prescription drug plan only if they indicated affir­ accountability.
matively whether they wished to use the mail delivery service or wished
to pick up prescriptions at the pharmacy. The financial incentives 3.4.3.5. Nudges that use reminders. A final technique for engaging sys­
associated with mail delivery and in-store pick-up were unchanged, but tem 2 is to remind individuals of the opportunity to take a particular
the active choice policy called attention to the decision at hand and action. When individuals intend to implement a behavior but have not
increased the percentage of employees choosing home delivery from 6% yet found a convenient moment for following through, or when in­
to 42%, creating cost savings for the employer and its employees of dividuals have simply forgotten of their intentions, a reminder can bring
approximately $700,000 per year (Beshears, 2016c; Beshears, Rooney, their intentions back to the top of mind and prompt them to take action.
& Sanford, 2016a, 2016b; Beshears et al., in press-b). In Table 4, we see For example, Altmann and Traxler (2014) show that among patients of a
that 22.4% of the treatment effect estimates in our data set are dentist who were due for a check-up, a postcard reminder increased both
the likelihood of making an appointment and the likelihood of
completing an appointment by approximately ten percentage points. For
another example, placing a free workplace influenza vaccination clinic
8
The two remaining techniques involve encouraging joint evaluation and in a location that employees pass regularly in the course of their day-to-
emphasizing disconfirming evidence.

10
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

day activities increases vaccination rates relative to placing the clinic in to employee outcomes in the sense that the same percentage of gross
a nearby location that employees do not pass regularly, as encountering income contributed to a Roth account translated into higher effective
the clinic on the way to some other activity reminds employees to obtain (after-tax) savings. Table 4 indicates that 0.6% of the treatment effect
a vaccination (Beshears, Choi, Laibson, Madrian, & Reynolds, 2016). In estimates in our data set are associated with a nudge that makes auto­
Table 4, we see that 2.6% of the treatment effect estimates in our data set matic adjustments.
are associated with a nudge involving a reminder.
4. Future challenges for the empirical nudge literature
3.4.4. Nudges that bypass both systems
When a nudge bypasses both systems, it takes individuals’ actions as Having documented that there is statistically robust evidence in
inputs and changes how those actions translate into outcomes. Table 4 favor of the efficacy of nudges across multiple disciplines, domains of
shows that 13.8% of nudge treatment effect estimates in our data set application, research settings, and types of nudges, we turn to a dis­
involve a nudge that bypasses both systems. Mean Cohen’s d for these cussion of challenges that the empirical nudge literature must confront if
treatment effect estimates is 0.546, and the percentage of effects with p it is to further expand its influence on managerial practice and policy
< 0.05 (p < 0.10) is 69.9% (77.1%). design.
Beshears and Gino (2015) identify two subcategories of nudges that
bypass both systems: nudges that set the default and nudges that make 4.1. Comparing categories of nudges based on their impact
automatic adjustments.
To begin our exploration of the categories of nudges that tend to have
3.4.4.1. Nudges that set the default. In Section 1, we discussed automatic the largest impact on outcomes, we use our data set of nudge treatment
enrollment in employer-sponsored retirement savings plans, an example effect estimates and study the correlates of a nudge’s effect size as
of a nudge that sets the default. By changing the default contribution measured by Cohen’s d. Recall that we cannot calculate Cohen’s d for all
rate from zero to a strictly positive percentage of pay, automatic of the observations in our data set because the necessary information is
enrollment changes what happens when an employee does not actively not always available, but we can calculate Cohen’s d for 507 treatment
indicate a preferred contribution rate. Without automatic enrollment, effect estimates reported in 101 articles.
such an employee does not become a plan participant; under automatic Columns 1 and 2 of Table 5 report the results from ordinary least
enrollment, such an employee does start contributing to the plan. The squares regressions with Cohen’s d as the outcome variable. In column 1,
nudge bypasses both systems in the sense that many individuals the explanatory variable of interest is an indicator for whether the nudge
passively accept the default without using either system 1 thinking or automates some aspect of the decision-making process. The control
system 2 thinking to consider their options, and automatic enrollment variables are indicators for the academic discipline of the article from
translates this action (or, more accurately, inaction) into plan partici­ which the treatment effect estimate is drawn (see Table 1), indicators for
pation instead of non-participation as the outcome. Past research has the domain of application (see Table 2), an indicator for whether the
documented that automatic enrollment generates large increases in research setting involves field observation (either via a field experiment
savings plan participation rates (Beshears et al., 2006, 2008b; Choi et al., or via observational analysis of a natural experiment involving field
2002, 2004; Madrian & Shea, 2001). In a meta-analysis of nudges that data), and an indicator for whether the outcome variable for which
set the default, Jachimowicz, Duncan, Weber, and Johnson (2019) find Cohen’s d is calculated is dichotomous (because the calculation of
that these nudges influence outcomes across a wide range of settings. In Cohen’s d differs for continuous versus dichotomous variables; see
Table 4, we see that 13.2% of the treatment effect estimates in our data Section 2.3). In the regression, each observation has a weight propor­
set are associated with a nudge that sets the default. tional to the inverse of the number of observations in the regression
associated with the same article. This procedure gives each article equal
3.4.4.2. Nudges that make automatic adjustments. Whereas a nudge that weight in the regression. Standard errors are clustered by article. We
sets the default changes the outcome that is implemented when an in­ find that nudges involving automaticity are associated with a Cohen’s
dividual does not actively select an option, a nudge that makes auto­ d that is 0.193 larger (p < 0.05). To put the magnitude of this estimate in
matic adjustments changes the outcome that is implemented when an context, Fig. 2 shows predicted values and associated 95% confidence
individual does make an active choice. For example, consider a firm that intervals from the regression holding all right-hand-side variables fixed
offers a traditional before-tax retirement savings account to its em­ at their means except for the indicator for whether the nudge involves
ployees. When employees contribute money out of their paychecks to automaticity, which takes a value of either zero or one. With other
such an account, the contributions are tax-deductible in the year they variables held at their means, the predicted Cohen’s d for nudges that
are made, and subsequent withdrawals from the account are taxable. involve automaticity is approximately 50% larger than the predicted
Now consider what happens if the firm introduces a “Roth” savings ac­ Cohen’s d for nudges that do not involve automaticity.
count as one of its retirement offerings. Contributions to a Roth account
are not tax-deductible in the year they are made, but withdrawals from
the account are not taxable. If employees use rules of thumb such as
“save 10% of income” and ignore the tax treatment of savings, the Roth
account increases savings relative to the before-tax account because a
given percentage of gross income contributed to a Roth account, which
is not taxed at withdrawal, translates into greater after-tax savings than
the same percentage of gross income contributed to a before-tax ac­
count, which is taxed at withdrawal. Consistent with this idea, Beshears,
Choi, Laibson, and Madrian (2017b) analyze eleven companies that
introduced Roth retirement accounts and find no evidence that these
accounts changed total contributions. Thus, the Roth accounts bypassed
both systems because many employees seemed not to change their sys­
tem 1 thinking or system 2 thinking; instead, they elected to contribute
the same percentage of gross income, without regard for the tax treat­
ment of contributions. The Roth accounts made automatic adjustments

11
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

Table 5
Regression analysis of effect size and statistical significance.
Cohen’s Cohen’s Indicator for p < Indicator for p < Indicator for p < Indicator for p <
d d 0.05 0.05 0.10 0.10

Nudge uses automaticity 0.193* – 0.139* – 0.155** –


(0.083) – (0.066) – (0.058) –
Nudge triggers system 1 – 0.476** – 0.828** – 0.802**
– (0.095) – (0.126) – (0.150)
Nudge engages system 2 – 0.421** – 0.764** – 0.779**
– (0.135) – (0.138) – (0.160)
Nudge bypasses both – 0.658** – 0.815** – 0.915**
systems – (0.204) – (0.182) – (0.191)
Research setting uses − 0.113 − 0.107 0.006 0.028 0.002 0.012
field observation (0.100) (0.100) (0.063) (0.065) (0.064) (0.065)
Dichotomous outcome − 0.125 − 0.103 – – – –
(0.093) (0.095) – – – –
Indicators for disciplines yes yes yes yes yes yes
Indicators for domains yes yes yes yes yes yes

p-value for hypothesis “triggers system 1” = “engages – 0.484 – 0.368 – 0.727


system 2”
p-value for hypothesis “triggers system 1” = “bypasses both – 0.269 – 0.926 – 0.365
systems”
p-value for hypothesis “engages system 2” = “bypasses both – 0.140 – 0.710 – 0.279
systems”

R-squared 0.187 0.177 0.068 0.062 0.070 0.061


Number of papers 101 101 174 174 174 174
Number of observations N = 507 N = 507 N = 965 N = 965 N = 965 N = 965

This table reports the results of ordinary least squares (OLS) regressions with the outcome variable in the column heading and the explanatory variables in the leftmost
column. When a nudge belongs to more than one category, the variables recording whether the nudge triggers system 1, whether the nudge engages system 2, and
whether the nudge bypasses both systems take fractional values (1/2 and 1/2, or 1/3 and 1/3 and 1/3). The second, fourth, and sixth regressions omit a constant term.
An observation’s weight in the regression is proportional to the inverse of the number of observations associated with the same article. Standard errors are clustered by
article. The symbols +, *, and ** indicate statistical significance at the 10%, 5%, and 1% levels, respectively.

0.9
p < 0.05
0.8

0.7
Predicted Cohen's d

0.6

0.5

0.4

0.3

0.2

0.1

0
Nudges that do not use automaticity Nudges that use automaticity
Fig. 2. Predicted Cohen’s d for nudges that do and do not use automaticity. This figure shows predicted values and 95% confidence intervals from an ordinary least
squares (OLS) regression for which the outcome variable is Cohen’s d. The right-hand-side variables are an indicator for whether the nudge uses automaticity,
indicators for academic discipline, indicators for domain of application, an indicator for whether the research setting involves field observation, and an indicator for
whether the outcome variable is dichotomous. An observation’s weight in the regression is proportional to the inverse of the number of observations associated with
the same article. Standard errors are clustered by article. For the predictions, all right-hand-side variables are held fixed at their means except for the indicator for
whether the nudge uses automaticity, which takes a value of either zero or one.

12
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

In column 2 of Table 5, we replace the indicator for whether the weighting of the nudge treatment effect observations in our data set as
nudge automates some aspect of the decision-making process with inappropriate because of the disproportionately high weight assigned to
variables capturing the extent to which a nudge falls into the three major articles that include a large number of small-sample studies. Nonethe­
categories identified by Beshears and Gino (2015), namely nudges that less, the results in Table 5 are similar if we pursue this strategy, except
trigger system 1, nudges that engage system 2, and nudges that bypass that the coefficient estimate on the indicator for using automaticity in
both systems.9 The control variables, weighting procedure, and clus­ column 1 is no longer statistically significant or marginally statistically
tering of standard errors are the same as in column 1. Nudges that bypass significant (even though the coefficient estimate of 0.160 is close to the
both systems tend to have a higher Cohen’s d than nudges that trigger coefficient estimate of 0.193 in Table 5). Finally, in columns 3–6 of
system 1 and nudges that engage system 2, but these differences are not Table 5, logistic regressions deliver the same conclusion as ordinary
statistically significant. least squares regressions.
Columns 3 and 4 of Table 5 use the same regression specifications as Overall, the results in Table 5 suggest that nudges involving auto­
columns 1 and 2, respectively, except for two changes. First, the maticity have a larger impact, both as measured by Cohen’s d and as
outcome variable is an indicator for whether the estimated treatment measured by the likelihood of a statistically significant effect, than
effect is statistically significant at the 5% level. Second, we drop the nudges that do not involve automaticity. This finding aligns with prior
indicator for whether the treatment effect is estimated for a dichoto­ theorizing that automating some aspect of the decision-making process,
mous variable, which was only relevant for the Cohen’s d regressions without relying on individuals to act differently, tends to be more
because the calculation of Cohen’s d differed for continuous versus powerful for driving changes in outcomes than encouraging individuals
dichotomous variables. Columns 5 and 6 of Table 5 are identical to to change their decision-making process, which requires attention and
columns 3 and 4, respectively, except the outcome variable is an indi­ effort (Beshears & Gino, 2015). Of course, our conclusions here are
cator for whether the estimated treatment effect is statistically signifi­ tentative because they are based only on correlational evidence. While
cant at the 10% level. Columns 3–6 indicate that nudges involving we control for some potential confounding factors, the type of nudge
automaticity are associated with a 13.9 percentage point (15.5 per­ that was applied in a given setting was not randomly assigned, so the
centage point) higher likelihood of having a treatment effect that is relationships that we find may be driven by omitted variables, such as
statistically significant at the 5% (10%) level, but there are not sys­ how difficult it is to change individuals’ behavior in that setting.
tematic differences in the likelihood of a statistically significant treat­ Moreover, while it is likely that certain types of nudges are consistently
ment effect across the three major categories of nudges. more impactful than other types of nudges regardless of context, a
The results in Table 5 are robust to alternative specifications. The critical ingredient for impact may also be the match between the type of
results in columns 1 and 2 are nearly identical if we winsorize Cohen’s nudge and the nature of the behavior to be changed.
d at the 1st and 99th percentiles. Also, for our baseline regressions, It is also important to note that nudges may be beneficial even if they
Cohen’s d is treated as missing and dropped from the sample if the do not exert an effect on the targeted outcome variable. Nudges that
estimated nudge treatment effect was the opposite of the predicted di­ simplify the choice environment sometimes have this property. For
rection and either statistically significant or marginally statistically example, in an experimental study of mutual fund investment decisions,
significant. However, the results in columns 1 and 2 are similar if we a nudge that simplified the presentation of information regarding the
include those observations in the regression sample either coded as funds did not change participants’ investment choices, but participants
having a positive value for Cohen’s d or coded as having a Cohen’s d of needed less time to reach decisions of the same quality, suggesting that
zero. In addition, when the nudge treatment effect is estimated for a the nudge was valuable (Beshears, Choi, Laibson, & Madrian, 2011).
dichotomous outcome variable and the mean of the outcome variable for Finally, when judging the desirability of a nudge, it is not only the
the control group is unknown, our baseline regressions assume that the nudge’s effect size that is important, but also the cost of implementing
treatment effect is symmetric around 0.5 for the purposes of calculating the nudge. We return to this point in Section 4.5.
Cohen’s d. The results in columns 1 and 2 are similar if we instead drop Notwithstanding the caveats above, the evidence in Table 5 suggests
these observations from the regression sample. We have also explored that there are important differences in impact across various categories
alternative methods of weighting observations. If we assign each of nudges. Developing a better understanding of this heterogeneity
observation a weight proportional to the inverse of the squared standard would be a valuable pursuit for the empirical literature on nudging.
error of the Cohen’s d estimate, some observations receive extremely
high weights, but if we winsorize those weights at the 20th and 80th 4.2. Using field research and laboratory research as complementary
percentiles, the results in columns 1 and 2 are similar.10 We view equal methods

Field research and laboratory research have both played major roles
9 in the empirical literature on nudging. As indicated in Table 3, 44.0% of
We do not simultaneously include the indicator for whether a nudge in­
the nudge treatment effects in our data set were estimated using field
volves automaticity as a right-hand-side variable because it is moderately
highly correlated with the three new variables. Because a given nudge can fall
experiments and the observational analysis of field data, and 56.0% of
into more than one of the three major categories, we do not use indicator effects were estimated using laboratory-style methods (laboratory ex­
variables for the categories. If a nudge falls into only one category (e.g., nudges periments, online experiments, and surveys). We argue that research on
that trigger system 1), the variable associated with that category takes a value nudging is most compelling when it uses field and laboratory approaches
of one, while the variables associated with the other two categories take values as complementary methods for understanding why and under what
of zero. If a nudge falls into two categories, the variables associated with those circumstances nudges influence behavior.
categories take a value of 1/2, while the variable associated with the remaining Field and laboratory approaches are complementary because they
category takes a value of zero. If a nudge falls into all three categories, all three build on each other. Laboratory techniques are often deployed more
variables take a value of 1/3. The regression omits a constant term, so it models quickly and easily, so they are well suited for exploring the viability of
a nudge that spans two or three categories as being a simple average of those
new nudge interventions that have not been tested previously. Research
categories. As a robustness check, we have estimated regressions that instead
in the field can then build on laboratory findings by focusing on nudges
use indicator variables for the three categories, allowing multiple indicators to
take a value of one for the same nudge, and our qualitative conclusions are that have shown promise in laboratory settings. This step is important
unchanged. because even when evidence from laboratory experiments documents
10
To calculate these weights, we sometimes must calculate standard errors that a given nudge can alter individual decisions, evidence from the field
based on reported p-values. When a reported p-value is zero, we replace it with may reveal that the size of the effect of the nudge diminishes in field
0.001 for this calculation. settings, perhaps to the point where it is too small to be of practical use

13
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

or too small to detect. In some cases, the discrepancy between the lab­ themselves. This result has been documented in domains including
oratory and field results may be due to the fact that laboratory experi­ residential energy conservation (Allcott, 2011), contributions to an on­
ments can amplify the effect of a nudge by stripping away relevant line community (Chen, Harper, Konstan, & Li, 2010), towel reuse in
contextual factors, which would have otherwise also influenced the hotels (Goldstein, Cialdini, & Griskevicius, 2008), voting (Gerber &
decision at hand. Field research, on the other hand, incorporates those Rogers, 2009), job choice (Coffman, Featherstone, & Kessler, 2017),
contextual factors, and those factors may drown out the effect of the food consumption (Sparkman & Walton, 2017), tax compliance (Halls­
nudge by drawing attention away from the stimuli by which the nudge is worth, List, Metcalfe, & Vlaev, 2017; Bott, Cappelen, Sørensen, &
delivered or by triggering alternative ways of thinking about the choice Tungodden, 2020), and fare evasion at train stations (Ayal, Celse, &
that compete with the decision-making process promoted by the nudge. Hochman, in press). In the context of an employer-sponsored retirement
To be clear, it is not a critique of laboratory techniques to point out savings plan, however, Beshears, Choi, Laibson, Madrian, and Milkman
that they tend to strip away contextual factors and are less suited for (2015) show that telling certain non-participating individuals about the
measuring the size of nudge effects in field settings. The objective of high participation rates of their peers reduces the target individuals’
laboratory-style research is often to demonstrate the existence of an likelihood of enrolling in the plan, and that this effect is stronger when
effect, not to estimate the size of an effect, and the demonstration of an the participation rate among peers is (plausibly exogenously) higher.
effect in the absence of contextual factors strengthens the case for the The effect is driven by individuals who have low incomes compared to
generalizability of the effect across contexts. Furthermore, when field their peers, suggesting that the nudge backfires because it interacts with
data indicate that a nudge has a small effect or no effect (or even has the individuals’ concerns regarding relative economic standing, triggering
opposite of the intended effect) in a particular context, it is often feelings of discouragement and thereby lowering plan enrollment rates.
logistically challenging or prohibitively costly to collect additional field Additional work, including work in laboratory environments, docu­
data that might shed light on the psychological mechanism behind the ments that peer information nudges may have no effect or may backfire
result, in part because many nudges deployed in field settings operate because individuals’ preferences may not depend on perceived social
through multiple mechanisms simultaneously (Hauser, Gino, & Norton, norms, because disclosing the low prevalence of an undesirable behavior
2018). Laboratory approaches, which tend to have more carefully may unintentionally make the behavior seem acceptable, because the
controlled experimental manipulations and more targeted measurement reference group whose behavior is reported may be interpreted as dis­
of key constructs, are valuable for understanding the decision-making similar from the target individuals, or because individuals may misre­
processes that underlie variation in nudge effects across domains and member the peer information in self-serving ways (Bicchieri & Dimant,
for developing predictions regarding the situations in which a nudge will 2019; Dimant, van Kleef, & Shalvi, 2020).11 Thus, the combination of
or will not have the desired effect. We emphasize that the combination of field and laboratory evidence provides a nuanced understanding of
field and laboratory approaches is necessary for providing guidance to when and how managers and policy makers might successfully use peer
practitioners who use nudges to address managerial and public policy information nudges.
problems.
The literature on financial returns aggregation and investment
portfolio decisions is one example of the importance of using both field 4.3. Studying the long-run effects of nudges
and laboratory methods. Some laboratory studies demonstrate that
showing individuals aggregated returns, such as returns aggregated over In our data set of nudge treatment effects, 17 out of the 174 articles
a long time horizon instead of a short time horizon, increases in­ collect follow-up data to estimate at least one treatment effect over a
dividuals’ willingness to invest in risky assets (Gneezy & Potters, 1997; longer time horizon than the initial time horizon used to examine the
Thaler, Tversky, Kahneman, & Schwartz, 1997). These findings are impact of the nudge;12 24 out of the 174 articles estimate at least one
consistent with the hypothesis that individuals are myopically loss treatment effect for an outcome variable that measures the cumulative
averse—they evaluate the outcomes of risky investments within a nar­ impact of a series of actions in a field setting (one type of outcome
row frame, such as a short time window, and when they decide whether variable that captures a long-run effect); and 36 out of the 174 articles
or not to invest in a risky asset, a possible loss of a given size within this fall in either of the first two categories. Thus, at most 21% of articles
frame is weighted more heavily than a gain of the same size within this attempt to assess the long-run effect of a nudge. Future empirical
frame (Benartzi & Thaler, 1995). Under this hypothesis, aggregating research on nudging should devote more attention to whether and how
returns widens the frame, and for most common risky asset return dis­ nudges exert long-run effects, as nudging is more compelling as a
tributions, a wider frame reduces the likelihood of experiencing a loss, managerial and policy tool if it can generate long-lasting changes in
which in turn increases willingness to invest in risky assets. Despite the outcomes instead of only short-run effects.
laboratory evidence indicating that an aggregation nudge can change One particularly intriguing area for future research is the mecha­
risk-taking behavior, Beshears, Choi, Laibson, and Madrian (2017a) nisms by which nudges can help individuals develop habits that lead to
show that the nudge has no effect in a setting that uses mutual funds sustained changes in behavior. Habits are patterns of behavior charac­
instead of laboratory gambles and a one-year experimental period terized by repeated automatic performance of an action or set of actions
instead of a short laboratory session. This evidence suggests that in response to a routinely occurring cue to act (Wood & Runger, 2016).
embedding the nudge within a field context drowns out the effect of the Unfortunately, in many situations, encouraging individuals to create
nudge. To better understand the mechanisms underlying this result, entirely new routines for engaging in a desirable behavior is unlikely to
Beshears et al. (2017a) conduct follow-up laboratory experiments and succeed in promoting habit formation because such new routines are
document that changes to the return distribution or to the amount of frequently disrupted by competing demands on individuals’ time and
time between the moment when a portfolio is chosen and the moment
when returns are viewed eliminates the effect of the nudge. These
11
findings do not invalidate the hypothesis that myopic loss aversion The possibility of backfiring is not unique to nudges that provide infor­
drives portfolio choices, but they do suggest that nudges based on this mation about peer behavior. See, for example, Brown, Johnstone, Haščič, Vong,
and Barascud (2013), Ascarza, Iyengar, and Schleicher (2016), and Bolton,
hypothesis are unlikely to impact portfolio choice in field settings.
Dimant, and Schmidt (2020).
The literature on nudges that provide information about peer 12
See the Online Appendix for details on our method for determining whether
behavior is another example of the importance of using both field and the researchers collected follow-up data. For five of the 17 articles, there is
laboratory methods. Field experiments have demonstrated that telling ambiguity, but we include these articles in our count to be conservative. Out of
individuals that a particular behavior is common among their peers can the other 12 articles, only two find no evidence of a treatment effect that
make those target individuals more likely to engage in the behavior persists over the longer time horizon.

14
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

attention (Beshears, Lee, Milkman, Mislavsky, & Wisdom, 2019). Suc­ feelings that their virtuous sandwich choices should be rewarded with
cess may require devising routines that dovetail with existing recurring indulgent side dish and drink choices. The increase in caloric intake
events in individuals’ lives, but this hypothesis deserves further study. from side dishes and drinks entirely offset the decrease in caloric intake
Given the challenges of helping individuals develop beneficial from sandwiches.
habits, another promising direction for future research is to explore how The dearth of attempts to gauge the effects of nudges on non-targeted
nudges can prompt individuals to make one-time, up-front investments outcomes is a glaring omission in the empirical nudge literature, as such
that generate a long series of future changes in outcomes. For example, unintended consequences can partially offset, entirely eliminate, or even
Brandon et al. (2017) document that a nudge showing individuals how reverse the benefits that nudges deliver on a targeted dimension. In
their home energy consumption compares to their neighbors’ con­ addition to documenting effects on non-targeted outcomes, future
sumption induces investment in energy-saving technologies, such as research on nudging should attempt to understand the circumstances
energy-efficient appliances. These one-time investments lead to persis­ under which nudging does and does not induce compensatory behavior
tent reductions in energy use. that undermines the intended impact of the nudge.
Of course, even if a nudge has a short-run effect but not a long-run
effect, it may nonetheless be valuable if it can be applied repeatedly 4.5. Placing a nudge in the context of other nudges and other
and continues to have a short-run effect each time it is applied. Some interventions
past work has identified nudges that, upon repeated application, have an
effect each time they are applied (see, e.g., Allcott & Rogers, 2014; Managers and policy makers often deploy multiple interventions
Altmann & Traxler, 2014; Beshears et al., 2013), but additional work on simultaneously. Empirical research on nudging must therefore help
this issue would be valuable. managers and policy makers understand how a given nudge fits within a
broader constellation of strategies for changing behavior. Towards this
4.4. Measuring the effect of nudges on non-targeted outcomes end, Benartzi et al. (2017) point out that an important metric by which
nudges should be judged is their impact relative to their cost. These
Out of the 174 articles in our data set of nudge treatment effects, only authors show across four policy domains that relative to traditional in­
12 measure the effect of the nudge on an outcome variable that could terventions such as financial incentives, nudges tend to deliver greater
offset the treatment effect on the focal outcome variable and that is in impact on targeted behaviors per dollar spent. Going forward, evalua­
the same domain as the focal outcome variable. Only three articles tions of nudge interventions should place greater emphasis on calcu­
measure the effect of the nudge on an outcome variable that could offset lating the costs associated with nudges in order to facilitate comparisons
the treatment effect on the focal outcome variable and that is in a for managers and policy makers who are choosing among interventions.
different domain from the focal outcome variable. There are strong Furthermore, it is important for managers and policy makers to have
theoretical reasons to believe that nudges can impact outcomes other evidence on the interactive effects among nudge interventions and other
than the targeted outcomes. Nudges often operate by changing the interventions. Out of the 174 articles in our data set of nudge treatment
choice architecture of one decision-making setting to prompt individuals effect estimates, 48 articles (27.6%) feature at least one test of the
to change their behavior in that setting. Nudges do little to shift the interaction of a nudge with some other intervention (whether the other
fundamental costs and benefits of various possible courses of action, so intervention is a nudge or not). More research along these lines is needed
outside of the setting with altered choice architecture, individuals may to provide guidance as to the optimal mix of interventions for achieving
engage in behavior that compensates for the changes induced by the managerial and policy objectives.
nudge. In some cases, nudges act as complements to other interventions. For
Investigations of nudge effects on non-targeted outcomes do not al­ example, a nudge that streamlines the process of applying for financial
ways find evidence of compensatory behavior. Beshears, Choi, Laibson, aid for attending college is complementary to existing financial aid
Madrian, and Skimmyhorn (2019) study the effects of automatic programs and can increase college enrollment (Bettinger, Long, Oreo­
enrollment into an employer-sponsored retirement savings plan on both poulos, & Sanbonmatsu, 2012). In other cases, nudges are substitutes for
targeted and non-targeted outcome variables. Consistent with prior other interventions. For example, in employer-sponsored retirement
research, they find that automatic enrollment increases contributions to savings plans, both automatic enrollment and employer matching con­
the savings plan, the targeted outcome. One concern with automatic tributions (contributions from the employer that are deposited in
enrollment is that it might also have the unintended consequence of employee accounts contingent upon employees’ own contributions) are
increasing financial distress if individuals have more retirement savings intended to increase plan participation. These two interventions are
but diminished financial resources for repaying debt. However, likely to be substitutes. In the absence of automatic enrollment, an
Beshears, Choi, Laibson, et al. (2019) do not find that automatic employer match of $0.25 per dollar of employee contributions is esti­
enrollment increases financial distress or credit card and other non- mated to increase plan participation rates by approximately 10 per­
secured debt.13 centage points relative to not offering a match (Papke, 1995; Basset,
In other situations, however, nudges have effects on non-targeted Fleming, & Rodrigues, 1998).14 When a plan automatically enrolls
outcomes that undermine their effects on targeted outcomes. For employees, an employer match of $0.25 per dollar of employee contri­
example, Wisdom, Downs, and Loewenstein (2010) study interventions butions is estimated to increase plan participation rates by no more than
designed to decrease caloric intake at a fast-food sandwich chain. One 5–6 percentage points relative to not offering a match (Beshears, Choi,
nudge intervention decreased the convenience of ordering high-calorie Laibson, & Madrian, 2010), an increase that is substantial but smaller
sandwiches and thereby reduced the likelihood with which partici­ than the increase in the absence of automatic enrollment.
pants selected high-calorie sandwiches. An unintended consequence was An understanding of the extent to which interventions serve as
that the nudge simultaneously increased the caloric content of the side complements or substitutes for each other is a valuable input to mana­
dishes and drinks selected by participants, perhaps due to participants’ gerial and policy decisions. For example, in the case of retirement sav­
ings plans, employers with automatic enrollment may wish to direct
financial resources away from employer matching contributions, a
13
Beshears, Choi, Laibson, et al. (2019) do find suggestive evidence that
automatic enrollment increases auto debt and first mortgage debt. Because
14
increases in these types of secured debt are associated with asset purchases, Estimates of this treatment effect vary widely, with some estimates as low as
these findings do not necessarily imply decreases in household net worth, but 5 percentage points (Engelhardt & Kumar, 2007) and others as high as 33
future research should investigate these issues further. percentage points (Even & Macpherson, 2005).

15
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

substitute for automatic enrollment, towards other programs that pro­ costs of implementation. On the basis of impact per unit of cost incurred,
mote retirement security, such as employer contributions to employee nudges are in fact highly valuable tools for changing behavior (Benartzi
accounts that are not contingent on employee contributions. et al., 2017).
The modest effect sizes of most nudges do imply, however, that it
5. General discussion and conclusions would be unwise to ignore the possibility of implementing other types of
interventions merely because a nudge has been deployed to address a
In this paper, we discussed examples of past research on nudging and given problem (Bhargava & Loewenstein, 2015; Hagmann, Ho, & Loe­
analyzed a data set of 174 articles in the empirical literature on nudging wenstein, 2019; Loewenstein & Chater, 2017). In moving beyond
in order to assess progress to date in this literature and to highlight key nudges, of course, managers and policy makers must be prepared to
challenges for future research on the topic. We documented that many accept interventions that fall short of the nudge definition of influencing
types of nudges, as studied by scholars in several different academic behavior without limiting choice or meaningfully changing financial
disciplines examining data in field and laboratory settings, have suc­ incentives. In some cases, even outright prohibition of certain options
ceeded in changing behavior in a wide range of domains of application. may be appropriate. Consider the case of restrictions on pre-retirement
We argued that future research on nudging should place greater withdrawals from retirement savings accounts. In the United States,
emphasis on (1) seeking to identify the types of nudges that tend to be such withdrawals are widely available in exchange for payment of a 10%
most impactful, (2) using field-based methods and laboratory-based tax penalty, and they are available penalty-free under special circum­
methods as complementary approaches, (3) examining the long-run ef­ stances. In other countries, such as the United Kingdom, such with­
fects of nudges, (4) considering the effects of nudges on non-targeted drawals are prohibited entirely except in extreme situations (Beshears,
outcomes, and (5) studying nudges within the context of a suite of Choi, Hurwitz, Laibson, & Madrian, 2015). If the population includes
nudges and other interventions designed to change a particular set of both individuals who have self-control problems that cause them to
behaviors. Many organizations have started to use nudging as a tech­ undersave and individuals who do not have self-control problems
nique for changing behavior, but in order for nudging to take its place (Augenblick, Niederle, & Sprenger, 2015; Beshears et al., 2020), simu­
alongside traditional interventions (such as financial incentives) in the lations indicate that a policy prohibiting pre-retirement withdrawals
standard toolkit of managers and policy makers, research on nudging leads to large welfare gains for the individuals with self-control prob­
will need to devote much more attention to the five issues above. lems and very small welfare losses for the individuals without self-
As discussed in Section 4, results from the handful of research articles control problems, a tradeoff that a policy maker may be willing to
that have begun to address the five issues outlined above suggest that make (Beshears, Choi, Clayton, et al., 2019).
some nudges that are currently considered powerful methods for When managers and policy makers contemplate the range of possible
changing behavior may prove to be less effective than previously interventions they might implement, they should consider how the in­
thought. For example, nudges that have a large short-run impact may terventions affect consumer welfare, a challenging question when in­
have a negligible long-run impact, necessitating repeated exposure to dividuals’ actions do not necessarily reflect their best interests (see
the nudge in order to generate a meaningful cumulative impact. Nudges Allcott & Kessler, 2019; Beshears et al., 2008a; Bernheim & Rangel,
that succeed in changing a targeted behavior may simultaneously induce 2009; Bernheim, Fradkin, & Popov, 2015). A further challenge is that
offsetting changes in non-targeted behaviors. while nudges are expressly intended as tools for improving individual
Even when nudges are deemed effective after they have been eval­ welfare, managers and policy makers who embrace this role for nudges
uated in a more comprehensive fashion, it is important to note that their must recognize that profit-maximizing firms may have an incentive to
impact on outcomes is often modest. Consider two of the most widely exploit individuals’ biases for their own gain (see Baker & Wurgler,
known nudges: automatic enrollment in an employer-sponsored retire­ 2013; Beshears, Gino, Lee, & Wang, 2016; Heidhues & Koszegi, 2018;
ment savings plan and personalized reports comparing the recipient’s and Malmendier, 2018). The designs of interventions should anticipate
energy consumption to the energy consumption of the recipient’s and account for the ways in which such firms may undermine efforts to
neighbors. When Beshears, Choi, Laibson, et al. (2019) evaluated the improve individual welfare. A complete discussion of formal welfare
implementation of automatic enrollment at a large employer, they analysis and strategic interactions with profit-maximizing firms is
estimated that automatic enrollment increased cumulative retirement beyond the scope of this paper, but we briefly highlight a key ethical
plan contributions during the first four years after an employee’s hire tension that emerges when contrasting nudge interventions and tradi­
date by only 4.1% of first-year annualized salary, a measurable change tional interventions such as financial incentives. On one hand, tradi­
but not one that on its own can eliminate the risk that a household will tional interventions may be attractive because they are transparent in
experience a drop in its standard of living at retirement (Munnell, Hou, their attempts to influence individuals’ decisions, whereas nudge in­
& Sanzenbacher, 2018). Similarly, Allcott (2011) estimated that terventions, especially those that trigger system 1 or bypass both sys­
personalized reports containing information about neighbors’ electricity tems, influence individuals’ decisions in ways that those individuals may
utilization reduced energy consumption by 2%, a notable effect but not not fully recognize and understand. On the other hand, nudge in­
one that on its own can reduce carbon emissions to the point where the terventions may be attractive because they help individuals who would
consequences of global climate change are entirely mitigated. otherwise have difficulty making wise decisions and because they do not
The observation that the effects of nudges tend to be modest does not restrict the choices of individuals who make wise decisions on their own
imply that psychological factors are unimportant in determining (Camerer et al., 2003; Thaler & Sunstein, 2003, 2008). Traditional in­
behavior. To the contrary, the fact that it is difficult to change behavior terventions, in contrast, often restrict the choices of the latter group in
in, for example, the retirement savings domain is consistent with a order to help the former group. This tension is one that managers, policy
pervasive role for psychological factors—a nudge that boosts savings in makers, and society at large must grapple with.
one narrow context is easily swamped by the operation of psychological In summary, the empirical literature on nudging has established that
factors driving overspending in all other contexts that an individual choice architecture techniques can succeed in changing behavior in
encounters. Nor does the observation that the effects of nudges tend to many managerial and policy-relevant settings. This paper has outlined
be modest imply that managers and policy makers should abandon the future directions that research in this area should pursue in order to
choice architecture approach to influencing behavior. No single policy make nudges part of the standard toolkit of managers and policy makers.
or intervention should be expected to resolve a major societal problem
on its own—substantial progress requires many policies and in­ Acknowledgments
terventions with modest effects all pushing in the right direction.
Furthermore, nudges may have modest effects, but they also have small This research was supported by the National Institutes of Health

16
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

(grant P30AG034532), the Pershing Square Fund for Research on the Beshears, J. (2016c). Express Scripts: Promoting prescription drug home delivery (A) and
(B). Harvard Business School teaching note 916-047.
Foundations of Human Behavior, and Harvard Business School. We
Beshears, J., Choi, J. J., Clayton, C., Harris, C., Laibson, D., & Madrian, B. C. (2019).
thank Max Bazerman, James Choi, Francesca Gino, Brian Hall, David Optimal illiquidity. Working paper.
Laibson, George Loewenstein, Brigitte Madrian, Deepak Malhotra, Beshears, J., Choi, J. J., Hurwitz, J., Laibson, D., & Madrian, B. C. (2015). Liquidity in
Kathleen McGinn, Katherine Milkman, Mario Small, three anonymous retirement savings systems: An international comparison. American Economic Review
Papers and Proceedings, 105, 420–425.
reviewers, and participants in the WW Roundtable Discussion on Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2006). Retirement saving: Helping
Creating Habit Formation for Healthy Behaviors, hosted by the Center employees help themselves. Milken Institute Review, 8(3), 30–39.
for Health Incentives and Behavioral Economics and the Behavior Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2008a). How are preferences
revealed? Journal of Public Economics, 92, 1787–1794.
Change for Good Initiative, for helpful comments. We are grateful for the Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2008b). The importance of default
research assistance of Alicia Zhang. Beshears has received additional options for retirement saving outcomes: Evidence from the United States. In
grant support from the TIAA Institute and the National Employment S. J. Kay, & T. Sinha (Eds.), Lessons from pension reform in the Americas (pp. 59–87).
Oxford: Oxford University Press.
Savings Trust (NEST); is a TIAA Institute Fellow; has received research Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2010). The impact of employer
data from Alight Solutions, Voya Financial, and the Commonwealth matching on savings plan participation under automatic enrollment. In D. A. Wise
Bank of Australia; and is an advisor to and equity holder in Nutmeg (Ed.), Research findings in the economics of aging (pp. 311–327). Chicago, IL:
University of Chicago Press.
Saving and Investment, a robo-advice asset management company. See Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2011). How does simplified
his website for a complete list of outside activities. The views expressed disclosure affect individuals’ mutual fund choices? In D. A. Wise (Ed.), Explorations
here are those of the authors and do not reflect the views or position of in the economics of aging (pp. 75–96). Chicago, IL: University of Chicago Press.
Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2013). Simplification and saving.
any agency of the federal government, Harvard University, or the Na­
Journal of Economic Behavior and Organization, 95, 130–145.
tional Bureau of Economic Research. Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2017a). Does aggregated returns
disclosure increase portfolio risk taking? Review of Financial Studies, 30, 1971–2005.
Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2017b). Does front-loading
Appendix A. Supplementary material taxation increase savings? Evidence from Roth 401(k) introductions. Journal of
Public Economics, 151, 84–95.
Supplementary data to this article can be found online at https://doi. Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2018). Behavioral household
finance. In B. D. Bernheim, S. DellaVigna, & D. Laibson (Eds.), Handbook of
org/10.1016/j.obhdp.2020.09.001.
behavioral economics – Foundations and applications 1 (Vol. 1, pp. 177–276).
Amsterdam: Elsevier Press.
References Beshears, J., Choi, J. J., Harris, C., Laibson, D., Madrian, B. C., & Sakong, J. (2020).
Which early withdrawal penalty attracts the most deposits to a commitment savings
account? Journal of Public Economics, 183, article 104144.
Afriat, S. N. (1967). The construction of utility functions from expenditure data.
Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2020). Active choice, implicit
International Economic Review, 8, 67–77.
defaults, and the incentive to choose. Organizational Behavior and Human Decision
Aleksovska, M., Schillemans, T., & Grimmelikhuijsen, S. (2019). Lessons from five
Processes (in press).
decades of experimental and behavioral research on accountability: A systematic
Beshears, J., Choi, J. J., Laibson, D., Madrian, B. C., & Milkman, K. L. (2015). The effect
literature review. Journal of Behavioral Public Administration, 2(2), 1–18.
of providing peer information on retirement savings decisions. Journal of Finance, 70,
Allcott, H. (2011). Social norms and energy conservation. Journal of Public Economics, 95,
1161–1201.
1082–1095.
Beshears, J., Choi, J. J., Laibson, D., Madrian, B. C., & Reynolds, G. I. (2016). Vaccination
Allcott, H., & Kessler, J. B. (2019). The welfare effects of nudges: A case study of energy
rates are associated with functional proximity but not base proximity of vaccination
use social comparisons. American Economic Journal: Applied Economics, 11, 236–276.
clinics. Medical Care, 54, 578–583.
Allcott, H., & Rogers, T. (2014). The short-run and long-run effects of behavioral
Beshears, J., Choi, J. J., Laibson, D., Madrian, B. C., & Skimmyhorn, W. L. (2019).
interventions: Experimental evidence from energy conservation. American Economics
Borrowing to save? The impact of automatic enrollment on debt. Working paper.
Review, 104, 3003–3037.
Beshears, J., Choi, J. J., Laibson, D., Madrian, B. C., & Weller, B. (2010). Public policy
Altmann, S., & Traxler, C. (2014). Nudges at the dentist. European Economic Review, 72,
and saving for retirement: The “autosave” features of the Pension Protection Act of
19–38.
2006. In J. J. Siegfried (Ed.), Better living through economics (pp. 274–290).
Angeletos, G.-M., Laibson, D., Repetto, A., Tobacman, J., & Weinberg, S. (2001). The
Cambridge, MA: Harvard University Press.
hyperbolic consumption model: Calibration, simulation, and empirical evaluation.
Beshears, J., Choi, J. J., Laibson, D., Madrian, B. C., & Zeldes, S. P. (2014). What makes
Journal of Economic Perspectives, 15(3), 47–68.
annuitization more appealing? Journal of Public Economics, 116, 2–16.
Ascarza, E., Iyengar, R., & Schleicher, M. (2016). The perils of proactive churn
Beshears, J., Dai, H., Milkman, K. L., & Benartzi, S. (2019). Using fresh starts to nudge
prevention using plan recommendations: Evidence from a field experiment. Journal
increased retirement savings. Working paper.
of Marketing Research, 53, 46–60.
Beshears, J., & Gino, F. (2015). Leaders as decision architects: Structure your
Augenblick, N., Niederle, M., & Sprenger, C. (2015). Working over time: Dynamic
organization’s work to encourage wise choices. Harvard Business Review, 93(5),
inconsistency in real effort tasks. Quarterly Journal of Economics, 130, 1067–1115.
52–62.
Ayal, S., Celse, J., & Hochman, G. (2020). Crafting messages to fight dishonesty: A field
Beshears, J., Gino, F., Lee, J., & Wang, S. (2016). T-Mobile in 2013: The Un-Carrier.
investigation of the effects of social norms and watching eye cues on fare evasion.
Harvard Business School case 916-043.
Organizational Behavior and Human Decision Processes (in press). https://doi.org/
Beshears, J., Lee, H. N., Milkman, K. L., Mislavsky, R., & Wisdom, J. (2019). Creating
10.1016/j.obhdp.2019.10.003.
exercise habits using incentives: The tradeoff between flexibility and routinization.
Baca-Motes, K., Brown, A., Gneezy, A., Keenan, E. A., & Nelson, L. D. (2013).
Working paper.
Commitment and behavior change: Evidence from the field. Journal of Consumer
Beshears, J., & Milkman, K. L. (2011). Do sell-side stock analysts exhibit escalation of
Research, 39, 1070–1084.
commitment? Journal of Economic Behavior and Organization, 77, 304–317.
Baker, M., & Wurgler, J. (2013). Behavioral corporate finance: an updated survey. In G.
Beshears, J., Milkman, K. L., & Schwartzstein, J. (2016). Beyond beta-delta: The
M. Constantinides, M. Harris, & R. M. Stulz (Eds.), Handbook of the economics of
emerging economics of personal plans. American Economic Review Papers and
finance (Vol. 2, Part A, pp. 357-424). Amsterdam: Elsevier Press.
Proceedings, 106, 430–434.
Bassett, W. F., Fleming, M. J., & Rodrigues, A. P. (1998). How workers use 401(k) plans:
Beshears, J., Rooney, P., & Sanford, J. (2016a). Express Scripts: Promoting prescription
The participation, contribution, and withdrawal decisions. National Tax Journal, 51,
drug home delivery (A). Harvard Business School case 916-026.
263–289.
Beshears, J., Rooney, P., & Sanford, J. (2016b). Express Scripts: Promoting prescription
Bazerman, M. H., & Moore, D. A. (2012). Judgment in managerial decision making (8th
drug home delivery (B). Harvard Business School case 916-040.
ed.). Hoboken, NJ: Wiley.
Bettinger, E. P., Long, B. T., Oreopoulos, P., & Sanbonmatsu, L. (2012). The role of
Benartzi, S., Beshears, J., Milkman, K. L., Sunstein, C. R., Thaler, R. H., Shankar, M., …
application assistance and information in college decisions: Results from the H&R
Galing, S. (2017). Should governments invest more in nudging? Psychological Science,
block FAFSA experiment. Quarterly Journal of Economics, 127, 1205–1242.
28, 1041–1055.
Bhargava, S., & Loewenstein, G. (2015). Behavioral economics and public policy 102:
Benartzi, S., & Thaler, R. H. (1995). Myopic loss aversion and the equity premium puzzle.
Beyond nudging. American Economic Review Papers and Proceedings, 105, 396–401.
Quarterly Journal of Economics, 110, 73–92.
Bicchieri, C., & Dimant, E. (2019). Nudging with care: The risks and benefits of social
Bernheim, B. D., Fradkin, A., & Popov, I. (2015). The welfare economics of default
information. Public Choice.
options in 401(k) plans. American Economic Review, 105, 2798–2837.
Bolton, G., Dimant, E., & Schmidt, U. (2020). When a nudge backfires: Combining (im)
Bernheim, B. D., & Rangel, A. (2009). Beyond revealed preference: Choice-theoretic
plausible deniability with social and economic incentives to promote behavioral
foundations for behavioral welfare economics. Quarterly Journal of Economics, 124,
change. CESifo Working Paper No. 8070.
51–104.
Bordalo, P., Gennaioli, N., & Shleifer, A. (2012). Salience theory of choice under risk.
Beshears, J. (2016a). Evive Health and workplace influenza vaccinations. Harvard
Quarterly Journal of Economics, 127, 1243–1285.
Business School case 916-044.
Bott, K. M., Cappelen, A. W., Sørensen, E.Ø., & Tungodden, B. (2020). You’ve got mail: A
Beshears, J. (2016b). Evive Health and workplace influenza vaccinations. Harvard
randomized field experiment on tax evasion. Management Science, 66, 2801–2819.
Business School teaching note 916-049.

17
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

Brandon, A., Ferraro, P. J., List, J. A., Metcalfe, R. D., Price, M. K., & Rundhammer, F. Johnson, E. J., Shu, S. B., Dellaert, B. G. C., Fox, C., Goldstein, D. G., Häubl, G., …
(2017). Do the effects of social nudges persist? Theory and evidence from 38 natural Weber, E. U. (2012). Beyond nudges: Tools of a choice architecture. Marketing
field experiments. Working paper. Letters, 23, 487–504.
Brown, Z., Johnstone, N., Haščič, I., Vong, L., & Barascud, F. (2013). Testing the effect of Kahneman, D. (2011). Thinking, fast and slow. New York, NY: Farrar, Straus and Giroux.
defaults on the thermostat settings of OECD employees. Energy Economics, 39, Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgment of
128–134. representativeness. Cognitive Psychology, 3, 430–454.
Bushong, B., Rabin, M., & Schwartzstein, J. (2019). A model of relative thinking. Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk.
Working paper. Econometrica, 47, 263–292.
Busse, M. R., Pope, D. G., Pope, J. C., & Silva-Risso, J. (2015). The psychological effect of Koszegi, B., & Rabin, M. (2006). A model of reference-dependent preferences. Quarterly
weather on car purchases. Quarterly Journal of Economics, 130, 371–414. Journal of Economics, 121, 1133–1165.
Camerer, C., & Lovallo, D. (1999). Overconfidence and excess entry: An experimental Lacetera, N., Pope, D. G., & Sydnor, J. R. (2012). Heuristic thinking and limited attention
approach. American Economic Review, 89, 306–318. in the car market. American Economic Review, 102, 2206–2236.
Camerer, C., Babcock, L., Loewenstein, G., & Thaler, R. (1997). Labor supply of New York Laibson, D. (1997). Golden eggs and hyperbolic discounting. Quarterly Journal of
City cabdrivers: One day at a time. Quarterly Journal of Economics, 112, 407–441. Economics, 112, 443–478.
Camerer, C., Issacharoff, S., Loewenstein, G., O’Donoghue, T., & Rabin, M. (2003). Lerner, J. S., Small, D. A., & Loewenstein, G. (2004). Heart strings and purse strings:
Regulation for conservatives: Behavioral economics and the case for “asymmetric Carryover effects of emotions on economic decisions. Psychological Science, 15,
paternalism”. University of Pennsylvania Law Review, 151, 1211–1254. 337–341.
Center for Retirement Initiatives. (2019). State-facilitated retirement savings programs: A Loewenstein, G., & Chater, N. (2017). Putting nudges in perspective. Behavioural Public
snapshot of program design features. Washington, DC: Georgetown University. Policy, 1, 26–53.
Chen, Y., Harper, F. M., Konstan, J., & Li, S. X. (2010). Social comparisons and Loewenstein, G., & Prelec, D. (1992). Anomalies in intertemporal choice: Evidence and
contributions to online communities: A field experiment on MovieLens. American an interpretation. Quarterly Journal of Economics, 107, 573–597.
Economic Review, 100, 1358–1398. Ly, K., Mažar, N., Zhao, M., & Soman, D. (2013). A practitioner’s guide to nudging.
Chernev, A., Böckenholt, U., & Goodman, J. (2015). Choice overload: A conceptual University of Toronto Rotman School of Management Research Report Series.
review and meta-analysis. Journal of Consumer Psychology, 25, 333–358. Madrian, B. C., & Shea, D. F. (2001). The power of suggestion: Inertia in 401(k)
Chetty, R., Looney, A., & Kroft, K. (2009). Salience and taxation: Theory and evidence. participation and savings behavior. Quarterly Journal of Economics, 116, 1149–1187.
American Economic Review, 99, 1145–1177. Malmendier, U. (2018). Behavioral corporate finance. In B. D. Bernheim, S. DellaVigna,
Choi, J. J., Laibson, D., Madrian, B. C., & Metrick, A. (2002). Defined contribution & D. Laibson (Eds.), Handbook of behavioral economics – Foundations and applications
pensions: Plan rules, participant decisions, and the path of least resistance. In 1 (Vol. 1, pp. 277–380). Amsterdam: Elsevier Press.
J. M. Poterba (Ed.), Tax policy and the economy (Vol. 16, pp. 67–114). Cambridge, Malmendier, U., & Tate, G. (2005). CEO overconfidence and corporate investment.
MA: MIT Press. Journal of Finance, 60, 2661–2700.
Choi, J. J., Laibson, D., Madrian, B. C., & Metrick, A. (2004). For better or for worse: McKenzie, C. R. M., & Liersch, M. J. (2011). Misunderstanding savings growth:
Default effects and 401(k) savings behavior. In D. A. Wise (Ed.), Perspectives on the Implications for retirement savings behavior. Journal of Marketing Research, 48,
economics of aging (pp. 81–121). Chicago, IL: University of Chicago Press. S1–S13.
Cialdini, R. B., & Goldstein, N. J. (2004). Social influence: Compliance and conformity. McKenzie, C. R. M., Liersch, M. J., & Finkelstein, S. R. (2006). Recommendations implicit
Annual Review of Psychology, 55, 591–621. in policy defaults. Psychological Science, 17, 414–420.
Coffman, L. C., Featherstone, C. R., & Kessler, J. B. (2017). Can social information affect Milkman, K. L., & Beshears, J. (2009). Mental accounting and small windfalls: Evidence
what job you choose and keep? American Economic Journal: Applied Economics, 9, from an online grocer. Journal of Economic Behavior and Organization, 71, 384–394.
96–117. Milkman, K. L., Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2013). Planning
Dai, H., Milkman, K. L., & Riis, J. (2014). The fresh start effect: Temporal landmarks prompts as a means of increasing preventive screening rates. Preventive Medicine, 56,
motivate aspirational behavior. Management Science, 60, 2563–2582. 92–93.
Dai, J., Milkman, K. L., Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2012). Milkman, K. L., Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. C. (2011). Using
Planning prompts as a means of increasing rates of immunization and preventive implementation intentions prompts to enhance influenza vaccination rates.
screening. Public Policy and Aging Report, 22(4), 16–19. Proceedings of the National Academy of Sciences of the United States of America, 108,
DellaVigna, S. (2009). Psychology and economics: Evidence from the field. Journal of 10415–10420.
Economic Literature, 47, 315–372. Munnell, A. H., Hou, W., & Sanzenbacher, G. T. (2018). National Retirement Risk Index
DellaVigna, S., & Linos, E. (2020). RCTs to scale: Comprehensive evidence from two shows modest improvement in 2016. Center for Retirement Research at Boston College
nudge units. Working paper. Brief, 18–21.
DellaVigna, S., & Malmendier, U. (2006). Paying not to go to the gym. American Economic O’Donoghue, T., & Rabin, M. (1999). Doing it now or later. American Economic Review,
Review, 96, 694–719. 89, 103–124.
Dimant, E., van Kleef, G. A., & Shalvi, S. (2020). Requiem for a nudge: Framing effects in O’Donoghue, T., & Rabin, M. (2001). Choice and procrastination. Quarterly Journal of
nudging honesty. Journal of Economic Behavior & Organization, 172, 247–266. Economics, 116, 121–160.
Engelhardt, G. V., & Kumar, A. (2007). Employer matching and 401(k) saving: Evidence Odean, T. (1998). Are investors reluctant to realize their losses? Journal of Finance, 53,
from the health and retirement study. Journal of Public Economics, 91, 1920–1943. 1775–1798.
Even, W. E., & Macpherson, D. A. (2005). The effects of employer matching in 401(k) Odean, T. (1999). Do investors trade too much? American Economic Review, 89,
plans. Industrial Relations, 44, 525–549. 1279–1298.
Fehr, E., & Goette, L. (2007). Do workers work more if wages are high? Evidence from a OECD Research (2018). Behavioural insights and public policy: Institutions applying BI
randomized field experiment. American Economic Review, 97, 298–317. to public policy around the world. OECD website, https://www.oecd.org/gov/reg
Gabaix, X. (2014). A sparsity-based model of bounded rationality. Quarterly Journal of ulatory-policy/behavioural-insights.htm. Accessed January 21, 2020.
Economics, 129, 1661–1710. Papke, L. E. (1995). Participation in and contributions to 401(k) pension plans. Journal of
Gabaix, X. (2019). Behavioral inattention. In B. D. Bernheim, S. DellaVigna, & D. Laibson Human Resources, 30, 311–325.
(Eds.), Handbook of behavioral economics – Foundations and applications 2 (Vol. 2, pp. Plan Sponsor Council of America. (2018). 60th annual survey of profit sharing and 401(k)
261–343). Amsterdam: Elsevier Press. plans. Chicago, IL: Plan Sponsor Council of America.
Gerber, A. S., & Rogers, T. (2009). Descriptive social norms and motivation to vote: Rabin, M. (1998). Psychology and economics. Journal of Economic Literature, 36, 11–46.
Everybody’s voting and so should you. Journal of Politics, 71, 178–191. Samuelson, W., & Zeckhauser, R. (1988). Status quo bias in decision making. Journal of
Gneezy, U., & Potters, J. (1997). An experiment on risk taking and evaluation periods. Risk and Uncertainty, 1, 7–59.
Quarterly Journal of Economics, 112, 631–645. Schäfer, T., & Schwarz, M. A. (2019). The meaningfulness of effect sizes in psychological
Goldstein, N. J., Cialdini, R. B., & Griskevicius, V. (2008). A room with a viewpoint: research: Differences between sub-disciplines and the impact of potential biases.
Using social norms to motivate environmental conservation in hotels. Journal of Frontiers in Psychology, 10.
Consumer Research, 35, 472–482. Shefrin, H. M., & Thaler, R. H. (1988). The behavioral life-cycle hypothesis. Economic
Hagmann, D., Ho, E. H., & Loewenstein, G. (2019). Nudging out support for a carbon tax. Inquiry, 26, 609–643.
Nature Climate Change, 9, 484–489. Simon, H. A. (1955). A behavioral model of rational choice. Quarterly Journal of
Hallsworth, M., List, J. A., Metcalfe, R. D., & Vlaev, I. (2017). The behavioralist as tax Economics, 69, 99–118.
collector: Using natural field experiments to enhance tax compliance. Journal of Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve: A key to the file-drawer.
Public Economics, 148, 14–31. Journal of Experimental Psychology: General, 143, 534–547.
Halpern, D. (2015). Inside the nudge unit: How small changes can make a big difference. New Sparkman, G., & Walton, G. M. (2017). Dynamic norms promote sustainable behavior,
York, NY: Random House. even if it is counternormative. Psychological Science, 28, 1663–1674.
Hauser, O. P., Gino, F., & Norton, M. I. (2018). Budging beliefs, nudging behavior. Mind Thaler, R. H. (1994). Psychology and savings policies. American Economic Review Paper
& Society, 17, 15–26. and Proceedings, 84, 186–192.
Heidhues, P., & Koszegi, B. (2018). Behavioral industrial organization. In B. D. Bernheim, Thaler, R. H., & Benartzi, S. (2004). Save more tomorrow: Using behavioral economics to
S. DellaVigna, & D. Laibson (Eds.), Handbook of behavioral economics – Foundations increase employee saving. Journal of Political Economy, 112, S164–S187.
and applications 1 (Vol. 1, pp. 517–612). Amsterdam: Elsevier Press. Thaler, R. H., & Shefrin, H. M. (1981). An economic theory of self-control. Journal of
Jachimowicz, J. M., Duncan, S., Weber, E. U., & Johnson, E. J. (2019). When and why Political Economy, 89, 392–406.
defaults influence decisions: A meta-analysis of default effects. Behavioural Public Thaler, R. H., & Sunstein, C. R. (2003). Libertarian paternalism. American Economic
Policy, 3, 159–186. Review Papers and Proceedings, 93, 175–179.
Thaler, R. H., & Sunstein, C. R. (2008). Nudge: Improving decisions about health, wealth,
and happiness. New Haven, CT: Yale University Press.

18
J. Beshears and H. Kosowsky Organizational Behavior and Human Decision Processes 161 (2020) 3–19

Thaler, R. H., Tversky, A., Kahneman, D., & Schwartz, A. (1997). The effect of myopia Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of
and loss aversion on risk taking: An experimental test. Quarterly Journal of Economics, choice. Science, 211, 453–458.
112, 647–661. Wisdom, J., Downs, J. S., & Loewenstein, G. (2010). Promoting healthy choices:
Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and Information versus convenience. American Economic Journal: Applied Economics, 2(2),
probability. Cognitive Psychology, 5, 207–232. 164–178.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Wood, W., & Runger, D. (2016). Psychology of habit. Annual Review of Psychology, 67,
Science, 185, 1124–1131. 289–314.

19

You might also like