BIT Manifesto

A MANIFESTO
for Applying Behavioral Science
Michael Hallsworth
Michael Hallsworth
2 The Behavioural Insights Team / A Manifesto for Applying Behavioral Science
ACKNOWLEDGMENTS
Thanks to Lila Tublin for drafting and editing support, Richard O’Brien for
communications support, Dilhan Perera for thoughts on the “Predict and Adjust”
section, and Alex Gyani for his input on the “Replication, Variation, Adaptation”
section.
Thanks to S. Banerjee, E. Berkman, A. Buttenheim, F. Callaway, J. Collins, J. Doctor,
D. Halpern, P. John, S. Kousta, T. Marteau, M. Muthukrishna, C. Payne, D. Perrott,
K. Ruggeri, R. Schmidt, D. Soman, H. Strassheim, C. Sunstein, and members of the
Behavioural Insights Team for their feedback on previous drafts. I am grateful to the
four peer reviewers at Nature Human Behaviour for their helpful suggestions.
The Behavioural Insights Team / A Manifesto for Applying Behavioral Science 3
SUMMARY
This is a manifesto for how applied behavioral science can fulfill
its true potential.
The behavioral insights movement has flourished over the last decade.1
There is now a vibrant ecosystem of practitioners, teams, and academics
building on each other’s work across the globe. Their focus on robust
evaluation means we know that this work has had an impact on
important issues such as antimicrobial resistance, educational attainment,
climate change, and obesity.
The Behavioural Insights Team is proud to have been a pioneer of this
growth. However, we and others in the field also realize that behavioral
science needs to evolve further over its next decade.
In this manifesto we take a clear-eyed look at the challenges facing the
field and offer ten proposals for making further progress. As a starting
point, we present the main arguments from critics of the behavioral
insights approach on the following page.
THE CRITICISMS
Limited impact Limited impact. The approach has focused on more tractable and easy-to-
measure changes at the expense of bigger impact: it has just been tinkering
around the edges of fundamental problems.2
Failure to reach The approach promotes a model of experimentation followed by scaling, but
scale it has not paid enough attention to how successful scaling happens - and the
fact it often does not happen.3
Mechanistic The approach has promoted a simple, linear, and mechanistic way of
thinking understanding and influencing behavior that ignores second-order effects and
spillovers (and employs evaluation methods that assume a move from A to B
against a static background).4
Flawed evidence The replication crisis has challenged the evidence base underpinning the
base behavioral insights approach, adding to existing concerns like the duration of
its interventions’ effects.5
Lack of precision The approach lacks the ability to construct precise interventions and establish
what works for whom, and when. Instead, it relies either on over-general
frameworks or disconnected lists of biases.6
Overconfidence The approach is affected by the wider problem of over-confidence and can
over-extrapolate from its evidence base, particularly when testing is not an
option.7
Control The approach can be elitist and pays insufficient attention to people’s own
paradigm goals and strategies; it uses concepts like “irrationality” to justify attempts
to control the behavior of individuals, since they lack the means to do so
themselves.8
Neglect of the The approach has a limited, overly cognitive and individualistic view of
social context behavior that neglects the reality that humans are embedded in established
societies and practices.9
Ethical concerns The behavioral insights approach will face more ethics, transparency, and
privacy conundrums as it attempts more ambitious and innovative work.10
Homogeneity of The range of participants in behavioral science research has been narrow and
participants and unrepresentative; 11 homogeneity in the locations and personal characteristics
perspectives of behavioral scientists influences their viewpoints, practices, and theories.12
THE PROPOSALSS
We do not agree with all these criticisms, but we do think that they highlight several challenges that
must - and can - be met. Doing so will mean behavioral science is better equipped to help build
policies, products, and services on stronger empirical foundations - and thereby address the world’s
crucial challenges.
Our ten proposals for applied behavioral science fall into three categories: scope (the range and
scale of issues to which behavioral science is applied); methods (the techniques and resources that
behavioral science deploys); and values (the principles, ideals, and standards of conduct that
behavioral scientists adopt).
Category Proposal Recommended action(s)
Scope Present behavioral science as a lens that
01 improves the view of any public and private

issue, in order to break a self-sustaining
pattern that has directed behavioral science
away from the most significant problems.
USE BEHAVIORAL
SCIENCE AS A LENS
Summary: Page 11
Detail: Page 25
Focus less on how to set up a dedicated
02 behavioral science team, and more on

how the approach can be integrated into
an organization’s standard processes by
upgrading its “choice infrastructure”.
BUILD BEHAVIORAL
SCIENCE INTO
ORGANIZATIONS
Summary: Page 12
Detail: Page 29
Use aspects of complexity thinking to
03 improve behavioral science so it can: exploit

“leverage points”; model the collective
implications of heuristics; alter specific
SEE THE SYSTEM features of systems to create wider changes;
and understand the longer-term impact on a
Summary: Page 14 system of a collection of policies with varying
Detail: Page 37 goals.
Methods Strengthen RCTs to deal better with
04 complexity by: gaining a better

understanding of the system interactions and
anticipate how they may play out; setting up
RCTs to measure diffusion and contagion in
PUT RCTs IN networks; building feedback and adaptation
THEIR PLACE into the design of RCTs and interventions.
Summary: Page 15
Detail: Page 45
Identify the most reliable interventions,
05 develop an accurate sense of the likely size

of their effects, and avoid the weaker options.
Recognize that heterogeneity requires a
much higher bar for claiming that an effect
REPLICATION, holds true across many unspecified settings.
VARIATION AND Create multi-site studies to systematically
ADAPTATION study heterogeneity in a wider range of
Summary: Page 16 contexts and participants. Codify and
Detail: Page 51 cultivate the practical skills that successfully
adapt interventions to new contexts.
Emphasize theories that are “practical”: they
06 fill the gap between high-level frameworks

and jumbled lists of biases; they are based
on data and generate testable hypotheses,
but also specify the conditions under which
BEYOND LISTS a prediction applies; they present actionable
OF BIASES steps to solve real-world problems.
Summary: Page 17
Detail: Page 61
Develop the practice of getting behavioral
07 scientists to predict the results of experiments,

and then feeding back the results to them.
PREDICT AND
ADJUST
Summary: Page 18
Detail: Page 69
Values Avoid using the term “irrationality”; practice
08 “epistemic humility”; and design processes and

institutions to counteract overconfidence. Pay
greater attention to people’s own interpretations
of their beliefs, feelings and behaviors. Reach
BE HUMBLE, a wider range of experiences, including
EXPLORE AND marginalized voices and communities. Recognize
ENABLE how apparently universal cognitive processes
Summary: Page 18 are shaped by specific contexts. Use six criteria
Detail: Page 75 (detailed in the main text) to assess when to enable
people to use behavioral science themselves.
Use data science to identify the ways in which
09 an intervention or situation appears to increase

inequalities and introduce features to reduce them.
For example, groups that are particularly likely
to miss a filing requirement could be offered pre-
DATA SCIENCE emptive help.
FOR EQUITY
Summary: Page 20
Detail: Page 85
Cultivate self-scrutiny; find new ways for the
10 subjects of research to judge researchers; take

actions to increase diversity among behavioral
scientists and their teams, such as building
professional networks between the Global North
NO VIEW and Global South.
FROM NOWHERE
Summary: Page 21
Detail: Page 91
The figure below shows how each proposal maps onto the criticisms, as well as which groups have respon-
sibility for implementing them: practitioners (individuals or teams who apply behavioral science findings
in practical settings); the clients who commission these practitioners (for example, public or private sector
organizations); academics working in the behavioral sciences (including disciplines such as anthropology,
economics, and sociology); and funders who support the work of these academics.
CRITICISM PROPOSAL RESPONSIBLE ACTOR(S)

SCOPE PRACTITIONERS CLIENTS ACADEMICS FUNDERS
LIMITED IMPACT Use behavioral science as a lens
FAILURE TO Build behavioral science into

REACH SCALE organizations
MECHANISTIC
See the system
THINKING
METHODS
FLAWED EVIDENCE
Put RCTs in their place
BASE
LACK OF PRECISION Replication, variation, adaptation
OVERCONFIDENCE Beyond lists of biases
CONTROL PARADIGM Predict and adjust
VALUES
NEGLECT OF THE
Be humble, explore and enable
SOCIAL CONTEXT
ETHICAL CONCERNS Data science for equity
HOMOGENEITY OF
PARTICIPANTS AND No “view from nowhere”
PERSPECTIVES
SCOPE
01
USE BEHAVIORAL
SCIENCE AS A LENS
The early phase of the behavioral insights The first step is to change the way we frame
movement was marked by skepticism about behavioral science itself. We need to see
whether findings from laboratories would translate behavioral science as a lens that can be applied
to real-world settings.13 Mindful of this concern, to any public and private issue. Using this frame
practitioners developed standard approaches that shows that behavioral insights can enhance the way
could demonstrate a clear causal link between an we see policy options (for example, revealing new
intervention and an outcome.14 ways of structuring taxes), rather than just acting as
an alternative to them; it also conveys that creating
In practice, these approaches directed attention new interventions to change behavior is not always
towards how the design of specific aspects of the goal - which means more weight should be
a policy, product or service influences discrete placed on the behavioral diagnosis of an issue.
behaviors by actors who are considered mostly in Behavioral science itself shows us the power of
isolation.15 These standard approaches are strong framing: the metaphors we use shape the way
and have produced compelling results. But they we behave, and therefore can be an agent of
have also encouraged people to see behavioral change.21 Metaphors are particularly important
science as a kind of specialist tool. This view mostly in this case because the task of broadening the
limits behavioral science to fixing concrete aspects use of behavioral science requires making a
of predetermined interventions - rather than shaping compelling case to decision makers.22 The metaphor
broader policy goals. Behavioral science acts of behavioral insights as a tool has established
as an alternative to standard tools, and it should credibility and acceptance in a defined area;
be applied only to certain kinds of “behavioral” expanding beyond that area is the task for the
issues.16 next decade.
Such a view is both misguided and profoundly Full detail on page 25

limiting, but over time it has created a self-
reinforcing perception that only certain kinds of
tasks are “suitable” for behavioral scientists.17
Opportunities, skills and ambitions have been
constricted as a result.
A rebalancing is needed. Behavioral science also

has much to say about pressing societal issues like
discrimination, pollution, or economic mobility, and
the structures that produce them.18 These ambitions
have always been present in the behavioral insights
movement,19 but the factors just outlined acted
against them being realized more fully.20
SCOPE
02
BUILD BEHAVIORAL
SCIENCE INTO
ORGANIZATIONS
There has been too little focus on using behavioral In the “Baseline” scenario there is limited awareness
science to shape organizations themselves, as of behavioral science in the organization, and its
opposed to increasing how much an organization principles are not incorporated into processes. In
uses behavioral science to achieve its goals.23 We the “Nudged Organization,” levels of behavioral
need to talk less on how to set up a dedicated science awareness are still low, but its principles
behavioral team, and more about how behavioral have been used to redesign processes to create
science can be integrated into an organization’s better outcomes for staff or service users. No
standard processes. For example, as well as trying explicit behavioral science knowledge or capacity
to ensure that a departmental budget includes is created or needed, which means the return on
provisions for behavioral science, why not use investment here could be large. For that reason, this
behavioral science to improve the way this budget model feels like a neglected opportunity.
is created (e.g., are managers anchored to
outdated spending assumptions)? In “Proactive Consultancy”, leaders may have set
up a dedicated behavioral team without enough
But we need to understand how this new way of supporting organizational changes. The result is
thinking maps against existing debates about how that the team has to work in an enterprising way,
to set up a behavioral function in organizations. going to look for opportunities and having to prove
We propose that doing so reveals six main its worth. But these teams may not be in a resilient
scenarios, as shown in the diagram below. position, since they lack ways to be grafted onto the
standard processes of an organization.
BEHAVIORAL SCIENCE KNOWLEDGE AND CAPACITY

LIMITED CONCENTRATED DIFFUSED
Greater potential for scale
BEHAVIORAL NO Baseline Proactive consultancy Behavioral entrepreneurs

SCIENCE
INCORPORATED
INTO
ORGANIZATIONAL
Behaviorally-enabled
PROCESSES YES Nudged organization “Call for the experts”
organization
In “Call For The Experts”, an organization has Most discussions make it seem like the meaningful
concentrated behavioral expertise, but there are choice is between the different columns in the table
also prompts and resources that allow this expertise above - how to organize dedicated behavioral
to be integrated more into “business as usual”. science resources. Instead, the more important move
Expertise is not widespread, but access to it is. This is from the top row to the bottom row: moving from
setup could mean that processes stimulate demand projects to processes, from commissions to culture.
for behavioral expertise that the central team can A useful way of thinking about this task is about
fulfill. That team may also have the institutional building or upgrading the “choice infrastructure” of
support to proactively monitor activities and the organization.24
respond quickly to specific crises.
Working out how best to build the choice
In “Behavioral Entrepreneurs”, there is behavioral infrastructure in organizations should be a major
science capacity distributed throughout the priority for behavioral science. One advantage
organization, either through direct capacity building to this approach is that it can help organizations
or recruitment. The problem is that organizational address problems with scaling interventions.
processes do not support these individual pockets Already we can see some features will be
of knowledge. Therefore, those with expertise find crucial: reducing the costs of experimentation;
it hard to apply ideas in practice, evaluate their creating a system that can learn from its actions;
effects, share findings, and build learning. and developing new and better ways of using
behavioral science principles to analyze the
Finally, a “Behaviorally-Enabled Organization” behavioral effects of organizational processes,
is one where there is knowledge of behavioral rules, incentives, metrics, and guidelines.25
science diffused throughout the organization, which
also has processes that reflect this knowledge and Full detail on page 29
support its deployment. Staff apply behavioral
science in a deliberate way as part of “business as
usual”, rather than as special projects. While this
is the most resilient setup, it also requires the most
resources.
SCOPE
03
SEE THE SYSTEM
Many important policy challenges emerge from Recognizing cross-scale behaviors means that
complex adaptive systems, where change often behavioral science could:
does not happen in a linear or easily predictable • Identify “leverage points’’ where a specific shift
way, and where coherent behavior can emerge in behavior will produce wider system effects.
from interactions without top-down direction.26 For example, if even a subset of consumers
There are many examples of such systems in human decides to switch to a healthier version of a
societies, including cities, markets, and political food product, this can have broader effects on
movements.27 These systems can create “wicked a population’s consumption through the way the
problems” - like the Covid-19 pandemic - where food system responds by restocking and product
ideas of success are contested, changes are non- reformulation.35
linear and difficult to model, and policies have • Model the collective implications of individuals
unintended consequences.28 using simple heuristics to navigate a system. For
example, new models show how small changes
This reality challenges the dominant behavioral to simple heuristics that guide savings (in this
science approach, which usually assumes stability case, how quickly households copy the savings
over time, keeps a tight focus on predefined target behaviors of neighbors) can lead to inequalities
behaviors, and predicts linear effects based on a in wealth suddenly emerging.36
predetermined theory of change.29 The end result, • Find targeted changes to features of a system
some argue, is a failure to understand how actors that create the conditions for wide-ranging shifts
are acting and reacting in a complex system that in behavior to occur. For example, a core driver
leads policymakers to conclude they are being for social media behaviors is the ease with
“irrational” - and then actually disrupt the system in which information can be shared.37 Even minor
misguided attempts to correct perceived biases.30 changes to this factor can drive widespread
Behavioral science can be improved by using changes - some have argued that such a
aspects of complexity thinking to offer new, change is what created the conditions leading to
credible, and practical ways of addressing the Arab Spring, for example.38
major policy issues. First, we need to reject crude This approach also suggests that a broader change
distinctions of “upstream” versus “downstream” or in perspective is needed. We need to realize the
the “individual frame” versus the “system frame”.31 flaws in launching interventions in isolation and
Instead, complex adaptive systems show that then moving on when a narrowly defined goal has
“higher-level” features of a system can actually been achieved. Instead, we need to see the longer-
emerge from the “lower-level” interactions of actors term impact on a system of a collection of policies
participating in the system.32 When they become the with varying goals.39 The best approach may be
governing features of the system, they then shape “system stewardship”, which focuses on creating
the “lower-level” behavior until some other aspect the conditions for behaviors and indirectly steering
emerges, and the fluctuations continue. We can see adaptation towards overall goals.40
this pattern in the way that new coronavirus variants
emerged from specific contexts to re-shape the Of course, not every problem will involve a
whole course of the pandemic. complex adaptive system; for simple issues the
standard behavioral approach works well. So
In other words, we are dealing with ‘cross-scale behavioral scientists should develop the skills to
behaviors’.33 For example, norms, rules, practices, recognize the type of system that they are facing
and culture itself can emerge from aggregated (“see the system”), and then choose their approach
social interactions; these features then shape accordingly. These skills can be developed
cognition and behavioral patterns in turn.34 through agent-based simulations,41 immersive
technologies,42 or just basic checklists.43
Full detail on page 37

METHODS
04
PUT RCTs IN
THEIR PLACE
Randomized Controlled Trials (RCTs) have been Finally, we can build feedback and adaptation into
a core part of applied behavioral science, the RCT design, allowing adjustments to changing
and they work very well in relatively simple conditions.47 Options include using two-stage trial
and stable contexts. But they can fare worse in protocols,48 evolutionary RCTs,49 sequential multiple
complex adaptive systems, whose many shifting assignment randomized (SMART) trials,50 and
connections can make it difficult to keep a control “bandit” algorithms that identify high-performing
group isolated, and where a narrow focus on interventions and allocate more people to them.51
predetermined outcomes may neglect others that We can also use behavioral science to enhance
are important but difficult to predict.44 alternative ways of measuring impact - in particular,
We can strengthen RCTs to deal better with agent-based modeling, which tries to simulate
complexity. We can try to gain a better the interactions between the different actors in a
understanding of the system interactions and system.52 The agents in these models are mostly
anticipate how they may play out, perhaps through assumed to be operating on rational choice
“dark logic” exercises that try to trace potential principles.53 Therefore, there is a big opportunity
harms, rather than benefits.45 We can set up RCTs to build in more evidence about the drivers
to measure diffusion and contagion in networks, of behavior - for example, habits and social
either by creating separate online environments or comparisons.54
by randomizing real-world clusters, like separate
villages.46 Full detail on page 45
METHODS
05
REPLICATION, VARIATION,
ADAPTATION
The “replication crisis” of the last decade has seen We need specific proposals as well as narrative
intense debate and concern about the reliability changes. The first concerns data collection: expand
of behavioral science findings. Poor research studies to include (and thus examine) a wider range
practices were a major cause of the replication of contexts and participants, and gather richer
crisis; the good news is that many have improved data about them. To date, only a small minority
as a result.55 of behavioral studies have provided enough
information to see how effects vary.61 Coordinated
We need to secure and build on these advances, so multi-site studies will be needed to collect enough
that we move towards a future where meta-analyses data to explore heterogeneity systematically;
of high-quality studies (including deliberate “crowdsourced” studies offer particular promise for
replications) are used to identify the most reliable testing context and methods.62
interventions, develop an accurate sense of the
likely size of their effects, and avoid the weaker Behavioral scientists also need to get better at
options. We have a responsibility to discard ideas judging how much an intervention’s results were
if solid evidence now shows they are shaky, and to linked to its context - and therefore how much
offer a realistic view of what behavioral science can adaptation it may need.63 We should use and
accomplish. modify frameworks from implementation science
to develop such judgment.64 Finally, we need
That responsibility also requires us to have a hard to codify and cultivate the practical skills that
conversation about heterogeneity in results: the successfully adapt interventions to new contexts;
complexity of human behavior creates so much expertise in behavioral science should not be seen
statistical “noise” that it’s often hard to detect as simply knowing about concepts and findings in
consistent signals and patterns.56 The main drivers of the abstract. Therefore, it’s particularly valuable to
heterogeneity are that a) contexts influence results learn from practitioners how they adapted specific
and b) the effect of an intervention may vary greatly interventions to new contexts. These accounts are
between groups within a population.57 These factors starting to emerge, but they are still rare,65 since
complicate the idea of replication itself: a “failed” researchers are incentivized to claim universality for
replication may not show that a finding was false, their results, rather than report and value contextual
but rather how it exists under some conditions and details.66
not others.58
These challenges mean that applied behavioral
scientists need to set a much higher bar for claiming
that an effect holds true across many unspecified
settings.59 There is a growing sense that interventions
should be talked about as hypotheses that were true
in one place, and which may need adapting for
them to be true elsewhere as well.60
METHODS
The concern for behavioral science is that it uses
06
both these high-level frameworks, like dual process
theories, and jumbled collections of heuristics and
biases - with little in the middle to draw both levels
together.74
BEYOND LISTS We think that a priority for responding to this
OF BIASES challenge is to develop theories that are practical.
By this we mean:
The heterogeneity in behavioral science findings • They fill the gap we’ve identified in
also means that our underlying theories need to behavioral science: between day-to-day
improve: we are lacking good explanations for why working hypotheses and comprehensive and
findings vary so much.67 This need for better theories systematic attempts to find universal underlying
can be seen as part of a wider “theory crisis” in explanations.
psychology, which has thrown up two big concerns
for behavioral science.68 • They are based on data rather than being
derived from pure theorizing.75
The first stems from the fact that theories of
behaviour often try to explain phenomena that are • They can generate testable hypotheses, so they
complex and wide-ranging.69 Trying to cover this can be disproved.76
variability can produce descriptions of relationships
and definitions of constructs that are abstract and • However, they also specify the conditions under
imprecise. The result is theories that are vague and which a prediction applies or does not.77
“weak”, since they can be used to generate many
different hypotheses - some of which may actually • They are geared towards realistic adaptation by
contradict each other.70 That makes theories hard practitioners and offer ‘actionable steps toward
to disprove and so weak theories stumble on, solving a problem that currently exists in a
unimproved.71 particular context in the real world.’78
The other concern is that theories can make specific We think that resource rationality is a good
predictions, but they are disconnected from each example of a practical theory. It starts from the
other - and from a deeper, general framework basis that people make rational use of their limited
that can provide broader explanations (like cognitive resources.79 Given there is a cost to
evolutionary theory, for example).72 The main way thinking, people will look for solutions that balance
this issue affects behavioral science is through choice quality with effort. Importantly, these
heuristics and biases. Examples of individual biases principles offer a systematic framework for building
are accessible, popular, and how many people useful models for how people act.
first encounter behavioral science. These ideas are
incredibly useful, but have often been presented A recent study has shown how these models can
as lists of standalone curiosities, in a way that is can not only predict how people will respond to
incoherent, reductive, and deadening. They can different kinds of nudges in certain contexts, but also
create overconfident thinking that targeting a can be integrated with machine learning to create
specific bias (in isolation) will achieve a certain an automated method for constructing ‘optimal
outcome.73 nudges’.80 These are highly practical benefits
coming from applying a particular theory.
Perhaps most importantly, focusing on lists of
biases distracts us from answering core underlying Full detail on page 61
questions. When does one or another bias apply?
Which are widely applicable, and which are highly
specific? These are highly practical questions when
someone is faced with tasks like, for example,
taking an intervention to new places.
METHODS VALUES
07 08
PREDICT AND BE HUMBLE, EXPLORE
ADJUST AND ENABLE
Hindsight bias is what happens when people feel Behavioral scientists (like other experts) may over-
“I knew it all along”, even if they did not.81 When confidently rely on decontextualized principles that
the results of an experiment come in, hindsight do not match the real-world setting for a behavior.88
bias may mean that behavioral scientists are more Deeper inquiry can reveal reasonable explanations
likely to think that they had predicted them, or for what seem to be behavioral biases.89 In
quickly find ways of explaining why they occurred. response, those applying behavioral science
Hindsight bias is a big problem because it breeds should: avoid using the term “irrationality”, which
overconfidence, impedes learning, dissuades can limit attempts to understand actions in context;
innovation, and prevents us from understanding acknowledge that diagnoses of behavior are
what is truly unexpected.82 provisional and incomplete (“epistemic humility”);90
and design processes and institutions to counteract
In response, behavioral scientists should establish overconfidence.91
a standard practice of predicting the results of
experiments, and then receiving feedback on A common theme through these ideas is the need for
how their predictions performed. Hindsight bias more and better inquiry into behaviors in context,
can flourish if we do not systematically capture rather than making assumptions. Open-ended
expectations or “priors” about what the results of a qualitative exploration of the context and drivers for
study will be - in other words, it is not easy to check behaviors is not new to the behavioral sciences.92
or remember the state of knowledge before an However, three areas demand particular focus in
experiment.83 Making predictions provides regular, the future. First, pay greater attention to people’s
clear feedback of the kind that is more likely to goals and strategies, and their own interpretations
trigger surprise and reassessment, rather than of their beliefs, feelings, and behaviors.93
hindsight bias.84 Second, reach a wider range of experiences,
including marginalized voices and communities,
More and more studies are explicitly integrating understanding how structural inequalities can lead
predictions.85 But barriers lie in the way of further to expectations and experiences varying greatly
progress. People may not welcome the ensuing by group and geography.94 Third, recognize how
challenge to their self-image; predicting may seem apparently universal cognitive processes are
like one thing too many on the to-do list; and the shaped by specific contexts, thereby unlocking new
benefits lie in the future. ways for behavioral science to engage with values
and culture.95
We propose: make predicting easy by
incorporating it into standard organizational
processes; minimize threats to predictors’ self-
image, for example by making and feeding back
predictions anonymously;86 give concrete prompts
for learning and reflection, in order to disrupt the
move from surprise to hindsight bias;87 and build
learning from prediction within and between
institutions.

e.g. healthy foods e.g. employees deciding they want e.g. creating regular
placed prominently in healthy foods to be prominent & automatic bank transfers
canteen developing arrangement to savings account
Level of involvement in initiative
NONE CO-DESIGN INITIATING/DRIVING

Level and nature of capacity created
NONE NUDGE SELF-NUDGE
SELF-
REFLECTION NUDGE+
NUDGE+
ACTION “PATERNALISTIC” BOOST BOOST “SELF-BOOST”
e.g. gyms introducing e.g. users working e.g. initiator decides e.g. individual reads e.g. placing letters to be
commitment devices to with gyms to develop that the best option is to about heuristics and posted on doorhandle
increase usage commitment devices that teach people heuristics constructs own heuristics
actively remind users of to increase savings for savings goals
their exercise goals
In addition, more can and should be done People may be heavily engaged in selecting and
to broaden ownership of behavioral science developing a nudge intervention that nonetheless
approaches. Many, but far from all, behavioral does not trigger any reflection or build any
science applications have been quite top- skills. Alternatively, a policy maker may have
down, with a ‘choice architect’ enabling certain paternalistically assumed that people want to build
outcomes.96 One route is to enable people to up their capacity to perform an action, when in fact
become more involved in designing interventions they do not. This is the real choice to be made.
themselves - and “nudge plus”, “self nudges”,
and “boosts” have been proposed as ways of A final piece missing from current thinking is that
doing this.97 Reliable criteria are needed to decide enabling people can lead to a major decentering
when enabling approaches may be appropriate, of the use of behavioral science. If more people
including: whether the opportunity to use an are enabled to use behavioral science, they may
enabling approach exists; ability and motivation; decide to introduce interventions that influence
preferences; learning and setup costs; equity others. Rather than just creating self-nudges through
impacts; and effectiveness (recognizing evidence altering their immediate environments, they may
on this point is still emerging).98 decide that wider system changes are needed
instead. A range of people could be enabled
But these new approaches should not be seen to create nudges that generate positive societal
simplistically as “enabling” alternatives to change (with no “central” actors involved), as
“disempowering” nudges.99 Instead, we need to happened for the “Fair Tax Mark” in the UK.
consider a) how far the person performing the
behavior is involved in shaping the initiative itself; b) Full detail on page 75
the level and nature of any capacity created by the
intervention.
VALUES
09
DATA SCIENCE
FOR EQUITY
Recent years have seen growing interest in using • Who does the personalization target, and using
new data science techniques to reliably analyze what criteria? Many places have laws or norms
the heterogeneity of large datasets.100 Machine to ensure equal treatment based on personal
learning is claimed to offer more sophisticated, characteristics. When does personalization
reliable, and data-driven ways of detecting violate those principles?
meaningful patterns in datasets.101 For example,
a machine learning approach has been shown to • How is the intervention constructed? To what
be more effective than conventional segmentation extent do the recipients have awareness of
approaches at analyzing patterns of US household the personalization, choice over whether it
energy usage to reduce peak consumption.102 occurs, control over its level or nature, and the
opportunity for giving feedback on it?108
A popular idea is to use such techniques to better
understand what works best for certain groups, and • When is it directed? Is it at a time when the
thereby tailor an offering to them.103 “Scaling” an participant is vulnerable? Would they likely
intervention stops being about a uniform roll-out, regret it later, if they had time to reflect?
and instead becomes about presenting recipients
with the aspects that are most effective for them.104 • Why is personalization happening? Does
it aim to exploit and harm or support and
This vision is often presented as straightforward and protect, recognizing that those terms are often
obviously desirable, but it runs almost immediately contested?
into ethical quandaries and value judgements.
People are unlikely to know what data has been Taking these factors into account, we propose that
used to target them, and how; the specificity of the main opportunity is for data science to identify
the data involved may make manipulation more the ways in which an intervention or situation
likely, since it may exploit sensitive personal appears to increase inequalities, and reduce
vulnerabilities; and expectations of universality them.109 For example, groups that are particularly
and non-discrimination in public services may be likely to, say, miss a filing requirement, could be
violated.105 offered preemptive help.
There is also emerging evidence that people often We call this idea data science for equity. It
object to personalization. While they support some addresses the “why” factor by using data science
personalized services, they consistently oppose to support not exploit. But it needs to be supported
advertising that is customized based on sensitive by other attempts to increase agency (the “how”
information - and they are generally against the factors), like a recent study that showed how boosts
collection of the information that personalization can be used to help people detect micro-targeting
relies on.106 When a company tries personalization of advertising,110 and studies that obtain more
that crosses into being “creepy,” uproar and data on which uses of personalization people find
damage to its reputation can ensue.107 acceptable.
In order to navigate this landscape, behavioral Full detail on page 85

scientists need to examine four factors.
VALUES
10
NO VIEW
FROM NOWHERE
Behavioral scientists need to understand how they Self-scrutiny may not be enough. We should
bring certain assumptions, privileges, and ways also find ways for people to judge researchers
of seeing to what they do.111 They are always and decide whether they want to participate in
situated, embedded, and entangled with ideas and research - going beyond consent forms. Finally,
situations. They cannot assume there is some set- we should take actions to increase diversity (of
aside position from which to observe the behavior several kinds) among behavioral scientists, teams,
of others - there is no “view from nowhere”.112 collaborations, and institutions. These could include
increased support for starting and completing PhDs,
Behavioral scientists are defined by having reducing the significant racial gaps present in much
knowledge, skills, and education; many of them public funding of research, or building professional
can use these resources to shape public and private networks that connect the Global North and
actions. Therefore, they are in a privileged position, Global South.120
but may not see the extent to which they hold elite
positions that stop them from understanding people Full detail on page 91
who think differently (for example those who are
skeptical of education).113
There have been repeated concerns that the

field is still highly homogeneous in other ways as
well. Gender, race, physical abilities, sexuality,
and geography also influence the viewpoints,
practices, and theories of behavioral scientists.114
Only a quarter of the behavioural insights teams
catalogued in a 2020 survey were based in the
Global South.115 The last decade has shown just how
behaviors can vary greatly from culture to culture,
even as psychology has tended to generalize from
relatively small and unrepresentative samples.116 So,
rather than claiming that science is value-free, we
need to find realistic ways of acknowledging and
improving this reality.117
A starting point is for behavioral scientists to

cultivate self-scrutiny by querying how their
identities and experiences contribute to their stance
on a topic. Hypothesis generation could particularly
benefit from this exercise, since arguably it is closely
informed by the researcher’s personal priorities
and preferences.118 Behavioral scientists could be
actively reflecting on interventions in progress,
including what factors are contributing to power
dynamics.119
CONCLUSION
When considered together, these proposals Improving applied behavioral science has some
present a consistent and coherent vision for the characteristics of a social dilemma - benefits are
future of applied behavioral science. A common diffused across the field as a whole, while costs
theme throughout the ten proposals is the need for fall on any individual party who chooses to act
self-reflective practice. In other words, a main (or act first). Practitioners are often in competition.
priority for behavioral scientists is to recognize Academics often want to establish a distinctive
the various ways that their own behavior is being research agenda. Commissioners are often
shaped by structural, institutional, environmental, rewarded for risk aversion. Impaired coordination
and cognitive factors. is particularly problematic, since it forms the basis
for several necessary actions (such as the multi-site
However, as the field itself shows, a gap often studies to measure heterogeneity).
emerges between intention and action. Given
what’s at stake, BIT will focus on bridging this gap Solving these problems will be hard. Funders
in the coming years. Realizing these proposals need to find mechanisms that adequately reward
will require sustained work and experiencing the coordination and collaboration by recognizing the
discomfort of disrupting what may have become true costs involved. Practitioners need to perceive
familiar and comfortable practices. Indeed, the competitive advantage from adopting new
this manifesto forms part of a new collection of practices and be able to communicate them to
resources from BIT to start to fulfill the goals clients. Stepping back, the starting point for these
set out here. changes needs to be a change in the narrative
about what the field does and could do. The
“manifesto” presented here aims to help shape
this narrative.
Scope Use behavioral science as Present behavioral science as a lens that improves the view of any
a lens public and private issue, in order to break a self-sustaining pattern
that has directed behavioral science away from the most significant
problems.
Build behavioral science Focus less on how to set up a dedicated behavioral science team, and
into organizations more on how the approach can be integrated into an organization’s
standard processes by upgrading its “choice infrastructure”.
See the system Use aspects of complexity thinking to improve behavioral science so
it can: exploit “leverage points”; model the collective implications of
heuristics; alter specific features of systems to create wider changes;
and understand the longer-term impact on a system of a collection of
policies with varying goals.
Methods Put RCTs in their place Strengthen RCTs to deal better with complexity by: gaining a better
understanding of the system interactions and anticipate how they
may play out; setting up RCTs to measure diffusion and contagion in
networks; building feedback and adaptation into the design of RCTs
and interventions.
Replication, variation, Identify the most reliable interventions, develop an accurate sense
adaptation of the likely size of their effects, and avoid the weaker options.
Recognize that heterogeneity requires a much higher bar for claiming
that an effect holds true across many unspecified settings. Create
multi-site studies to systematically study heterogeneity in a wider
range of contexts and participants. Codify and cultivate the practical
skills that successfully adapt interventions to new contexts.
Beyond lists of biases Emphasize theories that are “practical”: they fill the gap between
high-level frameworks and jumbled lists of biases; they are based
on data and generate testable hypotheses, but also specify the
conditions under which a prediction applies; they present actionable
steps to solve real-world problems.
Predict and adjust Develop the practice of getting behavioral scientists to predict the
results of experiments, and then feeding back the results to them.
Values Be humble, explore, Avoid using the term “irrationality”; practice “epistemic humility”;
and enable and design processes and institutions to counteract overconfidence.
Pay greater attention to people’s own interpretations of their beliefs,
feelings, and behaviors. Reach a wider range of experiences,
including marginalized voices and communities. Recognize how
apparently universal cognitive processes are shaped by specific
contexts. Use six criteria to assess when to enable people to use
behavioral science themselves.
Data science for equity Use data science to identify the ways in which an intervention or
situation appears to increase inequalities and introduce features to
reduce them. For example, groups that are particularly likely to miss a
filing requirement could be offered pre-emptive help.
No “view from nowhere” Cultivate self-scrutiny; find new ways for the subjects of research to
judge researchers; take actions to increase diversity among behavioral
scientists and their teams, such as building professional networks
between the Global North and Global South.
SCOPE
01
USE BEHAVIORAL
SCIENCE AS A LENS
We need to see behavioral science as a lens that improves the
view of any public and private issue, rather than as a tool that
we sometimes pick up. Making this change will help break the
self-sustaining pattern whereby demand for behavioral science,
and the tools we have developed, has pushed work towards
‘downstream’ interventions and away from structural changes.
The recent surge in applying behavioral science to So, how to move forward? Step one is to realize
practical issues has made a measurable difference that the strengths that have brought success may
across many domains. The approach has been also be holding behavioral science back.
adopted by public sector bodies at the local, To explain, let’s go back to the start of the current
national, and supra-national level,121 and by private phase of applied behavioral science (around
companies large and small. They have improved 2008-2012), when there was a pressing need to
outcomes in health,122 education,123 sustainability,124 demonstrate clear results and build credibility.
transport,125 diet,126 and financial behavior,127 That pressure led us and others to form standard
among many areas. Many of these improvements ways of applying “behavioral insights”. These
have come at relatively low cost.128 approaches generally have a common set of
features.132 The standard series of actions looks
Despite these achievements, objections have something like this:
emerged. A common one is that there’s been a
focus on tractable and easy-to-measure changes, • scoping the issue and exploring drivers of
at the expense of bigger impact on major issues. behavior
Behavioral science, it’s claimed, has just been • defining a specific target behavior that can be
tinkering around the edges of fundamental measured reliably
problems.129 • generating evidence-based interventions to
change the target behavior
We agree with the challenge that behavioral • creating a robust experimental design to test the
science can and should do more. Every day, new intervention’s effects
policies cut against well-established evidence of • if desired, taking the intervention to new places
how people behave.130 Services are shaped in (e.g., “scaling”)
ways that people cannot navigate. Products are
launched with fundamental misconceptions about These actions are usually presented in a step-by-
how people are likely to approach them. There step (linear) way, although most guides stress that
are fewer prominent examples that clearly show people can loop between stages.
how governing policies and systems have been
designed using concepts from behavioral science,
as opposed to specific aspects of how those
policies were presented or structured.131
We need a rebalancing. Behavioral science also has

Over the past decade, this kind of much to say about broader, larger issues in society like
approach has tended to produce:
discrimination, pollution, or economic mobility, and the
• “downstream” interventions that structures that produce them. Behavioral science has the
concern how specific aspects of potential to fundamentally change how we understand
a policy, product or service are
the factors shaping behavior and therefore how we
designed
• a focus on discrete behaviors by constitute an issue and what is possible.139
actors (e.g., people, businesses),
considered mostly in isolation.133
Take the economy: behavioral science can show
Perhaps a good example is BIT’s project to reduce how to regulate markets differently;140 how to
missed hospital appointments in the UK.134 This work design taxes to drive wider behavioral changes;141
identified the wording of text message reminders and even offer a vision for the future “behavioral
as an opportunity for improvement. We ran two economy” as a whole.142
randomized controlled trials in London, which found
that a message referring to the cost of a missed As this list shows, there are examples of how
appointment for the health system reduced no- behavioral science has tackled more complex,
shows from 11.1% to 8.5% - a 25% relative change. structural, “upstream” issues. But these examples
This low-cost change was then taken up by other are harder to communicate because they often
health providers around the world.135 deal with the fluid, murky, and fractured narratives
of politics and policy-making. Unlike the clear,
As this example shows, the approach is a strong linear stories of the approach outlined above, the
one. There’s a neatly-defined problem, a specific contribution of behavioral science may be difficult
intervention, and a strong causal link between that to trace or may play out over a long timeframe.
intervention and the target outcome. There’s a clear It’s easier to talk about the neat narratives instead,
story to tell about what happened and why. particularly since many people are aware of and
These strengths mean that there are still so many curious about the idea of ‘nudging’, which is often
improvements that this approach could achieve. For associated (inaccurately) with small presentational
instance, one priority should be to clear the vast tweaks only.143
administrative burdens (or ‘sludge’) that prevent
people - particularly those with fewest resources The wide potential scope of applied behavioral
- from understanding or accessing government science is an idea that BIT has promoted
services.136 consistently since its creation.144 But the self-
reinforcing limiting factors we outlined have proved
However, the justified focus on these clear and strong. Now the increasingly urgent question is:
credible results about downstream impact also how can we successfully change behavioral science
becomes self-reinforcing. People start to think that itself?
this is the sole way that behavioral science can be
applied. In turn, this perception shapes demand: Our answer is to consider the proposals in this
only certain kinds of problems are seen as ones manifesto, which aim to create a package that can
‘suitable’ for behavioral scientists.137 achieve that change. We can start by trying to
switch the metaphors or frames through which we
Opportunities, skills, and ambitions have been perceive behavioral science itself.
constricted as a result. In general, practitioners
have focused more on expanding the application Behavioral science should be understood as a lens
of some “tried and tested” interventions to new that enhances the view of any public and private
areas, and less on exploring new ones (or getting a issue, rather than as a tool that we sometimes pick
deeper understanding of familiar ones).138 up.
The trends we highlight above have tended to In one sense, this proposal is about returning to first
reinforce this tool metaphor, which encourages this principles. Back in 2010 we emphasized that ‘civil
way of thinking: servants...need to better understand the behavioral
dimension of their policies and actions’, and also
Behavioral science is a specialist tool that is applied stressed how behavioral science ‘powerfully
to certain kinds of problems - and not others. Often complements and improves conventional policy
these are defined “delivery” issues (“How do we tools’.147 But, for all the reasons above, this aspect
structure this message?”), but sometimes it can has been less prominent over the last decade.148
be used to solve a “behavioral” problem as an
alternative to more traditional approaches like rules Other metaphors apart from a lens would be
and incentives. powerful as well. For example, moving from ‘choice
architecture’ to ‘choice infrastructure’ effectively
These implications lead us down the wrong path. highlights the broader, embedded nature of our
Instead, behavioral science should be understood behaviors.149 The point is that behavioral science
as a lens that can be applied to any public itself shows us the power of framing: the metaphors
and private action. This change offers several we use shape the way we behave, and therefore
advantages: can be agents of change.150
• A lens metaphor shows that behavioral insights

can enhance the way we see policy options Metaphors are particularly important
(for example, revealing new ways of structuring
taxes), rather than just acting as an alternative because the task of broadening the use
to them. of behavioral science requires making
• A lens metaphor conveys that the uses of a compelling case to decision makers.
behavioral insights are not limited to creating
Behavioral science practitioners need
new interventions. A behavioral science lens
can, for example, help reassess existing actions to understand their audience and then
and understand how they may have unintended shape their offers accordingly.
effects. It emphasizes the behavioral diagnosis
of a situation or issue, rather than pushing too
soon to define a precise target outcome and In a way, the metaphor of behavioral science as
intervention.145 a tool that produces clear results has served the
field well over the last decade - it has established
• Specifying that this lens can be applied to credibility and acceptance in a defined area. The
any action conveys the error of separating out challenge now is to expand beyond that area,
“behavioral” and “non-behavioral” issues: allowing behavioral science to fulfill its potential
most of the goals of private and public action before the self-reinforcing cycle becomes too hard
depend on certain behaviors happening (or to break.
not). Behavioral science should therefore be
integrated into an organization’s core activities,
rather than acting as an optional specialist
tool.146
02
BUILD BEHAVIORAL SCIENCE
INTO ORGANIZATIONS
There has been too little focus on using behavioral science to
shape organizations themselves, as opposed to increasing
how much an organization uses behavioral science to achieve
its goals. We need to talk less on how to set up a dedicated
behavioral function, and more about how behavioral science
can be integrated into an organization’s standard processes.
For example, as well as trying to ensure that a • In “Call For Experts,” an organization has
departmental budget includes provisions for concentrated behavioral expertise, but there are also
behavioral science, why not use behavioral science prompts and resources that allow this expertise to be
to improve the way this budget is created (e.g., integrated more into “business as usual”. Expertise
are managers anchored to outdated spending is not widespread, but access to it is. This setup
assumptions)? could mean that processes stimulate demand for
behavioral expertise that the central team can fulfill.
But we need to understand how this new way of
thinking maps against the existing debate about how • In “Behavioral Entrepreneurs,” there is behavioral
to set up a behavioral function in organizations. We science capacity distributed throughout the
propose that doing so reveals six main scenarios. organization, either through direct capacity building
or recruitment. The problem is that organizational
• In the “Baseline” scenario there is limited processes do not support these individual pockets of
awareness of behavioral science in the knowledge.
organization, and its principles are not
incorporated into processes. • Finally, a “Behaviorally-Enabled Organization”
is one where there is knowledge of behavioral
• In the “Nudged Organization,” levels of science diffused throughout the organization, which
behavioral science awareness are still low, but its also has processes that reflect this knowledge and
principles have been used to redesign processes to support its deployment.
create better outcomes for staff or service users.
The common success factor in these scenarios is an
• In “Proactive Consultancy,” leaders may have upgrade of the “choice infrastructure” of organizations.
set up a dedicated behavioral team without To do this, we propose: reducing the costs of
grafting it onto the organization’s standard experimentation, creating a system that can learn from
processes. This lack of institutional grounding puts its actions; and developing new and better ways of
the team in a less resilient position, meaning it must using behavioral science knowledge to analyze the
always search for new work. behavioral effects of processes, rules, incentives, metrics,
and guidelines.
The second proposal is to broaden the scope of For example, as well as trying to ensure that a
how behavioral science is used in organizations. departmental budget includes provisions for
Much attention has been paid to how the practice behavioral science, why not use behavioral science
of applying behavioral science can have more to improve the way this budget is created (e.g.,
influence within organizations – usually by advising are managers anchored to outdated spending
on how a dedicated behavioral science function assumptions)?155
should be structured.151
The overriding message here is for greater focus
This is an important question: there were more than on the organizational changes that indirectly apply
300 such units by 2020 worldwide; BIT and others or support behavioral science principles, rather
have advised on setting them up.152 Work in this than just thinking through how the direct and overt
area has covered important questions like how to use of behavioral science can be promoted in an
arrange the leadership, organizational structure, organization. There are two main advantages to
funding, and goals of behavioral insights teams for doing this:
maximum success.153
SCALE
In contrast, there has been less attention Building behavioral science into organizations can
address some of the issues surrounding successful
paid to how behavioral science can be
scaling of interventions.156 If some of the barriers to
integrated into an organization’s own scaling concern cognitive biases in organizations,
processes.154 There has not been enough these changes could minimize the effect of such
focus on using behavioral science to shape biases.157 Rather than starting with a behavioral
science project and then trying to scale it, we
organizations themselves, as opposed to could start by looking at operations at scale and
increasing how much an organization uses understand how they can be influenced. There is
behavioral science to achieve its goals. a change of perspective here to a position where
‘what is scalable is not the content of what is
learned in any given context, but the capacity for
learning itself’.158

BEHAVIORAL NO Baseline Proactive consultancy Behavioral entrepreneurs

SCIENCE
INCORPORATED
INTO
ORGANIZATIONAL
Behaviorally-enabled
PROCESSES YES Nudged organization “Call for the experts”
organization
RESILIENCE The vertical axis represents whether behavioral

The goal would be to produce behaviorally- science has been used to shape the organization’s
informed standard or “business as usual” processes, own structures or processes, using a crude yes/no
rather than the continued application of behavioral distinction to make the diagram manageable. We
science explicitly. That approach is resilient to will bring this distinction to life with examples in the
changes in the demand for “behavioral science” following sections, before defining it in more detail.
solutions as such in the future.
The horizontal axis deals with the extent and form
BIT’s 2018 Behavioral Government report proposes of behavioral science knowledge and capacity in
many practical changes to organizational processes an organization.161 In the ‘Baseline’ scenario, there
to mitigate biases in government.159 But we need to is very little or no awareness of behavioral science
understand the range of options for implementing concepts in the organization. ‘Concentrated’
such changes. How do we think about them refers to the setup where there is a dedicated
alongside the desire to create a dedicated team or resource that applies behavioral science
behavioral insights team, for example? to organizational priorities.162 In the ‘Diffused’
scenario, people or teams with competence in
We think that the diagram above offers a useful behavioral science are spread throughout the
way of mapping the options for building behavioral organization. Deciding between concentrated and
science into organizations.160 diffused setups is generally seen as a central choice
for organizations looking to build a behavioral
science function.163
We can now see how this framework illuminates the framing effects, anchoring and halo effects can be
different options: created in practical decisions such as procurement
and performance appraisal.166 They can also be
BASELINE countered: when considering the purchase of an
Here, there is limited awareness of behavioral email software, framing effects like saying 20%
science in the organization, and its principles are of users were dissatisfied significantly affected
not incorporated into processes. Benefits are likely intentions (versus saying 80% were satisfied), but
to be limited. these effects were eliminated if both percentages
were shown (in a random order).167
NUDGED ORGANIZATION
Here, levels of behavioral science awareness The big outstanding question in this scenario is who
are still low, but its principles have been used to introduces the nudges, since the organization has
redesign processes to create better outcomes for little internal capacity. Perhaps these could be one-
staff or service users. For example, the pervasive off changes introduced from outside? Answering
optimism bias in organizations’ plans can be this question feels important, since the return on
reduced by mandatory ‘pre-mortems’, where investment here could be large - and, for that
decision makers imagine the future failure of their reason, this model feels like a neglected opportunity
project and then work back to identify why things that needs more attention.
went wrong. Group reinforcement (or “groupthink”)
could be minimized by creating routes for diverse PROACTIVE CONSULTANCY
views to be fed in anonymously before and after In this situation, leaders may have set up a
group discussions, counteracting the pressure dedicated behavioral team, but perhaps not
to conform face to face.164 And there are vast given much thought to supportive organizational
opportunities to reduce the administrative burdens, changes. The result is that the team has to work in
or ‘sludge’, that make services and processes an enterprising way, going to look for opportunities
difficult to access and navigate.165 and having to prove its worth.168
This scenario is termed the ‘nudged organization’ This situation reflects the reality for many teams, who
because no explicit behavioral science knowledge are ‘looking to develop networks, positions, and
or capacity is created or needed. Like for nudging, tactics that establish their authority and credibility
it is the choice architecture (or choice infrastructure) among decision makers.’169 As a result, much of the
that produces the outcomes, and there is no neutral discussion has focused on how best to set up these
choice in the way that an organization’s processes teams. The better contributions have recognized that
are set up. That means no behavioral team or this question is fundamentally political, rather than
unit is created; the change or goals may not even technocratic - how do the people leading such a
be framed in terms of behavioral science (as for resource build relationships and present their team
“administrative burdens”) as useful to their organizations?
For this reason, the best starting point is to The problem with this scenario is that teams may
understand how the existing setup is influencing not be in a resilient position, since they lack ways
behavior. Where is the choice architecture currently to be grafted onto the standard processes of an
working well, through accident or design? How can organization. For example, leaders may neglect
existing processes be amended easily to draw on to support and resource evidence-gathering and
these practices? Who are the people who oversee experimentation. At the same time, they may have
the rules, incentives, metrics, and guidelines that unrealistic expectations because they know only the
influence people throughout the organization? highlights of previous behavioral science success
stories.170
To give a concrete example, human resource
leaders profoundly shape what organizations
permit and reward. Yet, there has been relatively
little focus on “behavioral HR”. Recent studies have
shown that cognitive biases such as decoy effects,
Teams will therefore have to continually prove their The problem with the behavioral entrepreneurs
worth and relevance, while having fewer options scenario is that organizational processes do not
for doing so. As current practitioners point out, support these individual pockets of knowledge.
‘interventions that seem relatively easy to implement Therefore those with expertise find it hard to apply
(e.g., an RCT with promotion letters) can require a ideas in practice, evaluate their effects, share
set of system changes that touch a variety of groups findings, and build learning. For example, reducing
in the organization (e.g., printing services, database “sludge” often ‘requires coordination among a
administrators, processing centers).’171 This kind of number of teams in the organization’, which is a
broader organizational scaffolding is not always problem when teams work in silos and ‘it is not
prioritized, despite being needed for behavioral acceptable in the organization’s culture to interfere
teams to fulfill their potential. with other teams’ affairs.’174
CALL FOR EXPERTS These are not just hypotheticals. A review of

In the Call for Experts scenario, an organization the Dutch policy landscape found that ‘most
has similarly concentrated behavioral expertise, behavioral policy practices have not been deeply
but there are also prompts and resources that institutionalized’, and their advancement ‘depends
allow this expertise to be integrated more into on the ambition of individual enthusiasts’ (or, as
‘business as usual’. At its simplest, this might mean we’ve called them, “behavioral entrepreneurs”).175
that standard procedures prompt staff to recognize While they can achieve some successes, their
that the expertise may be needed (e.g., any new lack of institutional grounding can mean that they
requirement in an application process needs to be become jaded and start looking for other options
assessed for its likely effects on behavior). Expertise instead.
is not widespread, but access to it is.
BEHAVIORALLY-ENABLED
At another level, the organization may have ORGANIZATION
invested to ensure that the ability to randomize has We see a behaviorally-enabled organization
been built into new and existing delivery systems, as one where there is knowledge of behavioral
thereby allowing the team to run experiments science diffused throughout the organization, which
when they are called on. If working well, this setup also has processes that reflect this knowledge and
would mean that processes stimulate demand for support its deployment.176 This is the most resilient
behavioral expertise that the central team can fulfill. setup, since staff will be applying behavioral
That team may also have the institutional support to science in a deliberate way as part of “business as
proactively monitor activities and respond quickly to usual”, rather than through special projects.
specific crises.
A behaviorally-enabled organization would
One benefit to this kind of setup is that it allows bring together some of our previous proposals.
teams to select the most promising collaborations, It would embed the behavioral lens mentioned
rather than taking whatever is on offer. For earlier into its core functions, including strategy
example, the team in Employment and Social and operations. For example, behavioral science
Development Canada’s Innovation Lab claims could be integrated into cross-cutting frameworks
that a ‘careful selection process is critical to the like the WHO’s ‘Health In All Policies’ to prompt
success of incorporating behavioral insights into an inter-departmental working.177 It would address
organization’, since it identifies partners who are the need to “see the system”, recognizing that
open and willing to innovate, which makes it more sustainable outcomes are difficult to achieve
likely that the subsequent project demonstrates the through isolated changes. Such an organization
true value behavioral science can add.172 would be self-reflective, and carefully explore the
varying perspectives and experiences of its staff
BEHAVIORAL ENTREPRENEURS and service users.
In this situation, there is behavioral science capacity
distributed throughout the organization, either
through direct capacity building or recruitment.
The distribution of expertise can work if there
are effective support networks and efforts at
coordination.173
While this setup has the greatest opportunity for Working out how best to build the choice infrastruc-
scale and sustainability, it also requires the greatest ture in organizations should be a major priority for
investment. We conclude by talking about what behavioral science. As with many systems, the best
kinds of investments are needed. option may be to focus on creating the conditions
for desired behaviors to emerge, rather than over
CHOICES AND PRIORITIES FOR specifying solutions.180 But already we can see
THE BUILD some features will be crucial.
Most discussions make it seem like the meaningful Dilip Soman and Katherine Yeung argue for the
choice is between the different columns in our importance of reducing the costs of experimenta-
framework - how to organize your dedicated be- tion, including cheaper data collection, creating
havioral science resources. We argue that the more an experimental mindset, reducing institutional
important move is from the top row to the bottom impatience, and building agility so that organiza-
row: moving from projects to processes, from com- tions can easily adapt to learning.181 Others have
missions to culture. promoted the importance of sharing learning itself,
pointing towards the crucial role of Singapore’s
A useful way of thinking about this task is about Civil Service College in ‘curating and facilitating an
building or upgrading the “choice infrastructure” ecosystem of learning opportunities’ for behavioral
of the organization, defined as ‘the institutional science.182
conditions and mechanics of systems - the structures,
processes, and capabilities - that directly under- To this list, we want to add new and better ways
lay and support behavioral interventions to help of using behavioral science knowledge to analyze
choice architecture solutions work effectively and as the behavioral effects of processes, rules, incen-
planned’.178 tives, metrics, and guidelines. Such work has
surged recently under the labels of ‘behavioral
In other words, we should place greater focus on public administration’ and ‘behavioral operations
the institutional conditions and connections that management’, building on a longer tradition of
support the direct and indirect ways that behavioral organizational behavior research.183 We need to
science can infuse organizations. As the diagram ensure that this agenda produces work that has
below highlights, there are choices to be made practical value (and not just for the public sector),
about how this is done, based on ambitions and as in the proposal of “sludge audits” to reduce
resources.179 administrative burdens.184 Doing so will mean that
the behavioral lens we just proposed can be used
by an organization’s members - and, ideally, by its
leaders, who are in a position to achieve broader,
systemic change.

Proactive consultancy
BEHAVIORAL NO Baseline Behavioral entrepreneurs

Attention here, but
SCIENCE strategically brittle
INCORPORATED
INTO
High return on Need to avoid
ORGANIZATIONAL investment? compliance mentality
Resilient but ambitious
PROCESSES YES
Nudged organization “Call for the experts” Behaviorally-enabled
organization
03
SEE THE SYSTEM
Many big policy challenges emerge from complex adaptive
systems, which present major challenges to the dominant way
that behavioral science has been applied. However, we can
adapt behavioral science to deal with complexity better, and
use it to: identify “leverage points” where a specific shift in
behavior will produce wider system effects; understand the
collective implications of individuals using simple heuristics to
navigate a system; and change the rules of that system to make
it more likely that desired behaviors will emerge.
Of course, not every problem will involve a complex First we need to explain complex adaptive systems
adaptive system. So behavioral scientists should briefly:
first develop the skills to recognize the type of
system that they are facing (“see the system”), and “A complex adaptive system is a dynamic network
then choose their approach accordingly. Fulfilling of many agents who each act according to
the broader promise of behavioral science also individual strategies or routines and have many
requires us to expand the ways we tackle problems. connections with each other. They are constantly
The process of identifying targets, exploring drivers, both acting and reacting to what others are doing,
developing solutions, and testing them is strong. The while also adapting to the environment they find
problem is that it contains several assumptions that themselves in. Because actors are so interrelated,
do not hold when confronting some of the biggest changes are not linear or straightforward: Small
challenges societies face. That’s because these changes can cascade into big consequences;
challenges often consist of behaviors in complex equally, major efforts can produce little apparent
adaptive systems (CAS).185 change. An important point is that coherent
behavior can emerge from these interactions— the
The risk of talking about ‘complexity’ is that it may system as a whole can produce something more
seem like just another way of saying problems are than the sum of its parts.”187
difficult (indeed some use the term as an excuse
to do nothing).186 In fact, we mean applying CAS consist of many different causes, actors, and
a particular way of analyzing the world. Our goals. There are many examples of them in human
proposal is that combining behavioral science with societies, including cities, markets, criminal justice
complexity thinking offers new, credible, practical systems, and political movements. They often
ways of doing things differently. create what have been called ‘wicked problems’,
which are ‘‘difficult or impossible to solve because
of incomplete, contradictory, and changing
requirements that are often difficult to recognize”.188
Such problems even produce little agreement UNINTENDED CONSEQUENCES

among different groups about how ‘success’ can be The wide-ranging, intensive actions taken to
defined. mitigate the spread of Covid-19 have had
many indirect effects. These may have included
The need to understand CAS is becoming more increases in domestic violence; shortages of
urgent. Complexity has emerged from the sheer hydroxychloroquine in West Africa; reductions
number of contacts created by global population in carbon emissions; and changes in healthcare
growth. At the same time, technological changes access with shifts to telehealth.196 Our point is not
like the growth of social media mean that that these actions should not have been taken per
information can be transmitted much faster and se, but that solutions may change the nature of a
cheaply, in an unchanged form, over many different problem or create many new ones.
geographies and networks. The fact we do not
fully understand how these changes affect human These ideas challenge the assumptions underlying
behavior is ‘a principal challenge to scientific the dominant behavioral science approach
progress, democracy, and actions to address global
crises’.189 The issue is that ‘there are fewer examples of
behavioral insights applied to understand behavior
Indeed, the Covid-19 pandemic, perhaps the most in complex change processes’.197
acute global policy challenge of recent times, has
several features of a wicked problem:190 Why? Because the realities of complex adaptive
systems challenge the main assumptions underlying
CONTESTED IDEAS OF SUCCESS the dominant behavioral science approach: tight
Throughout the pandemic, there has been focus on a target behavior, linear effects, and
disagreement between individuals, organizations, stability.198 We outline each of these before offering
and countries about what the overall policy goals a way forward.
should be (suppression, elimination, buying time
until vaccines, removing restrictions), let alone TIGHT FOCUS ON A TARGET
consensus about how to achieve those varying BEHAVIOR
goals.191
People can agree on a specific, measurable target
behavior. Interventions that shift this behavior are
NON-LINEARITY successful - wider effects are minor and may not
The initial coronavirus variant appears to have been be considered, unless pre-specified.
‘over-dispersed’ - outbreaks were seeded by a
handful of super-spreading events.192 It’s estimated Some behavioral science organizations focus on
that around 80% of infections were caused by just ‘“breaking problems down into their constituent
10% of individuals.193 These properties can explain parts to understand the desired behaviours’”.199 For
why some nascent outbreaks fizzle and others take some issues, there is value in doing this to identify a
off: viral spread is a non-linear process.194 core target behavior to drive improvement. But you
cannot understand a complex system by breaking
Understanding these features could target policy it down into parts and mapping it - the way its
responses effectively. They might indicate that connections function is key.200
stopping many repeated contacts within a small
set of individuals has little effect on spread, in We cannot assume that changing a particular part
contrast to reducing random contacts at events and of the system will have the desired overall outcome.
restaurants.195 In other words, the way the virus For a start, there may be intense disagreement
interacts with social systems means preventing many between parties about how a behavior contributes
social contacts is unlikely to reduce viral spread in to an issue: some people may see pre-school
a linear way, whereas preventing a few super- provision as key to regenerating an urban area;
spreading events may have an outsized impact. others may view that as an unimportant contributor,
compared with reducing crime.
Moreover, focusing on a single behavior to achieve STABILITY

a specific outcome may disguise unintended You can measure the pre-specified target
consequences that hinder progress towards the behavior between point A and point B. The system
larger goal. A tight focus on target behaviors will remain stable over that time, and people will
can mean seeing only a slice of the picture, and not adapt in response to the intervention.
ignoring how actors in a system may respond by
producing effects felt elsewhere.201 Since actors adapt to new conditions, and
influence each other in doing so, the nature of the
There are many such examples. In the US, grocery problem may be changed by the introduction of an
stores participating in a government voucher apparent solution itself.209 A snapshot of behaviors
program successfully reduced fraud, but this effort at one point in time is not enough to claim victory.
also led to many stores no longer stocking high Perhaps the best example is regulation: market
nutrient foods as a result - ultimately harming players experiment with and gradually adapt to a
recipients’ diets.202 A ban on plastic carryout regulatory regime, working out how to evade its
bags led to 40 million fewer pounds of plastic provisions, until a new policy is needed.210 Therefore
being used for this purpose, but 12 million more success may actually lie in how well behavioral
pounds purchased as large trash bags instead.203 scientists adapt to the unanticipated effects their
In Brazil, reminders for upcoming credit card own actions produce.211
payments reduced late payment fees by 14%, but
also increased overdraft fees in bank accounts by What’s particularly sobering for behavioral
9%.204 Increasing efficiencies in hospitals produces scientists is that their own field provides reasons
situations where ‘activities in one area of the why they may be holding onto these principles,
hospital become critically dependent on seemingly despite their limitations. One is the ‘reductive
insignificant events in seemingly distant areas’.205 tendency’, which is ‘a process through which
individuals simplify complex systems into cognitively
LINEAR EFFECTS manageable representations… when faced with
The intervention affects participants in a direct and complex concepts, individuals are often inclined to
linear way, according to a pre-developed theory treat dynamic concepts as static, or to generalize
of change.206 across dissimilar domains’.212
In a CAS, actors adapt to the behavior of others, In other words, behavioral scientists use heuristics
so there is often not a simple relationship between based on linear models of change when confronted
inputs and outputs. Actors in a system may adapt to with complexity, since doing so is ‘predictable,
‘buffer’ the effect of an attempted change and keep comforting, and less mentally taxing’.213 This
things apparently stable - i.e., making it seem that tendency can exacerbate another one - an
the intervention had no effects.207 illusion of control. Those who are creating and
implementing interventions may overestimate how
However, repeated efforts may weaken these much control they have over events and outcomes,
stabilizing factors, and then a minor subsequent since they are using an inaccurate mental model of
event produces a ‘tipping point’, where change how things work.214
happens suddenly and the system flips into a new
state.208 An example might be repeated challenges The end result, some argue, is a failure to
that weaken the commitment of a country’s armed understand how actors are acting and reacting
forces to democracy, which do not translate into in a complex system that leads policymakers to
action but which create the conditions for an conclude they are being ‘irrational’ - and then
apparently minor event to trigger a coup. actually disrupt the system in misguided attempts to
correct perceived biases or inefficiencies.215
The point here is that if you do not see your
expected behavior in a certain timeframe, in line
with a linear ‘theory of change’, you may assume
that it has failed. But, in fact, change may be
happening through routes and over timescales you
had not anticipated.
DEVELOP BEHAVIORAL SCIENCE For example, a recent US study showed how

THAT CAN TACKLE PROBLEMS IN partisan policy divisions may actually be produced
by these random fluctuations.218
COMPLEX SYSTEMS The experiment created online ‘worlds’ where self-
We have outlined the criticisms. Now, we want identified Democrats or Republicans were asked
to offer hope and a way forward. There is an whether they agreed with up to 20 statements.
opportunity to develop behavioral science so it can These statements concerned public issues, but had
tackle the aspects of complexity that are common been selected so they did not reflect pre-existing
to major policy issues. The starting point is to show partisan positions - e.g., whether there should be a
how behavioral science can shed new light on well- move to professional full-time jurors.
known features of CAS: the fact that small changes
can have big impacts, and the way that actors often In eight of the worlds, participants could see if
use a simple set of rules to navigate a system. mainly Democrats or Republicans were agreeing
with the proposal. When this happened, strong
The idea that ‘small changes can have a big partisan alignment quickly followed. But the
impact’ is often used in the sense that apparently important point is that which proposals fell into the
minor features of the way a choice is designed Democratic or Republican camp varied greatly
or presented can have a surprisingly large effect between worlds - sometimes the juror proposal was
on subsequent behavior. Used this way, the idea adopted by one side, sometimes the other. There
fits neatly with the standard approach of specific, was initial fluctuation in the initial stages, driven by
isolated changes being applied to change a pre- chance and context, before a sudden non-linear
defined behavior. alignment one way or another.
But CASs show that the statement is true in a More fundamentally, we can see that norms,
different way. They show that ‘higher-level’ rules, practices, and culture itself can emerge from
features of a system can actually emerge from the aggregated social interactions. These features
‘lower-level’ interactions of actors participating in then shape cognition and behavioral patterns
the system.216 When they become the governing in turn.219 The implications here disrupt the crude
features of the system, they then shape the ‘lower- distinctions of ‘upstream’ versus ‘downstream’ or
level’ behavior until some other aspect emerges, ‘high-level’ versus ‘low-level’ policies - or, as one
and the fluctuations continue. recent paper put it, the “individual frame” and the
“system frame”.220 Instead we have ‘cross-scale
Let’s make things tangible with examples. In the behaviors’,221 where behaviors embedded in
Covid-19 pandemic, people were trying to achieve specific contexts, shaped by the overall way the
their goals (live their lives) within a broad set of system functions, can self-organize and emerge to
rules and in response to changing events. These shape the system itself.222
adaptive behaviors in specific contexts interacted
with the adaptive abilities of the virus. New variants
emerged as a result.217 Some of these variants This way of thinking opens up new possibilities.
quickly became widespread and, in doing so, Behavioral science could be used to identify
changed the nature of the whole pandemic, most ‘leverage points’ where behavior could be
obviously through their greater transmissibility or
resistance to vaccines. Policymakers attempting to nudged in a way that produces wider system
handle the changed situation then had to come up effects.223
with new overall strategies.
Perhaps these interventions could be targeted at
Experiments have also shown how initial, random the stage of random fluctuations, when contingent
fluctuations can emerge and solidify into stark features of the context can determine which
divides that shape societies. behaviors get locked in. At that stage, a small,
well-timed change could shift events onto a
different path. For example, a recent study shows
that presenting people with a random selection
of opinions (a “random dynamical nudge”) from
others could prevent segregated echo chambers
from forming in online environments.224
Another option is to identify when and where But there are also limitations to focusing on tipping
‘tipping points’ are likely to occur in a system - and points. One is that potential tipping points can
then either nudge them to occur or not, depending be difficult to identify.231 Inefficiency and multiple
on the policy goal.225 To do this would require attempts may be inevitable. Moreover, this
understanding how close the system is to abrupt approach ignores the possibility of influencing the
change, and whether there are stabilizing forces system to get to a tipping point in the first place.
that can be easily strengthened or weakened. To Fortunately, there is another route for behavioral
take a clear example from the animal kingdom: a science to address the structural features of
study identified that a society of monkeys was near a system.
a critical point, such that just a small disturbance
would lead to widespread violence. But the study We’ve seen that the connections between different
also found that some individuals played a crucial actors in a system can produce complex, multi-
role by adjudicating fights, and thus preventing a scale outcomes. But the individual actors in this
tipping point from being reached - strengthening this system often rely on a core set of relatively simple
role could therefore prevent collapse.226 rules to guide their behavior - e.g., ‘do what others
are doing’, ‘take the first available option’.232
Seen through the lens of behavioral science,
In the human world, a recent study these rules are understood as mental shortcuts or
showed how targeting ‘structurally frugal heuristics, and have been closely studied
influential individuals’ can create as the means by which people navigate their
environments.233
artificial tipping points in a similar
way.227 The same principle can be We can now start to see a future approach that
applied more widely. If even a subset fuses behavioral science and complex systems, in
of consumers decide to switch to a order to:
healthier version of a food product, • Provide evidence, not assumptions, about how
this can have broader effects on people use heuristics to guide their interactions.
a population’s health through the • Study the collective implications of these
way the system realigns. It becomes heuristics.
• Show how these heuristics play out against the
marginally more profitable to stock governing conditions and structures of a system.
the healthier option, which then may • Change the rules of that system to make it more
change the mix of products available likely that desired outcomes will emerge.
to consumers in general.228 There are already examples that point the way.
While it is known that factors like “animal spirits”
Changing the way that the medical use of cannabis and narratives affect the economy, macroeconomic
is interpreted in U.S. drug law may end up affecting models play a big role in guiding fiscal policies.234
a whole range of substances that are currently These models are often built on the assumption of
illegal.229 “rational agents” which, as behavioral science
shows, is not always accurate.
Behavioral science could also be used to
understand when a tipping point in behaviors and New studies are emerging that model households
attitudes is incipient and may invite or require a as actors embedded in shifting social networks
government response. For example, it could have who use heuristics in response to other actors. For
identified that public attitudes and behaviors example, a recent one assumes that a household
related to smoking in the UK were shifting, so that ‘updates its savings rate by copying the savings rate
legislation banning smoking in pubs would be both of its neighbor with the highest consumption’.235 The
respected and welcomed.230 study shows that if the speed at which this copying
happens is changed, then a tipping point occurs
and the households divide into a rich group and a
poor group. In other words, modifying heuristics can
produce non-linear shifts with profound effects.
What are the practical implications? The standard

approach would suggest trying to directly influence
the way people use heuristics - for example,
by creating interventions to change the way
households learn about savings from each other.
But a better way could be to harness the power of
the system and make a targeted change in one of its
features, which then drives wider effects.
Here are some examples of what we mean.
EXAMPLE
We mentioned earlier that social media has been viewed as a major factor in driving changes in
collective human behavior. A core constituting feature for social media behaviors is the ease with
which information can be shared. Even minor changes to this parameter can drive widespread
changes - some have argued that such a change is what created the conditions leading to the Arab
Spring, for example.236 If ‘changes to a few lines of code can impact global behavioral processes’,
then a priority should be understanding these changes and how they can be made, if appropriate.237
EXAMPLE
The UK’s tax on sugared drinks also worked by identifying and modifying a key system parameter.
In that case, the tax was designed in tiers, so that higher levels of sugar content resulted in higher
taxes. This design choice altered the incentives presented to manufacturers to make it more attractive
to reformulate their products. Rather than directly persuading consumers to consume less sugar, this
approach instead tries to gear the system dynamics of the market towards reductions in the sugar
content of available products.238
EXAMPLE
In London, red lights are typically the default at pedestrian crossings. People have to push a button
and wait for the green signal to go ahead. In 2021, Transport for London changed 18 crossings so
that their crossing lights default to green. Pedestrians didn’t have to stop and press a button. They’d
see the green light and be able to cross right away - unless a vehicle was detected, at which point,
the signal would turn red automatically. The change meant that pedestrians saved time, complied
with signals more, and there was virtually no increase in delays for traffic.239 This was both a classic
nudge (changing the default) and an example of changing one part of a system to have wider
effects. Increasing compliance with the signals makes walking safer; the default change makes
walking faster. The result could be an increase in pedestrian traffic that reshapes overall transit
policies and services.
In these examples, the idea is to find targeted SEE THE SYSTEM

changes to features of a system that create the Finally, we want to turn to the main message of
conditions for wide-ranging shifts in behavior to this proposal. Not every problem that behavioral
occur. Complexity is used to structure effective scientists face will involve a complex adaptive
action, rather than an excuse for doing nothing. system. So what practitioners should be doing is first
understand the type of system that they are dealing
This approach also suggests that a broader change with, and then choose their approach accordingly.
in perspective is needed. We need to realize the You need to see the system.
flaws in launching interventions in isolation and
then moving on when a narrowly defined goal When we are dealing with a simple problem
has been achieved. Instead, we need to see the or system, the linear process works well. But in
longer-term impact on a system of a collection of situations of complexity it may produce false
different policies with varying goals.240 Doing so precision that misses what is actually going on. In
will also address some of the criticism directed at those cases, you should be looking for leverage
behavioral economics regarding “ergodicity” - that points, whether tipping points or targeted changes
it neglects how repeated decisions are taken over to system features, and taking a system stewardship
time in changing contexts, and wrongly diagnoses perspective.
irrationality as a result.241
Of course, identifying the nature of the system you
One way of thinking about this ambition is that are dealing with may not be easy, since different
behavioral scientists should be thinking more elements of complexity may emerge over time.243
about system stewardship. A decade ago, system But evidence is emerging that people can be trained
stewardship was proposed as a way of harnessing to recognize the features of a CAS. One promising
the power of complex adaptive systems, while route is to share core concepts with people and
recognizing that they cannot be controlled in a then have them play agent-based simulations as,
direct or linear way.242 say, farmers or policy makers; these simulations
can show vividly how well-intentioned changes in
We don’t explain system stewardship in detail one area can have unintended consequences in
here, but it involves setting high-level goals or a another.244 Others are exploring the value of novel
set of simple rules for actors in the system, seeking immersive technologies,245 or just basic checklists.246
feedback, and responding to how the system We think behavioral science can play a proactive
is adapting as a result. The idea is to create the role in this effort.
conditions for certain behaviors, see how a system
can respond to issues through emergence and
adaptation, and steer that adaptation towards
broad goals if needed.
METHODS
04
PUT RCTs IN THEIR PLACE
RCTs have been a core part of applied behavioral science, and
they work very well in relatively simple and stable contexts. But
they can fare worse in complex adaptive systems, whose many
shifting connections can make it difficult to keep a control group
isolated, and where a narrow focus on predetermined outcomes
may neglect others that are important but difficult to predict.
We can strengthen RCTs to deal better with The intervention itself may destabilize the system
complexity: we can try to better anticipate indirect and reconfigure the nature of the issue at stake.
outcomes that may occur; to set up RCTs to measure Generally, RCTs are not good at dealing with this
diffusion and contagion in networks; and to adopt kind of emergent change.
adaptive trial designs. We can also use behavioral
science to improve alternative ways of measuring
ESTABLISHING CAUSALITY
impact - in particular agent-based modeling, which
One of the main advantages of RCT is that they
often relies on assumptions of rational choice.
identify causal effects by separating out a control
Where does all this leave randomized controlled group that is not exposed to an intervention.
trials (RCTs)? RCTs have been a central part of However, the many shifting connections in a CAS
the applied behavioral science process, given its make it more difficult to ensure that a control group
emphasis on quantitative methods that produce a remains isolated. This kind of ‘contamination’ is
robust counterfactual to estimate an intervention’s a particular problem for policy makers who are
impact.247 Yet they may deal poorly with the tasked with influencing a system as a whole - for
features of complex adaptive systems. We can see example, it is difficult to conduct an RCT on the
this in two main ways: dealing with change and introduction of new tax initiatives or criminal justice
establishing causality.248 guidelines at the national level.
DEALING WITH CHANGE The way that non-linear change happens over time
The generally accepted best practice for RCTs also presents a challenge to RCTs. For example,
is to choose specific, measurable outcomes in we may plan to measure effects over a predefined
advance and pre-register them; changing them period after an intervention, say 12 weeks, based
subsequently requires effort and may even attract on a linear theory of change. The problem is that the
suspicion.249 But, as we pointed out, a narrow change may not be linear - nothing may happen
focus on predetermined outcomes risks neglecting initially, but then change occurs suddenly and in a
ones that are important but difficult to predict.250 non-linear way after 3 weeks (or 13) and continues
And new outcomes are not the only issue: “new to grow after the 12 week cut off.252 Of course, this
questions, causal pathways, stakeholders or even is more a point about how many RCTs are applied
objectives may emerge during the evaluation in practice, rather than about what RCTs can do if
which the original evaluation design does not they are designed perfectly.
accommodate”.251
We want to stress that, in contexts that are simpler These designs typically require greater stakeholder
and more stable, RCTs work well. There were involvement in order to get good insights into these
very good reasons why applied behavioral interactions, but the interventions may be adopted
science embraced the precision, skepticism, and more widely as a result.
emphasis on causality that real-world RCTs can
bring. However, we also need to respond to these The second opportunity is to set up RCTs to
challenges by measure diffusion and contagion in networks. An
example is the study on how political partisanship
• Finding ways of strengthening RCTs so they are forms, mentioned earlier, which created separate
better aligned to the demands of evaluating online “worlds” to compare how contagion and
changes in complex systems. adaptation play out in different conditions. Indeed,
• Identifying how behavioral science can use and online may be the natural home for these kinds of
enhance other ways of evaluating impact. RCTs, since they require large populations, complete
adoption data, complete network data, and
replication.259
STRENGTHENING RCTS
The first opportunity to strengthen RCTs comes
But they can also be done in the real world, through
in the planning phase. We can try to better
cluster randomizations at the network level (if
anticipate indirect outcomes that may occur and
those networks are not well-connected with each
draw boundaries or select measures accordingly.
other). One way this has been done is by testing
For example, BIT worked on an RCT that tested
how social networks in different villages adapt to
the impact of sending parents texts encouraging
different interventions. For example:
them to ask their children questions about their
science curriculum.253 As intended, the intervention
increased at-home conversations about science. EXAMPLE
But the RCT design also made it possible to see that
the texts also made parents less likely to engage Randomizing villages in Honduras to target a)
in other school-related discussions and supportive randomly selected villagers b) villagers with
behaviors (e.g., turning off the television). By the most social ties or c) nominated friends of
anticipating this possibility, the RCT could measure random villagers, as the people to promote
the broader shifts in behavior created by the use of water purification and vitamin intake.
intervention. The behaviors spread most widely when route
C was taken.260
The need, therefore, is to gain a better
understanding of the system interactions and
anticipate how they may play out.254 This can be EXAMPLE
done through “dark logic” exercises that try to
trace potential harms, rather than benefits, thereby Adoption of new, higher productivity farming
challenging assumptions about an intervention’s technology was greater in Malawian villages
effects.255 One such study indicated that providing where ‘seed farmers’ had been selected by
clinicians with concerning feedback on their network theory-based approaches, rather
performance may impede (rather than improve) than using a government extension worker to
that confidence.256 Engaging the people who will identify them.261
implement and participate in an intervention will be
a key part of this effort. Another potential route is EXAMPLE
to use scenarios or early-stage prototypes to gain
insight into the way issues may emerge.257 In 522 Indian villages, getting people to
nominate ‘gossips’ led to much higher uptake
Similar thinking has been used to develop “hybrid of immunizations than in villages where the
implementation-effectiveness designs”.258 As well same information was provided to randomly
as measuring overall effectiveness, these designs try selected individuals.262
to identify how the different parts of an intervention
interact: for example, in the case of the education
RCT mentioned, the parents, the students, and
school all played different roles.
BIT has also produced guidance on how to This is a fast-developing area.271 For example,
evaluate complex interventions affecting whole there is much interest in the contributions machine
schools.263 learning can make. “Bandit” algorithms can be
used to identify which interventions are working
The third opportunity is to build feedback and best, and gradually add more people to those
adaptation into the RCT design.264 An obvious experimental arms (in other words, automating
starting point is to actively seek out emerging the “evolutionary RCT” process).272 These have
change and ‘weak signals’ that show how the been commonplace in the A/B marketing tests,
intervention is playing out in practice.265 To use but have some drawbacks.273 We are still learning
the previous example of encouraging parental which conditions favor adaptive trials - e.g., when
conversations about science - if the researchers had outcomes vary widely between groups; when results
not anticipated that other kinds of parental support can be realized and observed in waves, as in
might decline, they might have discovered this school years or terms; and when it is easy to switch
was happening by talking to students about how people between interventions.
things were going. To handle this possibility, BIT has
proposed using a ‘two-stage trial protocol’, where However, adaptive designs do not address all
new research questions are added while a trial the issues mentioned earlier. And they have
is in progress, but before analysis starts. This two- disadvantages: for evolutionary RCTs, it can be
stage process allows emerging possibilities to be hard to make the case for recruiting a bigger than
analyzed, without entering the problematic territory normal treatment group; SMARTs can be complex
of “hypothesizing after results are known”.266 to manage and are quite focused on individuals,
We can also use feedback to adapt the trial design rather than gaining system-level feedback.274 But
and interventions, as in a Multiphase Optimization they represent improvements worth considering
Strategy.267 For example, one study aimed to when dealing with complex adaptive systems.
improve the reading skills of third-grade students
in Colombia through remedial tutorials in small USING AND ENHANCING OTHER
groups.268 Since three consecutive cohorts of
WAYS OF EVALUATING IMPACT
students were involved, the results of each wave
There are other ways of assessing effects in complex
could be used to fine-tune the intervention for the
environments as well. One approach that is
next wave. As a result, the effectiveness of the
particularly relevant to behavioral science is agent-
intervention increased over time.
based modeling (ABM). ABM tries to simulate the
interactions between the different actors in a system,
An ‘evolutionary RCT’ has been offered as a
rather than imposing the functioning of a system in
specific form of adaptive trial.269 Here, participants
a top-down way.275 That means ABM can replicate
are randomized to the treatment group at a 2:1
many of the features seen in complex systems, like
ratio compared to the control group. The idea is
tipping points and the emergence of higher-level
that behavioral scientists would explore how the
features from lower-level features.
intervention is working, run side experiments, adapt
the intervention in response, and then ‘split off’
The practical point is that ABM can create a
some of the treatment group to assign them to the
virtual counterfactual - a representation of what
adapted intervention.
would have happened in the absence of an
intervention, and what future change may occur.276
A similar option is a SMART (Sequential Multiple
For example, one study used data from Sicily to
Assignment Randomized Trial) or a micro-
create realistic simulations of how people are
randomized trial.270 The idea here is that you
recruited into organized crime and then tested the
actively explore and monitor how people are
effects of implementing various policies to stop
responding to an intervention - if the evidence
that happening (e.g., targeting leaders, providing
shows it seems to be ineffective for some, then they
education support).277
are re-randomized to another intervention or a
control group. For example, if an initial diabetes
prevention intervention does not seem to be
affecting the blood glucose levels of some people,
they are split off and re-randomized to a more
intensive option. The idea is that you can evaluate
an adaptive series of interventions.
We single out ABM because we think behavioral PUT RCTS IN THEIR PLACE
science can improve it.278 As we mentioned earlier One argument is that ‘while an initial focus on
in relation to macroeconomic models, the agents rigorous empirical research helped BI teams
in ABM are mostly assumed to be operating on establish themselves in policy making, strict
rational choice principles.279 Therefore, there is a adherence may represent a risk to their long-term
big opportunity to build in more evidence about the growth and relevance’.283 We agree in the limited
drivers of behavior - for example, habits and social sense that finding new options is a good idea:
comparisons.280 behavioral science should adopt and improve
alternative approaches to evaluation. But we
However, it may not be clear which of these drivers stress that the main point is to match the evaluation
is most relevant for the issue at hand. Therefore, the approach to the type of system and issue at hand.
best course may be to ‘include different theories In simpler situations, standard RCTs can work well.
into the model to assess the sensitivity of the results When dealing with complex adaptive systems,
to different assumptions about human decision RCTs can be strengthened to address some of the
making’.281 Doing this allows you to assess how limitations that become apparent - and the act of
resilient your intervention is to the uncertainties doing so should drive behavioral science itself
about behavior generated by complex adaptive forwards.
systems. In other words, an improved ABM could
offer a sophisticated way of stress testing policies in
advance.
While behavioral science can improve ABM, the

requirements imposed by formal models also reveal
that behavioral science needs to sharpen up its
theories in terms of clarifying causal relationships
and how change occurs over time.282 We build
on this point later in our proposal Beyond lists of
biases.
05
REPLICATION, VARIATION
AND ADAPTATION
The “replication crisis” of the last decade has seen intense debate
and concern about the reliability of behavioral science findings.
Poor research practices were definitely a major cause of the
replication crisis; the good news is that many have improved as a
result.
We need to secure and build on these advances, so We also need to get better at judging how much
we move towards a future where meta-analyses of an intervention’s results were linked to its context
high-quality studies (some deliberate replications) are - and therefore how much adaptation it may
used to identify the most reliable interventions, develop need. Behavioral scientists should use and modify
an accurate sense of the likely size of their effects, and frameworks from implementation science to develop
avoid the weaker options. We have a responsibility such judgment. Finally, we need to develop and codify
to discard ideas if solid evidence now shows they are the practical skills that successfully adapt interventions
shaky, and to offer a realistic view of what behavioral to new contexts; expertise in behavioral science should
science can accomplish. not be seen as simply knowing about concepts and
findings.
That responsibility also requires us to have a hard
conversation about heterogeneity in results: the WHAT DID THE REPLICATION
complexity of human behavior creates so much
statistical “noise” that it’s often hard to detect consistent CRISIS DO FOR US?
signals and patterns. This much heterogeneity makes The “replication crisis” of the last decade has seen
the idea of replication itself problematic: a “failed” intense debate and concern about the reliability of
replication may not show that a finding was false, but behavioral science findings.285
rather how it exists under some conditions and not
others. This crisis has brought about many advances,
and we need to secure and build on them. But
These challenges mean that applied behavioral we also need to have a hard conversation about
scientists need to set a much higher bar for claiming how far the underlying goal of creating replicable,
that an effect holds true across many unspecified generalizable findings is actually achievable.
settings.284 There is a growing sense that interventions
should be talked about as hypotheses that were true in
one place, and which may need adapting in order for
Replication is seen as ‘a cornerstone of science’.286
them to be true elsewhere as well. An attempt to replicate a study should obtain
similar results, if it maintains important features
We need specific proposals as well as narrative of the original. Most people consider “similar”
changes. The first concerns data collection: expand to mean the effects are in the same direction and
studies to include (and thus examine) a wider range are statistically significant.287 The problem is that
of contexts and participants, and gather richer large-scale studies have found that only between a
data about them. Coordinated multi-site studies
third and two thirds of replication attempts produce
will be needed to collect enough data to explore
similar results to the originals - and often the effect
heterogeneity systematically; “crowdsourced” studies
offer particular promise for testing context and sizes are much smaller.288
methods.
A high-profile example is a 2012 study that This way of thinking points us towards a future
found that “signing at the top” of a car insurance where meta-analyses of high-quality studies (some
declaration increased honest reporting of mileage deliberate replications) are used to identify the most
more than signing at the bottom, as is traditional.289 reliable interventions, develop a realistic sense of
The idea was that signing beforehand made ethical the likely size of their effects, and avoid the weaker
concepts prominent to an individual just as they had options. Operating in the real world incurs real
the opportunity to be dishonest. After BIT failed to costs; we have a responsibility to discard ideas if
find a similar effect in large real-world studies,290 solid evidence now shows they are shaky. Overall,
other careful attempts (including one co-authored rather than over-inflating claims, practitioners
by the original researchers) have come to the same should ‘offer a balanced and nuanced view of the
conclusion.291 It could be that this intervention does promise of behavioral science’.298
have an effect in some settings but, on a practical
level, there is a real cost to keep testing it rather than This vision seems achievable. Evidence suggests
other ideas. that at least some experts in behavioral science
do indeed update their views in response to new
Poor research practices were a major cause replication results.299 More and more reviews based
of the replication crisis. Some of the main ones on better quality studies are emerging, giving us
include publication bias, low statistical power, a stronger sense of what impact to expect from
hypothesizing after results are known, over-reliance different kinds of interventions.300
on null hypothesis testing, and overly flexible
approaches to data collection, measurement tools, IS REPLICABLE, GENERALIZABLE
and analysis (e.g., stopping data collection once a KNOWLEDGE A REALISTIC
desired result has been achieved).292 GOAL?
We may need to rethink our ambitions, however.
The good news is this exposure has improved
It’s important to note that there has been a general
academic research practices.293 There are sharper
aim for findings not only to be replicable, but also
incentives to pre-register analysis plans, greater
generalizable. In basic terms, if replicability is about
expectations that data and code will be freely
whether the same results would emerge again in the
shared, and wider acceptance of post-publication
same conditions, generalizability is about whether
review of findings.294
the same results would emerge again in different
conditions.301
However, we do need to push back against some
of the ideas that have emerged from the replication
We usually place greater value on evidence that is
crisis. One is to interpret replication attempts in
generalizable. From a scientific viewpoint, ‘effects
a binary way - as showing something that “did”
that can only be reproduced in the laboratory or
or “did not” replicate. There’s a good case that
under only very specific and contextually sensitive
this way of thinking is ‘inadequate’.295 Instead,
conditions may ultimately be of little genuine
replications should be seen as contributions to
scientific interest.’.302 From a practical viewpoint,
a larger ongoing meta-analysis, and judged in
we often look for interventions that can be widely
terms of how much they add to existing stock of
“scaled”, and get concerned that effectiveness
knowledge.296 In fact, this kind of analysis may
often seems to fall when they are taken to new
show that even a non-significant result can advance
places (the so-called “voltage drop”).303
the conclusion that an effect is real.297
Now we can pose the deeper challenge that has
been emerging. Is replicable and generalizable
knowledge actually a feasible goal for applied
behavioral science? Recently, it’s become clear that
the complexity of human behavior creates so much
statistical “noise” that it’s hard to detect consistent
signals and patterns. In other words, results from
studies of behavior are highly heterogeneous.304
Several large studies have confirmed that results A recent study actually ran an experiment to
seem to vary between experiments in ways that are measure the impact of these contextual factors.
difficult to explain.305 It seems increasingly clear that Participants were randomly allocated to studies
‘heterogeneity cannot be avoided in psychological designed by different research teams to test the
research - even if every effort is taken to same hypothesis. For four of the five research
eliminate it’.306 questions, studies actually produced effects in
opposing directions. These ‘radically dispersed’
This situation makes replicable, generalizable results indicate that ‘idiosyncratic choices in stimulus
findings much less likely. One massive analysis design have a very large effect on observed
found so much heterogeneity across 8,000 studies results’.316
that it concluded that ‘there will always be a sizable
chance of unsuccessful replication, no matter how In a way, we shouldn’t be too surprised. The strong
well conducted or how large any of the studies influence of context on behavior has been one of
are’.307 In fact, the idea of replication itself becomes the central insights from social psychology over the
more problematic: a “failed” replication may not last century.317 Moreover, we are trying to have
show that a finding was false, but rather how it an effect in the real world. We are not dealing
exists under some conditions and not others.308 with how laboratories set up experiments. We are
dealing with complex adaptive systems that expose
How can we start cutting a path through this people to many cues, and where acting can change
thicket? The first step is to carve out two different the context in unexpected ways.318 For example,
(but related) drivers of heterogeneity: 1. Contexts while reminders may influence behavior effectively
influence results; 2. Effects vary between groups.309 in initial studies, they may lose their power if people
become surrounded by too many reminders, each
CONTEXTS INFLUENCE RESULTS coming from different sources.319
Just now, when explaining replication, we said
that similar results should emerge if another study We are also dealing with behavior embedded
‘maintains important features of the original’. But in institutions and cultures. Those of us trying to
do we know how many and what features are shape and improve public policy are confronted by
important?310 When people set up experiments, they ‘intricate bundles of rules, procedures, institutional
make myriad choices about things like the precise designs, and contextual constraints’.320 General
wording of messages, the time of day they are concepts coming from studies may only provide
given, how participants are recruited, and so on.311 only a vague guide here. Similarly, different
cultural meanings can render a study ineffective
Often they are guided by a desire to maximize their for unanticipated reasons. While the color red is
chances of detecting an effect.312 associated with danger in the West, in Chinese
culture it symbolizes good fortune. But even cultural
The point is that these choices vary greatly between variations are not certain to have an effect - they
studies and experimenters, in ways that often go may remain dormant unless ‘the right stimulus is
unnoticed.313 That’s true even for replication studies triggered’.321
that explicitly aim to be standardized.314
EFFECTS VARY BETWEEN GROUPS
These choices then interact with the inherent Context cannot explain all the heterogeneity in
complexity of human behavior. The result is that studies of behavior.322 Another factor is that the
neglected or undocumented aspects of the setting effect of an intervention may vary greatly between
or intervention design may be influencing the groups within a population - even though we often
observed outcome. When the context changes, talk in terms of an overall “average treatment
those elements may not be present, and so a effect”.323
study may seem to have failed to “replicate”
or “scale up”.315
For example, a study in India used a smartphone Let’s link this idea back to the replication crisis
app to give drivers feedback on their driving, using again. Effects vary between groups. When an
three different kinds of notifications: a reminder intervention is applied more widely, the makeup of
of personal best driving performance, average the participants may be different from the original
driving performance, or their performance on their study. That means that the intervention may seem to
latest trip.324 The first two nudges were effective no longer have an overall effect, even though it is
at improving driving among participants overall, having the same impact on some groups as before.
compared to a ‘no nudge’ control group.
But the study also found significant What looks like a failure to replicate a main
variation: the “personal best” nudge was finding may actually be driven by more nuanced
significantly better for high-performance
drivers who did not seek feedback often; underlying patterns of behavior that sometimes
the “average” nudge worked best for low- map poorly against the “haphazard” samples that
performance drivers who sought feedback are collected.326 An intervention may be dismissed
often. Lower performers appeared to be
completely when it actually “works” for a meaningful
more motivated by goals that were easier
to achieve. The study concluded that firms group of people.
could improve driver performance by 11%
through tailored messaging. For example, the company OPower showed in
several studies that energy consumption was
Results like these have sparked calls for a reduced by 2%, on average, if customers were
‘heterogeneity revolution’ in behavioral sciences, shown how their usage compared with that of
which accepts that most effects vary, and rejects their neighbors.327 But later studies showed that
the idea that an effect needs to hold across all the average treatment effect was much smaller
groups in order to be important or real.325 Such a when implemented at scale.328 This is because the
call is a challenge to a nudge approach that often intervention had more effect on people with higher
prioritizes achieving overall marginal shifts in the incomes and more concern for the environment, and
behavior of large populations. there were more of such people in the initial studies.
This is not just an academic debate. Focusing on

the population as a whole may lead to a discussion
about whether an intervention “works” that is
heavily weighted towards the largest group in the
population. An intervention that has a benefit on
average may disguise how some groups experience
no benefit - or even harms.
MAKE MORE CONSERVATIVE We need to be more conservative in explaining

CLAIMS that findings may have emerged from a particular
context (which is different from saying they are
These challenges mean that applied behavioral
poor-quality).332 And, we need to be realistic about
scientists need to set a much higher bar for claiming
the size of the impact that interventions based on
that an effect holds true across many unspecified
these findings are likely to have.
settings.329 Our narrative and ambitions may
need to change. Over the past decade, many
It’s going to be harder to move away from the
organizations have offered frameworks that aim
simple idea that an intervention “works” - not least
to raise the base level of behavioral science
because average treatment effects will continue to
knowledge (for BIT, these include EAST and
matter in environments where evidence is scarce
MINDSPACE). They offer deliberately simplified
and the ability to target interventions is limited.
messages like “people are influenced by the
Instead, there is a growing sense that interventions
behavior of others”.
should be talked about as hypotheses that were
true in one place, and which may need adapting in
We don’t need to ban these kinds of statements.330
order for them to be true elsewhere as well.333
But we do need to talk about them more like
the way behavioral scientists have talked about
We need specific proposals as well as narrative
incentives: that incentives definitely influence
changes. To address these issues, we propose:
behavior, but they can also backfire, and we can
multi-site studies to explore heterogeneity
find ways of making them more effective.331
systematically; better identification of context-
dependence in results; and, most importantly,
developing skills for adapting ideas to settings.
RUN MULTI-SITE STUDIES TO GET No team sees how the others are approaching
BETTER DATA the problem. The study can then run a meta-
analysis that combines effects across the studies to
Once we understand that results vary by context
determine which hypotheses are supported. The
and groups, the next step is to understand how and
ensuing results are the ones that survive the stress
why.334
test of being subjected to varying approaches
(method heterogeneity). A recent study of this
First, we can recruit participants using existing
kind, with 15,000 participants, found that two of
methods, but gather richer data about them. To
the five hypotheses tested were confirmed by this
date, only a small minority of behavioral studies
approach.342
have provided enough information to show how
effects vary.335 Moreover, such gaps in data
In terms of measuring how effects vary by groups,
coverage may result from and create systemic issues
an obvious first step is to recruit more diverse
in society: certain groups may be excluded or may
samples. The lack of cultural diversity in participants
have their data recorded differently from others.336
in behavioral science research has been much
Part of the solution may link back to the need for
discussed, but things are starting to change.343 For
greater diversity among behavioral scientists, since
example, the rise of online platforms has made it
‘diverse researchers collect diverse data’ - for
easier to collect data from participants around the
example using multiple race-related items, rather
world. A recent study tested the main principles of
than a single binary variable.337
prospect theory with 4,000+ participants across 19
countries (although most were Western).344
Second, we can expand studies to include (and Nevertheless, the truth is that collecting
thus examine) a wider range of contexts and varied data better is going to be hard.
participants. Experiments would be constructed to Organizations like BIT, which work across
many different continents, seem to have
answer for whom the effect appears, and under the greatest ability to deliberately explore
what circumstances, rather than “Did it work?”338 how effects vary by group characteristics.
Yet there are many barriers to coordinating
data collection across different clients
Large replication projects took a step in this and countries, since we are reliant on budgets and
direction, but have not been designed to willingness to cooperate. Often we need to work
systematically vary aspects of method and context with public administrative datasets, but these are
to measure their effects.339 BIT is attempting to do often limited (containing just age, gender, and
this for an intervention that uses the “teachable location, for example).
moment” of hospital attendance to reduce future
violence.340 Realistically, practitioners are going to be looking to
academia to initiate these advances. And a major
Here’s an analogy. Econometricians carry out investment in research infrastructure will be needed
“sensitivity analyses” to understand under what to set up standing panels of participants, coordinate
conditions their models predict behavior - i.e., stress between institutions, and reduce barriers to data
testing them. Behavioral scientists have minimized collection and transfer.345 There are promising
the need for these analyses by running experiments signs. Recently, projects like Behavior Change for
- but this means that the specific context of the Good have been producing ‘mega studies’ that
experiment is fixed, so we are advocating for stress test a wide range of interventions with hundreds
testing findings by varying the context.341 of thousands of people across many sites.346 These
studies represent a real advance, and the model
‘Crowdsourced’ studies may be particularly suited could be adapted to vary context, methods, and
to testing context and methods. As explained participants even more systematically.
above, different research teams each come up
with a way of testing the same hypothesis, and
participants from the same pool are randomly
assigned to each.
PREDICT CONTEXT-DEPENDENCE ADAPT INTERVENTIONS

We need to get better at judging how much an SKILLFULLY
intervention’s results were linked to its context - and We need to develop and codify the practical
therefore how much adaptation it may need. There skills that successfully adapt interventions to new
are reasons to be optimistic here. One study got contexts. The first step is to value them properly.
people to rate how contextually sensitive a topic Expertise in behavioral science is still mainly seen
was; topics with higher scores were less likely as knowing about concepts and findings. That is
to have successful replication studies.347 In other necessary but not sufficient, given the power of
words, judgments of context-dependence did seem context. We can’t simply pluck ideas from a generic
to link to replication outcomes. “nudge store” and expect them to work.352 Expertise
should explicitly include the ability to understand
Stronger evidence comes from the ‘crowd-sourced’ context and how to make choices that are sensitive
study mentioned earlier. It found that scientists were to that context. (Our later proposal of how to
able to anticipate the results of an intervention explore the context fully is relevant here.)
just by examining the materials (e.g., sample size,
methodology, materials). Most importantly,
they could distinguish between studies
that were testing the same hypothesis. That In a way, all we are doing is foregrounding a skill
means they had a ‘fine-grained sensitivity’ that is at the heart of behavioral science already.
to which of the design choices would be Whenever someone creates an intervention, they
most effective at realizing the concept in that
setting.348
give an abstract concept concrete form to affect
behavior. 353
This act of creation requires many design
The need is to cultivate and teach this choices. Which word should be used here? What time
sensitivity, so we can predict which of day should the offer be made? How?
interventions are likely to require adaptation,
and focus resources accordingly. Currently
we don’t know how these abilities are We know these choices can make or break an
developed: behavioral scientist have not reached intervention, yet we just don’t pay enough attention
consensus on which aspects of context matter. 349
to them. Many people will have had the experience
BIT and others have sketched out the aspects of of being mildly surprised at coming across a study
‘scalability’, but these are still quite basic and focus showing a null result - before they see the actual
more on whether the implementation mechanics can intervention and realize it was put together badly.
scale, rather than if the results will. 350
For example, a 2001 study found no effect of a
descriptive norm message (i.e., “most people pay
Behavioral scientists need to realize that fields like their taxes”) on tax compliance. But the message
implementation science have thought more deeply undermined itself by starting with the statement
about this topic. For example, the Consolidated “many Minnesotans believe other people routinely
Framework For Implementation Research (CFIR) cheat on their taxes”!354
was created in order to map the factors involved
with moving an intervention from one context to So far, there seems to be an element of skill or
another.351 These factors include the criteria of craft to making these choices well - they are not
“adaptability” and “complexity” and “trialability”, completely reducible to standard procedures, and
which could be used to make the kinds of may never be.355 For example, standard practice
judgments we outline above. “Adaptability” when testing messages is to change the least
is based on the idea that there is a distinction amount possible between treatment arms.356 This
between an intervention’s “core components” is a good general rule. But experience shows that
and its “adaptable periphery”; the latter can be it can produce phrases that have a strange and
modified to fit the setting without undermining the unnatural quality that itself will bias the results. (“If
effectiveness of an intervention. Behavioral scientists you contact us now, you won’t have to worry about
should be using and modifying frameworks like this this anymore.” / “If you don’t contact us now, you
to judge how much resource an intervention will will have to worry about this more.”)
need for adaptation.
The skill here is knowing how to navigate conflicting In 2014 BIT worked with Public Health England
priorities and concepts to produce something that to create a letter with a social norm message
works for the particular situation. So, the task of that reduced antibiotic prescribing among high-
adapting an intervention is actually quite similar prescribing primary care providers.361 Similar
to that of creating one in the first place - meaning, interventions have since been tried successfully in
once we realize that’s the case, we can build on countries around the world.362 When BIT came to
those existing skills. adapt the intervention to New Zealand, we knew
that Māori or Pacific populations experienced
We also need to go further and adopt a more specific challenges and were at a higher risk
structured approach to understand and develop of infectious diseases.363 They might need more
this skill. The CFIR framework is useful here as well, antibiotics, rather than fewer. So we adapted the
since it sets out five major factors affecting how intervention to show antibiotic prescribing split out
interventions are implemented, including features by Māori, Pacific, and all other populations. The
of the intervention, different aspects of the setting, letter significantly reduced prescribing overall; and
features of the individuals involved, and so on. Even we saw no significant change in the prescribing
a basic guide to these points would be helpful, but of doctors who were high prescribers overall, but
it’s worth noting that they would only be a starting low prescribers for Māori and Pacific populations.
point. The CIFR authors themselves admit that often Therefore, it appears that health inequities were
the distinction between “core” and “periphery” reduced.
elements ‘can only be discerned through trial and
error over time’.357 There is one obvious recommendation missing here.
Hopes have emerged that data science techniques
This point means that it’s particularly valuable to will allow us to better understand how effects vary
learn from practitioners how they adapted specific between groups. We agree, but also think that this
interventions to new contexts. These accounts are idea raises big ethical questions. In the penultimate
starting to emerge, but they are still rare (see BIT proposal, we will explore the promise and perils
example below).358 Researchers are incentivized to involved.
claim universality for their results, rather than report
and value contextual details.359 And of course,
these skills can be built through running multiple
rounds of prototyping and testing with affected
service users and communities.360
YOUR ANTIBIOTIC PRESCRIBING TO SPECIFIC DEMOGRAPHIC GROUPS*

We know Māori and Pacific patients may need more antibiotics than other
New Zealanders. Below is your prescribing rate to different demographic
groups, and for GPs in your DHB.
Māori patients Pacific patients Non-Māori & non-Pacific

Proportion of GPs
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
Your rate: 14.5 Your rate: 34.5 Your rate: 23.4

06
MOVE BEYOND
LISTS OF BIASES
Recently there have been claims that our lack of good explanations for
heterogeneity is part of a “theory crisis” in psychology. One concern is that
psychological theories have become very high-level and vague through their
attempts to capture complex behaviors. Another is that theories are specific,
but disconnected from each other - and from a deeper, general framework
that can provide broader explanations.
This second criticism is very relevant to the way that This failure is also an opportunity, since seeking
behavioral science relies on lists of heuristics and to understand the causes of heterogeneity should
biases. These ideas are incredibly useful, but have lead us to better theories.366 For example, when
often been presented as lists of standalone curiosities, data from one of the large replication studies was
in a way that is incoherent, reductive, deadening, reanalyzed, it revealed that political ideology
breeds overconfidence, and distracts us from
was having an important but unidentified effect on
answering more important underlying questions (like
heuristics like anchoring.367
when does one bias or another apply).
The concern for behavioral science is that it uses The need for better theories can be seen as part
and popularizes both these high-level frameworks, of a wider “theory crisis” in psychology.368 We
like dual process theories, and a collection of see two different concerns here: one, that theories
disconnected heuristics and biases - with little in the are too vague and high-level; the other, that they
middle to draw both levels together. are specific, but disconnected and ungrounded.
We think the way forward is to emphasize theories The concern for behavioral science is that it uses
that are practical. By this we mean: they fill the both these high-level frameworks, like dual process
gap between day-to-day working hypotheses and theories, and jumbled collections of heuristics and
systematic attempts to find universal underlying biases - with little in the middle to draw both levels
explanations; they are based on data; they can together.
generate testable hypotheses; they specify the
conditions under which a prediction applies or does THEORIES THAT ARE TOO VAGUE
not; and they are geared towards realistic adaptation
by practitioners. AND HIGH-LEVEL
Theories of behavior often try to explain
We think that resource rationality is a good example phenomena that are complex and wide-ranging.369
of the kind of practical theory that applied behavioral
If you are trying to show how emotion and
science could pursue.
cognition interact (for example), this involves
Behaviors often take place in complex systems; many causes and interactions. Trying to cover this
intervention effects vary across contexts and variability can produce descriptions of relationships
groups. Therefore maybe it’s not a surprise that and definitions of constructs that are abstract and
we lack good explanations for why findings vary imprecise.370
so much.364 When we fail to obtain an expected
finding, we may say that “context matters”, but often
cannot explain why.365
The results are theories that are “weak,” in the At the same time, these theories are often also
sense that they are vague and therefore can be disconnected from a deeper, general framework
used to generate many different hypotheses - some that can provide broader explanations (like the
of which may actually contradict each other.371 theory of natural selection does, for example).381 If
The theory becomes hard to test or reject because we don’t have such a paradigm, we can’t predict
its looseness means it can accommodate many when we should see an effect or not and we can’t
different findings.372 And its vagueness prevents distinguish ‘results that are unusual and interesting
us from making useful models or predictions of the from results that are unusual and probably
conditions under which interventions ought to work wrong’.382 Of course, this disconnect may happen
(or not).373 This may have been what happened in partly because these high-level theories are vague
the crowdsourcing study mentioned earlier - teams and weak, as noted above.
got varying results because the hypotheses allowed
many different approaches. The main way this issue affects behavioral science is
through heuristics and biases.383
The result is that running experiments does not
improve our understanding as it should. We can
always claim that a theory is basically true, but
Examples of individual biases are accessible,
there are reasons why it was not supported in this popular, and how many people first encounter
specific instance (e.g., a problem with the way the behavioral science. These ideas are incredibly
study was set up).374 And so weak theories stumble useful, but have often been presented as lists
on, unimproved.
of standalone curiosities, in a way that is
We are not saying that this is an accurate incoherent, reductive, and deadening.
summary of how applied behavioral science uses
theory. But it does raise questions. Dual process
theories arguably provide the main overarching INCOHERENT
framework.375 But there are criticisms that they The biases usually overlap and conflict in various
do not sufficiently explain when people use one ways. How are people vulnerable to “optimism
process or the other, they imply a clear division bias” - where someone believes that they are less
between the processes that does not exist, they likely to experience a negative event - but also
can be hard to refute, and they don’t make clear “negativity bias” - where negative things affect our
predictions.376 Models of how and why nudges mood and cognition more than positive things?384
work tend to be ad hoc and related only to certain Or how is “optimism bias” meaningfully different
domains, making it difficult to predict their effects in from “self-enhancement bias” or “positivity bias”?
new settings - although there has been progress as Where do the boundaries, if any, lie? How do they
the field has matured.377 interact?385 Presenting lists of biases does not help
us distinguish or organize them.386
THEORIES THAT ARE SPECIFIC
BUT DISCONNECTED
On the other hand, we have theories that are
more specific, but also disconnected from each
other. Some go as far as calling psychology
textbooks ‘largely a potpourri of disconnected
empirical findings’.378 There can be many different
‘mini theories’ offering different takes on the same
behavior - for example, theories based on cultural
factors, emotion or cognition can all be used
to explain cooperation in an experiment.379 Or
there may be a precise elaborate theory based
on one dataset that only applies in very specific
conditions.380
REDUCTIVE PRAGMATIC RESPONSES ARE

Framing behavioral science as a collection of biases NOT ENOUGH TO FULFILL
can give the impression that behavior is the product
BEHAVIORAL SCIENCE’S
of one factor rather than many. It can create
overconfident thinking that targeting a specific bias POTENTIAL
(in isolation) will achieve a certain outcome - like Given these practical challenges, one response is
the illusion of control risk we mentioned earlier in to step away from theories and just prioritize being
“See The System”. And it can give the impression pragmatic. Three main approaches in this vein are:
that these biases are universal laws, rather than
contingent tendencies.387
“THROWING THINGS AT THE
DEADENING WALL.”
Repeatedly working from a list in this way may Some people advocate just relying on
lead to a “painting by numbers” approach that ‘lightweight theorizing and outright guesses’, but
sucks out creativity and innovation from applied subjecting them to many rigorous experiments
behavioral science, leading to repetition. A deeper to get reliable findings.390 You may not have
engagement is more likely to allow unexpected thought too much about what you are testing,
insights into what might work. This risk has not been but you’ve got good data on whether it
discussed widely, but it could be increasing as the influences behavior.
field matures and broadens.
Perhaps the biggest issue is that looking at lists “POST THEORY SCIENCE.”391
of biases distracts us from answering the more Another proposal is to use algorithms to uncover
important underlying questions. When does one new relationships and connections in datasets,
or another (conflicting) bias apply, and why? and use these findings instead of theorizing.
Which are widely applicable and which are highly The argument here is that psychology has spent
specific? Meta-analyses of the “paradox of choice” too much time trying to explain the causes of
show zero effect overall, but that it can be very behavior, but can’t predict future behaviors
powerful in certain settings - what are our theories accurately.392 Machine learning can produce
for why that happens?388 How does culture or life insight without theory.
experience affect whether a bias affects behavior
or not?389 These are highly practical questions when
you are faced with tasks like, for example, taking an “MUDDLING THROUGH.”393
intervention to new places. In this view, applied behavioral science should
be seen as ‘a kind of learning by trial-and-error
that is informed as much by local, practical
knowledge and user feedback as by a universal
science of human behavior’.394 Using behavioral
science becomes more like translating some
general strategies into specific tactics - and
making limited, incremental adjustments that
respond closely to contextual needs.
All of these options present problems. For some, We could just recommend strong theories as the
costs and inefficiency loom large. Lots of blind goal and leave it there. But this definition covers
incremental experimenting means lots of failure: a lot of ground, and we want to focus on helping
the risk of the “contextual brittleness” (disconnect applied behavioral scientists meet the challenges
between intervention and context) we mentioned we face. So, instead we emphasize that a bigger
earlier is much greater.395 Having lots of priority for future theories is that they are practical.
interventions that do not work is fine in contexts By this we mean:404
where experimentation and measurement is cheap
- online environments, for example. But even 1. They fill the gap we’ve identified in
then there are other costs: an iterative, scattershot behavioral science: between day-to-day
approach is likely to take time to produce results.396 working hypotheses and comprehensive
And the truth is that, in many contexts, it does and systematic attempts to find universal
require money to run experiments - so designing underlying explanations. They are not so
your interventions by chance may not seem abstract as to be unhelpful when confronted
responsible. Particularly in the public sector, most with a specific issue, but they allow us to raise
people expect that someone has thought about our eyes above simply understanding what
what is being offered to them, rather than just works in the current context.
sticking a finger in the air.
2. They are based on data rather than being
More generally, rejecting theory means it’s harder derived from pure theorizing.405
to learn and make general conclusions. Some have
compared this approach to trying to write a novel 3. They can generate testable hypotheses, so
by collecting sentences from random strings of they can be disproved.406
letters.397 Even if an algorithm means we are sure
that two behaviors are definitely related in a big 4. However they also specify the conditions
dataset, we won’t know if that relationship holds under which a prediction applies or does
true elsewhere.398 We also can’t anticipate if an not.407
intervention will backfire in a different context.
5. They are geared towards realistic adaptation
In other words, giving up on theory for pragmatism by practitioners and offer ‘actionable steps
will mean that behavioral science cannot meet its toward solving a problem that currently exists
challenges or fulfill its potential. For example, being in a particular context in the real world.’408
able to tackle bigger, high-level problems may They involve meaningful exchange between
only be possible if you can identify higher-level those who use them and those who develop
causes.399 So what is the best way forward? them.
BE PRACTICAL Let’s be more concrete and give an example of a

In one sense, the obvious answer is that “we need practical theory that meets these criteria.
strong theories, not weak ones”. Strong theories
are seen as ‘coherent and useful conceptual A PRACTICAL EXAMPLE:
frameworks into which existing knowledge can
RESOURCE RATIONALITY
be integrated’.400 They identify the most relevant
Resource rationality is a way of understanding
theoretical and empirical questions; try to explain
choices and behavior on the basis that people
how phenomena do or do not vary across time
make rational use of their limited cognitive
and contexts; are parsimonious; and can be
resources.409 Given there is a cost to thinking,
used to build models that make testable and
people will only work to find exact solutions if they
non-obvious predictions.401 Various people are
think doing so is worth the effort. In other words,
trying to propose new ways of developing strong
people look for ‘solutions that balance choice
wide-ranging theories,402 and there are some
quality with effort’.410
promising candidates, such as “cultural evolutionary
behavioral science”.403
RETHINKING THE ANCHORING This approach is also based on data (point 2).
BIAS WITH RESOURCE It emerged partly from computational analysis
and work on ‘bounded optimality’ in artificial
RATIONALITY intelligence. Rather than a traditional approach
A common example of the anchoring bias involves where ‘a theorist imagines ways in which different
people estimating a number, but being biased by an
processes might combine to capture behavior’,
initial ‘anchor’ that puts a specific number in their mind,
even if it’s irrelevant. For instance, a famous study spun
resource rationality focuses instead on evidence
a wheel of fortune in front of participants, who were of the problems people have to solve and their
asked whether they thought the number of African resources for doing so.416
countries in the United Nations was more or less than
the number on the wheel. Then they were asked to For the three final aspects - testable hypotheses,
estimate the exact number. Their guesses were highly specifying conditions, and actionable steps for
influenced by the irrelevant number they had just practitioners - we look at how resource rational
seen.411 analysis can improve nudging. A recent study on
“optimal nudging” starts by echoing some of the
The traditional explanation is that people make an
initial guess as an anchor, and they try to adjust away criticisms above. The authors also see a high-low
from it with more information or thought - but don’t level split between ‘intuitive, mechanistic accounts
adjust enough. While anchoring may seem irrational, of individual nudges’ on the one hand and ‘formal,
a resource rational analysis suggests that people are abstract accounts of nudging as a whole.’417 Since
just trading off the cost of being wrong against the cost models of nudges are often ‘domain-specific and
of taking time to get the answer more right.412 ad hoc’, we don’t have a framework to reliably
predict how a nudge will perform for a new context
At least one experiment suggests this analysis may
or population. That makes designing nudges time-
be correct. People were asked to predict how long
someone will wait at a bus stop, and were given an consuming.
anchoring time. They were either penalized for how
long they took to give a final answer, or for how Their solution is to apply the principles of resource
inaccurate their final answer was. The results showed rational analysis. Doing so means they see nudges
that people adapted their behavior to get a ‘near as interventions that change the order or way
optimal’ tradeoff between speed and accuracy.413 people encounter information. Nudges make it
easier to consider some things rather than others,
Although the basic principle is similar to familiar or change the order in which people think about
ideas of “bounded rationality”, what’s new is things. These changes influence people’s choices.
that resource-rational analysis offers a systematic
framework for building models for how people will The authors used these principles of resource
act. Proponents call it a ‘unifying framework for analysis to build models to predict how people
a wide range of successful models of seemingly would respond to different kinds of nudges, like
unrelated phenomena and cognitive biases.’414 defaults, and suggested alternatives. They then
tested these models against how people behaved
More importantly, it’s also practical. Take the first in online games where you invest money to earn
criterion for practicality: the need to fill the gap more, or settle for what you have. When they tested
between high and low-level theories. Proponents the effect of a default option nudge in such a game,
explicitly frame resource rational analysis as the resource rational model predicted things well:
‘building bridges’ between high-level and lower- people were more likely to choose the default if
level approaches to understanding decisions.415 (The the problem was complex, and less likely to if their
high-level “computational” one is about working preferences were different from the average.
out optimal solutions; the lower-level “algorithmic”
one deals with actual cognitive resources and The point is that the framework and the models can
processes.) predict how the effects of nudges will change in
new contexts or with tweaks to their presentation.
They can also help us understand how effects vary
between groups, who may vary in the value they
place on certain kinds of information.
Moreover, the authors built on these results by THE VISION FOR THE FUTURE
using algorithms to create ‘an automated method We are not saying that resource rational analysis
for constructing optimal nudges’. When tested is completely correct or that it is the only practical
against randomly selected nudges, these optimal theory around. There is still debate about how
nudges increased the rewards people received ‘strong’ the theory actually is: proponents
in games and made those choices easier to make themselves seem unsure whether it is a unifying
(people spent less time spending money to find out framework or just ‘a methodological device to
information for their decision). efficiently search through the endless space of
possible mechanisms.’418 Many decisions in life are
Such an approach could both reveal new kinds vague and difficult to quantify for input into such a
of nudges, but also make creating them much framework, which needs reliable data to work.419
more efficient. More reliable ways of developing
personalized nudges are also possible, since they Instead, we see resource rationality as having
would just require some initial data on how an the features we need to aim for in behavioral
individual makes decisions (recognizing that caution science theories of the future: a framework that can
is needed here). These are all highly practical connect up general and specific ideas; testable,
benefits coming from applying a particular theory. data-driven hypotheses; the ability to make sense
of heterogeneity; predictions about when an
intervention will work or not; and practical tools
informed by the latest data science techniques.
That’s the vision for moving behavioral science
beyond lists of biases and towards greater
relevance and sophistication.
07
PREDICT AND ADJUST
It may seem strange that behavioral scientists are overconfident,
since they run so many experiments that may give unexpected
results. The reason is hindsight bias - what happens when people
feel “I knew it all along”, even if they did not. When the results of
an experiment come in, hindsight bias may mean that behavioral
scientists are more likely to think that they had predicted them, or
quickly find ways of explaining why they occurred.
Hindsight bias is a big problem because, alongside Yet, as we will explain below, the behavioral
overconfidence, it impedes learning, dissuades science approach can generate overconfidence. So
innovation, and prevents us from understanding why does experimentation not do more to address
what is truly unexpected. this overconfidence?
In response, we should develop the practice of First, although experimenters are meant to create
getting behavioral scientists to predict the results hypotheses that are confirmed or disconfirmed,
of experiments, and then feeding back the results most hypotheses are reported to be supported.420
to them. Doing so would provide clear, regular Moreover, hypotheses may not be particularly
performance feedback that is lacking and which detailed - they may not state the expected effect
enables hindsight bias. But barriers lie in the way. sizes of different treatments, or their likely ranking.
People may not welcome the ensuing challenge This may be particularly true if the study is being
to their self-image; predicting may seem like one conducted by practitioners who have to make
thing too many on the to-do list; the benefits lie in tactical decisions, rather than run a structured
the future. scientific inquiry.
We propose: make predicting easy by
incorporating it into standard processes; minimize But perhaps the main reason is one that is familiar to
threats to predictors’ self-image, for example by behavioral scientists: hindsight bias. Hindsight bias
making and feeding back predictions anonymously; is ‘the belief that an event is more predictable after
give concrete prompts for learning and reflection, in it becomes known than it was before it became
order to disrupt the move from surprise to hindsight known.’421 Or, put it another way, it’s what happens
bias; and build learning from prediction within and when people feel “I knew it all along”, even if
between institutions. they did not. So, when the results of an experiment
come in, hindsight bias may mean that behavioral
Experiments may appear like an admission of scientists are more likely to think that they had
uncertainty: we run experiments because we are predicted them, or quickly find ways of explaining
not sure of something (the notion of “equipoise”). why they occurred.
Overconfident people should not run experiments.
Some aspects of hindsight bias are particularly Presenting predictions back to participants, along
relevant to behavioral science. One is ‘knowledge with the actual results, would then provide the clear,
updating’, which refers to how new information regular performance feedback that is currently
is integrated into existing memory. Our memory absent and which allows hindsight bias to develop.
systems are skilled at fitting new facts into existing Note that the predictions do not need to relate to
schemes - we search out knowledge that is future studies - the results just need to be unknown to
consistent with the novel finding, and thereby the predictors. And indeed, the predictions may not
make it fit and “feel right”. 422 A similar but more relate to a controlled study at all, but rather to real-
sophisticated approach is ‘sense-making’, which world behavior. But there do need to be clear and
involves the ease with which we can tell a causal specific outcomes to measure predictions against.
story that explains the new information.423
The process of predicting and gaining feedback
Both knowledge updating and sense-making are can improve the “calibration” of behavioral
more likely if a result is not what we expected, scientists - in other words, it can ensure they have
since we may start to look for ways to preserve a appropriate confidence in their judgments. BIT itself
positive self-image of expertise - “ah, this is in line has promoted this idea, showing that the judgments
with this fact I already knew”. The key is whether of senior officials (and its own staff) can be
our initial surprise turns into “resultant surprise” and overconfident, but also that calibration can improve
reassessment of views, or whether we can quickly over time.432
resolve it into a coherent but incorrect explanation
that creates hindsight bias.424 Prediction brings benefits beyond reducing
overconfidence. As noted above, it can create a
What are the effects of hindsight bias? For one, it more accurate sense of how unexpected a set of
creates general overconfidence in one’s predictive findings are. Is wider reflection needed? Do beliefs
abilities, partly because it impedes accurate about concepts need updating? Predictions can
feedback on them.425 It also impedes learning, also reveal the importance of unanticipated null
since it can create an illusion of understanding results, thereby reducing publication bias and file-
that means we fail to learn from the past.426 And drawer effects.
the feeling that you “knew it all along” may also
dissuade you from finding new approaches to a Regular prediction would also show how expert
problem, thereby reducing innovation.427 Finally, opinion varies, and whether there are consistently
hindsight bias can stop us from understanding what good performers whose views should carry
is actually unexpected - we do not judge new more weight. A recent paper took this further by
findings correctly, which stops behavioral science determining that experts predicted that a video
from advancing as a field.428 would increase parental uptake of free educational
resources by 14 percentage points.
In response, we should develop the practice of In fact there was no effect of the
intervention, but the researchers could
getting behavioral scientists to predict the results of also measure whether the result was
experiments, and then feeding back the results to them. significantly different from the expert
prediction (it was). As a result, they could
show that their results were worth paying
Hindsight bias can flourish if we do not attention to.433 If we can build up a high-
systematically capture expectations or “priors” quality set of predictions and predictors, then this
about what the results of a study will be - in other will mean better policy advice in situations where
words, it is not easy to check or remember the state tests simply are not possible.434
of knowledge before an experiment.429 Making
predictions provides regular, clear feedback of Behavioral science has seen some limited use of
the kind that is more likely to trigger surprise and prediction as a tool. Results so far are mixed on the
reassessment, rather than hindsight bias.430 It could predictive powers of behavioral scientists, but the
also force people to consider that a range of following themes seem to be emerging.
different outcomes are possible, which can reduce
hindsight bias.431
• Experts (and, to an extent, non-experts) can A prompt to access this template should be inserted
predict the effect of real-world interventions at a point after interventions are near-final, but
fairly accurately.435 But, in general, these before a trial is launched - so predictions can
predictions tend to overestimate the size of influence which interventions are selected, or any
effects created by nudge-type interventions.436 that need to be improved. (It’s worth noting, though,
that if people know predictions will influence an
• Predictions of experts may be more accurate
outcome, that may distort their predictions!) Ideally,
than non-experts - although not always.437
a specific project member or role should be clearly
• Familiarity matters: practitioners were more responsible for ensuring predictions have been
accurate than academics at predicting the made. That same person should then trigger the
effect of real-world nudge interventions;438 individual results feedback, once findings are
non-experts who were likely to be familiar with confirmed (perhaps through an automated mail
an intervention were as good as experts at merge).
predicting its effects.439
MINIMIZE THREATS TO
We can’t be sure of these conclusions because only
PREDICTORS’ SELF-IMAGE
a tiny fraction of current work involves prediction
Being presented with evidence of inaccurate
and feedback. Several barriers lie in the way.
predictions may threaten a predictor’s self-image of
People may not welcome the challenge to their self-
expertise, prompting them to engage in hindsight
image that evidence of poor predictions may bring.
bias as a way of reasserting their agency.440 This
Real-world trials can involve a frantic process of
tendency could be minimized by presenting all
dealing with practical and conceptual challenges;
predictions as reasonable “based on what you
making predictions can seem like one more,
knew then”. Such a formulation presents the results
inessential, thing for the to-do list. Even if people
as a challenge to one’s past self, rather than one’s
understand the benefits, they lie in the future.
present or intrinsic abilities. Updating of beliefs may
be easier to achieve as a result.
These barriers mean that making predictions needs
to be supported by institutions and processes. But
Predictions should be made and fed back
setting up this support means navigating difficult
anonymously, since this further minimizes
questions like: What exactly to predict? How to
perceived threats to self-image and encourages
do that? By whom? When? How should results be
re-calibration.441 Note that predictions would be
communicated? And how should participation be
known generally (to aid better group calibration),
incentivized, if at all?
just not who made them. However, there are clear
advantages to identifying a group’s most accurate
We offer these proposals as a first step towards
predictors, since their views should be given greater
getting good answers to these questions.
weight when deciding which interventions to test -
and potential acclaim could act as an incentive for
MAKE PREDICTING EASY BY participating.442
INCORPORATING IT INTO
STANDARD PROCESSES
This proposal is obvious, in line with behavioral
science principles, and hard to achieve in practice.
Organizations need to create an accessible
template for collecting forecasts that makes them
easy to both send and complete. The template
should at least include how to predict outcomes
(e.g., estimated treatment effects, direction of
treatment effects, significant ranking of trial arms),
and whether or how to record one’s level of
uncertainty.
GIVE CONCRETE PROMPTS FOR BUILD LEARNING WITHIN AND

LEARNING AND REFLECTION BETWEEN INSTITUTIONS
Perhaps the biggest priority is to ensure that This kind of adjustment and learning is much easier
feedback leads to learning and adjusting. to achieve in conditions of psychological safety,
Understanding the ‘epistemic emotions’ created where people and groups can freely say that they
by prediction feedback is key here. Common got things wrong and accepted ideas can be
epistemic emotions are surprise, confusion or questioned. BIT is currently working to set up its own
curiosity, and they are triggered by conflict formal forecasting and score-keeping system - and
between existing and new knowledge.443 Surprise we are aware that incentives and leadership will
is particularly important, since it can be used as a be key to ensuring a prediction culture takes root.
prompt for learning and reorganizing networks of Ideally, there would be a wider commitment to
knowledge.444 However, what can happen is that learning, so that the organization builds a central
‘initial surprise’ turns into hindsight bias, instead of store of predictions and findings. Such a store could
the ‘resultant surprise’ that can lead to reorganizing also exist between institutions, as shown by the
knowledge. example of the Social Science Prediction Platform
(www.socialscienceprediction.org). This idea
The practical implication is to challenge and disrupt reinforces our earlier calls for behaviorally-enabled
the move from surprise to hindsight bias. An obvious organizations.
starting point is how the results are framed - prompts
could make it easier for the recipient to reassess
and update their knowledge. A very simple one
might be, “If this is true, what else do you need
to reassess?” Another idea is to ask predictors to
include a reason for their prediction at the time they
make it. Presenting this past reasoning alongside the
results may disrupt the move to coming up with new
explanations in the present.
VALUES
08
BE HUMBLE, EXPLORE
AND ENABLE
Behavioral scientists may over-confidently rely on
decontextualized principles that do not match the real-world
setting for a behavior. Deeper inquiry can reveal reasonable
explanations for what seem to be biased behaviors. Therefore,
we should: avoid using the term “irrationality”, which can limit
attempts to understand actions in context; acknowledge that
our diagnoses of behavior are provisional and incomplete
(“epistemic humility”); and design processes and institutions to
counteract overconfidence.
When exploring behaviors, we need to: pay Rather than just creating self-nudges through
greater attention to people’s goals and their altering their immediate environments, they may
own interpretations of their beliefs, feelings, and decide that wider system changes are needed.
behaviors; reach a wider range of experiences,
including marginalized voices and communities; Behavioral scientists who succeed in the future
and recognize how apparently universal cognitive will be ones that are humble about their own
processes are shaped by specific contexts, thereby knowledge, who carefully explore people’s
unlocking new ways for behavioral science to intentions and experiences, and who enable
engage with values and culture. individuals to use behavioral science to achieve
personal and collective goals.
In addition, more can and should be done
to broaden ownership of behavioral science BE HUMBLE
approaches. One route is to enable people to We need to recognize the limits to our knowledge
become more involved in designing interventions as behavioral scientists. Instead, several factors
themselves - and “nudge plus”, “self nudges”, and can push us towards being overconfident in our
“boosts” have been proposed as ways of doing judgments and the likely consequences of our
this. But these new approaches should not be actions.
seen simplistically as “enabling” alternatives to
“disempowering” nudges. We propose a new set Of course, it’s important to note that overconfidence
of criteria for deciding when enabling approaches affects experts in a wide range of fields.445 In
may be appropriate: opportunity; ability and our Behavioral Government report, we showed
motivation; preferences; learning and setup costs; how and why policy makers are overconfident
equity; and effectiveness. in their predictions about people’s behavior. And
A final piece missing from current thinking is that overconfidence is a particular risk when dealing
enabling people can lead to a major decentering with complex adaptive systems, since we may
of the use of behavioral science. A range of people misidentify causes and effects by applying simpler
could be enabled to create nudges for positive mental models that produce illusions of control.446
societal change (with no “central” actors involved).
Nevertheless, overconfidence can emerge because

of reasons that are specific to behavioral science. ENERGY SWITCHING
Many households do not switch energy suppliers,
Behavioral science may be seen as providing despite the potential for large savings; this inertia
a technical justification for seeing decisions as may seem particularly “irrational” for lower-income
flawed, and thus in need of corrective action. A households. But during our work on this topic, we
good example is the behavioral economic concept came across stories that made the inaction more
of ‘internalities’, which involve a struggle between reasonable. Over time, lower-income households (in
a person’s present self (who may want a cake) particular) may have learned the collections process
and their future self (who desires to be healthy). In of their current company. They know that a “final
practice, policy makers often choose to prioritize warning” is not final, and they still have two more
the future self, which often loses out otherwise.447 weeks before they lose power. Effectively, they can
These behavioral concepts, and the language of then use the power company as a line of credit to
bias and irrationality, can end up increasing the prioritize other payments. But they lose this valuable
confidence of choice architects in terms of what knowledge if they switch providers. Apparent inertia
people “really” or “should” want, and why they are may actually be a considered strategy.
acting as they are.448
In reality, it may be that people are acting on the HEALTHCARE CHOICES

basis of reasons that a choice architect has not Our previous work on reducing patient waiting times
perceived or anticipated. People may have specific in the UK had identified that primary care doctors
contextual knowledge that cake is unlikely to be were generally sending patients to the same set of
available in the near future, for example. Indeed, secondary care providers.455 The issue is that these
new evidence from 169 countries shows that, as referrals happened even when those providers had
economic environments worsen, ‘there is a stronger no capacity. Patients would experience long waits,
and more consistent tendency to discount future even though there were nearby alternatives with
values’.449 A recent study showed that apparently good availability. In many cases, these choices
irrational loss-gain framing effects may actually were as suboptimal as they seemed. But deeper
reflect an awareness that others may punish you exploration showed that sometimes other factors
for ignoring such effects.450 The ready technical were at play. Unsurprisingly, patient transport was
explanation offered by behavioral science could an important factor. Larger, better-known hospitals
provide confidence that obscures the need to tended to attract more public transport routes
search more deeply for less obvious explanations. than the alternative providers. So, although these
alternatives were all within a few miles of the patient,
These technical explanations may also be based on in some cases they might be less accessible.
‘an overly cognitive view of people as individual
decision-making agents, rather than as social
humans who are embedded in established practices
and networks’.451 Questions of culture and society, ROAD USE
which are likely to complicate matters, may be Going into our work to improve the health and safety
downplayed as a result.452 of food delivery agents in Australia’s gig economy,
we knew that many workers were frequently
These factors mean that behavioral scientists may breaking road rules. In particular, they rode bicycles
over-confidently rely on decontextualized principles on footpaths and other pedestrian-only areas. This
that do not match the real-world setting for a was happening despite strong disincentives: there
behavior. Such ‘contextual brittleness’ is likely to was a significant risk of incurring fines that could
lead to poor outcomes.453 wipe out a whole day’s earnings (and social media
posts from workers suggested that they were well
Again, there are many instances where behavioral aware of this risk). However, observing the behaviors
scientists have avoided the issues just outlined. in context revealed that the footpaths often ran
BIT has always emphasized the importance of next to narrow, busy roads with heavy truck and
understanding behaviors in context and in depth, bus traffic. So, the agents were trading off the risk
going beyond ostensible explanations:454 of a fine against the risk of injury or death on the
roads - a different calculation from the one we had
perceived initially.
How can we make this kind of deeper inquiry DESIGN PROCESSES AND
more likely? Here are some proposals: INSTITUTIONS TO COUNTERACT
OVERCONFIDENCE
AVOID THE TERM
While behavioral scientists should be familiar with
“IRRATIONALITY” the concept of overconfidence and its causes,
We will not focus on the long-running and applying such ideas to our own actions is much
complex debates about how to define rationality harder. Instead, we should look at how to design
and irrationality.456 Our main concern is with the and redesign the contexts in which behavioral
consequences of how ‘irrationality’ is used in scientists are making decisions in order to promote
practice. The act of diagnosing irrationality in others greater humility. In its Behavioral Government
seems to imply that you yourself are rational, or at report, BIT has already shown how policy makers
least have the ability to detect rationality. At the are often affected by issues such as optimism bias
same time, the act can delegitimize the views of the and illusions of control.461 We then set out a series
‘irrational’ party - their disagreement is not valid of institutional changes to reduce their ensuing
because they are not playing by the rules of reason, overconfidence, including pre-mortems, reference
unlike you.457 Doing so can lead to failure to class forecasting, and break points. What are the
understand the reasonableness of people’s actions. equivalent changes to processes that might reduce
Of course, we need to avoid setting up straw men overconfidence among those applying behavioral
representations of how behavioral scientists think, science?
as some critics do. But we think the use of the
label ‘irrational’ may increase over-confidence A common theme through these ideas is the need
and impede deeper inquiry. Dropping it may be a for more and better inquiry into behaviors in
necessary, but not sufficient, way of solving these context, rather than making assumptions. At BIT, we
problems for practitioners.458 refer to these inquiries as Explore work.
EMBRACE EPISTEMIC HUMILITY EXPLORE

Epistemic humility is based on ‘the realization Open-ended qualitative exploration of the
that our knowledge is always provisional and context and drivers for behaviors is not new to
incomplete - and that it might require revision in light the behavioral sciences.462 BIT itself has always
of new evidence.’459 For behavioral scientists, this emphasized the importance of investing in this kind
might involve recognizing what initial inquiries are of inquiry - and has worked to incorporate new
essential, rather than simply reaching for a familiar methods of doing so, such as citizens’ juries.463 As
theory, concept or intervention and applying it to part of this effort, BIT has released a new Explore
the situation at hand. It might be about pausing report, an accessible guide for exploring behaviors
to reflect on how far existing knowledge can be in order to create interventions. However, three
transferred between contexts and domains. It might areas demand particular focus in the future.
be about recognizing the possibility of backfires
and spillovers, of recognizing how goals and The first is to pay greater attention to a) people’s
preferences (including our own) may be complex needs and their existing strategies to fulfill their
and ambiguous. The Covid-19 pandemic has shown needs; and b) people’s own interpretations of their
just how difficult it can be to predict behavior, and beliefs, feelings, and behaviors. Of course, there
could act as a spur for recognizing why greater is existing work to build on here. Design thinking
epistemic humility is needed.460 offers ways of understanding users’ needs, how
they may be met, and how they may reshape
the nature of the “problem” itself.464 So-called
“wise interventions” foreground ‘the meanings
and inferences people draw about themselves,
other people, or a situation they are in’, and try to
reshape these interpretations to change behavior.465
These approaches have been less prominent (but
not absent) because the last decade has seen more
focus on changing contexts and situations than
attitudes and intentions.466
Second, we need to explore a wide range of Explore work - done well - can reveal
how these interactions take place.
experiences, including making greater efforts to
In doing so, it can unlock new ways
reach and recognize marginalized voices and for behavioral science to engage
communities.467 We need to understand how structural with values and culture, addressing
inequalities can lead to expectations and experiences the criticism that the approach has
‘a thin conception of the social’.476
varying greatly by group and geography.468 We need For example, one influential view
to be aware of exactly whose experience we are of culture is that it influences action
exploring and what may be shaping that experience. ‘not by providing the ultimate values
toward which action is oriented but by
shaping a repertoire or “tool-kit” of
We may need to collect more nuanced data on, habits, skills, and styles’.477 There are
say, race and ethnicity, rather than relying on the similarities here to the heuristics and biases ‘toolkit’
limitations of administrative data.469 And, finally, we perspective on behavior, and you can see how
need to acknowledge that the behavioral scientists behavioral scientists could start explaining how
doing this work bring their own perspectives, given and when certain parts of the toolkit become more
the field itself is still relatively homogeneous - as we salient to people.
discuss in more detail below.
Since this proposal may seem abstract, a final
Finally, we need to see how better exploration example may be useful. “Scarcity” is a well-known
can reveal the interplay of context, culture, behavioral science concept. As formulated by
and cognition. Some make the case that recent Mullainathan and Shafir, scarcity explains that
applications of behavioral science have focused people with a limited resource (e.g., money, food,
mainly on universal cognitive processes, triggered time) will focus narrowly (or “tunnel”) on managing
by immediate cues, as drivers of behavior; less that resource.478 However, tunneling their cognitive
attention has been paid to the role of culture, energy in one direction (e.g., providing food today)
society, and values.470 Taking this focus has pros means they end up neglecting other concerns, such
and cons.471 One risk is that it neglects the wider as longer-term planning. Scarcity then becomes a
significance of explore work, instead seeing it driver keeping people in poverty, as well as one of
as serving only a limited set of functions - e.g., its effects.
constructing tailored interventions for specific
circumstances, or checking how fixed concepts (like • Scarcity is a powerful and useful idea that
“anchoring”) play out in practice. has created valuable research and policy
initiatives.479 However, it has been criticized
In reality, a growing body of research shows how along the following lines, which echo the
context, culture, and cognition are interrelated, arguments we make above:480
and that neglecting one limits your understanding • The scarcity model assumes that those with
of the others.472 As one summary puts it, ‘universal limited resources find economic need to be the
cognitive processes are shaped by the specific most salient form of scarcity.
cultural repertoires provided by the social
environment, which vary between cross-cutting • However, they may be faced with multiple forms
social groups.’473 For example, memory can of scarcity at once (e.g., health issues, economic
function differently in Western and East Asian insecurity, violence).
cultures because of varying conceptions of time.474
• When trying to prioritize, ‘hierarchies of concern
In other words, there is dynamic interplay between are far from universal; culturally available
the kinds of things that Explore work uncovers on frames influence which of several competing
the one hand, and behavioral science concepts on needs takes priority to become the object of
the other. The former is not somehow secondary to “tunneling”’.481
the latter. People’s experiences of culture and group
identity profoundly influence the way that social
norms function, rather than being just interesting
variations on a central concept that remains
unaffected.475
• In other words, what it is appropriate to “tunnel ENABLE

in” on may vary according to groups. People
with few means may spend their limited money Explore work helps intervention designers, but
on funerals because worries about the social we can also go further. The people affected by
cost of not doing so outweighs narrow economic an intervention could be involved in designing it
concerns.
themselves; they could also be enabled to drive
• By failing to consider the way that values changes that benefit themselves or society at large.
influence the operation of scarcity, research
‘risks labeling behaviors that enable survival Going further like this is important because many
in specific contexts as “‘non-optimal”’.’482 As applications of behavioral science have been
we noted above, applying a general concept top-down, with a ‘choice architect’ enabling
universally may lead us to see certain choices as certain outcomes.483 Critics have argued that such
irrational - wrongly. a setup is problematic because it is manipulative,
identifies people’s preferences poorly, and limits
Careful Explore work can avoid these problems by transparency, learning and autonomy.484
identifying the different forms of scarcity present and
how people interpret them using items from their We do not explore those questions here.485
cultural ‘toolkit’. Doing so would help us to better Instead, we emphasize that BIT and others have
understand these high-level concepts and identify always supported and practiced a wider range
the conditions when they do or don’t influence of empowering approaches. From the start, the
behavior. And, in turn, this would prevent instances field has promoted techniques such as using
of the ‘behavioral brittleness’ mentioned earlier, natural frequencies to improve understanding of
where decontextualized cognitive frameworks are statistics, an idea that is often placed in opposition
poorly aligned to specific contexts. to nudging. This fact is only surprising if you have
a simplistic either/or view of nudge versus other
approaches; the reality is that a broader palette is
and always has been in use. The way this palette
should be used is still under debate - there are also
hurdles to making deliberative engagement feasible
in practice.486
As we explain below, more can and should be EMPOWERMENT CUTS ACROSS

done to broaden ownership of behavioral science EXISTING LABELS
approaches. Attention has focused on three main
Right now, there is often a simplistic division
ways of doing so:
between disempowering nudges on one side and
enabling nudge plus/self-nudges/boosts on the
NUDGE PLUS other. In fact, these labels disguise two real drivers
Nudge plus is where a prompt to encourage of empowerment that cut across the categories.
reflection is built into the design and delivery of a They are:
nudge (or occurs close to it). People cannot avoid
being made aware of the nudge and its purpose, 1. How far the person performing the behavior
enabling them to decide whether they approve is involved in shaping the initiative itself. They
of it or not. While some standard nudges, like could not be involved at all, involved in co-
commitment devices, already contain an element of designing the intervention, or initiating and
self-reflection, a nudge plus must include an ‘active driving the intervention itself.
trigger’ - just having the potential for reflection ‘is
not sufficient to prompt deliberation and cause 2. The level and nature of any capacity created
lasting behavior change’.487 by the intervention. The intervention may create
none (i.e., have no cognitive or motivational
SELF-NUDGES effects), it may create awareness (i.e., the ability
A self-nudge is where someone designs a nudge to to reflect on what is happening), or it may build
influence their own behavior. In other words, they the ability to carry out an action (e.g., a skill).
‘structure their own decision environments’ to make
an outcome they desire more likely.488 An example In the figure opposite, we show how the different
might be creating a reminder to store snacks in proposals map against these two drivers.491
less obvious and accessible places after they are
bought. We now want to highlight the main things this figure
reveals. Co-design requires some explanation. Co-
BOOSTS design uses creative methods ‘to engage citizens,
Boosts emerge from the perspective that many of stakeholders and officials in an iterative process
the heuristics we use to navigate our lives are useful to respond to shared problems’.492 In other words,
and can be taught. A boost is when someone is the people affected by an issue or change are
helped to develop a skill, based on behavioral involved as participants, rather than subjects. This
science, that will allow them to exercise their own involvement is intended to create more effective,
agency and achieve their goals.489 Boosts aim at tailored, and appropriate interventions that respond
building people’s competences to influence their to a broader range of evidence.493
own behavior, whereas nudges try to alter the
surrounding context and leave such competences These co-design methods can mesh well with
unchanged. The similar but lesser-known idea of behavioral science approaches. Both have a
‘steer’ preceded boosts.490 pragmatic focus on what can be done in the real
world; both highlight aspects of behavior that
When these ideas are discussed, there is often an escape rational actor models.494 Facilitated co-
underlying sense of “we need to move away from design sessions could help participants see how
nudging and towards these approaches”. But to findings from behavioral science are relevant to an
frame things this way neglects two crucial questions: issue and decide what interventions are justified.
how empowerment actually happens; and when to For example, the UK’s move to automatic enrollment
use the different approaches. for pensions was preceded by deliberative
events, including a National Pensions Day where
1,000 people from across the UK were presented
with evidence on the issue; 72% voted to adopt
automatic enrollment with the right to opt out (and
20% for no right to opt out).495

SELF-
REFLECTION NUDGE+
NUDGE+
e.g. healthy foods e.g. employees deciding they want e.g. creating regular
placed prominently in healthy foods to be prominent & automatic bank transfers
canteen developing arrangement to savings account

SELF-
REFLECTION NUDGE+
NUDGE+
e.g. gyms introducing e.g. users working e.g. initiator decides e.g. individual reads e.g. placing letters to be
commitment devices to with gyms to develop that the best option is to about heuristics and posted on doorhandle
increase usage commitment devices that teach people heuristics constructs own heuristics
actively remind users of to increase savings for savings goals
their exercise goals
The point here is that people may be heavily But if more people are enabled to use behavioral
engaged in selecting and developing a
science, they may decide to introduce interventions that
nudge intervention that nonetheless does
not trigger any reflection or build any skills influence others. Rather than just creating self-nudges
(the main focus of approaches such as through altering their immediate environments, they
nudge plus and boost). People may choose, may decide that wider system changes are needed
with consideration and full information, that
they do not want those things to happen.496
instead. In other words, a range of people could be
boosted to create nudges that generate positive societal
This choice is similar to the idea of self- change (with no ‘central’ actors involved).
nudging, in the top right-hand corner of
the figure. People are enabled to create
their own nudges that they may then These are not new ideas. In 1969, George Miller
forget about, even as they continue to work. They encouraged psychologists to find how best to ‘give
have exercised agency, even though they may psychology away’.501 In a little-noticed section
not experience autonomy later, while the nudge of Nudge, Thaler and Sunstein actually propose
operates.497 that ‘workplaces, corporate boards, universities,
religious organizations, clubs, and even families’
In contrast, in the bottom left of the figure, a policy could decide to nudge themselves.502 The creators
maker has decided that the best option is for of Nudge Plus do suggest that it could ‘provide a
someone to be taught a “boost” (e.g., simple rules link between citizen action on public policy issues
of thumb for managing finances). In the absence and bottom-up movements for social and political
of greater engagement, there is a risk that this action.’503
becomes “paternalistic boosting”, where a policy
maker has assumed that people will want this However, we need more examples of how this
approach. While proponents of boosts say that can be done. One might be the way that the Fair
‘individuals choose to engage or not to engage Tax Mark campaign emerged in the UK. The Mark
with a boost’, they also say ‘sometimes, lack of was a sign that a vendor had not engaged in tax
motivation may even be addressed with specific avoidance; it acted as a nudge, since it provided
boosts (or nudges).’498 These two things seem to be a signal to consumers without restricting their
in tension. choice.504 But the Mark was created by a not-for-
profit actor, rather than by a government. One
“Paternalistic boosting” is in contrast with the bottom could imagine that many other groups could be
right corner of the figure, “self-boosting”, where enabled to launch campaigns like these that draw
people learn about heuristics and then proactively on behavioral science principles.
apply them to their own challenges, absent any
prompting from a policy maker.499 These ideas point towards a new dynamic for
the public sector. Proponents of co-design argue
Our point is that these distinctions about different that more traditional, top-down approaches are
ways to enable are absent from current debates inadequate for addressing wicked problems, whose
about “nudging versus boosting”.500 Identifying multi-dimensional nature requires input from both
these two drivers of involvement and capacity experts and the public.505 This idea chimes with our
show how the way in which these approaches are proposals above on ‘system stewardship’. If there
applied matters. is an increasing need to shift away from top-down
control towards enabling conditions for behavior,
There is a final element that is missing from the then this changes the role of policy designers.
current debate – how far enabling people can lead Rather than being architects, they may need to
to a major decentering of the use of behavioral be more like facilitators, brokers, and partnership
science. builders.506
The approaches in the figure above tend to assume

either that a central actor is creating the intervention
or, if the person concerned is also the creator, then
the intervention is focused on themselves.
We think this is the right direction of travel, and In contrast, a nudge ‘typically reduces the cognitive
it shows how behavioral science can strengthen and other costs of decisions for choosers, who
democratic engagement rather than weaken are freed from devoting time and attention to the
it.507 But to move forward we need a clearer problem’.516
understanding of the different ways people can be
enabled, rather than trading competing labels.
5. EQUITY
If there are learning costs, who will want to bear
When to use more enabling approaches?
them? This matters because, as proponents say,
‘only those people who seek the competence
The second crucial question is not ‘is boost or
offered by a boost will adopt it’.517 It is likely that
nudge best?’, but how to match the approach to the
the people with the most time and resources will
situation.508 We propose that the following criteria
engage and stick with the learning process, which
should be used to inform that judgment:509
means there could be inequitable outcomes.
Equally, approaches that require less buy-in can
1. OPPORTUNITY reduce inequities.518
How likely is it that people will be able to use
enabling approaches at the moment of action?
6. EFFECTIVENESS
Put differently, how will people know that this is
Finally, there is the question of which approach is
the moment to “strategically call on automatic
likely to be most effective at influencing behavior.
processes” if automatic processes are already in
Various claims have been made about why nudge
play?510 There is much evidence that being aware of
plus or boosts are likely to produce deeper and
decision-making biases is very different from being
longer-lasting changes than nudges alone.519
able to counter them.511 Checks are needed as to
Studies that compare effectiveness are just emerging
whether realistic opportunities exist and how they
- some showing nudges outperforming boosts,520
can be exploited (e.g., by creating new habits).
some the reverse,521 and others that nudge plus does
better than nudge alone.522 Clearly, the field is still
2. ABILITY AND MOTIVATION in the empirical foothills of being able to compare
Sometimes people’s true preference is not to be different approaches for certain behaviors.
actively engaged in a choice or behavior (e.g.,
choosing not to choose). Similarly, ‘if individuals These criteria are all technical ones; autonomy
lack the cognitive ability or motivation to acquire and transparency may be valued as principles that
new skills or competences, then nudging is likely to outweigh all other considerations. But we argue that
be the more efficient intervention’.512 it’s important for the Explore phase to consider these
criteria, rather than assuming the kind of intervention
3. PREFERENCES that people prefer.
If someone’s preferences about an issue are difficult
to discover, then more enabling approaches may
be better than nudging. This is also true if goals vary
widely within a population, or if the same person
has conflicting goals.513 Enabling approaches may
also be less successful if the target behavior is not
aligned to the individual’s interests.514
4. LEARNING AND SETUP COSTS

The investment required from individuals (learning
costs) and policy makers (setup costs) may vary
between approaches. While creating boosts may
take less effort than traditional education, they ‘may
require some hours of instruction and practice’ and
‘a regular investment of time’.515
09
DATA SCIENCE
FOR EQUITY
Recent years have seen growing interest in using new data
science techniques to reliably analyze the heterogeneity of large
datasets. The idea is to better understand what works best for
certain groups, and thereby tailor an offering to them.
This vision is often presented as straightforward and Analyzing trial results by subgroups is a common
obviously desirable, but it runs almost immediately step, albeit one that requires care, as the replication
into ethical quandaries around manipulation and crisis showed.527 Machine learning offers more
discrimination. There is also emerging evidence that sophisticated, reliable and data-driven ways of
people often object to personalization. detecting meaningful patterns in datasets.528
We propose that the main opportunity is for data • A machine learning approach has been
science to identify the ways in which an intervention shown to be more effective than conventional
or situation appears to increase inequalities, and segmentation approaches at analyzing patterns
reduce them.523 of US household energy usage to reduce peak
We call this idea data science for equity. It uses consumption.529
data science to support not exploit. But it needs • Danish data has shown that machine learning
to be supported by other attempts to increase can accurately predict whether someone has
agency, and more data on which uses people find a bacterial infection, thereby complementing
acceptable. physicians’ attempts to limit unnecessary
antibiotic prescribing.530
THE PROMISE OF MACHINE • The smartphone driving app in India mentioned
LEARNING earlier used machine learning to identify how
There are now more datasets available and better feedback messages matched to driving abilities.
techniques to analyze them.524 While the term “data
science” covers a wide range of activities, there has There has been much excitement around the idea
been particular interest in machine learning, a field that these results could be used to create more
that develops algorithms that can offer new insights effective targeted interventions. In other words,
into the heterogeneity we just discussed.525 Rather people should receive the type or variation of
than explaining the details of machine learning, we intervention that works best for them - or they should
want to focus on the results it can produce. be proactively targeted based on their predicted
needs.531 “Scaling” a program becomes about mass
Investigating how effects vary by group is not a new tailoring rather than uniform deployment.
idea. The idea of “segmentation” has been core to
marketing for decades.526
There are claims that these personalized or ETHICAL CHALLENGES

“algorithmic” nudges are much more effective Adopting new technologies is never value-free.546
than generic ones,532 particularly so-called There is no “view from nowhere” here either: using
“hypernudges” that can create highly-personalized machine learning is not just a technocratic task.
online environments.533 More generally, the rise of Ethical quandaries soon emerge.
bespoke interventions is often seen as the future for
applied behavioral science.534 Usually the implicit First, using new techniques means that people
or explicit message is that we should just accelerate are unlikely to know what data has been used to
towards this future as fast as possible.535 target them, and how.547 Arguably, individuals
who are affected by algorithms are owed an
Many theories offer reasons why personalization explanation of how a decision has been made -
should produce positive results, such as: perceived but that can be hard when those algorithms are
self-relevance; feeling of rightness or fit; familiarity continually adapting and processing new data.548
or fluency; and self-efficacy.536 Real-world Transparency is particularly important if the effects
empirical studies also provide support.537 But are widespread and lasting (one study found that
we need to recognize that this vision of tailored targeted digital advertisements altered people’s
inventions requires major technical, ethical and self-perceptions, with effects lasting at least two
acceptability challenges to be overcome. weeks).549
TECHNICAL CHALLENGES Machine learning datasets may feature detailed

For a start, some of the same criticisms from and sensitive information about an individual,
psychology’s replication crisis may also apply to including their biases and vulnerabilities.550 When
machine learning. Measurement, model selection, combined with low transparency, the result could
and the way claims are communicated have all be greater opportunities to manipulate recipients
been the targets of criticism.538 A common concern into outcomes that are badly aligned with their
is whether the datasets that “train” machine preferences.551 If done systematically at large scale,
learning algorithms introduce bias and reduce their this could even lead to market manipulation.552 We
generalizability.539 Some studies also question the need to make sure that increased data capabilities
added value of machine learning. One looked do not just result in instruments of precise control (or
at whether researchers could predict certain life perceived control).
outcomes of children; complex machine learning
models often failed to outperform simple benchmark Closely related to manipulation concerns is the
approaches.540 fear that data science will create new opportunities
to exploit, rather than help, the vulnerable. One
Then there is the effort required, since ‘personalizing aspect is algorithmic bias. Models that use datasets
nudges is not a small endeavor’.541 While machine that reflect historical patterns of discrimination can
learning may produce interesting results, data produce results that reinforce these outcomes.553
scientists actually spend more than half their time For example, an algorithm will have a biased
doing the mundane, painstaking work of cleaning understanding of what a “successful” candidate is
data first.542 Accessing these datasets often raises if it uses a data from a company with discriminatory
privacy concerns, since they may include sensitive hiring practices. Examples range from racial bias
personal data.543 in healthcare provision to gender bias in credit
scoring, although some argue that algorithms
Finally, the recommendations need to be feasible are actually less likely to be biased than human
for policy makers. Choosing the right number of judgment.554
segments involves balancing analytical concerns
against knowledge about how many can be The UK’s Department of Work and Pensions (DWP) is
practically implemented.544 And the truth is, at least trialing an algorithm that analyzes past historical data
for the public sector, the infrastructure for delivering to predict which cases are fraudulent in the future.
tailored interventions at scale often just does not The UK’s National Audit Office reports that the DWP
is aware that ‘biased outcomes’ could occur, since ‘if
exist. While it may exist in the private sector, this
the model were to disproportionately identify a group
is often in online environments only; physical
with a protected characteristic as more likely to commit
environments are much harder to personalize.545 fraud, the model could inadvertently obstruct fair
access to benefits.’555
The DWP is attempting to manage these risks by: Moreover, there is an “acceptability gap”: even
if people accept a personalized service, they
• Pre-launch testing and continuous monitoring.
are generally against the collection of sensitive
• “Fairness” analysis, which looks at how false information that personalization often relies on.557
positive results are distributed across groups with
protected characteristics. While systematic data on acceptability is
• Keeping the final decision with a human, and limited, there are many cases where people
not telling caseworkers why each case has been have an instinctive negative reaction against
flagged for review. personalization.558 When a company tries
personalization that crosses into being “creepy,”
uproar and damage to their reputation can ensue.
Since disadvantaged groups are more likely The Dutch bank ING found this when it tried to
introduce targeted advertising for additional
to be subject to the decisions of algorithms, products, based on their customers’ behavior: after
there’s a particular risk that inequalities will a massive backlash, the Dutch Data Protection
be perpetuated.556 Authority formally warned against this kind of
marketing.559
And even if an algorithm is not making a decision, Reasons why people react negatively to
but just identifying people’s situations, that itself can personalization include: perceiving it as an invasion
be used to exploit them. For example, new ways of of privacy, as when the data is too detailed (e.g.,
identifying financial security could be used to target a previous transaction history) or clearly comes
support more effectively - or to target high-interest from another website;560 if it is seen as an explicit
loans more lucratively. attempt at manipulation;561 if it contains already-
known content;562 or if it seems to be based on
ACCEPTABILITY CHALLENGES a stereotypical judgment.563 For example, when
overweight people thought they had been given
There is also emerging evidence that people object
information about a weight loss program based on
to personalization. While they support some
their weight, rather than randomly, this led to more
personalized commercial services, like shopping
negative thoughts and lower intention to perform
and entertainment, they consistently oppose
healthy behaviors.564
advertising that is customized based on sensitive
information.
DATA SCIENCE FOR EQUITY Behavioral science can make a

particular contribution here. Since
How to navigate these challenges? We humans are training the machine
propose that the best way forward is for learning models, we need to understand
how biases in our perceptions
data science to identify the ways in which an and judgments are transferring to
intervention or situation appears to increase algorithms.572 There is growing interest
inequalities, and for behavioral science to be in the ways that behavioral scientists
can help understand and modify the
used to reduce them.565 We call this idea data
“behavior” of algorithms, as part of
science for equity: using the power of data a wider “behavioral data science”
science to support not exploit. agenda.573
For example, groups that are particularly likely

to, say, miss a filing requirement, could be offered “Data science for equity” may seem like a platitude,
preemptive help. Algorithms can be used to but it’s a very real choice: the combination of
better explain the causes of increased knee pain behavioral and data science is powerful, and
experienced in disadvantaged communities, has been used to create harm in the past. Nor is
thereby giving physicians better information to act it a simple choice: we need to work through the
on.566 They could identify when frontline workers implications of “data science for equity”. The idea
are discriminating (consciously or not) against may be more ethically justified because it tries to
service users.567 Or they could spot outliers who reduce harm and injustice and help those who
are achieving better outcomes than their situation are vulnerable.574 But this is only one strand of the
would predict, and try to understand why - so good objections we just considered. Behavioral scientists
practices could be facilitated more widely.568 need to consider all the “who”, “how”, “when”, and
“why” questions:
There are reasons to be optimistic. Biases in
algorithmic judgments may be easier to fix than • Who does the personalization target, and using
those in human judgment, and increasing effort is what criteria? Many places have laws or norms
going towards this goal. For example, many models to ensure equal treatment based on personal
that tried to predict postpartum hemorrhage had characteristics, such as age, race, and gender.
‘high risk of bias’ that could ‘perpetuate or even When does personalization violate those
worsen existing disparities in care’.569 Additional principles? Does personalization undermine
work produced new models that managed to social cohesion by exacerbating existing fault
reduce these biases.570 Best practice guides are lines?575 These questions actually go beyond
now recognizing the need to build user experience behavioral science, but they should not be
perspectives into data science projects, so the ignored.
effects of algorithms are considered fully.571
• How is the intervention constructed? To what
extent do the recipients have awareness of
the personalization, choice over whether it
occurs, control over its level or nature, and the
opportunity for giving feedback on it?576
• When is it implemented? Is it at a time when

the participant is vulnerable? Would they likely
regret it later, if they had time to reflect?
• Why is personalization happening? Does it aim

to exploit and harm or support and protect the
recipient, recognizing that those terms are often
contested?
“WHY”: PURPOSE OF PERSONALIZATION

EXPLOITATIVE SUPPORTIVE
Manipulation that the field needs “Data science for equity” may be
“HOW”: LEVEL to reject justified in some instances
LOW
OF AWARENESS,
CHOICE, CONTROL,
OPPORTUNITY FOR Will not be tolerated, and Long-term goal for the field, but
HIGH therefore unlikely to occur also presents practical barriers
FEEDBACK
The “how” and “why” are perhaps the two most A more practical goal is to increase people’s
important factors for the idea of data science for
equity. How far does a supportive intent justify
agency in the face of personalization,
an intervention that may not be transparent or perhaps by developing their ability to detect
apparent? As the simple table below indicates, tailoring or targeting when it happens.
the goal would be to move the “data science for
equity” agenda from the top right-hand corner to For example, a recent study found that a short
the bottom right-hand corner. In other words: aim intervention that prompted people to reflect on an
for a situation that reduces the ethical issues with aspect of their personality increased their ability
how data science is being applied as well. to detect advertisements targeted at them by up to
26 percentage points.579 Teaching these kinds of
Getting there will take time. But we can identify techniques may be a good option, since they could
some obvious priorities now. One is to get more be resilient to constantly changing technologies and
data on how acceptable people find different kinds strategies.
of personalization to be. What tradeoffs would
people find acceptable in order to receive higher
value personalized services?
Behavioral science can help guide our questions

here as well. For example, one possibility is a move
towards using algorithms to personalize prices.
Behavioral science offers existing evidence about
how people make judgments about the fairness of
pricing which can be relevant here.577 People may
tolerate high price tailoring if the products are very
similar or firms can explain why the differences are
fair (e.g., the higher price cross-subsidizes those
who can’t afford as much).578
10
NO ‘VIEW FROM
NOWHERE’
Behavioral scientists need to understand how they bring certain
assumptions, privileges, and ways of seeing to what they do. We
are always situated, embedded, and entangled with ideas and
situations. We cannot pretend there is some set-aside position
from which to observe the behavior of others - there is no “view
from nowhere”.
Behavioral scientists may not see the extent to Our final proposal is one of the most wide-ranging,
which they hold elite positions that stop them from challenging, and important. Behavioral scientists
understanding people who think differently. In need to understand how they bring certain
addition, homogeneity in terms of gender, race, assumptions, privileges, and ways of seeing to what
physical abilities, sexuality, and geography also they do.581 We cannot pretend there is some set-
influences the viewpoints, practices, and theories of aside position from which to observe the behavior
behavioral scientists. So, rather than claiming that of others. We are always situated, embedded,
science is value-free, we need to find realistic ways and entangled with ideas and situations.582 Yet it
of acknowledging and improving this reality. may not seem this way, since these entanglements
constitute our way of seeing itself.
We propose: improved self-scrutiny, with
practitioners querying how their identities and We sum up this idea as ‘no “view from nowhere”’.
experiences contribute to their stance on a topic; For the philosopher Thomas Nagel, the “view from
new ways for potential subjects of research to nowhere” was an objective stance that allows us
judge researchers and decide whether they want to ‘transcend our particular viewpoint’.583 We do
to participate; wider participation in intervention not think achieving such a stance is possible for
design to bring in new perspectives; and greater behavioral scientists. No objective observation deck
diversity among behavioral scientists, through outside society exists; the place from which we see
actions like professional networks connecting the behaviors and construct issues matters. Assuming
Global North and Global South. otherwise can create harm.
Exploring, enabling, and decentering - the
proposals we made just now raise the question
of who is acting. It’s a crucial point. Deeper
exploration may not be possible if you have
not reckoned with where you are starting from.
Applying a behavioral lens may be insufficient if
you only consider who is seen through the lens, and
not who is looking from the other side - or what the
act of “seeing” entails.580
This section looks first at how the characteristics and WHO BEHAVIORAL SCIENTISTS
positions of behavioral scientists as people can ARE INFLUENCES WHAT THEY DO
influence both their findings and the methods and
We start by considering the position of applied
concepts they use. We end by trying to offer ways
behavioral scientists. This is a group usually defined
of making behavioral scientists more aware of their
by having knowledge, skills, and education.
perspectives, of diversifying the field of behavioral
Many of them are in positions where they can
science, and of introducing greater equality
shape actions (public and private) through their
between the people on either side of the behavioral
advice and findings. Therefore, it’s fair to say that
science “lens”.
behavioral scientists as a group have a fairly
privileged or elite position, especially when it comes
to policy questions.
This elite positioning means that certain ideas are

likely to get strong support in ways that escape
full awareness. For example, one critical paper
argues that behavioral economists overstate the
value of saving for retirement because they are
rationalizing their own tendency to save more
than average. Similarly, the paper also argues
that their professional background means they
rarely understand stances that are skeptical of
education.584 We suggest that most people working
to increase uptake of the Covid-19 vaccines have
been vaccinated themselves.
The issue is not with behavioral scientists holding

these positions per se: it’s about awareness. We
may not see the extent to which we hold elite
positions that are preventing us from understanding
people who think differently. There is evidence
that politicians and bureaucrats misperceive the
opinions of both the public in general and the users
for whom they design services.585 In a recent study,
US “policy elites” (e.g., judges, media pundits,
lobbyists, scientists) inaccurately perceived issue
opinions by around 14 percentage points.586
Often, such elites believe that others’ opinions

are more similar to their own than they actually
are, and people will act the way they would.587
In our Behavioral Government report, we called
this the “illusion of similarity”.588 The danger is
that policy elites are placing their group values
and preferences on others, while thinking they
are adopting a view from nowhere. This does not
mean that policy makers can never act, but rather
that they need to carefully understand their own
positionality and that of others before acting.
The risk of this happening is not limited to the policy One study examined this possibility by analyzing
domain. The positioning of funders, researchers, and whether 5,000 psychology articles specified the
those researched affects the way many interventions racial/ethnic/national/cultural characteristics of
are realized. In response to a local resident asking, the sample in question.598 The authors argue that
“Why am I always being researched?”, the US non- doing so signals limited generalizability - i.e. “this
profit Chicago Beyond argues that: conclusion only applies to x group”.
‘Funders are often cast as “outside of the work,” They find that samples from the United States were
and researchers as featured less in titles compared to both other WEIRD
objectively neutral and merely “observing the and non-WEIRD regions. At the same time, samples
work.” This does not account for the biases and from the US were featured more if they referred to
perspectives every person brings to the work. When minorities ‘who may be perceived as exceptions
data is analyzed and meaning is derived from the to assumed generalizability of the White American
research, the power dynamic often mutes voices of population’. In other words, there may be an
those who are marginalized.’589 assumption that findings from White US residents
are particularly generalizable. Hence there are now
Of course, behavioral scientists have other aspects calls for psychology to generalize from - rather than
of identity apart from educational and professional just to - Africa.599
status, including gender, race, physical abilities,
sexuality and geography. These features also We can step back and see these arguments as
influence our viewpoints.590 And here again there part of a bigger debate about the role of science.
have been concerns that particular perspectives The “view from nowhere” is one of positivism: the
dominate. Only a quarter of the behavioral insights scientific method is a value-free pursuit of objective
teams cataloged in a 2020 survey were based truths about human behavior that rise above politics
in the Global South.591 An over-reliance on using and beliefs. But this perspective has been criticized
English in cognitive science has led to the impact of through a long history of skeptical inquiry. Some
language on thought being under-estimated.592 In have seen the scientific method as entrenching
the US, there have been particular concerns over power structures or representing a particular
the lack of racial and ethnic diversity in behavioral Western perspective, for example.600
and social sciences, which are less diverse than
biomedical sciences or engineering.593 We are not proposing abandoning the scientific
method; we have seen that randomized
trials can lead to prior assumptions being
Who behavioral scientists are influences what they
questioned and even overturned. But we
do. The positions and perspectives of those studying do agree that ‘values are entering the
behavior influences their theories and methods.594 very process of producing knowledge
within the behavioral sciences’.601 Rather
So this raises questions about the accuracy and than trying to deny this reality in favor
legitimacy of conclusions emerging from a relatively of a ‘view from nowhere’, we should find ways of
homogeneous elite.595 In the words of Neil Lewis, ensuring it produces better outcomes.
Jr., ‘Those in dominant positions within society and
within disciplines do not notice the centrality of their MORE SCRUTINY
positions, and as a result end up assuming that their We’re conscious that we don’t have all the answers
patterns of thoughts, feelings, and behaviors are here. A starting point could be for behavioral
“normal” and neutral’.596 scientists to cultivate awareness of how their stances
and viewpoints affect their practices. At one level,
These assumptions then form a base for developing this task would involve researchers examining
theories, conceptualizing variables, collecting their relationship to and feelings about the topic
data, and interpreting findings.597 For example, in question, and querying how their identities and
data from western, educated, industrialized, rich, experiences contributed to that stance.602
and democratic (WEIRD) samples may be seen Hypothesis generation could particularly benefit
as more generalizable to humans as a whole.
from this exercise, since arguably it is closely The handle was often dirty and people were on the
informed by the researcher’s personal priorities way to work or school. The intervention designers
and preferences.603 Working through the reasons note that while ‘our team’s missed opportunities to
why one is pro-vaccination may be the first step improve key design elements illustrates our own
to developing the ability to perceive and feel how biases’, these could be limited through resident
someone holds an opposing view - perhaps in participation.609
terms of concerns about purity and trust.604
Looking beyond specific interventions, wider
This self-reflexive scrutiny could also be applied to participation in behavioral science work could be
theories and methods. a way of keeping the field fresh by bringing in new
perspectives. Rory Sutherland has emphasized
how behavior can be successfully influenced
through the “alchemy” of asking non-
Behavioral scientists could be actively obvious questions and adopting unique
reflecting on interventions in progress, perspectives.610 Applied behavioral science
including what factors are contributing to may be weakened if it is a closed system
limited to expert input only. We noted earlier
power dynamics. the deadening effect of a “painting by
numbers” approach that just uses a limited
‘How are inequitable approaches, methods, palette of heuristics and biases.
measures filtering into the study, and what are
opportunities to do differently?’605 The task may be MORE DIVERSITY
uncomfortable, but attempts to change approaches
If personal viewpoints and situations constrain
can succeed. Anthropology moved away from
behavioral science, then expanding the range of
claiming objectivity to recognizing the central role
people who become behavioral scientists should
played by the researcher’s perspective.606
allow the field to learn more. For example, building
on our earlier point about hypothesis creation, there
Self-scrutiny may not be enough. We should also
is evidence that cognitive diversity may particularly
find ways for people to judge researchers and
benefit ‘complex, multi-stage, creative problem
decide whether they want to participate in research.
solving, during problem posing and hypothesis
We mean this in a broader sense that goes beyond
generation’.611
consent forms. For example, Chicago Beyond
has created a handbook that provides community
Many kinds of diversity could be targeted.
organizations with criteria for clarifying what they
In terms of geographic diversity, projects in
want to get out of research and whether they want
low- and middle-income countries should have
to participate.607 As we said before, the behavioral
researchers from those countries fully integrated.
lens should work both ways.
Doing this requires addressing barriers like a lack
of professional networks connecting the Global
MORE PARTICIPATION North and Global South, and the time required to
If people do decide to participate, that build understanding of the tactics required to write
participation can be set up to challenge the successful grant applications from funders.612 Within
assumed ‘view from nowhere’ - including higher-income countries, much more could be done
through co-design approaches discussed earlier. to increase the racial diversity of the behavioral
Experiments with politicians have shown that science field, whether in terms of support for starting
misperceptions regarding public opinion can be and completing PhDs,613 or reducing the significant
reduced by increasing exposure to voters.608 The racial gaps in publicly-funded research that exist in
nonprofit Ideas42 gives the example of a project in some countries.614 These changes require sustained
New York City where feedback from participants effort, and may not happen quickly, but are
revealed that a major barrier to using new waste necessary for the future credibility of the field.615
disposal units was the need to touch a handle.
ENDNOTES
1 Hallsworth, M. & Kirkman, E. (2020) Behavioral Insights. MIT Press. 16 Lepenies, R., Mackay, K., & Quigley, M. (2018). Three challenges for behavioural
science and policy: the empirical, the normative and the political. Behavioural Public
2 Chater, N., & Loewenstein, G. (2023). The i-frame and the s-frame: How focusing Policy, 2(2), 174-182.
on the individual-level solutions has led behavioral public policy astray. Behavioral
and Brain Sciences. DOI: https://doi.org/10.1017&am. Schmidt, R. (2022). A model 17 Schmidt, R., & Stenger, K. (2021).
for choice infrastructure: looking beyond choice architecture in Behavioral Public
Policy. Behavioural Public Policy, 1-26. Marteau, T. M., Ogilvie, D., Roland, M., 18 Van Rooij, B., & Fine, A. (2021). The behavioral code: The hidden ways the law
Suhrcke, M., & Kelly, M. P. (2011). Judging nudging: can nudging improve population makes us better or worse. Beacon Press. Andre, P., Haaland, I., Roth, C., & Wohlfart,
health? BMJ, 342. J. (2021). Narratives about the Macroeconomy (No. 18/21). CEBI Working Paper
Series.
3 Mažar, N., & Soman, D. (Eds.). (2022). Behavioral Science in the Wild.
University of Toronto Press. 19 Dolan, P., Hallsworth, M., Halpern, D., King, D., & Vlaev, I. (2010). MINDSPACE:
influencing behaviour for public policy. London: Institute for Government and Cabinet
4 Schmidt, R., & Stenger, K. (2021). Behavioral brittleness: The case for strategic Office.
behavioral public policy. Behavioural Public Policy, 1-26. Lambe, F., Ran, Y., Jürisoo,
M., Holmlid, S., Muhoza, C., Johnson, O., & Osborne, M. (2020). Embracing 20 Meder, B., Fleischhut, N., & Osman, M. (2018). Beyond the confines of choice
complexity: A transdisciplinary conceptual framework for understanding behavior architecture: A critical analysis. Journal of Economic Psychology, 68, 36-44.
change in the context of development-focused interventions. World Development, 126,
104703. Deaton, A., & Cartwright, N. (2018). Understanding and misunderstanding 21 Thibodeau, P. H., & Boroditsky, L. (2011). Metaphors we think with: The role of
randomized controlled trials. Social Science & Medicine, 210, 2-21 metaphor in reasoning. PloS One, 6(2), e16782.
5 Shrout, P. E., & Rodgers, J. L. (2018). Psychology, science, and knowledge 22 Feitsma, J. (2019). Brokering behaviour change: The work of behavioural insights
construction: Broadening perspectives from the replication crisis. Annual Review experts in government. Policy & Politics, 47(1), 37-56.
of Psychology, 69(1), 487-510. Stanley, T. D., Carter, E. C., & Doucouliagos, H.
(2018). What meta-analyses reveal about the replicability of psychological research. 23 Battaglio Jr, R. P., Belardinelli, P., Bellé, N., & Cantarelli, P. (2019). Behavioral
Psychological Bulletin, 144(12) public administration ad fontes: A synthesis of research on bounded rationality,
cognitive biases, and nudging in public organizations. Public Administration Review,
6 Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human 79(3), 304-320.
Behaviour, 3(3), 221-229. Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural
science is unlikely to change the world without a heterogeneity revolution. Nature 24 Schmidt, R. (2022). A model for choice infrastructure: looking beyond choice
Human Behaviour, 5(8), 980-989. Sanbonmatsu, D. M., & Johnston, W. A. (2019). architecture in Behavioral Public Policy. Behavioural Public Policy, 1-26
Redefining science: The impact of complexity on theory development in social and
25 Soman, D., & Yeung, C. (Eds.). (2020). The behaviourally informed organization.
behavioral research. Perspectives on Psychological Science, 14(4)
University of Toronto Press.
7 IJzerman, H., Lewis, N.A., Przybylski, A.K. et al. Use caution when applying
26 HM Treasury (2020). Magenta Book 2020: Supplementary Guide: Handling
behavioural science to policy. Nature Human Behaviour 4, 1092–1094 (2020).
Complexity in Policy Evaluation.
https://doi.org/10.1038/s41562-020-00990-w
27 Boulton, J. G., Allen, P. M., & Bowman, C. (2015). Embracing complexity:
8 Grüne-Yanoff, T. (2012). Old wine in new casks: libertarian paternalism still
Strategic perspectives for an age of turbulence. Oxford University Press.
violates liberal principles. Social Choice and Welfare, 38(4), 635-645. Rizzo, M. J.,
& Whitman, G. (2020). Escaping paternalism: Rationality, behavioral economics, and 28 Angeli, F., Camporesi, S., & Dal Fabbro, G. (2021). The COVID-19 wicked
public policy. Cambridge University Press. problem in public health ethics: conflicting evidence, or incommensurable values?
Humanities and Social Sciences Communications, 8(1), 1-8
9 Ewert, B. (2020). Moving beyond the obsession with nudging individual
behaviour: Towards a broader understanding of Behavioural Public Policy. Public 29 Bak-Coleman, J. B., Alfano, M., Barfuss, W., Bergstrom, C. T., Centeno, M. A.,
Policy and Administration, 35(3), 337-360. Lamont, M., Adler, L., Park, B. Y., & Xiang, Couzin, I. D., ... & Weber, E. U. (2021). Stewardship of global collective behavior.
X. (2017). Bridging cultural sociology and cognitive psychology in three contemporary Proceedings of the National Academy of Sciences, 118(27), e2025764118.
research programmes. Nature Human Behaviour, 1(12), 866-872. Leggett, W. (2014).
The Politics of Behaviour Change: Nudge, Neoliberalism and the State. Policy & Politics 30 Scott, James C. (1998) Seeing like a State. Yale University Press
42(1), 3–19.
31 Chater, N., & Loewenstein, G. (2023).
10 Lorenz-Spreen, P., Geers, M., Pachur, T., Hertwig, R., Lewandowsky,
S., & Herzog, S. M. (2021). Boosting people’s ability to detect microtargeted 32 Boulton, J. G., Allen, P. M., & Bowman, C. (2015). Embracing complexity:
advertising. Scientific Reports, 11(1), 1-9. Mills, S. (2022). Personalized nudging. Strategic perspectives for an age of turbulence. OUP Oxford.
Behavioural Public Policy, 6(1), 150-159. Mohlmann, M. (2021) Algorithmic Nudges
Don’t Have to Be Unethical. Harvard Business Review. Mills, S. (2022). Personalized 33 Schill, C., et al. (2019). A more dynamic understanding of human behaviour for
Nudging. Behavioural Public Policy, 6(1), 150-159. the Anthropocene. Nature Sustainability, 2(12), 1075-1082.
11 Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the 34 DiMaggio, P., & Markus, H. R. (2010). Culture and social psychology:
world? Behavioral and Brain Sciences, 33(2-3), 61-83. Converging perspectives. Social Psychology Quarterly, 73(4), 347-352.
12 Lewis Jr, N. A. (2021). What counts as good science? How the battle for 35 Hallsworth, M. (2017). Rethinking public health using behavioural science.
methodological legitimacy affects public psychology. American Psychologist, 76(8), Nature Human Behaviour, 1(9), 612-612.
1323. Dupree, C. H., & Kraus, M. W. (2022). Psychological science is not race neutral.
Perspectives on Psychological Science, 17(1), 270-275. 36 Asano, Y. M., Kolb, J. J., Heitzig, J., & Farmer, J. D. (2021). Emergent inequality
and business cycles in a simple behavioral macroeconomic model. Proceedings of the
13 Levitt, S. D., & List, J. A. (2007). What do laboratory experiments National Academy of Sciences, 118(27).
measuring social preferences reveal about the real world? Journal of Economic
Perspectives, 21(2), 153-174. 37 Bak-Coleman, et al. (2021). Stewardship of global collective
behavior. Proceedings of the National Academy of Sciences, 118(27), e2025764118.
14 Hansen, P. G. (2019). Tools and ethics for applied behavioural insights: The
BASIC toolkit. Organisation for Economic Cooperation and Development, OECD. 38 Jones-Rooy, A., & Page, S. E. (2012). The complexity of system effects. Critical
Review, 24(3), 313-342.
15 Schmidt, R., & Stenger, K. (2021). Behavioral brittleness: the case for strategic
behavioral public policy. Behavioural Public Policy, 1-26. 39 Hawe, P., Shiell, A., & Riley, T. (2009). Theorising interventions as events in
systems. American Journal of Community Psychology, 43(3-4), 267-276.
40 Hallsworth, M. (2011) System Stewardship: The future of policymaking? Institute 61 https://osf.io/zuh93/

for Government.
62 Landy, J. F., et al. (2020). Crowdsourcing hypothesis tests: Making transparent
41 Rates, C. A., Mulvey, B. K., Chiu, J. L., & Stenger, K. (2022). Examining how design choices shape research results. Psychological Bulletin, 146(5), 451.
ontological and self-monitoring scaffolding to improve complex systems thinking with a
participatory simulation. Instructional Science, 1-23. 63 Van Bavel, J. J., Mende-Siedlecki, P., Brady, W. J., & Reinero, D. A. (2016).
Contextual sensitivity in scientific reproducibility. Proceedings of the National Academy
42 Fernandes, L., Morgado, L., Paredes, H., Coelho, A., & Richter, J. (2019). of Sciences, 113(23), 6454-6459.
Immersive learning experiences for understanding complex systems. In: iLRN 2019
London-Workshop, Long and Short Paper, Poster, Demos, and SSRiP Proceedings from 64 Damschroder, L. J., Aron, D. C., Keith, R. E., Kirsh, S. R., Alexander, J. A., &
the Fifth Immersive Learning Research Network Conference (pp. 107-113). Verlag der Lowery, J. C. (2009). Fostering implementation of health services research findings
Technischen Universität Graz. into practice: a consolidated framework for advancing implementation science.
Implementation Science, 4(1), 1-15.
43 https://www.3ieimpact.org/sites/default/files/2021-07/complexity-blg-
Annex1-Checklist_assessing_level_complexity.pdf 65 Mazar, N. & Soman, D. (eds.) (2022) Behavioral Science in the Wild, University
of Toronto Press
44 Treasury, H. M. (2020). Magenta Book Annex: Handling complexity in policy
evaluation. Deaton, A., & Cartwright, N. (2018). Understanding and misunderstanding 66 Brenninkmeijer, J., Derksen, M., Rietzschel, E., Vazire, S., & Nuijten, M. (2019).
randomized controlled trials. Social Science & Medicine, 210, 2-21. Informal laboratory practices in psychology. Collabra: Psychology, 5(1).
45 Bonell, C., Jamal, F., Melendez-Torres, G. J., & Cummins, S. (2015). ‘Dark 67 McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale
logic’: theorising the harmful consequences of public health interventions. J Epidemiol replication projects in contemporary psychological research. The American Statistician,
Community Health, 69(1), 95-98. 73(sup1), 99-105.
46 Kim, D. A., Hwong, A. R., Stafford, D., Hughes, D. A., O’Malley, A. J., Fowler, 68 Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in
J. H., & Christakis, N. A. (2015). Social network targeting to maximise population psychology. Psychonomic bulletin & review, 26(5), 1596-1618. Borsboom, D., van
behaviour change: A cluster randomised controlled trial. The Lancet, 386(9989), 145- der Maas, H. L., Dalege, J., Kievit, R. A., & Haig, B. D. (2021). Theory construction
153. methodology: A practical framework for building theories in psychology. Perspectives
on Psychological Science, 16(4), 756-766.
47 Berry, D. A. (2006). Bayesian clinical trials. Nature Reviews Drug Discovery,
5(1), 27-36. 69 Sanbonmatsu, D. M., Cooley, E. H., & Butner, J. E. (2021). The Impact of
Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology,
48 https://www.bi.team/blogs/running-rcts-with-complex-interventions/ 4028.
49 Volpp, K. G., Terwiesch, C., Troxel, A. B., Mehta, S., & Asch, D. A. (2013, June). 70 Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in
Making the RCT more useful for innovation with evidence-based evolutionary testing. In psychology. Psychonomic Bulletin & Review, 26(5), 1596-1618.
Healthcare (Vol. 1, No. 1-2, pp. 4-7). Elsevier.
71 Fried, E. I. (2020). Theories and models: What they are, what they are for, and
50 Kidwell, K. M., & Hyde, L. W. (2016). Adaptive interventions and SMART what they are about. Psychological Inquiry, 31(4), 336-344.
designs: application to child behavior research in a community setting. American
Journal of Evaluation, 37(3), 344-363. 72 Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human
Behaviour, 3(3), 221-229.
51 Caria, S., Kasy, M., Quinn, S., Shami, S., & Teytelboym, A. (2020). An adaptive
targeted field experiment: Job search assistance for refugees in Jordan. SSRN. 73 http://behavioralscientist.org/there-is-more-to-behavioral-science-than-
biases-and-fallacies/
52 https://www.cecan.ac.uk/wp-content/uploads/2020/08/EPPN-No-03-
Agent-Based-Modelling-for-Evaluation.pdf 74 Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human
Behaviour, 3(3), 221-229.
53 Schlüter, M., et al. (2017). A framework for mapping and comparing behavioural
theories in models of social-ecological systems. Ecological Economics, 131, 21-35. 75 Abner, G. B., Kim, S. Y., & Perry, J. L. (2017). Building evidence for public human
Wijermans, N., Boonstra, W. J., Orach, K., Hentati‐Sundberg, J., & Schlüter, M. (2020). resource management: Using middle range theory to link theory and data. Review of
Behavioural diversity in fishing—Towards a next generation of fishery models. Fish and Public Personnel Administration, 37(2), 139-159.
Fisheries, 21(5), 872-890.
76 Moore, L. F., Johns, G., & Pinder, C. C. (1980). Toward middle range theory.
54 Schill, C., et al. (2019). A more dynamic understanding of human behaviour for Middle range theory and the study of organizations, 1-16.
the Anthropocene. Nature Sustainability, 2(12), 1075-1082.
77 Sanbonmatsu, D. M., Cooley, E. H., & Butner, J. E. (2021). The Impact of
55 Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology,
Undisclosed flexibility in data collection and analysis allows presenting anything as 4028.
significant. Psychological Science, 22(11), 1359-1366; Nelson, L. D., Simmons, J., &
Simonsohn, U. (2018). Psychology’s renaissance. Annual Review of Psychology, 69, 78 Berkman, E. T., & Wilson, S. M. (2021). So useful as a good theory? The
511-534. practicality crisis in (social) psychological theory. Perspectives on Psychological
Science, 16(4), 864-874.
56 Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses
reveal about the replicability of psychological research. Psychological Bulletin, 79 Lieder, F., & Griffiths, T. L. (2020). Resource-rational analysis: Understanding
144(12), 1325. human cognition as the optimal use of limited computational resources. Behavioral and
Brain Sciences, 43.
57 Van Bavel, J. J., Mende-Siedlecki, P., Brady, W. J., & Reinero, D. A. (2016).
Contextual sensitivity in scientific reproducibility. Proceedings of the National Academy 80 Callaway, F., Hardy, M., & Griffiths, T. (2022). Optimal nudging for cognitively
of Sciences, 113(23), 6454-6459. Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). bounded agents: A framework for modeling, predicting, and controlling the effects of
Behavioural science is unlikely to change the world without a heterogeneity revolution. choice architectures. Working Paper.
Nature Human Behaviour, 5(8), 980-989.
81 Roese, N. J., & Vohs, K. D. (2012). Hindsight bias. Perspectives on Psychological
58 McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale Science, 7(5), 411-426.
replication projects in contemporary psychological research. The American Statistician,
73(sup1), 99-105. 82 Henriksen, K., & Kaplan, H. (2003). Hindsight bias, outcome knowledge and
adaptive learning. BMJ Quality & Safety, 12(suppl 2), ii46-ii50. Bukszar, E., &
59 Sanbonmatsu, D. M., Cooley, E. H., & Butner, J. E. (2021). The Impact of Connolly, T. (1988). Hindsight bias and strategic choice: Some problems in learning
Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology, from experience. Academy of Management Journal, 31(3), 628-641. DellaVigna, S.,
4028. Pope, D., & Vivalt, E. (2019). Predict science to improve science. Science, 366(6464),
428-429.
60 Cartwright, N., & Hardie, J. (2012). Evidence-based policy: A practical guide to
doing it better. Oxford University Press. Soman, D. & Mazar, N. (2022) The Science of 83 DellaVigna, S., Pope, D., & Vivalt, E. (2019). Predict science to improve science.
Translation and Scaling. In: Mazar and Soman (eds.) Behavioral Science in the Wild, Science, 366(6464), 428-429.
University of Toronto Press, pp.5-19.
84 Munnich, E., & Ranney, M. A. (2019). Learning from surprise: Harnessing a 105 Mills, S. (2022). Personalized nudging. Behavioural Public Policy, 6(1), 150-159.
metacognitive surprise signal to build and adapt belief networks. Topics in Cognitive Lorenz-Spreen, P., Geers, M., Pachur, T., Hertwig, R., Lewandowsky, S., & Herzog, S.
Science, 11(1), 164-177. M. (2021). Boosting people’s ability to detect microtargeted advertising. Scientific
Reports, 11(1), 1-9.
85 Dimant, E., Clemente, E. G., Pieper, D., Dreber, A., & Gelfand, M. (2022).
Politicizing mask-wearing: predicting the success of behavioral interventions among 106 Kozyreva, A., Lorenz-Spreen, P., Hertwig, R., Lewandowsky, S., & Herzog, S. M.
republicans and democrats in the US. Scientific Reports, 12(1), 1-12. DellaVigna, (2021). Public attitudes towards algorithmic personalization and use of personal data
S., & Linos, E. (2022). RCTs to scale: Comprehensive evidence from two nudge online: Evidence from Germany, Great Britain, and the United States. Humanities and
units. Econometrica, 90(1), 81-116. Social Sciences Communications, 8(1), 1-11.
86 Ackerman, R., Bernstein, D. M., & Kumar, R. (2020). Metacognitive hindsight 107 De Jonge, P., Verlegh, P. & Zeelenberg, M. (2022) If You Want People to Accept
bias. Memory & Cognition, 48(5), 731-744. Your Intervention, Don’t Be Creepy. In: Soman, D. & Mazar, N. (eds) Behavioral
Science in the Wild, University of Toronto Press, pp.284-291.
87 Pezzo, M. (2003). Surprise, defence, or making sense: What removes hindsight
bias? Memory, 11(4-5), 421-441. 108 https://thedecisionlab.com/insights/technology/this-is-personal-the-dos-and-
donts-of-personalization-in-tech
behavioral public policy. Behavioural Public Policy, 1-26. 109 Matz, S. C., Kosinski, M., Nave, G., & Stillwell, D. J. (2017). Psychological
targeting as an effective approach to digital mass persuasion. Proceedings of the
89 Dorison, C. A., & Heller, B. H. (2022). Observers penalize decision makers National Academy of Sciences, 114(48), 12714-12719.
whose risk preferences are unaffected by loss–gain framing. Journal of Experimental
Psychology: General, 151(9). 110 Lorenz-Spreen, P., Geers, M., Pachur, T., Hertwig, R., Lewandowsky, S., &
Herzog, S. M. (2021). Boosting people’s ability to detect microtargeted advertising.
90 Porter, T., Elnakouri, A., Meyers, E. A., Shibayama, T., Jayawickreme, E., & Scientific Reports, 11(1), 1-9.
Grossmann, I. (2022). Predictors and consequences of intellectual humility. Nature
Reviews Psychology, 1(9), 524-536. 111 Lewis Jr, N. A. (2021). What counts as good science? How the battle for
methodological legitimacy affects public psychology. American Psychologist, 76(8),
91 Hallsworth, M., Egan, M., Rutter, J., & McCrae, J. (2018). Behavioural 1323.
government: Using behavioural science to improve how governments make decisions.
London: Institute for Government. 112 Nagel, T. (1986) The View from Nowhere. Sugden, R. (2013). The behavioural
economist and the social planner: to whom should behavioural welfare economics be
92 Van Bavel, R., & Dessart, F. J. (2018). The case for qualitative methods in addressed?. Inquiry, 56(5), 519-538.
behavioural studies for EU policy-making. Publications Office of the European Union:
Luxembourg. https://www.povertyactionlab.org/blog/8-11-21/strengthening- 113 Liscow, Z., & Markovits, D. (2022). Democratizing Behavioral Economics. Yale
randomized-evaluations-through-incorporating-qualitative-research-part-1 Journal on Regulation, 39(1217).
93 Walton, G. M., & Wilson, T. D. (2018). Wise interventions: Psychological 114 Roberts, S. O., Bareket-Shavit, C., Dollins, F. A., Goldie, P. D., & Mortenson,
remedies for social and personal problems. Psychological Review, 125(5), 617. E. (2020). Racial inequality in psychological research: Trends of the past and
recommendations for the future. Perspectives on Psychological Science, 15(6), 1295-
94 Lewis Jr, N. A. (2021). What counts as good science? How the battle for 1309.
methodological legitimacy affects public psychology. American Psychologist, 76.
115 https://gocommonthread.com/work/global-gavi/bi
95 Lamont, M., Adler, L., Park, B. Y., & Xiang, X. (2017). Bridging cultural sociology
and cognitive psychology in three contemporary research programmes. Nature Human 116 Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the
Behaviour, 1(12), 866-872. Vaisey, S. (2009). Motivation and justification: A dual- world?. Behavioral and brain sciences, 33(2-3), 61-83. Cheon, B. K., Melani, I., &
process model of culture in action. American journal of sociology, 114(6), 1675-1715. Hong, Y. Y. (2020). How USA-centric is psychology? An archival study of implicit
assumptions of generalizability of findings to human nature based on origins of study
96 Richardson, L., & John, P. (2021). Co-designing behavioural public policy: samples. Social Psychological and Personality Science, 11(7), 928-937.
lessons from the field about how to ‘nudge plus’. Evidence & Policy, 17(3), 405-422.
Hallsworth, M. & Kirkman, E. (2020). 117 Dupree, C. H., & Kraus, M. W. (2022). Psychological science is not race neutral.
Perspectives on Psychological Science, 17(1), 270-275.
97 Banerjee, S., & John, P. (2020). Nudge plus: incorporating reflection into
behavioral public policy. Behavioural Public Policy, 1-16. Reijula, S., & Hertwig, R. 118 Sendhil Mullainathan, keynote address to the Society of Judgment and Decision
(2022). Self-nudging and the citizen choice architect. Behavioural Public Policy, 6(1), Making Annual Conference, 2022.
119-149. Hertwig, R., & Grüne-Yanoff, T. (2017). Nudging and boosting: Steering or
empowering good decisions. Perspectives on Psychological Science, 12(6), 973-986. 119 https://chicagobeyond.org/researchequity/
98 Hertwig, R. (2017). When to consider boosting: some rules for policy-makers. 120 https://www.poverty-action.org/blog/locally-grounded-research-
Behavioural Public Policy, 1(2), 143-161. See also: Grüne-Yanoff, T., Marchionni, C., strengthening-partnerships-advance-science-and-impact-development
& Feufel, M. A. (2018). Toward a framework for selecting behavioural policies: How to
choose between boosts and nudges. Economics & Philosophy, 34(2), 243-266 121 https://www.bi.team/blogs/increasing-economic-mobility-in-us-cities/. OECD
(2017) Behavioural Insights and Public Policy: Lessons from Around the World; https://
99 Sims, A., & Müller, T. M. (2019). Nudge versus boost: A distinction without a www.un.org/en/content/behaviouralscience/
normative difference. Economics & Philosophy, 35(2), 195-222.
122 Hallsworth, M., Snijders, V., Burd, H., Prestt, J., Judah, G., Huf, S., & Halpern, D.
100 Big-data studies of human behaviour need a common language. Nature July (2016). Applying behavioral insights: Simple ways to improve health outcomes. World
8, 2021. Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in Innovation Summit for Health.
psychology: Lessons from machine learning. Perspectives on Psychological Science,
12(6), 1100-1122. 123 Damgaard, M. T., & Nielsen, H. S. (2018). Nudging in education. Economics of
Education Review, 64, 313-342.
101 Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous
treatment effects using random forests. Journal of the American Statistical 124 Ferrari, L., Cavaliere, A., De Marchi, E., & Banterle, A. (2019). Can nudging
Association, 113(523), 1228-1242. Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B. improve the environmental impact of food supply chain? A systematic review. Trends
(2019). Metalearners for estimating heterogeneous treatment effects using machine in Food Science & Technology, 91, 184-192. Byerly, H., et al. (2018). Nudging
learning. Proceedings of the National Academy of Sciences, 116(10), 4156-4165. pro‐environmental behavior: evidence and opportunities. Frontiers in Ecology and the
Environment, 16(3), 159-168.
102 Todd-Blick, A., Spurlock, C. A., Jin, L., Cappers, P., Borgeson, S., Fredman, D.,
& Zuboy, J. (2020). Winners are not keepers: Characterizing household engagement, 125 Dyson, P. & Sutherland, R. (2021) Transport for humans: Are we nearly there yet?
gains, and energy patterns in demand response using machine learning in the United London publishing.
States. Energy Research & Social Science, 70, 101595.
126 Cadario, R., & Chandon, P. (2020). Which healthy eating nudges work best? A
103 Mills, S. (2022). Personalized nudging. Behavioural Public Policy, 6(1), 150-159. meta-analysis of field experiments. Marketing Science, 39(3), 465-486.
104 Soman, D., & Hossain, T. (2021). Successfully scaled solutions need not be 127 Money Advice Service, Behavioural Insights Team, and Ipsos MORI (2018)
homogenous. Behavioral Public Policy, 5(1), 80–9. A behavioural approach to managing money: Ideas and results from the Financial
Capability Lab.
128 Benartzi, S., et al. (2017). Should governments invest more in nudging? 153 https://gocommonthread.com/work/global-gavi-bi/. Halpern, D. (2015) Inside
Psychological Science, 28(8), 1041-1055. the Nudge Unit. WH Allen. Angawi, A. & Hasanain, W. (2018) The Nuts and Bolts of
Behavioral Insights Units. In: The Behavioral Economics Guide 2018, ed. Alain Sasom.
129 Chater, N., & Loewenstein, G. (2023). Robertson, T., Darling, M., Leifer, J., Footer, O., & Gordski, D. (2017). Behavioral design
teams: The next frontier in clinical delivery innovation. Issue Brief, 2017, 1-16.
130 Sanders, M., Snijders, V., & Hallsworth, M. (2018). Behavioural science and
policy: Where are we now and where are we going?. Behavioural Public Policy, 2(2), 154 Battaglio Jr, R. P., Belardinelli, P., Bellé, N., & Cantarelli, P. (2019). Behavioral
144-167. public administration ad fontes: A synthesis of research on bounded rationality,
cognitive biases, and nudging in public organizations. Public Administration Review,
131 Sanders, M., Snijders, V., & Hallsworth, M. (2018). 79(3), 304-320.
132 See, for example, Hansen, P. G. (2019). Tools and ethics for applied behavioural 155 https://hbr.org/2017/10/why-coos-should-think-like-behavioral-economists
insights: The BASIC toolkit. Organisation for Economic Cooperation and Development,
OECD; Datta, S., & Mullainathan, S. (2014). Behavioral design: a new approach to 156 Soman, D. (2020). The Science of Using Behavioral Science. In: The
development policy. Review of Income and Wealth, 60(1), 7-35. Behaviourally Informed Organization (pp. 3-22). University of Toronto Press, pp.12-13.
133 Schmidt, R., & Stenger, K. (2021). Behavioral brittleness: the case for strategic 157 Mayer, S., Shah, R., & Kalil, A. (2021). How cognitive biases can undermine
behavioral public policy. Behavioural Public Policy, 1-26. program scale-up decisions. In: The Scale-Up Effect in Early Childhood and Public
Policy (pp. 41-57). Routledge.
134 Hallsworth, M., Berry, D., Sanders, M., Sallis, A., King, D., Vlaev, I., & Darzi,
A. (2015). Stating appointment costs in SMS reminders reduces missed hospital 158 CPI (2021). Human Learning Systems Report, p.62
appointments: findings from two randomised controlled trials. PloS one, 10(9),
e0137306. 159 Behavioral Insights Team (2018). Behavioral Government.
135 https://www.nsw.gov.au/behavioural-insights-unit/blog/reducing-missed- 160 Feng, Kim & Soman (2020) have a related but different diagram, which focuses
hospital-appointments-better-text-messages; Berliner Senderey, A., Kornitzer, T., on what issues the behavioral strategies are applied to. Feng, B., Kim, M., & Soman,
Lawrence, G., Zysman, H., Hallak, Y., Ariely, D., & Balicer, R. (2020). It’s how you say D. (2020). Embedding Behavioral Insights in Organizations. In: The Behaviourally
it: Systematic A/B testing of digital messaging cut hospital no-show rates. PloS one, Informed Organization (pp. 23-40). University of Toronto Press.
15(6), e0234817.
161 To minimize complexity, we are combining ‘knowledge’, referring to awareness
136 Mills, S. (2020). Nudge/sludge symmetry: on the relationship between nudge of the concepts of behavioral science, and ‘capacity’, referring to the ability to
and sludge and the resulting ontological, normative and transparency implications. implement them - e.g., construct interventions and test them.
Behavioural Public Policy, 1-24.
162 Obviously, this diagram greatly simplifies matters. It does not fully account for the
137 Schmidt, R., & Stenger, K. (2021). Behavioral brittleness: The case for strategic relationships between organizations (as in those between government departments).
behavioral public policy. Behavioural Public Policy, 1-26. The World Bank adds a third category, ‘networked’, to represent resources that are
diffused but coordinated Afif, Z., Islan, W. W., Calvo-Gonzalez, O., & Dalton, A.
138 https://medium.com/@DavePerrott/is-applied-behavioural-science-reaching- (2018). Behavioral science around the world: Profiles of 10 countries. World Bank
a-local-maximum-538b536f7e7d Group: Mind, Behavior, and Development Unit.
139 Hansen, P. G. (2018). What are we forgetting? Behavioural Public Policy, 2(2), 163 Feng, B., Kim, M., & Soman, D. (2020). Embedding Behavioral Insights in
190-197. Organizations. In: The Behaviourally Informed Organization (pp. 23-40). University of
Toronto Press, p.31.
140 Costa, E., King, K., Dutta, R., & Algate, F. (2016). Applying behavioural insights to
regulated markets. The Behavioural Insights Team. 164 Behavioral Insights Team (2018) Behavioral Government.
141 https://www.bi.team/blogs/behaviour-change-and-the-new-sugar-tax/ 165 Herd, P. & Moynihan, D (2018) Administrative Burden: Policymaking by other
means. Russell Sage Foundation. Sunstein, C. R. (2022). Sludge audits. Behavioural
142 The Behavioural Insights Team (2020) The Behavioural Economy. public policy, 6(4), 654-673.
143 Ewert, B., Loer, K., & Thomann, E. (2021). Beyond nudge: Advancing the state-of- 166 Cantarelli, P., Bellé, N., & Belardinelli, P. (2020). Behavioral public HR:
the-art of behavioural public policy and administration. Policy & Politics, 49(1), 3-23. Experimental evidence on cognitive biases and debiasing interventions. Review of
Public Personnel Administration, 40(1), 56-81.
144 Halpern, D. (2015). Inside the nudge unit: How small changes can make a big
difference. WH Allen. 167 Cantarelli, P., Bellé, N., & Belardinelli, P. (2020).
145 Hansen, P. G. (2018). What are we forgetting? Behavioural Public Policy, 2(2), 168 Feitsma, J. (2019). Brokering behaviour change: the work of behavioural insights
190-197. experts in government. Policy & Politics, 47(1), 37-56.
146 Soman, D., & Yeung, C. (eds.) (2020). The behaviourally informed organization. 169 Hallsworth, M., & Kirkman, E. (2020). p.192.
170 Soman, D. (2020). The Science of Using Behavioral Science. In: The
147 Dolan, P., Hallsworth, M., Halpern, D., King, D., & Vlaev, I. (2010). MINDSPACE: Behaviourally Informed Organization (pp. 3-22). University of Toronto Press, p.18.
influencing behaviour for public policy. London: Institute for Government and Cabinet
Office. 171 Soman, D., & Yeung, C. (2020), p.265.
148 Meder, B., Fleischhut, N., & Osman, M. (2018). Beyond the confines of choice 172 Soman, D., & Yeung, C. (2020), p.257.
architecture: a critical analysis. Journal of Economic Psychology, 68, 36-44.
173 Feitsma, J. N. P. (2019). Inside the behavioural state. Eleven International
149 Schmidt, R. (2022). A model for choice infrastructure: looking beyond choice Publishing.
architecture in Behavioral Public Policy. Behavioural Public Policy, 1-26. However, the
notion of ‘infrastructure’ has connotations of physical rigidity, which may sit badly with 174 Soman, D., & Yeung, C. (2020), p.265.
the dynamic nature of human infrastructure (interactions and relationships).
175 Feitsma, J. N. P. (2019). Inside the behavioural state. Eleven International
150 Thibodeau, P. H., & Boroditsky, L. (2011). Metaphors we think with: The role of Publishing, p.109.
metaphor in reasoning. PloS One, 6(2), e16782.
176 Soman and Yeung (2020) put forward the principle of the ‘behaviourally-
151 Khan, Z. & Newman, L. (2021). Building behavioral science in an organization. informed organization’. I support and draw on this work. However, for my purposes the
Action Design Press. term “informed” introduces an ambiguity. The processes in the ‘nudged organization’
may be seen as informed by behavioral science, even though few people may be
152 Wendel, S. (October 5, 2020). Who is doing applied behavioural science? informed of the principles being applied.
Results from a Global Survey of Behavioural Teams. https://behavioralscientist.
org/who-is-doing-applied-behavioral-science-results-from-a-global-survey-of- 177 Ewert, B., & Loer, K. (2021). Advancing behavioural public policies: in pursuit of a
behavioral-teams/ more comprehensive concept. Policy & Politics, 49(1), 25-47.
178 Schmidt, R. (2022). A model for choice infrastructure: looking beyond choice 204 Medina, P. C. (2021). Side effects of nudging: Evidence from a randomized
architecture in Behavioral Public Policy. Behavioural Public Policy, 1-26 intervention in the credit card market. The Review of Financial Studies, 34(5), 2580-
2607.
179 Obviously, the focus here will change depending on whether you are looking
to build behavioral science knowledge and capacity (i.e., move further right in the 205 Cook, R., & Rasmussen, J. (2005). “Going solid”: a model of system dynamics
diagram). There are similarities here to the distinction that Lourenco et al. (2016) and consequences for patient safety. BMJ Quality & Safety, 14(2), 130-134.
make between ‘behaviorally informed’ initiatives (designed after an explicit review of
evidence on behavior) and ‘behaviorally aligned’ initiatives (initiatives that do not rely 206 Bak-Coleman, J. B., et al. (2021).
on behavioral evidence but which can be found to be in line with it).
207 Boulton, J. G., Allen, P. M., & Bowman, C. (2015). Embracing complexity:
180 Schmidt, R. (2022). A model for choice infrastructure: looking beyond choice Strategic perspectives for an age of turbulence. OUP Oxford.
architecture in Behavioral Public Policy. Behavioural Public Policy, 1-26
208 Jones-Rooy, A., & Page, S. E. (2012). The complexity of system effects. Critical
181 Soman, D., & Yeung, C. (eds.) (2020). The behaviourally informed organization. Review, 24(3), 313-342.
209 Dentoni, D., Bitzer, V., & Schouten, G. (2018). Harnessing wicked problems in
182 Soman, D., & Yeung, C. (2020), p.284. multi-stakeholder partnerships. Journal of Business Ethics, 150(2), 333-356.
183 Grimmelikhuijsen, S., Jilke, S., Olsen, A. L., & Tummers, L. (2017). Behavioral 210 Barak‐Corren, N., & Kariv‐Teitelbaum, Y. (2021). Behavioral responsive
public administration: Combining insights from public administration and psychology. regulation: Bringing together responsive regulation and behavioral public
Public Administration Review, 77(1), 45-56. Donohue, K., Özer, Ö., & Zheng, Y. policy. Regulation & Governance, 15, S163-S182.
(2020). Behavioral operations: Past, present, and future. Manufacturing & Service
Operations Management, 22(1), 191-202. 211 Hallsworth, M. (2012). How complexity economics can improve government:
rethinking policy actors, institutions and structures. Complex new world: translating new
184 Sunstein, C. R. (2022). Sludge audits. Behavioural public policy, 6(4), 654-673. economic thinking into public policy, London: IPPR (Institute for Public Policy Research),
39-49.
185 HM Treasury (2020) Magenta Book 2020: Supplementary Guide: Handling
Complexity in Policy Evaluation. 212 Gras, D., Conger, M., Jenkins, A., & Gras, M. (2020). Wicked problems,
reductive tendency, and the formation of (non-) opportunity beliefs. Journal of Business
186 Savona, N., Thompson, C., Rutter, H., & Cummins, S. (2017). Exposing Venturing, 35(3).
complexity as a smokescreen: A qualitative analysis. The Lancet, 390, S3.
213 Gras, D., Conger, M., Jenkins, A., & Gras, M. (2020).
187 Hallsworth, M. & Kirkman, E. (2020), pp.185-186.
214 Dunlop, C. A., & Radaelli, C. M. (2015). Overcoming Illusions of Control: How to
188 Rittel, H. W., & Webber, M. M. (1973). Dilemmas in a general theory of planning. Nudge and Teach
Policy Sciences, 4(2), 155-169.
Regulatory Humility. In A. Alemanno & A. L. Sibony (eds.) Nudge and the Law. A
189 Bak-Coleman, J. B., et al. (2021). Stewardship of global collective behavior. European Perspective (pp. 139-160). Oxford/London: Hart.
Proceedings of the National Academy of Sciences, 118(27), e2025764118.
215 Scott, James C. (1998) Seeing like a State. Yale University Press
190 Krakauer, D. & West, G. (2021) The Complex Alternative: Complexity Scientists
on the COVID-19 Pandemic. The Santa Fe Institute. 216 Boulton, J. G., Allen, P. M., & Bowman, C. (2015). Embracing complexity:
Strategic perspectives for an age of turbulence. OUP Oxford.
191 Angeli, F., Camporesi, S., & Dal Fabbro, G. (2021). The COVID-19 wicked
problem in public health ethics: conflicting evidence, or incommensurable values? 217 I am aware of the argument that many social behaviors are ‘complex contagions’
Humanities and Social Sciences Communications, 8(1), 1-8. and therefore require ‘contact with multiple sources of reinforcement in order to be
transmitted’, as opposed to viruses, which are ‘simple contagions’ that may only require
192 Bak-Coleman, J. B., et al. (2021). a single contact from a single source. Centola, D. (2018). How Behavior Spreads, p.37.
193 Endo, A. (2020). Estimating the overdispersion in COVID-19 transmission using 218 Macy, M., Deri, S., Ruch, A., & Tong, N. (2019). Opinion cascades and the
outbreak sizes outside China. Wellcome Open Research, 5. unpredictability of partisan polarization. Science Advances, 5(8), eaax0754.
194 Solé, R., & Elena, S. F. (2018). Viruses as complex adaptive systems (Vol. 15). 219 Schill, C., et al. (2019). A more dynamic understanding of human behaviour
Princeton University Press. https://stemacademicpress.com/stem-volumes-covid-19 for the Anthropocene. Nature Sustainability, 2(12), 1075-1082.; DiMaggio, P., &
Markus, H. R. (2010). Culture and social psychology: Converging perspectives. Social
195 Bak-Coleman, J. B., et al. (2021). Psychology Quarterly, 73(4), 347-352.
196 Turcotte-Tremblay, A. M., Gali, I. A. G., & Ridde, V. (2021). The unintended 220 Chater, N., & Loewenstein, G. (2023)
consequences of COVID-19 mitigation measures matter: practical guidance for
investigating them. BMC Medical Research Methodology, 21(1), 1-17. 221 Schill, C., et al. (2019).
197 Lambe, F., Ran, Y., Jürisoo, M., Holmlid, S., Muhoza, C., Johnson, O., & Osborne, 222 There are similarities here to the idea of “Coleman’s boat” in sociology.
M. (2020). Embracing complexity: A transdisciplinary conceptual framework for Ramström, G. (2018). Coleman’s boat revisited: Causal sequences and the micro-
understanding behavior change in the context of development-focused interventions. macro link. Sociological Theory, 36(4), 368-391.
World Development, 126, 104703.
223 Abson, D. J., et al. (2017). Leverage points for sustainability transformation.
198 Ruth Schmidt and Kateleyn Stenger sum up the challenges in terms of ‘contextual Ambio, 46(1), 30-39.
brittleness’, ‘systemic brittleness’, and ‘anticipatory brittleness’ in an excellent article.
Schmidt, R., & Stenger, K. (2021). Behavioral brittleness: the case for strategic 224 Currin, C. B., Vera, S. V., & Khaledi-Nasab, A. (2022). Depolarization of echo
behavioral public policy. Behavioural Public Policy, 1-26. chambers by random dynamical nudge. Scientific Reports, 12(1), 1-13.
199 Government Communication Service (2021). The Principles of Behaviour Change 225 Andreoni, J., Nikiforakis, N., & Siegenthaler, S. (2021). Predicting social tipping
Communications, p.34 At: https://gcs.civilservice.gov.uk/publications/the-principles- and norm change in controlled experiments. Proceedings of the National Academy of
of-behaviour-change-communications/ Sciences, 118(16), e2014893118.
200 Center for Public Innovation (2020) Human Learning Systems Report, p.76. 226 Daniels, B. C., Krakauer, D. C., & Flack, J. C. (2017). Control of finite critical
behaviour in a small-scale social system. Nature Communications, 8(1), 1-8.
behavioral public policy. Behavioural Public Policy, 1-26. Altmann, S., Grunewald, A., 227 Alexander, M., Forastiere, L., Gupta, S., & Christakis, N. A. (2022). Algorithms
& Radbruch, J. (2021). Interventions and Cognitive Spillovers, The Review of Economic for seeding social networks can enhance the adoption of a public health intervention
Studies, rdab087, https://doi.org/10.1093/restud/rdab087 in urban India. Proceedings of the National Academy of Sciences, 119(30),
e2120742119.
202 Fisman, R. & Golden, M. (2017). How to fight corruption. Science, 356(6340):
803–804. 228 Hallsworth, M. (2017). Rethinking public health using behavioural science.
Nature Human Behaviour, 1(9), 612-612.
203 Taylor, R. L. (2019). Bag leakage: The effect of disposable carryout bag
regulations on unregulated bags. Journal of Environmental Economics and 229 https://ondrugs.substack.com/p/president-bidens-scheduling-directive
Management, 93, 254-271.
230 https://www.bbc.com/news/health-40444460 256 Catlow, J., Bhardwaj-Gosling, R., Sharp, L., Rutter, M. D., & Sniehotta, F. F.
(2022). Using a dark logic model to explore adverse effects in audit and feedback: A
231 Jones-Rooy, A., & Page, S. E. (2012). The complexity of system effects. Critical qualitative study of gaming in colonoscopy. BMJ Quality & Safety, 31(10), 704-715.
Review, 24(3), 313-342.
232 Holland, J. H. (1996). Hidden order: How adaptation builds complexity. Addison behavioral public policy. Behavioural Public Policy, 1-26.
Wesley Longman Publishing Co., Inc.; Holland, J. H. (2000). Emergence: From chaos to
order. OUP Oxford. 258 Johnson, A. L., et al. (2020). Increasing the impact of randomized controlled
trials: an example of a hybrid effectiveness–implementation design in psychotherapy
233 Gigerenzer, G., & Gaissmaier, W. (2011). Heuristic decision making. Annual research. Translational Behavioral Medicine, 10(3), 629-636.
Review of Psychology, 62, 451-482.
259 Centola, D. (2018) How Behavior Spreads: The Science of Complex Contagions.
234 Akerlof, G. A., & Shiller, R. J. (2010). Animal spirits: How human psychology Princeton University Press. p.63.
drives the economy, and why it matters for global capitalism. Princeton University Press.
260 Kim, D. A., Hwong, A. R., Stafford, D., Hughes, D. A., O’Malley, A. J., Fowler,
235 Asano, Y. M., Kolb, J. J., Heitzig, J., & Farmer, J. D. (2021). Emergent inequality J. H., & Christakis, N. A. (2015). Social network targeting to maximise population
and business cycles in a simple behavioral macroeconomic model. Proceedings of the behaviour change: a cluster randomised controlled trial. The Lancet, 386(9989), 145-
National Academy of Sciences, 118(27). 153.
236 Jones-Rooy, A., & Page, S. E. (2012). The complexity of system effects. Critical 261 Beaman, L., BenYishay, A., Magruder, J., & Mobarak, A. M. (2021). Can network
Review, 24(3), 313-342. theory-based targeting increase technology adoption? American Economic Review,
111(6), 1918-43.
237 Bak-Coleman, J. B., et al. (2021).
262 Banerjee, A., Chandrasekhar, A. G., Duflo, E., & Jackson, M. O. (2019). Using
238 Hallsworth, M. & Kirkman, E. (2020) Behavioral Insights. MIT Press. gossips to spread information: Theory and evidence from two randomized controlled
trials. The Review of Economic Studies, 86(6), 2453-2490.
239 https://tfl.gov.uk/info-for/media/press-releases/2022/february/new-tfl-data-
shows-success-of-innovative-pedestrian-priority-traffic-signals 263 Education Endowment Fund (2017) Evaluation of Complex Whole-School
Interventions: Methodological and Practical Considerations.
240 Hawe, P., Shiell, A., & Riley, T. (2009). Theorising interventions as events in
systems. American Journal of Community Psychology, 43(3-4), 267-276. 264 Berry, D. A. (2006). Bayesian clinical trials. Nature Reviews Drug Discovery,
5(1), 27-36.
241 Doctor, J. N., Wakker, P. P., & Wang, T. V. (2020). Economists’ views on the
ergodicity problem. Nature Physics, 16(12), 1168-1168; Thomas, D. H., & Jona, L. 265 Boulton, J. G., Allen, P. M., & Bowman, C. (2015). Embracing complexity:
(2018). ‘Good Nudge Lullaby’: Choice Architecture and Default Bias Reinforcement. Strategic perspectives for an age of turbulence. OUP Oxford, p.189. HM Treasury
The Economic Journal, 128(610), 1180-1206. (2020) The Magenta Book: Supplementary Guide: Handling Complexity in Policy
Evaluation.
242 Hallsworth, M. (2011) System Stewardship. Institute for Government.
266 https://www.bi.team/blogs/running-rcts-with-complex-interventions/
243 HM Treasury (2020). The Magenta Book: Supplementary Guide: Handling
Complexity in Policy Evaluation. 267 Collins, L. M., Murphy, S. A., & Strecher, V. (2007). The multiphase optimization
strategy (MOST) and the sequential multiple assignment randomized trial (SMART):
244 Rates, C. A., Mulvey, B. K., Chiu, J. L., & Stenger, K. (2022). Examining new methods for more potent eHealth interventions. American Journal of Preventive
ontological and self-monitoring scaffolding to improve complex systems thinking with a Medicine, 32(5), S112-S118.
participatory simulation. Instructional Science, 1-23.
268 Marinelli, H. A., Berlinski, S., & Busso, M. (2021). Remedial education: Evidence
245 Fernandes, L., Morgado, L., Paredes, H., Coelho, A., & Richter, J. (2019). from a sequence of experiments in Colombia. Journal of Human Resources, 0320-
Immersive learning experiences for understanding complex systems. In iLRN 2019 10801R2.
London-Workshop, Long and Short Paper, Poster, Demos, and SSRiP Proceedings from
the Fifth Immersive Learning Research Network Conference (pp. 107-113). Verlag der 269 Volpp, K. G., Terwiesch, C., Troxel, A. B., Mehta, S., & Asch, D. A. (2013, June).
Technischen Universität Graz. Making the RCT more useful for innovation with evidence-based evolutionary testing.
Healthcare 1(1-2), 4-7.
246 https://www.3ieimpact.org/sites/default/files/2021-07/complexity-blg-
Annex1-Checklist_assessing_level_complexity.pdf 270 Kidwell, K. M., & Hyde, L. W. (2016). Adaptive interventions and SMART
designs: application to child behavior research in a community setting. American
247 Hallsworth, M. & Kirkman, E. (2020). Behavioral Insights. MIT Press. Journal of Evaluation, 37(3), 344-363. Klasnja, P., Hekler, E. B., Shiffman, S.,
Boruvka, A., Almirall, D., Tewari, A., & Murphy, S. A. (2015). Microrandomized trials:
248 HM Treasury (2020) The Magenta Book: Supplementary Guide: Handling
An experimental design for developing just-in-time adaptive interventions. Health
Psychology, 34(S), 1220.
249 Munafò, M. R., et al. (2017). A manifesto for reproducible science. Nature
271 Kasy, M., & Sautmann, A. (2021). Adaptive treatment assignment in experiments
Human Behaviour, 1(1), 1-9.
for policy choice. Econometrica, 89(1), 113-132.
250 Deaton, A., & Cartwright, N. (2018). Understanding and misunderstanding
272 Caria, S., Kasy, M., Quinn, S., Shami, S., & Teytelboym, A. (2020). An adaptive
randomized controlled trials. Social Science & Medicine, 210, 2-21.
targeted field experiment: Job search assistance for refugees in Jordan. SSRN.
273 Mattos, David Issa, Jan Bosch, and Helena Holmström Olsson. “Multi-armed
bandits in the wild: Pitfalls and strategies in online experiments.” Information and
252 Boulton, J. G., Allen, P. M., & Bowman, C. (2015). Embracing complexity: Software Technology 113 (2019): 68-81.
Strategic perspectives for an age of turbulence. OUP Oxford, p.189.
274 Hopkins, A., Breckon, J., & Lawrence, J. (2020). The experimenter’s inventory:
253 Robinson, C. D., Chande, R., Burgess, S., & Rogers, T. (2021). Parent a catalogue of experiments for decision-makers and professionals. The Alliance for
Engagement Interventions are Not Costless: Opportunity Cost and Crowd Useful Evidence.
Out of Parental Investment. Educational Evaluation and Policy Analysis DOI:
275 https://www.cecan.ac.uk/wp-content/uploads/2020/08/EPPN-No-03-
10.3102/01623737211030492
Agent-Based-Modelling-for-Evaluation.pdf
254 Note that this is a slightly different challenge from understanding why nudges
backfire or fail in general, since that can happen in a relatively simple situation that
someone has just misinterpreted. Osman, M., McLachlan, S., Fenton, N., Neil, M.,
Löfstedt, R., & Meder, B. (2020). Learning from behavioural changes that fail. Trends in 277 Calderoni, F., Campedelli, G. M., Szekely, A., Paolucci, M., & Andrighetto, G.
Cognitive Sciences. December 2020, Vol. 24, No. 12 (2022). Recruitment into organized crime: An agent-based approach testing the impact
of different policies. Journal of Quantitative Criminology, 38(1), 197-237.
255 Bonell, C., Jamal, F., Melendez-Torres, G. J., & Cummins, S. (2015). ‘Dark
logic’: Theorising the harmful consequences of public health interventions. J Epidemiol
Community Health, 69(1), 95-98.
278 Jager, W. (2021). Using agent-based modelling to explore behavioural dynamics 296 Stanley, D. J., & Spence, J. R. (2014). Expectations for replications: Are yours
affecting our climate. Current opinion in psychology, 42, 133-139. realistic? Perspectives on Psychological Science, 9(3), 305-318.
279 Schlüter, M., et al. (2017). A framework for mapping and comparing behavioural 297 Braver, S. L., Thoemmes, F. J., & Rosenthal, R. (2014). Continuously cumulating
theories in models of social-ecological systems. Ecological Economics, 131, 21-35; meta-analysis and replicability. Perspectives on Psychological Science, 9(3), 333-342.
Wijermans, N., Boonstra, W. J., Orach, K., Hentati‐Sundberg, J., & Schlüter, M. (2020).
Behavioural diversity in fishing—Towards a next generation of fishery models. Fish and 298 Soman, D. & Mazar, N. (2022) The Science of Translation and Scaling. In:
Fisheries, 21(5), 872-890. Mazar and Soman (eds.) Behavioral Science in the Wild, University of Toronto Press,
pp.5-19.
280 Schill, C., et al. (2019). A more dynamic understanding of human behaviour for
the Anthropocene. Nature Sustainability, 2(12), 1075-1082. 299 McDiarmid, A. D., Tullett, A. M., Whitt, C. M., Vazire, S., Smaldino, P. E., &
Stephens, J. E. (2021). Psychologists update their beliefs about effect sizes after
281 Schlüter, M., et al. (2017). replication studies. Nature Human Behaviour, 5(12), 1663-1673.
282 Muelder, H., & Filatova, T. (2018). One theory-many formalizations: Testing 300 Szaszi, B., Palinkas, A., Palfi, B., Szollosi, A. & Aczel, B. A systematic scoping
different code implementations of the theory of planned behaviour in energy agent- review of the choice architecture movement: toward understanding when and why
based models. Journal of Artificial Societies and Social Simulation, 21(4). nudges work. J. Behav. Decis. Mak. 31, 355–366 (2018). Hummel, D., & Maedche,
A. (2019). How effective is nudging? A quantitative review on the effect sizes and limits
283 Ewert, B., Loer, K., & Thomann, E. (2020). Beyond nudge: Advancing the state-of- of empirical nudging studies. Journal of Behavioral and Experimental Economics, 80,
the-art of behavioural public policy and administration. Policy & Politics. 47-58. DellaVigna, S., & Linos, E. (2022). RCTs to scale: Comprehensive evidence from
two nudge units. Econometrica, 90(1), 81-116.
Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology, 301 Asendorpf, J. B., et al. (2016). Recommendations for increasing replicability
4028. in psychology. Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-
analyses reveal about the replicability of psychological research. Psychological
285 Shrout, P. E., & Rodgers, J. L. (2018). Psychology, science, and knowledge Bulletin, 144(12), 1325.
construction: Broadening perspectives from the replication crisis. Annual Review of
Psychology, 69, 487-510. 302 Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses
reveal about the replicability of psychological research. Psychological Bulletin,
286 Sanbonmatsu, D. M., Cooley, E. H., & Butner, J. E. (2021). The Impact of 144(12), 1325.
4028. 303 Allcott, H. (2015). Site selection bias in program evaluation. The Quarterly
Journal of Economics, 130(3), 1117-1165. Gerber, A., Huber, G. & Fang, A. Do subtle
287 Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses linguistic interventions priming a social identity as a voter have outsized effects on voter
reveal about the replicability of psychological research. Psychological Bulletin, turnout? Evidence from a new replication experiment: outsized turnout effects of subtle
144(12), 1325. linguistic cues. Polit. Psychol. 39, 925–938 (2018). List, J. A. (2022). The voltage effect:
How to make good ideas great and great ideas scale. Currency.
288 Open Science Collaboration. (2015). Estimating the reproducibility of
psychological science. Science, 349(6251), aac4716. Camerer, C. F., Dreber, A., 304 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to
Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., ... & Wu, H. (2018). Evaluating change the world without a heterogeneity revolution. Nature Human Behaviour, 5(8),
the replicability of social science experiments in Nature and Science between 2010 980-989.
and 2015. Nature Human Behaviour, 2(9), 637-644. Sanbonmatsu, D. M., Cooley,
E. H., & Butner, J. E. (2021). The Impact of Complexity on Methods and Findings in 305 Klein, R. A., et al. (2018). Many Labs 2: Investigating variation in replicability
Psychological Science. Frontiers in Psychology, 4028. across samples and settings. Advances in Methods and Practices in Psychological
Science, 1(4), 443-490.
289 Shu, L. L., Mazar, N., Gino, F., Ariely, D., & Bazerman, M. H. (2012). Signing at
the beginning makes ethics salient and decreases dishonest self-reports in comparison 306 McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale
to signing at the end. Proceedings of the National Academy of Sciences, 109(38), replication projects in contemporary psychological research. The American Statistician,
15197-15200. 73(sup1), 99-105. Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What
meta-analyses reveal about the replicability of psychological research. Psychological
290 Behavioural Insights Team, Applying Behavioural Insights to Reduce Fraud, Bulletin, 144(12), 1325. Landy, J. F., et al. (2020). Crowdsourcing hypothesis tests:
Error and Debt (Cabinet Office, London, 2012) 185, 186. Kettle, S., Hernandez, M., Making transparent how design choices shape research results. Psychological Bulletin,
Sanders, M., Hauser, O., & Ruda, S. (2017). Failure to CAPTCHA attention: Null results 146(5), 451.
from an honesty priming experiment in Guatemala. Behavioral Sciences, 7(2), 28.
307 Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses
291 Kristal, A. S., Whillans, A. V., Bazerman, M. H., Gino, F., Shu, L. L., Mazar, N., reveal about the replicability of psychological research. Psychological bulletin,
& Ariely, D. (2020). Signing at the beginning versus at the end does not decrease 144(12), 1325.
dishonesty. Proceedings of the National Academy of Sciences, 117(13), 7103-7107.
Martuza, J. B., Skard, S. R., Løvlie, L., & Thorbjørnsen, H. (2022). Do honesty‐nudges 308 Sanbonmatsu, D. M., Cooley, E. H., & Butner, J. E. (2021). The Impact of
really work? A large‐scale field experiment in an insurance context. Journal of Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology,
Consumer Behaviour. 4028. McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale
292 Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-Positive Psychology: 73(sup1), 99-105.
Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything
as Significant. Psychological Science, 22(11), 1359-1366. John, L. K., Loewenstein, 309 Goodyear, L., Hossain, T., & Soman, D. (2022) Prescriptions for Successfully
G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices Scaling Behavioral Interventions. In: Mazar and Soman (eds.) Behavioral Science in
with incentives for truth telling. Psychological science, 23(5), 524-532. Stanley, T. the Wild, University of Toronto Press, pp.28-41.
D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses reveal about the
replicability of psychological research. Psychological Bulletin, 144(12), 1325. Flake, 310 For an interesting practice, see: Chorpita, B. F., Becker, K. D., Daleiden, E. L.,
J. K., & Fried, E. I. (2020). Measurement schmeasurement: Questionable measurement & Hamilton, J. D. (2007). Understanding the common elements of evidence-based
practices and how to avoid them. Advances in Methods and Practices in Psychological practice: Misconceptions and clinical examples. Journal of the American Academy of
Science, 3(4), 456-465. Child and Adolescent Psychiatry, 46(5), 647-652.
293 Nelson, L. D., Simmons, J., & Simonsohn, U. (2018). Psychology’s renaissance. 311 Soman, D. & Mazar, N. (2022) The Science of Translation and Scaling. In:
Annual Review of Psychology, 69, 511-534. Mazar and Soman (eds.) Behavioral Science in the Wild, University of Toronto Press,
pp.5-19.
294 Frias‐Navarro, D., Pascual‐Llobell, J., Pascual‐Soler, M., Perezgonzalez, J., &
Berrios‐Riquelme, J. (2020). Replication crisis or an opportunity to improve scientific 312 Fiedler, K. (2011). Voodoo correlations are everywhere—not only in
production? European Journal of Education, 55(4), 618-631. neuroscience. Perspectives on psychological science, 6(2), 163-171.
295 McShane, B. B., Böckenholt, U., & Hansen, K. T. (2022). Variation and 313 Brenninkmeijer, J., Derksen, M., Rietzschel, E., Vazire, S., & Nuijten, M. (2019).
Covariation in Large-scale Replication Projects: An Evaluation of Replicability. Journal Informal laboratory practices in psychology. Collabra: Psychology, 5(1).
of the American Statistical Association, (Just Accepted), 1-31.
314 McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale 338 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021).
73(sup1), 99-105. 339 McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale
315 Van Bavel, J. J., Mende-Siedlecki, P., Brady, W. J., & Reinero, D. A. (2016). 73(sup1), 99-105.
Contextual sensitivity in scientific reproducibility. Proceedings of the National Academy
of Sciences, 113(23), 6454-6459. 340 https://youthendowmentfund.org.uk/news/youth-endowment-fund-to-support-
grassroots-organisations-to-take-part-in-research-to-find-out-what-works-to-keep-
316 Landy, J. F. & Crowdsourcing Hypothesis Tests Collaboration. (2020). children-safe-from-violence/
Crowdsourcing hypothesis tests: Making transparent how design choices shape
research results. Psychological Bulletin, 146(5), 451. 341 This point comes from Alex Gyani, BIT’s Director of Research & Methodology,
APAC.
317 Lewin K (1936) Principles of Topological Psychology (McGraw-Hill, New York)
trans. 342 Landy, J. F., et al. (2020). Crowdsourcing hypothesis tests: Making transparent
how design choices shape research results. Psychological Bulletin, 146(5), 451.
Heider F and Heider G. Camerer CF, Loewenstein G, Rabin M, eds (2011) Advances in
Behavioral Economics (Princeton Univ Press, Princeton, NJ). 343 Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the
world? Behavioral and Brain Sciences, 33(2-3), 61-83.
Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology, 344 Ruggeri, K., et al. (2020). Replicating patterns of prospect theory for decision
4028. under risk. Nature Human Behaviour, 4(6), 622-633.
319 Gravert, C. (2022) Reminders: Their value and hidden cost. In: Mazar and 345 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to
Soman (eds.) Behavioral Science in the Wild, University of Toronto Press, pp.120-131. change the world without a heterogeneity revolution. Nature Human Behaviour, 5(8),
980-989.
320 Van Ryzin, G. G. (2021). Nudging and Muddling through. Perspectives on Public
Management and Governance, 4(4), 339-345. 346 Royer, H. (2021). Benefits of megastudies for testing behavioural interventions.
Nature, 8th December 2021.
321 Jang, C., Saldanha, N. A., Singh, A., Adhiambo, J. (2022) Implementing
Behavioral Science Insights with Low-income populations in the Global South. In: 347 Van Bavel, J. J., Mende-Siedlecki, P., Brady, W. J., & Reinero, D. A. (2016).
Mazar and Soman (eds.) Behavioral Science in the Wild, University of Toronto Press, Contextual sensitivity in scientific reproducibility. Proceedings of the National Academy
pp.277-283. of Sciences, 113(23), 6454-6459.
322 Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses 348 Landy, J. F., et al. (2020). Crowdsourcing hypothesis tests: Making transparent
reveal about the replicability of psychological research. Psychological bulletin, how design choices shape research results. Psychological Bulletin, 146(5), 451.
144(12), 1325.
349 Soman, D. & Mazar, N. (2022) The Science of Translation and Scaling. In:
323 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to Mazar and Soman (eds.) Behavioral Science in the Wild, University of Toronto Press,
change the world without a heterogeneity revolution. Nature Human Behaviour, 5(8), pp.5-19 (p.13).
980-989.
350 List, J. A. (2022). The voltage effect: How to make good ideas great and great
324 Choudhary, V., Shunko, M., Netessine, S., & Koo, S. (2021). Nudging drivers to ideas scale. Currency.
safety: Evidence from a field experiment. Management Science 68 (6), 4196-4214.
351 Damschroder, L. J., Aron, D. C., Keith, R. E., Kirsh, S. R., Alexander, J. A., &
325 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Lowery, J. C. (2009). Fostering implementation of health services research findings
into practice: a consolidated framework for advancing implementation science.
326 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Implementation science, 4(1), 1-15.
327 Allcott, H. (2011). Social norms and energy conservation. Journal of Public 352 Goodyear, L., Hossain, T., & Soman, D. (2022) Prescriptions for Successfully
Economics, 95(9-10), 1082-1095. Scaling Behavioral Interventions. In: Mazar and Soman (eds.) Behavioral Science in
the Wild, University of Toronto Press, pp.28-41.
328 Allcott, H. (2015). Site selection bias in program evaluation. The Quarterly
Journal of Economics, 130(3), 1117-1165. 353 Landy, J. F., et al. (2020). Crowdsourcing hypothesis tests: Making transparent
how design choices shape research results. Psychological Bulletin, 146(5), 451.
Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology, 354 Slemrod, J., Blumenthal, M., & Christian, C. (2001). Taxpayer response to an
4028. increased probability of audit: evidence from a controlled experiment in Minnesota.
Journal of Public Economics, 79(3), 455-483.
330 Prospect Theory, for example, seems to replicate quite well. Ruggeri, K., et al.
(2020). Replicating patterns of prospect theory for decision under risk. Nature Human 355 ‘Betsy Levy Paluck, The Art of Psychology No.1’. Brain Meets World by
Behaviour, 4(6), 622-633. Behavioral Scientist.
331 Institute for Government and Cabinet Office (2010) MINDSPACE: Influencing 356 De Martino, B., Kumaran, D., Seymour, B. & Dolan, R. J. (2006) Frames, biases,
behaviour through public policy. and rational decision-making in the human brain. Science, 313 (5787), 684-687.
332 Landy, J. F., et al. (2020). Crowdsourcing hypothesis tests: Making transparent 357 Damschroder, L. J., Aron, D. C., Keith, R. E., Kirsh, S. R., Alexander, J. A., &
how design choices shape research results. Psychological Bulletin, 146(5), 451. Bryan, Lowery, J. C. (2009). Fostering implementation of health services research findings
C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to change the into practice: a consolidated framework for advancing implementation science.
world without a heterogeneity revolution. Nature Human Behaviour, 5(8), 980-989. Implementation Science, 4(1), 1-15.
333 Cartwright, N., & Hardie, J. (2012). Evidence-based policy: A practical guide to 358 Mazar and Soman (eds.) Behavioral Science in the Wild. University of Toronto
doing it better. Oxford University Press. Soman, D. & Mazar, N. (2022) The Science of Press
Translation and Scaling. In: Mazar and Soman (eds.) Behavioral Science in the Wild,
University of Toronto Press, pp.5-19. 359 Brenninkmeijer, J., Derksen, M., Rietzschel, E., Vazire, S., & Nuijten, M. (2019).
Informal laboratory practices in psychology. Collabra: Psychology, 5(1).
334 Gelman A (2014) The connection between varying treatment effects and the
crisis of unreplicable research: A Bayesian perspective. Journal of Management 360 Kirsh, S. R., Lawrence, R. H., & Aron, D. C. (2008). Tailoring an intervention to the
41(2):632–643. context and system redesign related to the intervention: A case study of implementing
shared medical appointments for diabetes. Implementation Science, 3(1), 1-15.
335 https://osf.io/zuh93/ Goswami, I. & Urminsky, O. (2022) Why Many Behavioral Interventions Have
Unpredictable Effects in the Wild: The Conflicting Consequences Problem. In: Mazar
336 https://www.nesta.org.uk/blog/mind-gap-between-truth-and-data/ and Soman (eds.) Behavioral Science in the Wild, University of Toronto Press, pp.65-
81.
337 https://behavioralscientist.org/breaking-the-silence-can-behavioral-science-
confront-structural-racism/
361 Hallsworth, M., Chadborn, T., Sallis, A., Sanders, M., Berry, D., Greaves, F., 381 Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human
& Davies, S. C. (2016). Provision of social norm feedback to high prescribers of Behaviour, 3(3), 221-229.
antibiotics in general practice: a pragmatic national randomised controlled trial. The
Lancet, 387(10029), 1743-1752. 382 Muthukrishna, M., & Henrich, J. (2019).
362 https://www1.racgp.org.au/newsgp/professional/nudge-letters-prompt- 383 https://www.worksinprogress.co/issue/biases-the-wrong-model/

sustained-drop-in-antibiotic; Schwartz, K. L., et al. (2021). Effect of antibiotic-
prescribing feedback to high-volume primary care physicians on number of antibiotic 384 It may be possible to reconcile the two - i.e., that negativity bias is processed
prescriptions: a randomized clinical trial. JAMA Internal Medicine, 181(9), 1165-1173. more as negative outcomes relating to others, rather than oneself, but this requires a
https://www.ars.toscana.it/2-articoli/4134-nudge-per-uso-prudente-antibiotici-in- deeper level of analysis than is usually present. Sharot, T. (2011). The optimism bias.
toscana-intervento-su-medici-di-famiglia.html#; Bradley, D. T., Allen, S. E., Quinn, H., Current biology, 21(23), R941-R945. Rozin, P., & Royzman, E. B. (2001). Negativity
Bradley, B., & Dolan, M. (2019). Social norm feedback reduces primary care antibiotic bias, negativity dominance, and contagion. Personality and Social Psychology Review,
prescribing in a regression discontinuity study. Journal of Antimicrobial Chemotherapy, 5(4), 296-320.
74(9), 2797-2802.
385 Rizzo, M. J., & Whitman, G. (2023). The unsolved Hayekian knowledge problem
363 https://www.bi.team/blogs/amr-blog/ in behavioral economics. Behavioural Public Policy, 7(1), 199-211.
364 McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale 386 Schimmelpfennig, R., & Muthukrishna, M. (2023). Cultural evolutionary
replication projects in contemporary psychological research. The American Statistician, behavioural science in public policy. Behavioural Public Policy, 1-31. doi:10.1017/
73(sup1), 99-105. bpp.2022.40. Kwan, V. S., John, O. P., Kenny, D. A., Bond, M. H., & Robins, R.
W. (2004). Reconceptualizing individual differences in self-enhancement bias:
365 Fried, E. I. (2020). Lack of theory building and testing impedes progress in the an interpersonal approach. Psychological Review, 111(1), 94. Mezulis, A. H.,
factor and network literature. Psychological Inquiry, 31(4), 271-288. Abramson, L. Y., Hyde, J. S., & Hankin, B. L. (2004). Is there a universal positivity bias
in attributions? A meta-analytic review of individual, developmental, and cultural
366 Bryan, C.J., Tipton, E. & Yeager, D.S. Behavioural science is unlikely to change differences in the self-serving attributional bias. Psychological Bulletin, 130(5), 711.
the world without a heterogeneity revolution. Nat Hum Behav 5, 980–989 (2021).
387 http://behavioralscientist.org/there-is-more-to-behavioral-science-than-
367 McShane, B. B., Böckenholt, U., & Hansen, K. T. (2022). Variation and biases-and-fallacies/
Covariation in Large-scale Replication Projects: An Evaluation of Replicability. Journal
of the American Statistical Association, (just-accepted), 1-31. 388 Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human
Behaviour, 3(3), 221-229.
368 Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in
psychology. Psychonomic bulletin & review, 26(5), 1596-1618. Borsboom, D., van 389 Rand, D. G. (2016). Cooperation, fast and slow: Meta-analytic evidence for
der Maas, H. L., Dalege, J., Kievit, R. A., & Haig, B. D. (2021). Theory construction a theory of social heuristics and self-interested deliberation. Psychological science,
methodology: A practical framework for building theories in psychology. Perspectives 27(9), 1192-1206. Gelfand, M. J. (2018). Rule makers, rule breakers: how culture wires
on Psychological Science, 16(4), 756-766. our minds, shapes our nations and drive our differences. Robinson.
369 Sanbonmatsu, D. M., & Johnston, W. A. (2019). Redefining science: The impact of 390 Manzi, J. (2012). Uncontrolled: The surprising payoff of trial-and-error for
complexity on theory development in social and behavioral research. Perspectives on business, politics, and society. Basic Books (AZ).
Psychological Science, 14(4), 672-690.
391 https://www.theguardian.com/technology/2022/jan/09/are-we-witnessing-
370 Sanbonmatsu, D. M., Cooley, E. H., & Butner, J. E. (2021). The Impact of the-dawn-of-post-theory-science
4028. 392 Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in
371 Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in 12(6), 1100-1122.
psychology. Psychonomic Bulletin & Review, 26(5), 1596-1618.
393 Van Ryzin, G. G. (2021). Nudging and Muddling through. Perspectives on Public
372 Fried, E. I. (2020). Theories and models: What they are, what they are for, and Management and Governance, 4(4), 339-345.
what they are about. Psychological Inquiry, 31(4), 336-344.
394 Van Ryzin, G. G. (2021).
373 Maatman, F. O. (2021). Psychology’s Theory Crisis, and Why Formal Modelling
Cannot Solve It. 395 Schmidt, R., & Stenger, K. (2021). Behavioral brittleness: The case for strategic
behavioral public policy. Behavioural Public Policy, 1-26.
374 Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in
psychology. Psychonomic Bulletin & Review, 26(5), 1596-1618. 396 Yeung, K. (2012). Nudge as fudge. Mod. L. Rev., 75, 122.
375 Hallsworth, M. & Kirkman, E. (2020) Behavioral Insights. MIT Press. 397 van Rooij, I., & Baggio, G. (2021). Theory before the test: How to build
high-verisimilitude explanatory theories in psychological science. Perspectives on
376 Pennycook, G. (2017). A perspective on the theoretical foundation of dual Psychological Science, 16(4), 682-697.
process models. In De Neys, W. (Ed.), Dual Process Theory 2.0 (pp. 13–35). De Neys,
W. (2022) Advancing theorizing about fast-and-slow thinking. In press at Behavioral 398 Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human
and Brain Sciences. Keren, G., & Schul, Y. (2009). Two is not always better than one: A Behaviour, 3(3), 221-229.
critical evaluation of two-system theories. Perspectives on psychological science, 4(6),
533-550. 399 Schimmelpfennig, R., & Muthukrishna, M. (2023). Cultural evolutionary
behavioural science in public policy. Behavioural Public Policy, 1-31. doi:10.1017/
377 Chetty, R. (2015). Behavioral economics and public policy: A pragmatic bpp.2022.40. Smaldino, P. E. (2020). How to build a strong theoretical foundation.
perspective. American Economic Review, 105(5), 1–33. Kosters, M., & Van der Psychological Inquiry, 31(4), 297-301.
Heijden, J. (2015). From mechanism to virtue: Evaluating Nudge theory. Evaluation,
21(3), 276-291. Callaway, F., Hardy, M., & Griffiths, T. (2022). Optimal nudging for 400 Fried, E. I. (2020). Theories and models: What they are, what they are for, and
cognitively bounded agents: A framework for modeling, predicting, and controlling the what they are about. Psychological Inquiry, 31(4), 336-344.
effects of choice architectures.
401 Sanbonmatsu, D. M., & Johnston, W. A. (2019). Redefining science: The impact
378 Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human of complexity on theory development in social and behavioral research. Perspectives
Behaviour, 3(3), 221-229. on Psychological Science, 14(4), 672-690. Smaldino, P. (2019). Better methods can’t
make up for mediocre theory. Nature, 575(7783), 9-10.
379 https://www.psychologicalscience.org/observer/grand-challenges; Schlüter,
M., et al. (2017). A framework for mapping and comparing behavioural theories in 402 van Rooij, I., & Baggio, G. (2021). Theory before the test: How to build
models of social-ecological systems. Ecological Economics, 131, 21-35. high-verisimilitude explanatory theories in psychological science. Perspectives on
Psychological Science, 16(4), 682-697. Borsboom, D., van der Maas, H. L., Dalege,
380 Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in J., Kievit, R. A., & Haig, B. D. (2021). Theory construction methodology: A practical
psychology: Lessons from machine learning. Perspectives on Psychological Science, framework for building theories in psychology. Perspectives on Psychological Science,
12(6), 1100-1122. 16(4), 756-766. Smaldino, P. E. (2020). How to build a strong theoretical foundation.
Psychological Inquiry, 31(4), 297-301.
403 Schimmelpfennig, R., & Muthukrishna, M. (2023). 426 Henriksen, K., & Kaplan, H. (2003). Hindsight bias, outcome knowledge
and adaptive learning. BMJ Quality & Safety, 12(suppl 2), ii46-ii50. Bernstein, D.
404 This list brings together aspects of two separate but related ideas: “middle-range M., Aßfalg, A., Kumar, R., & Ackerman, R. (2016). Looking backward and forward
theories” and “practical theories” in a composite that we think is helpful. erton, R. K., & on hindsight bias. In: J. Dunlosky & S. K. Tauber (eds.), The Oxford Handbook of
Merton, R. C. (1968). Social theory and social structure. Simon and Schuster. Berkman, Metamemory (pp. 289–304). Oxford University Press.
E. T., & Wilson, S. M. (2021). So useful as a good theory? The practicality crisis in
(social) psychological theory. Perspectives on psychological science, 16(4), 864-874. 427 Bukszar, E., & Connolly, T. (1988). Hindsight bias and strategic choice: Some
problems in learning from experience. Academy of Management Journal, 31(3), 628-
405 Abner, G. B., Kim, S. Y., & Perry, J. L. (2017). Building evidence for public human 641.
resource management: Using middle range theory to link theory and data. Review of
Public Personnel Administration, 37(2), 139-159. 428 DellaVigna, S., Pope, D., & Vivalt, E. (2019). Predict science to improve science.
Science, 366(6464), 428-429.
406 Moore, L. F., Johns, G., & Pinder, C. C. (1980). Toward middle range theory.
Middle range theory and the study of organizations, 1-16. 429 DellaVigna, S., Pope, D., & Vivalt, E. (2019). Predict science to improve science.
Science, 366(6464), 428-429.
Complexity on Methods and Findings in Psychological Science. Frontiers in Psychology, 430 Munnich, E., & Ranney, M. A. (2019). Learning from surprise: Harnessing a
4028. metacognitive surprise signal to build and adapt belief networks. Topics in Cognitive
Science, 11(1), 164-177.
408 Berkman, E. T., & Wilson, S. M. (2021). So useful as a good theory? The
practicality crisis in (social) psychological theory. Perspectives on Psychological 431 Munnich, E., & Ranney, M. A. (2019). Learning from surprise: Harnessing a
Science, 16(4), 864-874. metacognitive surprise signal to build and adapt belief networks. Topics in Cognitive
Science, 11(1), 164-177.
409 Lieder, F., & Griffiths, T. L. (2020a). Resource-rational analysis: Understanding
human cognition as the optimal use of limited computational resources. Behavioral and 432 https://www.bi.team/blogs/how-government-can-predict-the-future/ ; https://
Brain Sciences, 43. doi:10.1017/S0140525X1900061X www.bi.team/blogs/are-you-well-calibrated-results-from-a-survey-of-1154-bit-
readers/
410 Callaway, F., Hardy, M., & Griffiths, T. (2022). Optimal nudging for cognitively
bounded agents: A framework for modeling, predicting, and controlling the effects of 433 Deshpande, M., & Dizon-Ross, R. (2022). The (lack of) anticipatory effects of the
choice architectures. Working Paper. social safety net on human capital investment. Working Paper.
411 Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics 434 DellaVigna, S., Pope, D., & Vivalt, E. (2019). Predict science to improve science.
and Biases: Biases in judgments reveal some heuristics of thinking under uncertainty. Science, 366(6464), 428-429.
Science, 185 (4157), 1124-1131.
435 DellaVigna, S., & Pope, D. (2018). Predicting experimental results: who knows
412 Lieder, F., Griffiths, T. L., M Huys, Q. J., & Goodman, N. D. (2018). The anchoring what? Journal of Political Economy, 126(6), 2410-2456. DellaVigna, S., & Pope, D.
bias reflects rational use of cognitive resources. Psychonomic Bulletin & Review, 25 (1), (2018). What motivates effort? Evidence and expert forecasts. The Review of Economic
322-349. Studies, 85(2), 1029-1069. Otis, N. G. (2021). Forecasting in the Field. Working
Paper.
413 Lieder, F., Griffiths, T. L., M Huys, Q. J., & Goodman, N. D. (2018).
436 Abaluck, J., et al. (2021). Impact of community masking on COVID-19: A
414 Lieder, F., & Griffiths, T. L. (2020a). cluster-randomized trial in Bangladesh. Science, eabi9069. Dimant, E., Clemente, E.
G., Pieper, D., Dreber, A., & Gelfand, M. (2022). Politicizing mask-wearing: predicting
415 Lieder, F., & Griffiths, T. L. (2020a). the success of behavioral interventions among republicans and democrats in the US.
Scientific Reports, 12(1), 1-12. Bowen, D. (2022) Simple models predict behavior at
416 Lieder, F., & Griffiths, T. L. (2020a). least as well as behavioral scientists. Working Paper.
417 The following draws on Callaway, F., Hardy, M., & Griffiths, T. (2022). 437 DellaVigna, S., & Pope, D. (2018). Predicting experimental results: who knows
what? Journal of Political Economy, 126(6), 2410-2456. DellaVigna, S., & Pope, D.
418 Lieder, F., & Griffiths, T. L. (2020b). Advancing rational analysis to the algorithmic
(2018). What motivates effort? Evidence and expert forecasts. Dimant, E., Clemente, E.
level. Behavioral and Brain Sciences, 43.
G., Pieper, D., Dreber, A., & Gelfand, M. (2022). Politicizing mask-wearing: predicting
419 Atmanspacher, H., Basieva, I., Busemeyer, J. R., Khrennikov, A. Y., Pothos, E. M., the success of behavioral interventions among republicans and democrats in the US.
Shiffrin, R. M., & Wang, Z. (2020). What are the appropriate axioms of rationality for Scientific Reports, 12(1), 1-12. Otis, N. G. (2021). Forecasting in the Field. Working
reasoning under uncertainty with resource-constrained systems? Behavioral and Brain Paper. Counterexample: Milkman, K., et al. (2022). A 680,000-person megastudy
Sciences, 43. of nudges to encourage vaccination in pharmacies. Proceedings of the National
Academy of Sciences, 119(6), e2115126119.
420 Fanelli, D. (2012). Negative results are disappearing from most disciplines
and countries. Scientometrics 90, 891–904. Uchino, B. N., Thoman, D., & Byerly, S. 438 DellaVigna, S., & Linos, E. (2022). RCTs to scale: Comprehensive evidence from
(2010). Inference patterns in theoretical social psychology: Looking back as we move two nudge units. Econometrica, 90(1), 81-116.
forward. Social and Personality Psychology Compass, 4(6), 417-427. Sanbonmatsu,
439 Otis, N. G. (2021). Forecasting in the Field. Working Paper.
D. M., Posavac, S. S., Behrends, A. A., Moore, S. M., & Uchino, B. N. (2015). Why a
confirmation strategy dominates psychological science. PloS One, 10(9), e0138197. 440 Pekrun, R., & Stephens, E. J. (2012). Academic emotions. In: APA Educational
Psychology Handbook, Vol 2: Individual differences and cultural and contextual
421 Roese, N. J., & Vohs, K. D. (2012). Hindsight bias. Perspectives on Psychological
factors. (pp. 3-31). American Psychological Association.
Science, 7(5), 411-426.
441 Ackerman, R., Bernstein, D. M., & Kumar, R. (2020). Metacognitive hindsight
422 Roese, N. J., & Vohs, K. D. (2012).
bias. Memory & Cognition, 48(5), 731-744.
423 Roese, N. J., & Vohs, K. D. (2012).
424 Munnich, E., & Ranney, M. A. (2019). Learning from surprise: Harnessing a
442 DellaVigna, S., Pope, D., & Vivalt, E. (2019). Predict science to improve science.
metacognitive surprise signal to build and adapt belief networks. Topics in Cognitive
Science, 366(6464), 428-429.
Science, 11(1), 164-177. Pezzo, M. (2003). Surprise, defence, or making sense: What
removes hindsight bias? Memory, 11(4-5), 421-441. Calvillo, D. P., & Gomes, D. M. 443 Nerantzaki, K., Efklides, A., & Metallidou, P. (2021). Epistemic emotions:
(2011). Surprise influences hindsight–foresight differences in temporal judgments of Cognitive underpinnings and relations with metacognitive feelings. New Ideas in
animated automobile accidents. Psychonomic Bulletin & Review, 18(2), 385-391. Psychology, 63, 100904.
425 Kahneman, D., & Klein, G. (2009). Conditions for intuitive expertise: a failure to 444 Munnich, E., & Ranney, M. A. (2019). Learning from surprise: Harnessing a
disagree. American Psychologist, 64(6), 515. Dunlosky, J., & Rawson, K. A. (2012). metacognitive surprise signal to build and adapt belief networks. Topics in Cognitive
Overconfidence produces underachievement: Inaccurate self-evaluations undermine Science, 11(1), 164-177.
students’ learning and retention. Learning and Instruction, 22(4), 271-280.
445 Speirs-Bridge, A., Fidler, F., McBride, M., Flander, L., Cumming, G., & Burgman, 467 Lewis Jr, N. A. (2021). What counts as good science? How the battle for
M. (2010). Reducing overconfidence in the interval judgments of experts. Risk Analysis, methodological legitimacy affects public psychology. American Psychologist, 76.
30: 512-523.; Angner, E. (2006). Economists as experts: Overconfidence in theory and
practice. Journal of Economic Methodology, 13(1), 1-24. 468 https://behavioralscientist.org/breaking-the-silence-can-behavioral-science-
446 Byrne, D., & Callaghan, G. (2013). Complexity theory and the social sciences:
The state of the art. Routledge. 469 https://behavioralscientist.org/breaking-the-silence-can-behavioral-science-
447 Liscow, Z. & Markovits (2022) Democratizing Behavioral Economics. Working
Paper. 470 Schill, C., et al. (2019). A more dynamic understanding of human behaviour for
the Anthropocene. Nature Sustainability, 2(12), 1075-1082. Lewis Jr, N. A. (2021).
448 Rizzo, M. J., & Whitman, G. (2020). Escaping paternalism: Rationality,
behavioral economics, and public policy. Cambridge University Press. 471 Hallsworth, M. & Kirkman, E. (2020), pp.176-179.
449 Ruggeri, K., Panin, A., Vdovic, M. et al. (2022). The globalizability of temporal 472 DiMaggio, P. Culture and cognition. Annu. Rev. Sociol. 23, 263–287 (1997).
discounting. Nature Human Behaviour. https://doi.org/10.1038/s41562-022- Nisbett, R. E., Peng, K., Choi, I., & Norenzayan, A. (2001). Culture and systems
01392-w of thought: holistic versus analytic cognition. Psychological Review, 108(2), 291.
Vaisey, S. (2009). Motivation and justification: A dual-process model of culture in
450 Dorison, C. A., & Heller, B. H. Observers penalize decision makers whose risk action. American Journal of Sociology, 114(6), 1675-1715.
preferences are unaffected by loss-gain framing. Journal of Experimental Psychology:
General, https://doi.org/10.1037/xge0001187 473 Lamont, M., Adler, L., Park, B. Y., & Xiang, X. (2017). Bridging cultural sociology
and cognitive psychology in three contemporary research programmes. Nature Human
451 Hallsworth, M. & Kirkman, E. (2020) Behavioral Insights. MIT Press. Behaviour, 1(12), 866-872.
452 Smaldino, P. E. (2020). How to build a strong theoretical foundation. 474 Wang, Q. (2021). The cultural foundation of human memory. Annual review of
Psychological Inquiry, 31(4), 297-301. Lamont, M., Adler, L., Park, B. Y., & Xiang, X. Psychology, 72, 151-179.
(2017). Bridging cultural sociology and cognitive psychology in three contemporary
research programmes. Nature Human Behaviour, 1(12), 866-872. 475 Hogg, M. A., & Reid, S. A. (2006). Social identity, self-categorization, and the
communication of group norms. Communication Theory, 16(1), 7-30.
behavioral public policy. Behavioural Public Policy, 1-26. 476 Leggett, W. (2014). The politics of behaviour change: Nudge, neoliberalism and
the state. Policy & Politics, 42(1), 3-19. DiMaggio, P., & Markus, H. R. (2010). Culture
454 Resource-rational analysis, which I discuss later, provides new evidence that and social psychology: Converging perspectives. Social Psychology Quarterly, 73(4),
‘When people seem to act against their own interests, they may be doing so in light 347-352.
of priorities and constraints that aren’t obvious to outsiders – or even to themselves.’
Cowles, H. M., & Kreiner, J. (2020). Another claim for cognitive history. Behavioral and 477 Swidler, A. (1986). Culture in action: Symbols and strategies. American
Brain Sciences, 43. Sociological Review, 273-286.
455 https://www.bi.team/blogs/green-means-go-how-to-help-patients-make- 478 Mullainathan, S., & Shafir, E. (2013). Scarcity: Why having too little means so
informed-choices-about-their-healthcare/ much. Macmillan.
456 Elster, J. (2008). Reason and rationality. Princeton University Press. Pinker, S. 479 World Bank (2014). World development report 2015: Mind, society, and
(2021). Rationality. behavior. The World Bank.
457 Commentators like Steven Pinker argue that rationality is based on ‘epistemic 480 We take this example and the argument below from Lamont, M., Adler, L., Park,
humility’ because a commitment to rationality does not imply that you have access to B. Y., & Xiang, X. (2017). Bridging cultural sociology and cognitive psychology in three
objective truth, but rather that you support effective tools for getting closer to the truth. contemporary research programmes. Nature Human Behaviour, 1(12), 866-872.
But dealing with disagreement is a crucial way that these tools work and delegitimizing
opposing stances as ‘irrational’ prevents that from happening. 481 Lamont, M., Adler, L., Park, B. Y., & Xiang, X. (2017).
458 Daniel Kahneman states that ‘I often cringe when my work with Amos [Tversky] 482 Lamont, M., Adler, L., Park, B. Y., & Xiang, X. (2017).
is credited with demonstrating that human choices are irrational, when in fact our
research only showed that Humans are not well described by the rational-agent 483 Richardson, L., & John, P. (2021). Co-designing behavioural public policy:
model.’ Thinking, Fast and Slow, p.411. lessons from the field about how to ‘nudge plus’. Evidence & Policy, 17(3), 405-422.
459 https://behavioralscientist.org/epistemic-humility-coronavirus-knowing-your- 484 Grüne-Yanoff, T. (2012). Old wine in new casks: libertarian paternalism still
limits-in-a-pandemic/ - see also Porter, T., Elnakouri, A., Meyers, E. A., Shibayama, T., violates liberal principles. Social Choice and Welfare, 38(4), 635-645. Rizzo, M. J.,
Jayawickreme, E., & Grossmann, I. (2022). Predictors and consequences of intellectual & Whitman, G. (2020). Escaping paternalism: Rationality, behavioral economics, and
humility. Nature Reviews Psychology, 1-13. public policy. Cambridge University Press.
460 Mazzocchi, F. (2021). Drawing lessons from the COVID-19 pandemic: science 485 See Chapter 5 of Hallsworth, M. & Kirkman, E. (2020).
and epistemic humility should go together. History and Philosophy of the Life Sciences,
486 John, P., et al. (2013). Nudge, nudge, think, think: Experimenting with ways to
43(3), 1-5.
change civic behaviour. A&C Black.
461 Behavioural Insights Team (2018) Behavioural Government: Using behavioural
science to improve how governments make decisions.
behavioral public policy. Behavioural Public Policy, 1-16.
462 Van Bavel, R., & Dessart, F. J. (2018). The case for qualitative methods in
488 Reijula, S., & Hertwig, R. (2022). Self-nudging and the citizen choice architect.
behavioural studies for EU policy-making. Publications Office of the European Union:
Behavioural Public Policy, 6(1), 119-149.
Luxembourg. https://www.povertyactionlab.org/blog/8-11-21/strengthening-
randomized-evaluations-through-incorporating-qualitative-research-part-1 489 Hertwig, R., & Grüne-Yanoff, T. (2017). Nudging and boosting: Steering or
empowering good decisions. Perspectives on Psychological Science, 12(6), 973-986.
463 https://www.bi.team/blogs/citizens-jury/
490 https://www.thersa.org/reports/steer-the-report
behavioral public policy. Behavioural Public Policy, 1-26. 491 In the figure, nudges are partly present in the ‘reflection’ row. This is because
‘many nudges do require citizens to think and indeed reflect’, and therefore overlap
465 Walton, G. M., & Wilson, T. D. (2018). Wise interventions: Psychological
with nudge pluses. Banerjee, S., & John, P. (2020). Nudge plus: incorporating reflection
remedies for social and personal problems. Psychological review, 125(5), 617.
into behavioral public policy. Behavioural Public Policy, 1-16.
466 See the emphasis in Institute for Government and Cabinet Office (2010)
492 Einfeld, C., & Blomkamp, E. (2021). Nudge and co-design: complementary or
MINDSPACE: Influencing behaviour through public policy.
contradictory approaches to policy innovation? Policy Studies, 1-19.
493 Richardson, L., & John, P. (2021). Co-designing behavioural public policy: 518 Mrkva, K., Posner, N. A., Reeck, C., & Johnson, E. J. (2021). Do nudges reduce
lessons from the field about how to ‘nudge plus’. Evidence & Policy, 17(3), 405-422. disparities? Choice architecture compensates for low consumer knowledge. Journal of
Marketing, 0022242921993186.
494 Einfeld, C., & Blomkamp, E. (2021).
495 Hills, John (2007) Pensions, public opinion and policy. In: Hills, John, Le behavioral public policy. Behavioural Public Policy, 1-16. Hertwig, R., & Grüne-
Grand, Julian and Piachaud, David, (eds.) Making Social Policy Work. CASE studies Yanoff, T. (2017). Nudging and boosting: Steering or empowering good decisions.
on poverty, place and policy. The Policy Press, Bristol, UK, pp. 221-243. ISBN Perspectives on Psychological Science, 12(6), 973-986.
9781861349576
520 Bradt, J. (2019). Comparing the effects of behaviorally informed interventions
496 This is similar to Cass Sunstein’s arguments about ‘choosing not to choose’. on flood insurance demand: an experimental analysis of ‘boosts’ and ‘nudges’.
Sunstein, C. R. (2015). Choosing not to choose: Understanding the value of choice. Behavioural Public Policy, 1-31.
Oxford University Press, USA.
521 Folke, T., Bertoldo, G., D’Souza, D., Alì, S., Stablum, F., & Ruggeri, K. (2021).
497 Reijula, S., & Hertwig, R. (2022). Self-nudging and the citizen choice architect. Boosting promotes advantageous risk-taking. Humanities and Social Sciences
Behavioural Public Policy, 6(1), 119-149. Communications, 8(1), 1-10.
498 Moreover, policy makers will still have decided to invest scarce resources in 522 Mühlböck, M., Kalleitner, F., Steiber, N., & Kittel, B. (2022). Information,
a boost intervention, which may mean other options are not available. Hertwig, R., reflection, and successful job search: A labor market policy experiment. Social Policy &
& Grüne-Yanoff, T. (2017). Nudging and boosting: Steering or empowering good Administration, 56(1), 48-72.
decisions. Perspectives on Psychological Science, 12(6), 973-986.Grüne-Yanoff, T.,
Marchionni, C., & Feufel, M. A. (2018). Toward a framework for selecting behavioural 523 Matz, S. C., Kosinski, M., Nave, G., & Stillwell, D. J. (2017). Psychological
policies: How to choose between boosts and nudges. Economics & Philosophy, 34(2), targeting as an effective approach to digital mass persuasion. Proceedings of the
243-266. National Academy of Sciences, 114(48), 12714-12719.
499 Part of the vision of boosting is that people may develop the ability to transfer 524 ‘Big-data studies of human behaviour need a common language.’ Nature July
heuristics from the original site of learning to new areas. 8, 2021. Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in
500 Sims and Muller (2018) argue that claims such as ‘nudges are illiberal, but 12(6), 1100-1122.
boosts are not’ are not valid. Sims, A., & Müller, T. M. (2019). Nudge versus boost: A
distinction without a normative difference. Economics & Philosophy, 35(2), 195-222. 525 Athey, S. (2018). The impact of machine learning on economics. In The
economics of artificial intelligence: An agenda (pp. 507-547). University of Chicago
501 Miller, G. A. (1969). Psychology as a means of promoting human welfare. Press.
American Psychologist, 24(12), 1063.
526 Smith, W. R. (1956). Product differentiation and market segmentation as
502 Thaler, R. & Sunstein, C. (2008), p.252. alternative marketing strategies. Journal of Marketing, 21(1), 3-8.
503 Banerjee, S., & John, P. (2020). Nudge plus: incorporating reflection into 527 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to
behavioral public policy. Behavioural Public Policy, 1-16. change the world without a heterogeneity revolution. Nature Human Behaviour, 5(8),
980-989.
504 Hallsworth, M. & Kirkman, E. (2020).
528 Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous
505 Einfeld, C., & Blomkamp, E. (2021). Nudge and co-design: complementary or treatment effects using random forests. Journal of the American Statistical
contradictory approaches to policy innovation? Policy Studies, 1-19. Association, 113(523), 1228-1242. Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B.
(2019). Metalearners for estimating heterogeneous treatment effects using machine
506 Bason, C. (2018). Leading public sector innovation: Co-creating for a better learning. Proceedings of the National Academy of Sciences, 116(10), 4156-4165.
society. Policy Press, pp.64–65.
529 Todd-Blick, A., Spurlock, C. A., Jin, L., Cappers, P., Borgeson, S., Fredman, D.,
507 Strassheim, H. (2020). De-biasing democracy. Behavioural public policy and the & Zuboy, J. (2020). Winners are not keepers: Characterizing household engagement,
post-democratic turn. Democratization, 27(3), 461-476. gains, and energy patterns in demand response using machine learning in the United
States. Energy Research & Social Science, 70, 101595.
508 Rouyard, T., Engelen, B., Papanikitas, A., & Nakamura, R. (2022). Boosting
healthier choices. BMJ, 376. 530 Ribers, M. A., & Ullrich, H. (2022). Machine learning and physician prescribing:
a path to reduced antibiotic use. Working paper.
509 Ralph Hertwig offers a different set of criteria in Hertwig, R. (2017). When to
consider boosting: some rules for policy-makers. Behavioural Public Policy, 1(2), 531 Mills, S. (2022). Personalized nudging. Behavioural Public Policy, 6(1), 150-159.
143-161. See also: Grüne-Yanoff, T., Marchionni, C., & Feufel, M. A. (2018). Toward
a framework for selecting behavioural policies: How to choose between boosts and 532 Mohlmann, M. (2021) Algorithmic Nudges Don’t Have to Be Unethical. Harvard
nudges. Economics & Philosophy, 34(2), 243-266. Business Review. Mills, S. (2022). Personalized nudging. Behavioural Public Policy,
6(1), 150-159.
510 Hertwig, R. (2017). When to consider boosting: some rules for policy-makers.
Behavioural Public Policy, 1(2), 143-161. 533 Morozovaite, V. (2021). Two sides of the digital advertising coin: putting
hypernudging into perspective. Mkt. & Competition L. Rev., 5, 105. Teeny, J. D.,
511 Kristal and Santos usefully distinguish between ‘encapsulated’ and ‘attentional’ Siev, J. J., Briñol, P., & Petty, R. E. (2021). A review and conceptual framework for
biases; the former are much more difficult to de-bias. Kristal, A. S., & Santos, L. R. understanding personalized matching effects in persuasion. Journal of Consumer
(2021). GI Joe Phenomena: Understanding the Limits of Metacognitive Awareness on Psychology, 31(2), 382-414.
Debiasing.
534 https://www.research-live.com/article/opinion/new-frontiers-the-growth-of-
512 Hertwig, R. (2017). bespoke-nudging/id/5066852
513 Weimer, D. L. (2020). When are nudges desirable? Benefit validity when 535 Soman, D. & Mazar, N. (eds) Behavioral Science in the Wild, University of
preferences are not consistently revealed. Public Administration Review, 80(1), 118- Toronto Press, pp.284-291; https://www.aitimejournal.com/interview-with-ganna-
126.; Hertwig, R. (2017). pogrebna-lead-for-behavioral-data-science-alan-turing-institute
514 Note that this criterion deals with someone’s view on the choices available; the 536 Teeny, J. D., Siev, J. J., Briñol, P., & Petty, R. E. (2021). A review and conceptual
preceding one deals with whether they want to engage with the choice at all. framework for understanding personalized matching effects in persuasion. Journal of
Consumer Psychology, 31(2), 382-414.
515 Hertwig, R. (2017).
537 Matz, S. C., Kosinski, M., Nave, G., & Stillwell, D. J. (2017). Psychological
516 Hertwig, R. (2017).
targeting as an effective approach to digital mass persuasion. Proceedings of the
517 Hertwig, R. (2017). national academy of sciences, 114(48), 12714-12719.
538 Hullman, J., Kapoor, S., Nanayakkara, P., Gelman, A., & Narayanan, A. (2022). & Tutzauer, F. (2007). The persuasive effects of message framing in organ donation:
The worst of both worlds: A comparative analysis of errors in learning from data in The mediating role of psychological reactance. Communication Monographs, 74(2),
psychology and machine learning. arXiv preprint arXiv:2203.06498. McDermott, 229-255.
M. B., Wang, S., Marinsek, N., Ranganath, R., Foschini, L., & Ghassemi, M. (2021).
Reproducibility in machine learning for health research: Still a ways to go. Science 562 Clark, J. K., Wegener, D. T., & Fabrigar, L. R. (2008). Attitude accessibility and
Translational Medicine, 13(586), eabb1655. message processing: The moderating role of message position. Journal of Experimental
Social Psychology, 44(2), 354-361.
539 https://www.nature.com/articles/d41586-018-05707-8
563 Derricks, V., & Earl, A. (2019). Information targeting increases the weight
540 Salganik, M. J., et al. (2020). Measuring the predictability of life outcomes with of stigma: Leveraging relevance backfires when people feel judged. Journal of
a scientific mass collaboration. Proceedings of the National Academy of Sciences, Experimental Social Psychology, 82, 277-293.; White, K., & Argo, J. J. (2009). Social
117(15), 8398-8403. identity threat and consumer preferences. Journal of Consumer Psychology, 19(3),
313-325.
541 Peer, E., Egelman, S., Harbach, M., Malkin, N., Mathur, A., & Frik, A. (2020).
Nudge me right: Personalizing online security nudges to people’s decision-making 564 Derricks, V., & Earl, A. (2019). Information targeting increases the weight
styles. Computers in Human Behavior, 109, 106347. of stigma: Leveraging relevance backfires when people feel judged. Journal of
Experimental Social Psychology, 82, 277-293.
542 https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-
most-time-consuming-least-enjoyable-data-science-task-survey-says/; https:// 565 Matz, S. C., Kosinski, M., Nave, G., & Stillwell, D. J. (2017). Psychological
thenewstack.io/clean-data-is-the-foundation-of-effective-machine-learning/ targeting as an effective approach to digital mass persuasion. Proceedings of the
National Academy of Sciences, 114(48), 12714-12719.
543 Mills, S. (2022). Personalized nudging. Behavioural Public Policy, 6(1), 150-159.
566 Pierson, E., Cutler, D. M., Leskovec, J., Mullainathan, S., & Obermeyer, Z. (2021).
544 https://hdsr.mitpress.mit.edu/pub/4cg8dhgr/release/3 An algorithmic approach to reducing unexplained pain disparities in underserved
populations. Nature Medicine, 27(1), 136-140.
545 Ruggeri, K., Benzerga, A., Verra, S., & Folke, T. (2020). A behavioral approach
to personalizing public health. Behavioural Public Policy, 1-13. 567 Moseley, A., & Thomann, E. (2021). A behavioural model of heuristics and biases
in frontline policy implementation. Policy & Politics, 49(1), 49-67.
546 Horkheimer, M. & Adorno, T. (1972). Dialectic of Enlightenment (org. pub. 1947).
568 Filmer, D., Nahata, V., & Sabarwal, S. (2021). Preparation, Practice, and Beliefs.
547 Mills, S. (2022). Personalized nudging. Behavioural Public Policy, 6(1), 150-159. World Bank Policy Research Paper 9847.
548 Mohlmann, M. (2021) Algorithmic Nudges Don’t Have to Be Unethical. Harvard 569 Neary, C., Naheed, S., McLernon, D. J., & Black, M. (2021). Predicting risk of
Business Review. postpartum haemorrhage: a systematic review. BJOG: An International Journal of
Obstetrics & Gynaecology, 128(1), 46-53. Khan, M. S., Bates, D., & Kovacheva, V. P.
549 Summers, C. A., Smith, R. W., & Reczek, R. W. (2016). An audience of one: (2021). The Quest for Equitable Health Care: The Potential for Artificial Intelligence.
Behaviorally targeted ads as implied social labels. Journal of Consumer Research, NEJM Catalyst Innovations in Care Delivery, 2(6).
43(1), 156-178.
570 Venkatesh, K. K., et al. (2020). Machine learning and statistical models to predict
550 Spencer, S. B. (2020). The problem of online manipulation. U. Ill. L. Rev., 959. postpartum hemorrhage. Obstetrics and Gynecology, 135(4), 935.
551 Susser, D., Roessler, B., & Nissenbaum, H. (2019). Online manipulation: Hidden 571 https://towardsdatascience.com/the-ultimate-guide-to-starting-ai-
influences in a digital world. Geo. L. Tech. Rev., 4, 1. d506255d7ea
552 Morozovaite, V. (2021). Two sides of the digital advertising coin: putting 572 Rahwan, I., et al. (2019). Machine Behaviour. Nature, 568 (7753), 477-486.
hypernudging into perspective. Mkt. & Competition L. Rev., 5, 105.
573 https://medium.com/behavior-design-hub/what-is-behavioral-data-
553 Mohsen Abbasi, Sorelle A Friedler, Carlos Scheidegger, and Suresh science-and-how-to-get-into-it-e389ed20751f; https://www.researchgate.net/
Venkatasubramanian. 2019. Fairness in representation: quantifying stereotyping as a publication/349963864_Anthropomorphic_Learning
representational harm. In: Proc. of the 2019 SIAM International Conference on Data
Mining. SIAM, 801–809 574 Ruggeri, K., Benzerga, A., Verra, S., & Folke, T. (2020). A behavioral approach
to personalizing public health. Behavioural Public Policy, 1-13.
554 Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting
racial bias in an algorithm used to manage the health of populations. Science, 575 O’Shea, L. (2019). Future Histories: What Ada Lovelace, Tom Paine, and the Paris
366(6464), 447-453. Bigman, Y., Gray, K., Waytz, A., Arnestad, M., & Wilson, Commune Can Teach Us About Digital Technology. Verso Books.
D. (2020). Algorithmic discrimination causes less moral outrage than human
discrimination. In Press at Journal of Experimental Psychology: General. Athey, S. 576 https://thedecisionlab.com/insights/technology/this-is-personal-the-dos-and-
(2018). The impact of machine learning on economics. In The economics of artificial donts-of-personalization-in-tech
intelligence: An agenda (pp. 507-547). University of Chicago Press.
577 Kahneman, D., Knetsch, J. L., & Thaler, R. (1986). Fairness as a constraint on profit
555 National Audit Office (2022) Report on Accounts: Department for Work and seeking: Entitlements in the market. The American economic review, 728-741.
Pensions.
578 Camerer, C. F. (2018). Artificial intelligence and behavioral economics. In The
556 Eubanks, V. (2018). Automating inequality: How high-tech tools profile, police, economics of artificial intelligence: An agenda (pp. 587-608). University of Chicago
and punish the poor. St. Martin’s Press. Press.
557 Kozyreva, A., Lorenz-Spreen, P., Hertwig, R., Lewandowsky, S., & Herzog, S. M. 579 Lorenz-Spreen, P., Geers, M., Pachur, T., Hertwig, R., Lewandowsky, S., &
(2021). Public attitudes towards algorithmic personalization and use of personal data Herzog, S. M. (2021). Boosting people’s ability to detect microtargeted advertising.
online: Evidence from Germany, Great Britain, and the United States. Humanities and Scientific Reports, 11(1), 1-9.
Social Sciences Communications, 8(1), 1-11.
580 See the discussion of “the gaze” in cultural theory. Sartre, J. (1943), Being
558 Teeny, J. D., Siev, J. J., Briñol, P., & Petty, R. E. (2021). A review and conceptual and Nothingness. Foucault, M. (1975). Discipline and Punish: The Birth of the Prison.
framework for understanding personalized matching effects in persuasion. Journal of Mulvey, L. (1989). Visual pleasure and narrative cinema. In: Visual and other pleasures
Consumer Psychology, 31(2), 382-414. (pp. 14-26). Palgrave Macmillan, London. Nielsen, C. R. (2011). Resistance through
re-narration: Fanon on de-constructing racialized subjectivities. African Identities, 9(4),
559 De Jonge, P., Verlegh, P. & Zeelenberg, M. (2022) If You Want People to Accept 363-385.
Your Intervention, Don’t Be Creepy. In: Soman, D. & Mazar, N. (eds) Behavioral
Science in the Wild, University of Toronto Press, pp.284-291. 581 Lewis Jr, N. A. (2021). What counts as good science? How the battle for
560 van Doorn & Hoekstra, 2013; Kim net al., 2019a; White, T. B., Zahay, D. L., 1323.
Thorbjørnsen, H., & Shavitt, S. (2008). Getting too personal: Reactance to highly
personalized email solicitations. Marketing Letters, 19(1), 39-50. 582 Milner IV, H. R. (2007). Race, culture, and researcher positionality: Working
through dangers seen, unseen, and unforeseen. Educational Researcher, 36(7), 388-
561 Briñol, P., Rucker, D. D., & Petty, R. E. (2015). Naïve theories about persuasion: 400.
Implications for information processing and consumer attitude change. International
Journal of Advertising, 34(1), 85-106. Reinhart, A. M., Marshall, H. M., Feeley, T. H.,
583 Nagel, T. (1986) The View from Nowhere. Robert Sugden uses this term to 605 https://chicagobeyond.org/researchequity/
criticize behavioral economists because, he argues, they implicitly take this stance to
make recommendations for society. But this is slightly different from my point, which is 606 Eriksen, T. H. (2001). Small places, large issues: An introduction to social and
not aimed at rejecting paternalism as such, but instead is concerned with the situated cultural anthropology.
nature of any activity, including just observation. Sugden, R. (2013). The behavioural
economist and the social planner: to whom should behavioural welfare economics be 607 https://chicagobeyond.org/researchequity/
addressed? Inquiry, 56(5), 519-538.
608 Pereira, M. M. (2021). Understanding and reducing biases in elite beliefs about
584 Liscow, Z., & Markovits, D. (2022). Democratizing Behavioral Economics. Yale the electorate. American Political Science Review, 115(4), 1308-1324.
Journal on Regulation, 39(1217).
609 https://www.ideas42.org/blog/street-smarts-in-the-field-insights-for-
585 Bergman, P., Lasky-Fink, J., & Rogers, T. (2020). Simplification and defaults designing-public-housing-infrastructure/
affect adoption and impact of technology, but decision makers do not realize it.
Organizational Behavior and Human Decision Processes, 158, 66-79. Pereira, M. 610 Sutherland, R. (2019). Alchemy: The surprising power of ideas that don’t make
M. (2021). Understanding and reducing biases in elite beliefs about the electorate. sense. Random House.
American Political Science Review, 115(4), 1308-1324.
611 Sulik, J., Bahrami, B., & Deroy, O. (2021). The Diversity Gap: when diversity
586 Furnas, A. C (2022) The People Think What I Think: False Consensus and Elite matters for knowledge. Perspectives on Psychological Science, 17456916211006070.
Misperception of Public Opinion.
612 https://www.poverty-action.org/blog/locally-grounded-research-
587 This is not always the case - see Broockman, D. E., & Skovron, C. (2018). Bias in strengthening-partnerships-advance-science-and-impact-development
perceptions of public opinion among political elites. American Political Science Review,
613 https://phdproject.org/
112(3), 542-563.
614 Erosheva, E. A., Grant, S., Chen, M. C., Lindner, M. D., Nakamura, R. K., & Lee,
588 Behavioral Insights Team (2018). Behavioural Government, p.35.
C. J. (2020). NIH peer review: Criterion scores completely account for racial disparities
589 https://chicagobeyond.org/researchequity/ in overall impact scores. Science Advances, 6(23), eaaz4868.
590 Lewis Jr, N. A. (2021). What counts as good science? How the battle for 615 Ledgerwood, A., et al. (2022). The pandemic as a portal: Reimagining
methodological legitimacy affects public psychology. American Psychologist, 76(8), psychological science as truly open and inclusive. Perspectives on Psychological
1323. Science, 17456916211036654.
591 https://gocommonthread.com/work/global-gavi-bi/
592 Blasi, D. E., Henrich, J., Adamou, E., Kemmerer, D., & Majid, A. (2022). Over-
reliance on English hinders cognitive science. Trends in Cognitive Sciences. https://doi.
org/10.1016/j.tics.2022.09.015
593 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5293198/; https://www.

apa.org/workforce/data-tools/demographics; Callahan, J. L., Smotherman, J. M.,
Dziurzynski, K. E., Love, P. K., Kilmer, E. D., Niemann, Y. F., & Ruggero, C. J. (2018).
Diversity in the professional psychology training-to-workforce pipeline: Results from
doctoral psychology student population data. Training and Education in Professional
Psychology, 12(4), 273.
594 Lepenies, R., & Małecka, M. (2019). Behaviour change: extralegal, apolitical,
scientistic? In Handbook of behavioural change and public policy (pp. 344-360).
Edward Elgar Publishing.
595 Lepenies, R., & Małecka, M. (2019). https://www.psychologicalscience.org/

observer/grand-challenges
596 Lewis Jr, N. A. (2021). What counts as good science? How the battle for
1323.
597 Saini, A. (2020). Want to do better science? Admit you’re not objective. Nature,
579(7798), 175-176. Dupree, C. H., & Kraus, M. W. (2022). Psychological science is
not race neutral. Perspectives on Psychological Science, 17(1), 270-275.
598 Cheon, B. K., Melani, I., & Hong, Y. Y. (2020). How USA-centric is psychology?
An archival study of implicit assumptions of generalizability of findings to human nature
based on origins of study samples. Social Psychological and Personality Science,
11(7), 928-937.
599 Adetula, A., et. al. (2022) Psychology should generalize from — not just to —
Africa. Nat Rev Psychol (2022). https://doi.org/10.1038/s44159-022-00070-y
600 Adorno, T. & Horkenheimer, M. (1972) Dialectic of Enlightenment. Harding, S.

(1998). Is science multicultural: Postcolonialisms, feminisms, and epistemologies.
601 Lepenies, R., & Małecka, M. (2019).
602 https://www.urban.org/urban-wire/equitable-research-requires-questioning-
status-quo
603 Sendhil Mullainathan, keynote address to the Society of Judgment and Decision
Making Annual Conference, 2022.
604 Amin, A. B., Bednarczyk, R. A., Ray, C. E., Melchiori, K. J., Graham, J.,
Huntsinger, J. R., & Omer, S. B. (2017). Association of moral values with vaccine
hesitancy. Nature Human Behaviour, 1(12), 873-880.
The Behavioural Insights Team uses
behavioral science to develop better
systems, policies, products and services.
Our goal is to help people, communities
and organizations thrive.
Michael Hallsworth

BIT Manifesto

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BIT Manifesto

Uploaded by

Copyright:

Available Formats

A MANIFESTO

for Applying Behavioral Science

Category Proposal Recommended action(s)

Scope Present behavioral science as a lens that

01 improves the view of any public and private

Focus less on how to set up a dedicated

02 behavioral science team, and more on

Use aspects of complexity thinking to

03 improve behavioral science so it can: exploit

Category Proposal Recommended action(s)

Methods Strengthen RCTs to deal better with

04 complexity by: gaining a better

Identify the most reliable interventions,

05 develop an accurate sense of the likely size

Emphasize theories that are “practical”: they

06 fill the gap between high-level frameworks

Develop the practice of getting behavioral

07 scientists to predict the results of experiments,

Category Proposal Recommended action(s)

Values Avoid using the term “irrationality”; practice

08 “epistemic humility”; and design processes and

Use data science to identify the ways in which

09 an intervention or situation appears to increase

Cultivate self-scrutiny; find new ways for the

10 subjects of research to judge researchers; take

CRITICISM PROPOSAL RESPONSIBLE ACTOR(S)

LIMITED IMPACT Use behavioral science as a lens

FAILURE TO Build behavioral science into

LACK OF PRECISION Replication, variation, adaptation

OVERCONFIDENCE Beyond lists of biases

CONTROL PARADIGM Predict and adjust

ETHICAL CONCERNS Data science for equity

Such a view is both misguided and profoundly Full detail on page 25

A rebalancing is needed. Behavioral science also

BEHAVIORAL SCIENCE KNOWLEDGE AND CAPACITY

BEHAVIORAL NO Baseline Proactive consultancy Behavioral entrepreneurs

Full detail on page 37

Full detail on page 69

Level of involvement in initiative

NONE CO-DESIGN INITIATING/DRIVING

NONE NUDGE SELF-NUDGE

ACTION “PATERNALISTIC” BOOST BOOST “SELF-BOOST”

In order to navigate this landscape, behavioral Full detail on page 85

There have been repeated concerns that the

A starting point is for behavioral scientists to

Category Proposal Recommended action(s)

Category Proposal Recommended action(s)

We need a rebalancing. Behavioral science also has

• A lens metaphor shows that behavioral insights

BEHAVIORAL SCIENCE KNOWLEDGE AND CAPACITY

BEHAVIORAL NO Baseline Proactive consultancy Behavioral entrepreneurs

RESILIENCE The vertical axis represents whether behavioral

CALL FOR EXPERTS These are not just hypotheticals. A review of

BEHAVIORAL SCIENCE KNOWLEDGE AND CAPACITY

BEHAVIORAL NO Baseline Behavioral entrepreneurs

Such problems even produce little agreement UNINTENDED CONSEQUENCES

Moreover, focusing on a single behavior to achieve STABILITY

DEVELOP BEHAVIORAL SCIENCE For example, a recent US study showed how

What are the practical implications? The standard

Here are some examples of what we mean.

In these examples, the idea is to find targeted SEE THE SYSTEM

While behavioral science can improve ABM, the