4. Esther Duflo (2017)

American Economic Review: Papers & Proceedings 2017, 107(5): 1–26
https://doi.org/10.1257/aer.p20171153
RICHARD T. ELY LECTURE
The Economist as Plumber †
By Esther Duflo*
Economists are increasingly getting the real world, not only do they need to give general
opportunity to help governments around the prescriptions, they must engage with the details.
world design new policies and regulations. This Second, details that we as economists might
gives them a responsibility to get the big pic- consider relatively uninteresting are in fact
ture, or the broad design, right. But in addition, extraordinarily important in determining the
as these designs actually get implemented in the final impact of a policy or a regulation, while
world, this gives them the responsibility to focus some of the theoretical issues we worry about
on many details about which their models and most may not be that relevant. This sentiment is
theories do not give much guidance. well summarized by Klemperer (2002, p. 170),
There are two reasons for this need to attend who presents his views on what matters for prac-
to details. First, it turns out that policymakers tical auction design, based on his own experi-
rarely have the time or inclination to focus on ence designing them and advising bidders: “in
them, and will tend to decide on how to address short,” he writes, “good auction design is mostly
them based on hunches, without much regard good elementary economics, [whereas] most of
for evidence. Figuring all of this out is therefore the extensive auction literature is of second-or-
not something that economists can just leave der importance for practical auction design.”
to policymakers after delivering their report: if It seems appropriate to open an essay on
they are taking on the challenge to influence the plumbing with an actual plumbing example that
illustrates the two points I made above (Devoto
et al. 2012). Many cities in the developing world
* Department of Economics, MIT, 77 Massachusetts seek to improve citizens’ access to home water
Avenue, Building E52-544, Cambridge, MA 02139 (e-mail: connections. Even when there are public taps,
eduflo@mit.edu). This essay is based on the Ely lecture, urban households without a connection at home
which I delivered at the AEA meeting in January 2017. spend several hours per week collecting water,
Many thanks to Alvin Roth, both for his invitation to give
this lecture, and for his license to get our hands dirty with
and this burden causes them considerable stress
policy work. I thank Abhijit Banerjee and Cass Sunstein, and tension. The typical policy for improving
master plumbers, for their encouragements, for several con- water access is to build the necessary infrastruc-
versations that shaped this lecture, and for detailed com- ture, and then to encourage “end of pipe” connec-
ments on a first draft. David Atkin, Robert Gibbons, Parag tions through subsidized tariffs and/or subsidized
Pathak, and Richard Thaler, provided useful comments.
Vestal McIntyre gave wonderful and detailed comments loans. In 2007 in Tangier, a firm called Amendis
on style; all awkwardness remains mine. Laura Stilwell (the local subsidiary of Veolia Environnement),
provided excellent research assistance. Plumbing is not a which was in charge of the water and sanitation
solitary activity, and I would like to acknowledge the many for the city, had spent considerable resources
people who have plumbed and discussed with me over the
years, notably: Abhijit Banerjee, Rukmini Banerji, James
building large pipes and installing toilets in each
Berry, Raghabendra Chattopadhyay, Iqbal Dhaliwal, Rachel house, and, in collaboration with the city, was
Glennerster, Michael Greenstone, Rema Hanna, Clement offering interest-free loans to poor households to
Imbert, Michael Kremer, Shobhini Mukherjee, Karthik make it possible to cover the marginal cost of new
Muralidharan, Nicholas Ryan, Rohini Pande, Benjamin water connections. But take-up of the subsidized
loan program was very low (less than 10 per-
Olken, Anna Schrimpf, and Michael Walton.
†
Go to https://doi.org/10.1257/aer.p20171153 to visit the
article page for additional materials and author disclosure cent); applying for the program required a trip to
statement. the municipal office with supporting documents,
1
2 AEA PAPERS AND PROCEEDINGS MAY 2017
and this proved a real barrier. When the research its own or does not run at all, we avoid
team randomly visited households and offered having to go looking for where the wheels
procedural assistance by photocopying the are getting caught and figuring out what
required documents at home and delivering them small adjustments it would take to get the
to the municipal office, take-up of the loan and machine to run properly. To say that we
need to move to a voucher system does
the water connections increased to 69 percent. not oblige us to figure out how to make it
As a result of this small additional expenditure, work—how to make sure that parents do
poor residents of Tangier gained access to water, not trade in the vouchers for cash (because
and therefore recovered a considerable amount they do not attach enough value to their
of time to do other things. They were ultimately children’s education) and that schools do
much happier and less stressed, despite a large not take parents for a ride (because par-
increase in their water bill. ents may not know what a good education
Improving access to private water connec- looks like). And how to get the private
tions was a sensible policy idea, and the entire schools to be more effective? After all,
effort was broadly well designed. But the lack of at least in India, even children who go to
private schools are nowhere near grade
attention to the very last step (the administrative level. And many other messy details that
steps to sign up) had been preventing this large every real program has to contend with.
investment in both physical and financial infra-
structure from paying off. When we are concerned with such details,
These kinds of practical design questions are ex ante, we will have some priors on what fea-
ubiquitous in policy design. When new front- tures will be important, and this guides our first-
line workers are hired to implement a program, pass attempts at design. But it is not clear that
will emphasizing wage and career prospects dis- either the policymakers or the scientists will
courage publicly minded individuals or encour- correctly identify the most important choices.
age the most talented to join? When thinking Those may not have been the focus of either
about an immunization policy, should policy- practical or theoretical attention, and thus may
makers assume that parents understand the full have been completely ignored. So an economist
costs and benefits and immunization and ratio- who cares about the details of policy implemen-
nally internalize them, or assume they may be ill tation will need to pay attention to many details
informed and/or present biased? When design- and complications, some of which may appear
ing the health exchanges for the Affordable Care to be far below their pay grade (e.g., the font size
Act, should the health-plan options be labeled on posters) or far beyond their competence level
with precious metal (platinum, gold, or bronze) (e.g., the intricacy of government budgeting in
or will that inadvertently bias the participant a federal system). It will sometimes appear that
choice toward one type of plan versus the other? the extensive training they received is underused
At the outset, it will be difficult to know the if, as Klemperer notes, the theoretical complex-
answer to these questions, especially when the ities turn out to be second order. On the other
policy problem is fairly new. hand, they will have a chance to apply their
Paying attention to the details of policy economist’s mind, since many of the details
requires a mindset that is slightly different from have implications for issues that are an econo-
that which graduate school instills in economist’s bread and butter: incentives, information,
mists. Banerjee (2007, p. 161–162) summarizes imperfect rationality, etc. They will also need to
the reluctance of economists to engage with be very observant, and keep a close eye on the
those details very well in his essay “Inside the impact of any change they recommend.
Machine.” Economists, he writes, tend to think in Inspired by this exhortation to go inside the
“machine mode”: they want to find out the button machine, Alvin Roth’s image of the economist
that will get the machine started, the root cause as engineer, and Banerjee’s (2002) image of the
of what makes the world go round. He writes: economist as an experienced craftsman,1 I label
The reason we like these buttons so much,

it seems to me, is that they save us the 1
Banerjee in fact mentioned plumbers, in defending the
trouble of stepping into the machine. By reputation that economists will provide good advice because,
assuming that the machine either runs on like plumbers, they care about their reputation.
VOL. 107 NO. 5 richard t. ely lecture 3
this detail-focused approach as the “plumbing” The engineer takes these general principles
mindset. The economist-plumber stands on the into account, but applies them to a specific sit-
shoulder of scientists and engineers, but does not uation. This requires careful attention to the
have the safety net of a bounded set of assump- details of the environment being studied, but
tions. She is more concerned about “how” to do also new tools: the economist-engineer cannot
things than about “what” to do. In the pursuit shrug off the fact that a particular situation is not
of good implementation of public policy, she is covered by the assumptions of the theorem, and
willing to tinker. Field experimentation is her cannot ask agents to change their preferences so
tool of choice. the assumptions hold. If the specific real-world
What I will try to argue in this lecture is problem at hand cannot be solved analytically,
not that all of us should be plumbers, or even then she will reach for other tools—in particular
that any of us should be plumbers all the time; computation and laboratory experiments—and
instead, I will try to show that there is value for will simulate the behavior of a market.
economists to take on some plumbing projects, For example, in the matching for doctors
in the interest of both society and our discipline. who are applying for their first residency to a
hospital, which is the focus of Roth (2002), the
I. Scientists, Engineers, Plumbers simple matching theory does not accommodate
the fact that some of these new doctors come as
A. Definitions married couples, and need to be assigned to the
same town. Roth refined the theory to accommo-
Alvin Roth’s seminal 1999 Fisher-Schultz lec- date married couples, but at that point, he could
ture (Roth 2002) invited economists to adopt an not solve the problem analytically: in particular,
“engineering” approach to their craft. Economists, theory suggests that, with couples, there could
he pointed out, are increasingly called upon, not be a situation without stable matching, and that
just to analyze real world institutions, but also to the sequencing of the decision could in principle
design them. Roth’s focus is the design of mar- affect results. So Roth and his colleagues used
kets, but economists are also called upon to help computation to design an algorithm, and exam-
design incentives schemes for firms and regula- ine the impact of different rules (using data from
tion and social policies for governments. In this previous years), including the potential impact
paper, I will consider the role of economists in the of sequencing. The computations suggested that
design of policies and regulation. the algorithm never failed to converge to a stable
For Roth (2002, p. 1341), intervening in the match, and that sequencing effects were small
real world should fundamentally alter the atti- and unsystematic. This allowed them to suggest
tude of the economist and her way of working. a matching algorithm that would work (and has
He sets the tone in the abstract of the paper worked in practice), even though the theory for
(emphasis added): why couples were probably not a big problem
after all had not been fully developed.
Market design involves a responsibility for The plumber goes one step further than the
detail, a need to deal with all of a mar- engineer: she installs the machine in the real
ket’s complications, not just its principle world, carefully watches what happens, and
features. Designers therefore cannot work then tinkers as needed. At the time she inherits
only with the simple conceptual models the machine, the broad goals are clear, but many
used for theoretical insights into the gen-
eral working of markets. Instead, market details still need to be worked out. The funda-
design calls for an engineering approach. mental difference between an engineer and a
plumber is that the engineer knows (or assume
The scientist provides the general framework she knows) what the important features of the
that guides the design. In Roth’s Fisher-Schultz environment are, and can design the machine to
lecture, which again is focused on market design, address these features—in the abstract, at least.
game theory provides the general principles. For There may not be a theory fully worked out to
all the other domains where economists will be accommodate these features, but she can use
called to provide inputs, there exists a relevant computation and lab experiments to simulate
body of theory (or at least general insight) that how they will play out. When the plumber fits
they can use to guide design. the machine, there are many gears and joints,
and many parameters of the world that are difficult What is more, we may not even know ex ante
to anticipate and will only become known once which of these decisions will in fact matter, and
the machine grinds into motion. The plumber will our models offer fairly limited guidance on what
use a number of things—the engineering design, to pay attention to. Pathak (forthcoming, p. 3)
his understanding of the context, prior e xperience, writes: “I will discuss a handful of issues exam-
and the science to date—to tune every feature of ined in the theoretical literature on matching
the policy as well as possible, keeping an eye on mechanisms that have proven to be first-order.
all the relevant details as best he can. But with It’s worth emphasizing that it is only with the
respect to some details, there will remain genuine benefit of several design case studies that we’re
uncertainty about the best way to proceed, because beginning to understand which issues are quan-
the solution depends on a host of factors he cannot titatively important.”
easily quantify, or sometimes even identify, in the This uncertainty—concerning what the true
abstract. (These are the “unknown unknowns”: all model is—has consequences for policy engi-
the issues we can’t predict but will arise anyway). neering very similar to the problems discussed
Thus, another difference between plumbers and in the macroeconomics literature on “robust”
engineers is that engineers will start from the out- policy (Cogley et al. 2008). Since we don’t
come they are seeking to attain and engineer the know what the true model is, we need to design
machine to reach it. Plumbers, on the other hand, policies that are as robust as possible to this
will have to adopt a more tentative approach, start- model uncertainty. For example, Chetty (2015)
ing from the machine’s characteristics and iden- argues that public policy that includes “nudges”
tifying their effect (compared to another possible (such as default) is more robust, in the Hansen
set of choices). and Sargent sense, to a specific model uncer-
For example, in the early 2000s the city of tainty (are people rational or not?): rational peo-
Boston decided to change how it assigned stu- ple will chose what they want, and “behavioral”
dents to schools. There was an important engi- agents will be in default that should largely work
neering part to choosing the mechanism that for them. A smart nudging policy is thus a fairly
would be used, initiated by Abdulkadiroglu and robust policy, in general. It will have an effect
Sönmez (2003) and followed by a considerable only if people suffer from behavioral biases, but
literature (see Pathak 2011 and Abdulkadiroglu have no effect otherwise.
and Sönmez 2013 for reviews). But once city There is no general theory of how to design
leaders settled on a mechanism, they still had to policy under this kind of model uncertainty, how-
make many decisions. How to communicate the ever, and in many cases, even the best educated
change to parents? How to persuade them that guess will still be just that, a guess. The econo-
they could reveal their true preferences when they mist-plumber will use all they know (including
ranked schools? Should there be a limit on how model uncertainty), to come up with the best
many schools parents can (or are required to) guess possible, and then pay careful attention to
rank? Should parents living near a school receive what happens in reality. The uncertainty in the
preferential treatment, and how should that be set environment creates a highly stochastic world:
up? And so on. According to Pathak (forthcom- the natural way to “pay attention” to what hap-
ing, p. 3), who reviewed his experience working pens, as I will argue below, is thus to analyze
in several cities and who modeled his conclusion natural experiments or set up field experiments
on Klemperer’s: to try out different plumbing possibilities.
What really matters for school choice mar- B. Designing the Taps, Laying Down the Pipes:
ket design are basic insights about straight- Two Examples
forward incentives, transparency, avoiding
inefficiency through coordination of offers Although I will often talk about details as a
and well-functioning aftermarkets, and
influencing inputs to the design, including general way to describe the handiwork of plumb-
applicant d ecision-making and the quality ers, there are two different kinds of plumbing
of schools. Some of the issues examined in policy design. First is the “design of the
in the extensive theoretical literature on tap” work: taking care of apparently irrelevant
school choice matching m arket design are details, such as the way the policy is communi-
less important for practical design. cated or the default options offered to customers.
Second is “layout of the pipes” work: important would have been up to the local leaders, and so
logistical decisions that are fundamental to the on. But there are good reasons to think that each
policy’s functioning but often treated as purely of these features matters. The price to be paid
mechanical, such as the way money flows from for rice is one piece of information that recip-
point A to point B, or which government worker ients themselves do not have, so local leaders
has sign-off on what decisions. can easily skim off the program by inflating the
Banerjee et al. (2015) worked with the copay. Making the list widely available will cre-
Indonesian government on a “design of the tap” ate communal knowledge, potentially changing
problem, in the context of the Raskin program, the nature of the bargaining game between the
a subsidized rice distribution scheme that, with local leader and the citizen. Distributing phys-
an annual budget of US$1.5 billion and a tar- ical cards may not even be necessary, if lists of
geted population of 17.5 million households, is those eligible are posted in the village.2
Indonesia’s largest targeted transfer program. The In practice, distributing cards made a large
Raskin program is plagued with corruption: the difference: on net, across all of the variations of
data in Banerjee et al. show that beneficiary house- the program, the cards led to a large increase in
holds pay more than they should and receive less subsidy received by eligible households. Eligible
than their full quota of rice, to the extent that they households in treatment villages received a
receive only about one-third of the subsidy they 26 percent increase in subsidy, stemming from
are entitled to, overall. Working with the gov- both an increase in quantity and a decrease in
ernment of Indonesia, the researchers designed the copay price. This occurred despite imperfect
a set of field experiments to provide information implementation: eligible households in treat-
directly to eligible households. In 378 villages ment villages were only 30 percentage points
(randomly selected from among 572 villages more likely to have received a card relative to
spread over 3 provinces), the central government the control. Moreover, several of the variations
mailed “Raskin identification cards” to eligible did in fact matter: gains were substantially larger
households to inform them of their eligibility and when the information was public and prominent,
the quantity of rice that they were entitled to. In when the prices were displayed on the card, and
addition, the team varied several features of the when people actually got a physical card. The
design of the cards and the way they were intro- coupons feature did not seem to be particularly
duced: whether the copay price was also listed on relevant. This piece of plumbing research was
the card, whether information about all the bene- influential: since the success of the cards was
ficiaries was posted in the village, whether cards blatant, the government decided to scale them
were sent to all eligible households or only to a up and immediately started distributing them
subset, and whether the cards came with coupons to everyone, for Raskin and other government
to acknowledge the delivery of the rice. transfer programs. In total more than 60 mil-
A first plumbing issue is whether or not lion beneficiaries got cards. The upside of the
households got a physical card. Traditionally, fact that plumbing is not a prime focus of gov-
households had not gotten one—the program ernments is that when you hand them a good
was designed to be implemented by local lead- plumbing solution, it stands a better chance of
ers, who were given considerable discretion, being widely adopted than when you present a
and any effort at information dissemination had brand new program, which is prone to come up
focused primarily on them, not on citizens. The against someone’s ideology.
optimistic policymaker who initially designed Banerjee, Duflo, Imbert, Mathew, and Pande
this machine clearly envisioned a beneficent (2016) focus on the hidden pipes. In many gov-
local administration dedicated to the good of the ernments worldwide, some programs (such as
people. The very specific design features of the local workfare programs or infrastructure build-
cards and their distribution are further plumbing ing) are financed centrally but implemented by
issues, and would normally have been ignored, local communities. This causes a basic plumbing
even after the government had been convinced
that sending cards was a good idea. Someone (a 2
In a different setting, the government of India made the
graphic designer maybe?) would have made a conscious decision not to necessarily issue an identity card
decision on whether or not to list on the card the when it established biomarker-linked unique identifiers for
price to be paid for rice. Distribution of the cards all Indians.
problem: how to transfer money from the center

State pool in central bank of India
to the outposts, which are normally strapped
for cash and unable to front the expenses for
3
the programs. Traditionally, due to slow trans-
Fund transfer
Fund request
CPSMS
District
mission both of financial orders and informa- access 4
2
tion, the only practical way to transfer funds is
to give the local community an advance and, Block
1 5
after they provided proof of how they used it, Panchayat
give them another advance. But this system is Panchayat savings acount
fraught with difficulties. First, there is a basic
financial management issue: some money lies Figure 1. Flow of Authorizations under
idle in one locale, whereas another might lack the Traditional System
money. The advances need to be large enough
to avoid paralyzing the program, but will add up for a specific activity (“we have hired Mrs.
to the government’s expenditure. Second, and Kumar for 14 days”), and receive a correspond-
perhaps more importantly, the ex post justifica- ing payment on a “zero balance account,” which
tion of expenditures is often delayed and patchy, they would immediately use to pay Mrs. Kumar.
making auditing and accounting difficult. This We had the opportunity to design and implement
creates opportunities for leakage and embezzle- a large-scale experiment to evaluate the impact of
ment, and the monitoring structure that is put in this reform. We worked in 120 districts, in 1,000
place to control it may become part of the prob- treatment and 2,000 control local governments,
lem, creating its own red tape, inefficiencies, covering a population of 30 million people.
and leakage opportunities (Banerjee 1997). The main benefit that Mathew foresaw in this
We had the chance to experiment with a sys- system was that it would remove the uncertainty
tem that was largely the brainchild of Santhosh on whether and when money would come, and
Mathew, one of the coauthors of the study. he believed that it would empower local gov-
Mathew is a rare creature—a b ureaucrat-plumber. ernments to be more proactive in implementing
A career civil servant with a Masters in econom- the program. As it turned out, unforeseen events
ics and a PhD in development studies, Mathew meant that this could not happen in this context.
is deeply concerned with the pragmatic issues of A few days after the program was launched,
any program he is involved with. In the context the central government cut Bihar’s MGNREGS
of the MGNREGS, a public workfare program funds, due to a dispute on data entry, and mas-
financed by the Federal government but imple- sive uncertainty ensued on when the money
mented at the village level, a unified financial would be available again. This took five months
platform was put in place in Mathew’s adopted to be resolved, and since it happened just after
state of Bihar, whereby the money transfer could the introduction of the system, many local offi-
be made directly to the local government from the cials (wrongly) believed the new uncertainty they
bank in the state capitol. This solved half of the experienced on whether money would come was
problem, that of the transmission of money. The the new system’s fault.
other half (transmission of information on who The plumbing reform, however, changed
should get money) was originally not solved. something else: the reformed system was con-
Figure 1 shows the flow of authorizations under siderably more transparent than the status quo.
the traditional system: the local government (pan- The local official was putting his signature on
chayat) transmits a request for funds to the next a document where he said that he had recruited
administrative level, the block, which ratifies it specific people. This made it easy for auditors
and transmits it to the next level up, the district, to go look for them, and punish the officials if
which ratifies it and enters it in the financial man- they did not exist or had not worked for the pro-
agement platform, which generates an invoice to gram (many of the workers in MGNREGS are
the Central Bank of India, leading to a transfer “ghost workers”). Although this was not the pri-
right back down onto the panchayat account. mary motivation of the reform, basic economics
Mathew designed a reform to this system. The suggested that this could change the behavior of
Panchayat now could log in directly into the the local officials. We looked for evidence of a
financial management platform, enter an invoice reduction in corruption, and we found plenty.
First, we saw a reduction of about 22 percent policies and institutions. There is nothing
in the expenditures on the program, without obvious about designing a school choice mecha-
apparent change in the number of workdays nism, an auction, or an exchange market for kid-
in an independent survey we conducted. Since ney. Economic theory has developed the tools
the survey data could be noisy and miss a real to think about those questions, and it stands to
decline in expenditure, we looked for smoking reason that policymakers would ask for econo-
guns, and found two: first, we matched all the mists’ guidance, very much like economists pro-
names reported in the public database for the vide advice on macroeconomic policy. It may
program to a census that was conducted at the however be less clear that policymakers need
same time. We estimated that a match was less plumbers, or that economists would be partic-
likely to be found for someone in the public data- ularly well suited to play this role. This section
base in the control group than in the treatment argues that policymakers do need plumbers. The
group, indicating that they were more likely to next section will argue that our discipline pre-
be a ghost worker. Second, we examined the pares us well to think about many of the plumb-
wealth declarations of the Panchayat officials, ing problem that they face.
and found a significant reduction in the median Abhijit Banerjee, Gita Gopinath, and I
wealth (though we find no impact at the top and recently spent a few hours with bureaucrats and
the average effect is insignificant). consultants from the health department in the
Most anticorruption drives focus on explicit state of Kerala, in the south of India. Kerala is
anticorruption tools (such as audits) or on incen- one of India’s most successful states, at least
tives, but not on the way that the program is as far as the social sector. It is starting to face
structured. This example showed that structure first-world problems: an aging population, high
is not only relevant, but quantitatively important: prevalence of hypertension, diabetes, and obe-
during the course of the experiment, the reduc- sity. The bureaucracy and the elected leaders are
tion in expenditure alone saved $6 million to the committed to try to do something about both
government. And as in the case of the Raskin rising health care costs and noncommunicable
program in Indonesia cited above, the experi- disease prevention. Their best idea is to reform
ment was actually instrumental in getting this the public primary health care system in order
program scaled up across all states, and then for to make it more attractive to customers (who for
the financial transfers in other types of program. the most part seek their health care in the private
Initially, before the corruption result came out, sector, like elsewhere in India), and to instate
the program was canceled in Bihar: faced with a better practices of prevention, management, and
reduction in expenditure and the objections of the treatment. They would try out a new organiza-
local officials, who for obvious reasons were not tion of the health care sector. Nurses, volunteers
fans of the program, Mathew’s successor decided from the local governments, and doctors would
to cancel it. But after it had become apparent that work seamlessly in a health team that would be
the reduction in expenditure stemmed mostly in charge of keeping the population healthy, with
from a reduction in corruption, the story changed: a heavy focus on encouraging lifestyle changes
When the Prime Minister was keen to show that and preventative activities. They are currently
his government was doing something to improve planning to try the system in 152 health centers.
the implementation of public programs, this par- Though the Additional Chief Secretary (top
ticular reform seemed uncontroversial enough bureaucrat) in charge of health had invited us to
for top bureaucrats—a “technical reform” they the meeting, he was called away to deal with a
could get behind relatively easily. The program doctors’ strike around the time the conversation
was scaled up to MGNREGS. It is now poised to turned to the specifics of the reform. He handed
be scaled up to other kinds of centrally sponsored us over to a retired professor and a retired doc-
schemes and has the potential to affect billions of tor, who have been charged with designing the
dollars in central government spending. specifics of the policy. This in itself is symptom-
atic: top policymakers usually have absolutely
II. Why Policymakers Need the Plumbers no time for figuring out the details of a policy
plan, and delegate it to “experts.” In our con-
It seems quite natural for policymakers to rely versation, we started to push on some specific
on economist-engineers to design c omplicated questions on the model that they had in mind:
why would patients pay attention to a nurse, of obesity to be built in communities, etc.).
given that until now they have only taken doctors Interestingly enough, it had nothing to do with
seriously? Were they really sure that if the nurse the health care reform that we had discussed that
started to take blood pressure and fill prescrip- morning. It appeared that the details would have
tions, this would give her the authority she would to wait for another day.
need to dispense advice? Or that doctors would I have since learned to avoid these kinds of
be willing and able to signal that nurses were to meetings in general, but this encounter reminded
be respected, in a system that has always been me of my early days as a plumber. It turns out
heavily hierarchical? For that matter, did the that most policymakers, and most bureaucrats,
planners really think it was going to be possible are not very good plumbers.
for health care professionals to spend a lot more Part of this is because, contrary to most econ-
time on public health and prevention when there omists’ generous beliefs, agents in general are
were only 2 doctors for every 30,000 people? not necessarily very skilled at what they do. This
What was striking was, not only did they is the central argument of Banerjee (2002) in
not have any answer to these questions, but favor of normative economics. He points out that
they showed no real interest in even entertain- agents have not necessarily had the opportunity
ing them. Whenever we asked them to spell out to experiment, even in their main line of trade.
what they thought their policy lever was (as For example, farmers may not have stumbled
opposed to their aspiration), the stock answer on the best mix of fertilizer, because once they
was that they did not really have one, that the have something that works well enough, it may
local governments and medical officers could be optimal to stop experimenting. Even if they
not be forced to do anything. Maybe a village are inclined to experiment, the results may not
committee would need to decide to organize be informative if the outcome data is very noisy,
yoga classes, but another one would not, so and learning from neighbors is difficult. This
there was really no way to find out what really leaves a lot of scope for specialized expertise,
worked. This was entirely beside the point, since including that of economists. Banerjee focuses
the presence (or absence) of yoga class would on the example of microcredit, and the fact that,
have been an outcome of what they could do at before the microfinance movement, bankers did
the central government (train local governments not figure out how they could use some basic
in the fundamental of public health, for exam- principles of mechanism design to make it pos-
ple). But they seemed to have no understanding sible to lend to the poor.
of a causal chain going from policy design to Hanna, Mullainathan, and Schwartzstein
implementation and final outcome. Their posi- (2014) emphasize another barrier to learning:
tion oscillated between presenting the illusion of the tendency, even for experienced agents, not to
the perfect system, and presenting the illusion notice some of the important features of the envi-
of complete powerlessness in the face of local ronment. They test a model of selective attention
power and initiative. with seaweed farmers in Indonesia, which could
We tried, and failed, to engage them on the just as well apply to the plight of policy practi-
details of policy. Not only did they have no tioners. Seaweed is farmed by attaching strands
understanding of plumbing issues, there was not (pods) to lines in the ocean. A large number of
even a realization that plumbing was an issue dimensions affect yield (e.g., length of the line,
at all. When the top bureaucrat popped in and optimal distance between pods, optimal length
was apprised of the conversation, it was decided of a cycle, size of the pod). The farmers have
that we would be shown some details. We went limited attention, and can only attend to a lim-
away to another meeting and came back after ited number of dimensions. They allocate their
three hours. They had set up the projector, a attention optimally, but a feedback loop may
sign that things were about to get serious. They occur: if they start with the prior that something
displayed a power point with each of the new is not important, then they don’t pay any atten-
UN Sustainable Development Goals, and a list tion to it, and they may never notice that it is
of proposals to achieve them in Kerala. These in fact important. The authors present experi-
amounted to a long, meritorious, and likely mental evidence that farmers not only ignore
totally vacuous wish list (30 minutes of e xercise some dimensions of farming, they continue to
per day mandatory in all schools, awareness ignore them even when they are presented data
suggesting that they actually matter, unless existing Rs500 and Rs1,000 notes would be
they are told very specifically to pay attention replaced by new notes, and would become ille-
to them. In this instance, they tend to ignore gal tender by the end of December. The initial
the size of the pod, and in interpreting the trial stated objective was to cut corruption by getting
data, they do not think to compare trial by pod rid of black money, though other goals were
size. Very experienced farmers can get it terribly mentioned later: making India cashless, fight-
wrong, even when experimenting is not partic- ing terrorists by removing their means of pay-
ularly costly, simply because they do not know ment, creating a windfall for the government in
what to look at. Very experienced policymakers the case that less money came back than was
often get it just as wrong. removed from circulation. What was absolutely
This inability to notice details is maybe remarkable is that this major move was made
even stronger for the economist-scientists and with zero preparation beforehand. The central
the policy practitioners than for farmers. The bank had not printed enough money, creating a
economist-scientist is effectively trained to
cash shortage and huge bank queues. The new
ignore them: the art of modeling requires sim- money was a different size and most ATMs had
plifying reality to work out the logic of some to be retrofitted. Cash deposits in poor people’s
essential assumptions. The spotlight is on “no frills” accounts skyrocketed, presumably
general principles, not on details that may be because someone had given them the money for
very specific to an environment—indeed, such safekeeping. Between the time the policy was
details are distractions. In order to focus on announced and the end of December, the gov-
the general principles, the economist-scientist ernment issued and changed more than 100 dif-
will tend to dismiss those nasty details (which ferent rules concerning when individuals could
in fact will make the policy work or not). This deposit their cash, how much they could deposit,
applies to theorists, but to us field experiment- what kind of justification was needed, when they
ers as well, when we wear our scientists’ hats. could withdraw, how much they would have to
At those times, we run experiments that are as pay in penalty if they could not prove they had
tightly controlled as possible to try to isolate a paid taxes and so on and so forth. The result
specific mechanism or test a theoretical princi- was a combination of a flabbergasted public, a
ple. This is not a shortcoming, just a feature of massive disruption in the economy and, in short,
the scientist’s work, yet it means that scientists complete confusion.
will not typically be the ones able to identify the The idea that you could remove 86 percent
key issues that hinder policy success. The pol- of the cash in an economy without first think-
icy practitioner is often not even willing to try to ing through the logistics, including what would
acknowledge details. replace it (considering that mobile money, credit
Many are, to one degree or another, afflicted cards, and account-to-account transfers were not
by the specific blindness that affects the “high particularly well developed in India) is of course
modernist” state described by James Scott in extreme. But it expresses the kind of magical
Seeing Like a State (Scott 1998). Policymakers thinking that many governments indulge in—
and bureaucrats have a tendency to simplify the and not just in developing countries—that if you
problems that face them until they fit some pre- have a generally good idea that makes sense in
conceived notion of what a human being should some abstract way, the details will work them-
want, or need, or be. Scott’s book is mainly con- selves out.3
cerned with episodes where this high modern- The failure to experiment, the failure to
ist approach went particularly awry (including notice, and the grip of ideology combine in
the Soviet period and the Villagization policy in what Banerjee and Duflo (2012) described as
Tanzania), but the temptation to regulate people the “ Three I’s” (ideology, ignorance, inertia).
into what they want them to be affects policy- Policymakers tend to design schemes based on
makers more generally. the ideology of the time, in complete ignorance
A striking recent example is the other unex-
pected and potentially calamitous event that 3
An interesting side question is why bureaucrats cannot
took place on November 8, 2016. On that day, be given incentives to get the detail of the policy right, since
the prime minister of India, Narendra Modi, unlike policymakers they are not elected and should not be
announced that, effective immediately, all wedded to a particular viewpoint and ideology.
Panel A Control plants Panel B Treatment plants

Audits Audits
20 Mass: 0.7297 20 Mass: 0.3913
15 15
Percent
Percent
10 10
5 5
0 0
0 100 200 300 400 0 100 200 300 400
Panel C Panel D
Backchecks
20 Mass: 0.1892 20
15 15
Percent
Percent
10 10
5 5
0 0
0 100 200 300 400 0 100 200 300 400
Figure 2. Audit Readings for Suspended Particulate Matter (SPM)
of the reality of the field, and once these policies we performed backchecks on the auditors by
are in place, they just stay in place. sending an independent team to take pollution
There are two implications of this attitude of measurements right after them. The results for
policymakers, a negative one and a more positive one particular pollutant (Suspended Particulate
one. On the negative side, it means that many Matter) are reproduced in Figure 2. The auditors’
well-intentioned policies fail in practice (or reports show considerable bunching right below
don’t succeed as well as they could have) even if the threshold for acceptable pollution, which is
their underlying principles were sound, because completely absent in the backcheck. Even low
the details are not gotten right. Examples are pollution measurements, which are apparent in
numerous. Despite the considerable amount of the backcheck, are absent in the auditors’ read-
political capital that was expended to launch the ings. It seems that, in many cases, they were not
Affordable Care Act in the United States, and even visiting the firms. This was reflected in the
the importance of what was at stake, the roll-out going rate for auditors’ report, which was below
of the health exchanges was plagued by entirely the minimum cost of running an actual pollution
avoidable software glitches. Another example test. The likely explanation is that the system
is the third-party audit system that the Supreme was plagued by conflicts of interest: since the
Court of Gujarat mandated in an attempt to firms were selecting and paying the auditors, the
improve the monitoring of industrial pollution auditors had every reason to give them the report
(Duflo et al. 2013). Under the scheme, each firm they wanted, if possible for very little money.
with high pollution potential had to hire a pri- Although the system was designed with worthy
vate firm to complete a yearly auditing report. general principles in mind, basic design flaws
Each firm selected and paid its own auditor. made it essentially useless.
Auditors completed three visits and submitted On the positive side, policymakers’ inatten-
the resulting report to the government regulat- tion to detail creates a space for policy improve-
ing body, the Gujarat Pollution Control Board. A ments even in environments where the policy
number of rules were in place to ensure quality discussion is paralyzed, or where institutions are
and limit corruption: auditing teams could not not good (Banerjee and Duflo 2012). When there
sign up more than a certain number of clients, are existing laws on the books, finding ways to
there were qualification requirements for audit- make them work better meets with little oppo-
ing team members, and dishonest auditors risked sition. In the United States, even as Congress
decertification. Yet, the system was a complete and the executive branch were fighting over
failure. As part of a randomized evaluation, every single line item in the budget, m illions
more children were certified between 2008 and example, if households receive a cash transfer,
2014 as part of the school lunch program by it may matter whether the woman or the man
successive improvements in direct certification of the household receives the money (Field et
for children of parents who benefited from other al. 2016; Benhassine et al. 2015; Almås et al.
programs: SNAP, TANF, or Medicaid (Moore, 2015). It may matter if the transfer is presented
Conway, and Kyler 2012; Moore et al. 2013). as something that can help children go to school
The Child Nutrition and WIC Reauthorization (Benhassine et al. 2015), or if households are
Act of 2004 mandated direct certification, but reminded that they can save it, and so on.
rate of enrollment accelerated greatly after the The success of the nudge agenda, inspired by
Healthy, Hunger-free Kids Act of 2010. The act the work of behavioral economists, and popu-
contained a set of “plumbing” reforms that led larized by Thaler and Sunstein (2008) has given
to significant increases in direct certification: prominence to “design of the tap” issues among
mandating name matching of different databases economists. Indeed, the last two Ely lectures
three times a year; removing the option of “let- centered on this issue. Campbell (2016) argued
ter certification,” where a letter was sent to the that the regulation of household finance must
parents which had to be forwarded to the school; take into account the fact that households do not
establishing awards for the best-performing understand finance very well, and also that they
states. In 2011–2012, 1.7 million children were are subject to behavioral biases. Chetty (2015)
directly certified (a 17 percent increase from the argued that the “pragmatist” public finance will
previous year), and 740,000 were added the fol- pay attention to those issues regardless of their
lowing year (Moore, Conway, and Kyler 2012; scientific foundations, because they do lead to
Moore et al. 2013). This did not raise eyebrows greater success in what the policymakers are try-
in Congress, and probably did more to help ing to do. It is not just economists advocating
kids in poverty than a more publicized measure subtle behavioral intervention. Policymakers are
would have. also aware of them, at least in the United States
and United Kingdom, where “Nudge Units”
III. Why Economists Make Good Plumbers have been created in the White House and at 10
Downing Street. The main idea behind “choice
All of these arguments suggest that we need architecture” is precisely that what may appear
someone smart to take care of the plumbing. But to be irrelevant details (default options, ease of
why does it need to be an economist? making a decision, the way some numbers are
The reason is that many plumbing issues are presented, etc.) actually influence final con-
actually fundamentally economists’ problems. sumer choices. Policymakers therefore need to
Not all economists need to be plumbers; we be thoughtful about all those choices. (It is worth
need scientists and engineers too. And not all nothing that many tap-design issues don’t have
plumbers need to be economists. Sometimes, anything to do with behavioral economics: the
what a policymaker needs is a good software issues explored by Banerjee et al. in Indonesia’s
engineer, lawyer, or subject expert. But econom- Raskin program have much more to do with
ics, as a discipline, provides domain expertise political economy and corruption, for example).
for some of the fundamental plumbing issues in The economist’s job does not end once poli-
public policy. cymakers are made aware that these tap-design
There are three broad classes of reasons why issues matter, however, particularly because
plumbing matters for public policy, and all three policymakers love a free lunch, and some of
involve economics. the behavioral ideas may stroke this disposi-
First, the citizens (or the supposed beneficia- tion. They are thus wont to treat new behavioral
ries of the policies) are humans, with conflict- ideas as general “solutions to every problem,”
ing objectives, limited information sets, limited rather than as tools in a plumber’s kit, perhaps
attention, and limited willpower. This means necessary to tune any kind of policy properly. A
that the specific way that policies are presented plumber’s mindset is required to carefully watch
and implemented will potentially have tremen- how these ideas play out in practice, and some-
dous influence on whether they will work or not. times an economist may be need to analyze how
Therefore, when a new policy is put in place, it works. We will see several examples of cases
it is essential to think about those details. For where “obvious” improvements did not turn out
to be improvements at all. Ben Handel’s and on how to hire, fire, and reward government
Jonathan Kolstad’s work on health insurance employees creates the basic environment in
illustrates the subtle economic analysis neces- which government work is happening.
sary to analyze the plumbing of a comprehen- Consider what is involved in the decision to
sive health reform such as the Affordable Care hire new front-line workers (community liaison
Act (a.k.a. Obamacare). They are concerned workers, health workers) to carry out a new pro-
with issues of practical relevance: in 2015, they gram. While the bureaucracy will generally be
wrote a policy document with two specific pro- quite good at determining how many new work-
posals for the health exchanges (Handel and ers are needed and may be considering how
Kolstad 2015), a consumer search tool that per- to set their wages, it will almost surely not be
sonalizes choice framing and recommendations thinking about what the job-announcement post-
based on an individual’s specific characteristics, ers should advertise. And yet, Ashraf, Bandiera,
and a “smart default” policy, targeted to partic- and Lee (2015) show that a small change in the
ular consumers, to which the government could framing of a front-line health worker job on the
default consumers during open enrollment when advertisement poster—namely, emphasizing the
considerable evidence indicates that they are career path that the job can lead to rather than
in the wrong plan. Those proposals are quite the opportunity to do good for the community—
detailed, and they draw from a substantial aca- makes a large difference on the observable and
demic literature (to which Handel and Kolstad unobservable characteristics of those who apply.
have contributed before). Because the propos- The “go-getters” who applied for the job in the
als are realistic, they need to tackle issues that career-path framing treatment conduct more
are beyond the simple observations that con- visits, accomplish more during visits, and ulti-
sumers are ill informed and tend to be charac- mately lead to better health outcomes than the
terized by inertia (although this clearly is part “do-gooders.” The overall goal of this new pro-
of the problem). For example, Handel (2013) gram was to create a new layer of health workers
observes that, due to inertia, many consumers close to people in the community; however, the
remain in dominated plans, where they lose a seemingly minor advertisement decision turns
substantial amount of money (consumers leave out to have very significant effects.
approximately $2,000 on the table, on aver- We are far from understanding how this plays
age). However, Handel also finds that, if con- out. In contrast to the results just discussed,
sumer inertia is reduced, adverse selection will Deserranno (2015) finds that financial incen-
likely increase in a marketplace with no insurer tives can lead to a less socially motivated appli-
risk-adjustment transfers: this means that any cant pool. She randomized the advertisements
proposal that helps consumer solve the iner- for a new position of health promoter for BRAC,
tia issue needs to address the adverse selection a large international NGO. In one treatment arm,
issue at the same time. the job advertisement mentioned the minimum
Second, those who implement policies, often amount that a health promoter was expected to
government workers, are humans too. They may earn (“low-pay” treatment) and, in another treat-
suffer from the same limitations as the final ment arm, the maximum (“high-pay” treatment).
subjects of the policy, and their incentives are A third treatment advertised the mean of the
not necessarily to work very hard or in the best expected earnings distribution (“medium-pay”
interests of the citizens they serve. In much the treatment). Note that the difference was only
same way as plumbing details create a choice in the framing: the actual pay was the same in
architecture for the final consumers, they also all treatments. The study finds that while the
create an “incentive architecture” for the gov- high-pay treatment attracted 30 percent more
ernment workers who implement the policies. applicants relative to the low-pay treatment, the
Despite the current vogue of catchphrases applicants had much less experience as health
such as “state capacity,” these problems have volunteers and were much more likely to state
received fairly little attention in economics until “earning money” as the most important feature
relatively recently. There are, however, several of the job. Applicants under the medium- and
recent interesting experiments on the “personnel high-pay treatments were also 24 percentage
economics” of the state (see a review in Finan, points less like to make a donation to a public
Olken, and Pande 2015). Indeed, the decisions health NGO in the context of a dictator game.
Some aspects of personnel economics of rules will, advertently or inadvertently, affect the
the state such as salary and incentives are very ability and willingness of the front-line workers
salient and politically charged, and I would not to implement the policy. Mindful incentive infra-
characterize them as plumbing.4 Some are much structure requires understanding the government
less salient, such as the framing of the posi- as an organization: any policy will necessarily
tions (as described above) or transfer policy. take place in an organization that has power
Transfers and posting to more or less desirable structures, and a culture that has large impacts
locations are routinely used by the governments on how the policy will play out. The specific
of developing countries to reward performance way it is drafted, even the parts of it that do not
or punish people who are not liked, but this is formally address incentives, will determine its
not done in a systematic way. Banerjee et al. ability to be implemented (Gibbons 2010). This
(2012a) find some positive effects of reducing gives particular salience to the piping issues,
the frequency of transfers of police officers on which, having not benefited from the behavioral
their performance. Banerjee et al. (2012b) find economics revival, are even more neglected than
that the promise of a transfer out of a punish- the tap-design issues.
ment posting induced police officers to work Between 2001 and 2014, a group of us
harder on conducting random sobriety check- (Banerjee et al. 2016) experienced firsthand the
points. Khan, Khwaja, and Olken (2016a) power of incentive architecture as we attempted
propose, implement, and test one particular to mainstream a remedial education program
system—performance-ranked serial dictator- through the government system. The basic idea of
ship—in which tax inspectors select their most the intervention was that children must be taught
favored location in an order that is determined at the level at which they currently are, rather
by performance. The authors show that the than at the level prescribed by the curriculum for
mechanism does in fact work, increasing tax their grade. Several RCTs showed that it could be
collection by 40 to 80 percent. Moreover, the implemented by volunteers or paid NGO work-
people who, ex ante, their model predicts will ers to very good effect, and it had been shown to
be the most sensitive to this incentive are in fact raise learning outcomes when run by Pratham,
those who increase their collection the most. a large NGO, outside of school. To mainstream
But note that none of these studies investigate it within schools, Pratham trained teachers in
an essential aspect of this policy in equilibrium: the pedagogy. We evaluated this attempt, as part
if transfers are used as rewards, or if postings of a large scale-up of the policy in the Indian
are more secure, the most corrupt people may states of Bihar and Uttarakhand over the two
work very hard to get transferred to where they school years of 2008–2009 and 2009–2010.
want to be. This could entirely change the wel- The results were disappointing: despite official
fare implication of the policy, so this is an area endorsement, teachers did not seem to apply
where more plumbing is needed. the methodology when they were back to their
But importantly, the incentive infrastructure classrooms, and the intervention did not result in
issues apply to the design of any policy and reg- any increase in test scores. Our process monitor-
ulation, not just incentives policy specifically ing and a set of qualitative interviews suggested
or even personnel economics more generally. that there was an incentive-architecture problem
Regardless of whether the policymaker is draft- in the way Pratham and its partner government
ing an education intervention, a banking reg- had implemented the program. While there was
ulation, a new tax, or anything else, there will endorsement of the approach at the top, nothing
be decisions to be made on who will carry it else in the teachers’ lives had really changed. In
out, under what supervision, with what level of particular, their stated responsibility remained to
autonomy, and how implementation will inter- complete the curriculum, and from their point of
act with their other duties. Any particular set of view, any time devoted to the remedial activities
was essentially wasted.
4
To have a chance of success, something
Although they are extremely important, and the subject needed to be change, not in the basic philosophy
of important field experiments: see, for example, Dal Bó,
Finan, and Rossi (2013) on front-line worker recruitment of the program or in the content of the training,
and Khan, Khwaja, and Olken (2016b) on financial incen- but in the plumbing. Working with the state of
tives for tax collectors. Haryana, Pratham implemented two key changes
to the way the intervention was run. First, the been put in place for checking account overdraft
program would be implemented during an extra coverage, credit card over-the-limit practices,
hour in the day (added in all schools, not just and some home-mortgage escrows. In particu-
those where the program was implemented, but lar, a 2010 regulation imposed on banks a pol-
earmarked to this in treatment schools). Since icy default regarding overdraft coverage. Under
this hour was new, it was easy to make salient this regulation, banks were not allowed to charge
that the responsibility of the teachers was to an overdraft fee for when an account went into
carry out the remedial activity during that time. a negative balance due to an ATM transaction or
Second, a cadre of supervisors (from within the a nonrecurrent debit card transaction unless the
government system) were recruited and trained consumer had opted out of the default and cho-
to act as both supporting people and monitors sen the bank overdraft protection plan. The idea
for the teachers as they were implementing the was that the overdraft coverage is basically a
program. Nothing changed in a teacher’s pay high-interest, low-risk loan charged by bank and
or her formal role, but these two adjustments is not in the interest of most customers. Banks get
made the difference between signaling that the away with high overdraft coverage fees because
program was an optional add-on or part of the customers do not expect to go into overdraft, and
teacher’s core duty. We evaluated this version thus pay no attention to this feature when they
of the program in an RCT in 400 schools (200 sign up for an account. Moreover, they don’t nec-
treatment, 200 control), and found significant essarily realize when they are going into overdraft
increases in test scores. (Stango and Zinman 2014).
Finally, any regulation that seeks to affect The 2010 regulation required banks to offer
the way that firms behave, or any policy that no overdraft protection by default, and prohib-
relies on firms for implementation at any point ited them from charging overdraft fees in the
must also take firm behavior into account. This default if the account ended up being overdrawn
behavior is one reason why “obvious” solu- anyway.5 In doing so, the regulators cited the
tions do not always work the way they would research on the stickiness of the default, and
be expected to. Willis (2013) makes this point expressed the hope that most consumers would
in the context of default enrollment in an over- stick to the new default of no overdraft protec-
draft protection plan (which Campbell 2016 tion. To enhance the stickiness of the default, the
describes as one of the signature attempts to regulator set some safeguards: to get out of the
incorporate behavioral economics into the reg- default, customer needed to take an “affirmative
ulation of household finance products). Defaults choice” (talk to someone or click a box), and
are settings or rules about how product, policies, before allowing them to do so, the bank needed
or legal relationships function that are in place to show them some documents.
unless users actively change them. Based on Thus, the regulator had taken care of the obvi-
their success in retirement savings in particular ous plumbing concerns they could think of. And
(Madrian and Shea 2001; Chetty et al. 2014), yet, the default did not work that well. Willis
default-in programs have been come popular (2013) cites data from the Consumer Finance
with policymakers. Since people tend to stick Protection Bureau suggesting that around one-
with the option they are presented with, policy- half of the bank’s clients had opted out of the
makers can get the outcome they want by set- default, and that those who switched were those
ting the default carefully, while leaving people more likely to then pay high overdraft fees.
the freedom of choice (this is what Sunstein and Banks achieved that by using any number of
Thaler have called “libertarian paternalism”). strategies that Willis describes in detail, such as
When drafting policy, picking a default is often minimizing the costs of opting out as much as
unavoidable, and hence, regardless of whether the law allowed them to, stationing employees
one is convinced of their stickiness, choosing at ATMs to convince people to opt out, using
the right default is always a plumber’s problem. forms for new clients that made sticking with
But in some cases, policymakers can attempt
to influence the relationship between consum-
ers and firms by mandating that the firm offers 5
This applied to ATM or one-time transactions only.
a particular default option. In the consumer Banks could still authorize and charge for overdraft occa-
finance area, legally required default terms have sioned by checks or recurring transactions.
the default just as complicated as opting out, and These two examples may give a somewhat too
using language that made the default scary and simplistic view of the problem: firms, like gov-
the alternative attractive. ernments, are organizations themselves and may
The bottom line is that, if the objective of thus not always respond optimally to a change in
the government was to protect customers from the environment. Taking into account how a reg-
the bank’s abusive use of overdraft policy, then ulation will affect firms as organizations further
the policy should have been drafted with an underscores the role of details rules and how
understanding of the economics of the bank’s they interact with the environment.
likely response. This would have probably led To summarize, economists have the dis-
to a different set of rules. The 2009 Credit Card ciplinary training to make good plumbers:
Accountability Responsibility and Disclosure economics trains us in behavioral science,
(CARD) Act was less respectful of the banks’ incentives issues, and firm behavior; it also
freedom. It contained some “nudges,” but also gives us an understanding of both governments
hard ceilings on the late fees that banks were and firms as organizations, though more work
allowed to charge. Agarwal et al. (2015) find probably remains to be done there. We econo-
important effects of the CARD Act on inter- mists are even equipped to think about market
est and fees paid by customers (as the ceilings equilibrium consequences of apparently small
were binding on some products and were not changes. This comparative advantage, along
offset by increases on other products), and rel- with the importance of getting these issues right,
atively little change of behavior related to the makes it a responsibility for our profession to
nudge. engage with the world on those terms.
Economists have some domain expertise to
understand those dynamics, and therefore can IV. Plumbing and Experimentation
help draft policies that take those incentives into
account. Duflo et al. (2013) not only studied the Roth (2002, p. 1342) made the point that “in
(faulty) third-party audit system for environment the service of design, [lab] experimental and
that the court had ordered in Gujarat, but set out computational economics are natural comple-
to improve it. Given what we know from eco- ments to game theory.” Lab experiments and
nomics, it was not rocket science to identify the computational work are essential tools for the
key problem: since the system was plagued by economist-engineer because sometimes real life
conflicts of interest, the performance of the audi- will not oblige with the nice assumptions that
tors could be expected to improve if the financial are necessary for elegant models with closed-
link between them and the firm was severed, and form solutions. If that is the case, simulations
if they were instead made liable to the regula- are needed to understand the behavior of the
tor, who would them reward them for accuracy. proposed systems under the more complicated
Similar proposals have been made for reforming set of assumptions.
financial audits in the United States and Europe, Similarly, in the service of fitting policies for
though they have not been implemented. The the real world, field experiments are a natural
Gujarat Pollution Control Board (the regulator in complement to theory and economic intuition.
charge) implemented a reformed system, where Evaluation in the field is necessary for the
auditors were randomly assigned to firms, paid plumbers because intuition, however sophis-
from a central pool, and independently moni- ticated and however well grounded in existing
tored in a randomized experiment in 473 firms theory and prior related evidence, is often a
(233 of which where assigned to the new sys- very poor guide of what will happen in reality.
tem). Compared to audits conducted under the Therefore, the economist-plumber is eminently
status quo (even with the same auditor operating fallible. We have already seen several interven-
in both systems), we found those “reformed” tions that did not initially succeed, despite being
audits to be significantly more accurate. This designed as well as possible, and with econo-
additional accuracy led to a reduction of pol- mists’ inputs (e.g., the 2010 default regulation
lution among the worst offending plants. This to prevent the banks from using overdraft pol-
work inspired a reform of the entire third party icy, and the first attempt to scale up Pratham’s
audit system in Gujarat, once again demonstrat- “teaching at the right level” intervention). My
ing the convincing power of plumbing. career is peppered with such examples, and
occasionally, the best-intentioned efforts have involved with, as rigorously as possible, and
backfired. help correct the course: the economist-plumber
In Banerjee, Duflo, and Glennerster (2008), needs to persistently experiment and repeat the
we helped the local government in Udaipur dis- cycle of trying something out, observing, tinker-
trict, Rajasthan, design an incentive program to ing, trying again.
get nurses to come to work more often (a base- This underscores the need for evaluation,
line survey had shown that they were absent but of course, randomized controlled trials
more than one-half of the days). Each nurse was are not the only way to evaluate policies. For
given a time-and-date stamp and asked to stamp example, Handel and Kolstad’s work, which I
a register affixed to the wall of the center sev- discussed, largely exploit nonrandomized situ-
eral times in the course of a day, one day a week ations, although they have some experimental
(on Monday) to prove her presence. The gov- papers as well. Conversely, most RCTs are not
ernment announced that those who didn’t show plumbing experiments—they are designed and
up at least 50 percent of the time on Mondays implemented carefully with as little disturbance
would get their wages docked. This policy was as possible, to learn a specific parameter or mea-
evaluated in an RCT. Initially, everything went sure the effect of one particular intervention.
according to plan: nurse attendance, which They are not concerned with the specifics of
was around 30 percent before the launch of the how such an intervention would be implemented
program, jumped to 60 percent after the pro- at scale.
gram was introduced in centers where it was in Still, randomized field experiments are an
place, while it remained unchanged elsewhere. economist-plumber’s natural tool, for several
However, a few months later, the tide turned. reasons.
Nurse attendance in the monitored centers First, there are the traditional identifica-
started to drop, and kept dropping. Eight months tion arguments: it is generally difficult to form
into the program, the monitored and unmon- a proper counterfactual when a policy was
itored centers were performing at an exactly launched in the world, and hence we may not
equal—and very poor—level. When we looked know the real effect of many policy attempts.
into what happened, we were struck by the fact Second, precisely because policymakers are
that recorded absence remained low even after not particularly interested in plumbing, they
the program fell apart. What went up sharply will in general make a large number of deci-
were “exempt days”—days when the nurses sions at the same time. Even when it is possible
claimed that they were not required to come in to evaluate one policy or one regulation with
for some reason (training and meetings were the a credible identification design, it may be dif-
most common reasons). There seems to have ficult to “unpack.” For example Agarwal et al.
been a strong collusion between the nurses and (2015) estimate the impact of the 2009 CARD
the doctors in charge of supervising them. At Act, which regulated the fees that banks could
some point, attendance in the monitored centers charge on consumer business cards, but not
actually fell below the unmonitored ones and business credit cards, and also had some nudg-
remained lower until the end of the study: by ing provisions. They compare fees and interest
the end, the nurses in monitored centers came paid before and after the acts on consumer cards
to work only 25 percent of the time. We had versus business cards. This gives us a convinc-
completely failed to foresee how this collusion ing estimate of the effect of the CARD Act taken
would undermine the program, and would in as a whole, but it is a little difficult to separate
fact make the situation worse than it had been out the impact of its different provisions. In
before, as it dawned on nurses that they could contrast, with a carefully designed experiment,
get away with anything. it is possible to manipulate one lever at a time.
For some people, this very fallibility is a Thus, randomized experiments are a very natu-
reason why we should not intervene at all. But ral complement to policy experimentation. This
everybody makes mistakes, so this is hardly an is nicely illustrated by the Banerjee et al. paper
excuse for inaction (Banerjee 2002). However, on the Raskin information card, to name just one
because the economist-plumber intervenes in example.
the real world, she has a responsibility to assess Third, an explicit experiment, with a start
the effects of whatever manipulation she was date and an end date by which a report must
be provided and a decision made, has import- it forces the policymakers to think about the
ant benefits for the process of government. levers that they actually have control over. In
Greenstone (2009) argues for a “culture of per- our conversation with partners from the Kerala
sistent regulatory experimentation and evalua- health department, which I recounted above, it
tion,” where every regulation should be subject was apparent that they were very confused as to
to experimentation, and an automatic sunset what they could and could not do as a policy.
clause (by which it must be reviewed to deter- On the one hand, they had an ideal vision of the
mine its costs and benefits during the exper- nurse and the village officials working together
imental period) and an expansion clause (by to be key dispensers of health advice. On the
which it will be expanded if it turns out to have other hand, they resisted any notion that they
large benefits and low costs). Greenstone argues could actually tell local governments what to do.
that, in the United States, there are opportuni- If they had to design an experiment, they would
ties to reform the system for evaluating, adopt- have needed to spell out what was in fact in their
ing, and getting rid of policies. Most evaluations control (training the panchayats? Training doc-
are done ex ante, with very little relevant data to tors and nurses in a set of new guidelines and
determine the cost and benefits, and very little roles and responsibilities?). This would actually
serious review ex post. Almost no federal regu- have forced them to write down a policy, rather
lation comes with a sunset clause, although they than keep it suspended at the purely aspirational
are used more frequently by states (Baugus and level.
Bose 2015). Partly inspired by this article and Learning happens in the course of the experi-
by Greenstone himself (who hung up his pro- ment itself as well. With Banerjee et al. (2012a),
fessor’s gown for a time to engage in full-time we conducted a first set of randomized exper-
plumbing), the Obama administration called iments, designed to improve the behavior of
for a “look back,” a comprehensive review of police officers with ordinary citizens in the field.
all the costs and benefits of regulations in place The Rajasthan police top brass were very com-
by all agencies (Executive Order 13563, 2011). mitted to the goal of improving the force, and
This was a bold move, but without proper data ready to immediately implement reforms that
available and without any counterfactual, the we would have suggested. We convinced them
exercise had a narrower scope than was origi- to start with an experiment to test out some of
nally envisioned. Moreover, inertia is deeply the main proposals made by the various police
ingrained in governments everywhere. Once commissions in India: training police officers,
an apparatus is installed to support a policy or introducing community observers, rationalizing
a regulation, it is unlikely to go away. The case the workweek with a day off and a duty roster,
is different if the apparatus is the explicit sub- and limiting the frequency of transfers. Even
ject of an experiment, with an anticipation of a though this was a personnel experiment carried
review to come. Of course, what matters for this out on a large scale (more than 100 police sta-
political economy process is that the process is tions, covering a population of over eight mil-
viewed as experimental, not necessarily that the lion people) and with a government, this was not
experiment be randomized. But once a regula- a plumbing experiment, in that we did not pay
tor agrees to try something out (perhaps differ- much attention to the details of how each inter-
ent variations of a program) on an experimental vention was going to be carried out. The reason
basis, the marginal cost of randomizing is often is that our partners in the police did not think
low, and the results of an RCT are more trans- that plumbing would ever be an issue at all. The
parent and easier to explain.6 police in India is structured like a military orga-
Fourth, the process of designing a random- nization, and the top brass assumed that if they
ized experiment in conjunction with a govern- ordered something, it would happen. So they
ment is an important learning process, because simply sent the order to carry out these differ-
ent activities down the hierarchy. Unfortunately,
this was not exactly what happened. Some of
the reforms (the transfer freeze and the training)
6
Greenstone’s agenda found some resonance in the
Obama administration. A 2013 memo from the Office of
Management and Budget (Burwell et al. 2013) encouraged could be more or less directly implemented from
all heads of agencies to use existing evidence to draft new the top, and were indeed carried out. But others
rules, but also to conduct randomized experiments. (the rationalization of the workweek, and the
community observer) relied on the station mas- squad would get them a ticket out of the dog-
ters for implementation. If they had been carried house. We fitted the police cars of this brigade
out, they would have caused a direct limitation with GPS, which allowed us to monitor whether
on the autonomy of those very station masters. they were in fact doing their job. Compared to
So, with the benefit of hindsight, it is not entirely regular police stations, officers in those squads
surprising that those measures were in fact never were in fact more likely to be at their checkpoint
actually implemented in the field: neat reports and arrested more drunks. Thus, the first exper-
made their way up, but when we sent monitors to iment gave the police enough understanding of
the field, they found that nothing had changed in what was under their control in their organiza-
practice. Collectively, we entirely neglected the tion and what was not, information that allowed
plumbing when we were designing the reforms them to better design a new policy.
and their implementation (and accordingly, we
probably tested out reforms that were unwork- V. The Plumber and the Scientist
able, although they had been recommended by
various reform commissions), because we had The knowledge that is generated in the course
a vision of the organization that turned out to be of a plumbing experiment is often quite con-
entirely different to what it was in practice. But text-specific (to a particular organization and
this project was carried out as an RCT, rather institutions). One concern is that it may not gen-
than just rolled out (which had been our police eralize beyond this setting. In this case, even if a
partner’s initial idea), so we were at least able randomized experiment is used to pick the ver-
to realize what we had gotten wrong: not only sion of the policy that actually works, is it still
because we saw no change in the final outcome, science, or is it just glorified consulting? After
but also because we had put in place a compre- all, firms also use randomized controlled trials
hensive data collection apparatus to see what to choose the price of their goods, the color of
was happening on the ground. their yogurt packets, or what advertisement to
While we never got to learn whether the feature on a landing page. The number and the
rationalization of the workweek or community scale of these experiments have exploded with
observers could have worked if they had been the advent of the Internet and A/B testing. Many
implemented, we learned a fair amount about of these marketing experiments are of use to the
what kind of interventions were possible to carry companies that run them, but are not treated as
out with the police. We took this lesson to the research, do not receive IRB approval, and are
next project we carried out together, when Nina never published. When they are (see Simester
Singh got the responsibility to design an anti- forthcoming for a review) it is because they
drunk-driving project (Banerjee et al. 2012b). manage to make a more general point on con-
Some of the issues that we were concerned with sumer behavior. Are plumbing experiments able
are central to policing, and related to quite subtle to help us draw similarly able to help us draw
questions of economics: should enforcement be general conclusions, valid in other contexts as
always in a fixed place or vary locations? Should well?
it follow a crackdown model or be at a lower level There are several answers to this criticism,
for a long period of time? We were interested in which sets apart policy plumbing from usual
setting up an experiment to learn that. Based on A/B testing.
the previous experiment, however, neither the First, plumbing experimentation can create
researchers nor the police officials were ready to the counterfactual that allows for the evaluation
assume that traffic checkpoints to catch offend- of the policy itself. Many policies are imple-
ers would regularly happen at the specified time mented nationwide, which makes it difficult to
and place. So we thought hard about the police evaluate them. Plumbing experiments can help
as an organization, with the constraints that it by creating variation in how well they are imple-
faces (in particular, the prohibition of merit pay) mented. This can generate the variation suffi-
in order to find a space where it was possible cient to evaluate the causal impact of the policy
to provide incentives. We formed drunk-driving itself, even when it has already been rolled out.
enforcement squads out of officers assigned in a Consider Devoto et al. (2012), the experiment
certain type of station perceived as a punishment on water connection. By fixing some “design of
posting. We promised that good behavior on this taps” issues, and removing the barriers to access
to the loan (and hence the water connections), particular improvement, we would only need to
they generated an exogenous increase in water be willing to assume that other reasons for inten-
access in a randomly selected set of households sive margin increase in program participation
(69 percent take-up for treatment households, would have a similar effect on wages in the pri-
versus about 0 in control households), which vate sector. That assumption is more palatable
gave them the opportunity to evaluate the impact that the assumption that smartcards themselves
of access to water at home on health and house- would work well in any context; this is a state-
hold welfare. ment about the functioning of the private labor
In addition, a particular feature of plumbing market rather than about the government as an
experiments is that, because they can be car- organization.
ried out with governments, and can therefore Second, the plumbing experiments have also
be done on a large scale, they give us a window generated insights that are useful to pure science.
to evaluate the equilibrium effect of policies, A first relationship between plumbing and
which is otherwise very difficult to assess and pure science is that plumbing experiments can be
model.7 For example, Muralidharan, Niehaus, designed to test mechanisms, precisely because
and Sukhtankar (2016a) evaluate the impact the researcher can operate very specific levers.
of biometrically authenticated payment in the Thus, to employ the terminology of Ludwig,
MGNREGS program in Andhra Pradesh. The Kling, and Mullainathan (2011) and Congdon
new system replaced the traditional method et al. (2017), plumbing experiments can be both
of paying beneficiaries through their postal mechanism and policy experiments. To be sure,
account, which required them to travel to the this is not a specific advantage of plumbers. The
post office and often pay a bribe to an agent most common use of field experiments these
if they wanted to get paid. In the new system, days is a careful test of a theoretical proposition,
they could withdraw money from an ATM in and this use of experiment as “science” usually
the village staffed by a local woman, using takes place in a better controlled environment
their fingerprint for identification. The rest of (see Banerjee and Duflo 2009 for a discussion
the implementation of the MGNREGS program of this use of experiments, and Glennerster 2017
(including how the money got to the Panchayat for practical trade-offs between control and rele-
account) was not changed. The authors found vance that arise when running experiments). But
that treatment improved the implementation of plumbing does not preclude science, and pres-
the program in many dimensions. In particular ents some advantages. First, the policy context
it reduced leakages and delays, leading to higher is usually not replicable outside of this policy
participation in the program. Importantly, the environment. Second, the relative indifference
randomization was carried out at the subdis- of governments to plumbing questions often
trict level (a group of 20–25 villages, covering translates into a willingness to give the plumb-
over 60,000 people). This allowed the authors, ers a relatively free reign on the design. Third,
in follow-up work (Muralidharan, Niehaus, and the scale also allows many subtreatments with-
Sukhtankar 2016b), to study the impact of the out losing power. The plumbers can then turn to
program on wages. They found that their inter- their scientist friends (or to the scientist within)
vention led to an increase in private sector wages to design quite subtle “mechanism experiments,”
without reduction in employment. This general moving one parameter at a time to test a very
equilibrium impact is quantitatively important: specific scientific theory, in the policy-relevant
90 percent of the increase in individual’s income context, at scale, and on a representative popu-
due to the intervention could be attributed to the lation. In the Banerjee et al. (2015) experiment
private wage increase. The question of the equi- on the Raskin card, for example, there were
librium impact of workfare programs like the several manipulations that permit quite precise
MGNREGS is of central interest for the design evaluation of the mechanism at play. One of the
of social safety nets. Of course, to generalize the manipulations involved varying the visibility of
finding beyond Andhra Pradesh, or beyond this the card distribution program to affect high-or-
der belief (not only do I know the benefits I’m
entitled to, but I know that the official knows,
7
Muralidharan and Niehaus (2016) make the same argu- and I know that others know that we both know).
ment in their discussion of experimentation at scale. This provides quite unique empirical evidence
on the impact of common knowledge, a concept measured by home visits and the organization
in many models of game theory and political of community meetings. In contrast, Deserranno
economy. (2015) provides evidence in support of the rela-
Another nice example is the Kling et al. tionship between prosocialness and job perfor-
(2012) study of comparison friction. We have mance. Compared to the high-pay treatment, the
seen that inertia is an important feature of con- health promoters recruited under the low-pay
sumer behavior. One reason may be that it is dif- treatment visited a larger number of households,
ficult for them to compare options, even when organized more public presentations in the vil-
the information is available. There are a number lage, and provided more antenatal checks. She
of studies of “comparison friction” contrasting also finds that they were more likely to target the
the Internet to other markets, suggesting that this most vulnerable households. Clearly, these two
is a general problem. Kling et al. (2012) set up studies do not settle the issue, as they have oppo-
an experiment that isolates comparison friction site conclusions, but they help us think about it
from other factors. In the treatment group, senior better.
citizens in Medicare Part D received personal- To be fair, it is not always possible to have
ized information on the performance of different very complex design in plumbing experiments:
plans—information that they could, in principle, in some instances, it places too much strain
have acquired very easily by going to a website on the implementing agency. For example, the
and entering the data. The intervention had large Muralidharan, Niehaus, and Sukhtankar (2016a)
impact on plan switching (which increased from experiment with smartcards in MGNREGS var-
27 percent to 38 percent). The design, which is ies many things at the same time (they give the
focused on the very narrow channel of providing smartcards, they move the collection point to the
the results of the comparison, isolates the role village, they move it from the post office to the
of comparison friction from other mechanisms, ATM, and they introduce local female “bank-
such as difficulty of acquiring information, pro- ing correspondents” who act as local agents for
crastination, etc. the bank). This makes it difficult in this setting
More generally, well-designed plumbing to isolate one particular mechanism. But when
experiments can sometimes introduce varia- possible, it makes the experiment particularly
tion that does not exist in natural conditions, valuable, and my sense is that such trials will
and thus generate a counterfactual to illumi- become increasingly feasible, especially as the
nate theoretical mechanisms that are not easily first experiments provide examples to follow.8
observable in nature. For example, it is a ques- A first relationship between plumbing and
tion of general interest whether people hired pure science is related to the “learning by
with prosocial motivations would generally noticing” concept that I referenced earlier.
make better employees, and specifically bet- Many scientists-turned-engineers and later,
ter public service employees. But since proso- turned-plumbers (Banerjee, Klemperer, Pathak)
cial tendencies are not randomly assigned, this have said that experience taught them to pay
is not something where evaluating causality attention to an entirely different set of notions
is easy. The Ashraf, Bandiera, and Lee (2015) than the ones they started with. The scientist’s
study and the Deserranno (2015) studies that I focus on specific models and assumptions
mentioned earlier provide examples of how a requires not noticing things that are unimportant,
plumbing experiment can help us make progress or that cannot be made sense of in their particu-
on difficult-to-settle empirical issues: in both lar paradigm. The plumber must notice anything
studies, the authors used the different recruit- if it will matter for implementation. The plumber
ment strategies to create variation in the type will share these observations with engineers and
of health promoters that were recruited. Once scientists (or with the engineer and scientist ver-
employed, they were all tasked with the same sion of themselves) and thus hopefully generate
responsibilities and faced the same incentives. new interesting and important questions to work
Thus, the experiments clearly identify selection,
controlling for incentive. Based on this design,
Ashraf, Bandiera, and Lee (2015) find health
8
For other examples of complex plumbing experiments
with many treatments that look specifically at mechanisms,
workers attracted by career incentives are much see Olken (2007); Alatas et al. (2012, 2016); Khan, Khwaja,
more effective at delivering health services, as and Olken (2016a); and Banerjee et al. (2012b).
on. Plumbing experiments shine the spotlight on p olicymakers, market participants, or research-
new problems, and new theories can be devel- ers before the research. In a working paper ver-
oped to address those problems. sion, Dur et al. (2013) report how their research
For example, Pathak (forthcoming, p. 32) was influenced by the policy debate and how the
concludes by reflecting on his many years of policy debate was influenced by their research,
helping cities design and implement school which ultimately led to the abandonment of the
assignment mechanisms, with which I opened “walk zone” in Boston on the grounds that the
this essay by saying: “Experiences from the field MIT-Boston College team had showed it made
highlight new issues that have not been the focus no difference, but it was perceived to unfairly
of this earlier literature such as nonconsequen- advantage some students. For a last example, in
tialist rationales for straightforward incentives, many cases, participants in school choice mech-
the importance of transparency and simplicity in anism have a different view than the scientists on
influencing designs, the value of single-offer sys- whether a particular school assignment system
tems over multiple-offer alternatives, the need can be “gamed” by being strategic about it. This
to streamline aftermarkets, and the importance is an example of a much more general problem,
of participation on the school and student side. where participants’ perceptions of the optimal
It is my hope that further interaction between strategy in a system may be quite different than
theory and practice will sharpen focus on these what the optimal strategy is. As scientists, we
and potentially other significant issues that have may have dismissed this, since agents are just
received relatively little attention from the theo- wrong. But as plumbers, it obviously matters,
retical literature.” At the end of the paper, Pathak since it will affect final outcomes. Motivated by
goes on to open yet another agenda on feedback these sorts of issues, Li (2016) actually tried to
between market-clearing algorithms, school address this issue formally by defining the con-
attributes, and student preferences. Those are, I cept of “obviously strategy proof.”
am guessing, difficult and interesting theoretical This ability of experiments to put the spot-
problems, but they would not have arisen with- light on new problems is important. Many new
out careful study of the property and behavior of fields in economics came about because some-
actual school choice mechanisms one decided that we could not continue to ignore
Indeed, the school choice literature has sev- the failure of assumptions that were staring us
eral examples of where questions that first arose in the face: informational economics, behavioral
as purely practical motivated further theoretical economics, and network economics all come to
work, advancing market design as a field. The mind. For this kind of innovation to happen, we
questions of how to break a tie in school assign- need to leave a space for surprising informa-
ment mechanisms originated as a practical question to come in. Of course, this requires being
tion that parents and school administrators were open to processing reality. Any field study, not
very worried about, and led to an active literature just plumbing experiments (and not just experi-
(see Pathak forthcoming for a review). Similarly, ments), can generate such insight. The plumbing
researchers had not originally paid much atten- studies may even have a particular handicap in
tion to a central feature of school assignment this regard because, since they focus on details,
in Boston, the “walk zone.” But when Mayor they may often be difficult to theorize, even
Menino promised that 50 percent of the stu- when they are important for policy (Banerjee
dents should go to a school they could walk to, 2002), beyond broad and rather simple conclu-
researchers started simulating what would hap- sions (“information matters,” “default matters,”
pen if the size of the “walk zone” was increased etc.). Plumbing experiments, since they are pri-
given the assignment mechanism that exists marily motivated by pragmatism, must focus on
in Boston, and they were surprised to see that what is important for the world, not necessarily
it did not lead to many more students assigned on the very subtle issues that theorists would
to their walk zone. This motivated theoretical find worth discussing.
work on what happens in an assignment system On the other hand, surprising information
where a student can qualify for a school on two is perhaps more interesting to theorists, and
separate lists. The results are actually not obvi- thus likely to prompt more thinking when it is
ous and are interesting theoretically. The basic generated in very high-stakes environments
mechanisms at play were not understood by and involving many people. The unexpected
b ehavior of a feature of a policy implemented programs as they are being deployed represents
at scale may motivate more interest than the a tremendous opportunity to generate evidence
result from a small experiment, however well and improve the effectiveness of money that is
controlled. Behavioral public finance is a good already being spent. Moreover, as we have seen,
example of such back-and-forth between theo- the lessons that are generated from these part-
retical insights and findings from the plumbing nerships are often actually acted upon, which
experiments. Madrian and Shea (2001) were means that gains from evaluation are not just
motivated by basic idea of procrastination to potential, they are actual.
look at the role of default options in retirement To take just one extreme example, the
savings. They found that employees enrolled in Affordable Care Act is a unique institution, and
the 401k by default were much more likely to perhaps a less-than-ideal result of a long and
stay enrolled in the long term than those who complicated political process. It is very unlikely
were not defaulted in, and this prompted a much that anybody designing a health insurance sys-
more systematic investigation of the preva- tem from scratch would adopt anything resem-
lence of the default effect in retirement savings. bling it, so in a sense there is very little external
The first wave of studies were empirical (see, validity to any work about the Affordable Care
e.g., Choi 2002, 2003; Choi et al. 2004, 2012; Act. Yet, when Handel, Kolstad, and others write
Chetty et al. 2014) and these, in turn, moti- papers that are concerned with detailed issues
vated theoretical work both on whether setting of implementation of a system like the ACA, it
defaults is desirable (e.g., Carroll et al. 2009) does not really occur to us to wonder whether
and more generally on how to set them, since this is perhaps a little too specific.
they had been shown to matter. The core theo- Similarly, most plumbing studies have
retical question that defaults raise is motivated first-order impacts. Getting the details right on
by welfare analysis: since economist-plumb- school choice directly affects the welfare of mil-
ers have demonstrated that defaults matter, the lions of school children. The work of Banerjee
economist-engineer wants to set the default at et al. (2015) on the Raskin program led to the
a “reasonable” place. What is the right reason- scale-up of cards to more than 60 million bene-
able rate? This is not an easy theoretical ques- ficiaries of public transfer programs. Obviously
tion to answer, since it is likely to depend both the scaled-up policy was adapted and slightly
on the underlying reasons why defaults matter different than what was tested in the experiment.
in the first place, and because welfare analysis If the effects of this scaled-up version were the
cannot rely on the standard tools when agents same as what they had estimated, they would
exhibit behavioral biases. Amador, Werning, translate in benefits anywhere between $80 mil-
and Angeletos (2006) start tackling the question lion and $170 million annually. The work of
of the optimal savings policy for an agent with Duflo et al. (2013) on environmental audits in
hyperbolic discounting preferences. Bernheim, Gujarat, which I discussed above, led to a reform
Fradkin, and Popov (2015) explicitly address of the regulation in the state, which has a pop-
the question of the optimal default policy, and ulation of 66 million. The evaluations that are
show both how this depends on the underlying described in Banerjee, Duflo, Imbert, Mathew,
behavioral model, and how some relatively gen- and Pande (2016) and Muralidharan, Niehaus,
eral results can nonetheless be obtained. and Sukhtankar (2016a) directly affected hun-
Last and perhaps most important, plumbing dreds of thousands of participants in the Indian
experiments, by their nature, are typically run on workfare program, and eliminated millions of
real policies and on a large scale. They are not dollars of leakage, just while the research was
just data points: they affect many real people and underway, in the areas they covered. These
(potentially) have real benefits as they are car- studies also played a part in the adoption of a
ried out. When economists work on understand- combination of those reforms as direct payment
ing how to design a policy that is going to affect to beneficiary from the center in August 2015.9
millions of people and cost millions of dollars,
the stakes are high enough that it is p erhaps less 9
Unusually, a briefing presented by the Ministry of
important to know whether the findings will Rural Development to the Cabinet meeting actually cited a
generalize beyond the particular work. Working research paper (Ministry of Rural Development 2015). “As
with governments to evaluate versions of these of today, 92 percent of Panchayats are brought onto e-FMS
This will affect the lives of 182 million benefi- Alatas, Vivi, Abhijit Banerjee, Rema Hanna,
ciaries, for a budget of $5.6 billion. Benjamin A. Olken, and Julia Tobias. 2012.
Many of us chose economics because, ulti- “Targeting the Poor: Evidence from a Field
mately, we thought science could be leveraged Experiment in Indonesia.” American Eco-
to make a positive change in the world. There nomic Review 102 (4): 1206–40.
are many different paths to get there. Scientists Almås, Ingvild, Alex Armand, Orazio Attanasio,
design general frames, engineers turn them into and Pedro Carneiro. 2015. “Measuring and
relevant machinery, and plumbers finally make Changing Control: Women’s Empowerment
them work in a complicated, messy policy envi- and Targeted Transfers.” National Bureau of
ronment. As a discipline, we are sometimes a Economic Research Working Paper 21717.
little overwhelmed by “physics envy,” searching Amador, Manuel, Iván Werning, and George-
for the ultimate scientific answer to all ques- Marios Angeletos. 2006. “Commitment vs.
tions—and this will lead us to question the legit- Flexibility.” Econometrica 74 (2): 365–96.
imacy of plumbing. This essay is an attempt to Ashraf, Nava, Oriana Bandiera, and Scott S. Lee.
argue that plumbing should be an inherent part 2015. “Do-Gooders and Go-Getters: Career
of our profession: we are well prepared for it, Incentives, Selection, and Performance in Pub-
reasonably good at it, and it is how we make lic Service Delivery.” Unpublished.
ourselves useful. A feature unique to econom- Banerjee, Abhijit V. 1997. “A Theory of Misgov-
ics is that scientists, engineers, and plumbers all ernance.” Quarterly Journal of Economics 112
talk to each other (and in fact are often talking to (4): 1289–332.
themselves—the same economist wearing dif- Banerjee, Abhijit V. 2002. “The Uses of Economic
ferent hats). This conversation should continue: Theory: Against a Purely Positive Interpreta-
it is what will keep us relevant and, possibly, tion of Theoretical Results.” Unpublished.
honest. Banerjee, Abhijit V. 2007. Making Aid Work.
Cambridge, MA: MIT Press.
References Banerjee, Abhijit, Rukmini Banerji, James
Berry, Esther Duflo, Harini Kannan, Shobhini
Abdulkadiroglu, Atila, and Tayfun Sönmez. Mukherji, Marc Shotland, and Michael Wal-
2003. “School Choice: A Mechanism Design ton. 2016. “From Proof of Concept to Scal-
Approach.” American Economic Review 93 able Policies: Challenges and Solutions, with
(3): 729–47. an Application.” National Bureau of Economic
Abdulkadiroglu, Atila, and Tayfun Sönmez. 2013. Research Working Paper 22931.
“Matching Markets: Theory and Practice.” In Banerjee, Abhijit, Raghabendra Chattopad-
Advances in Economics and Econometrics, hyay, Esther Duflo, Daniel Keniston, and Nina
Vol. 1, edited by D. Acemoglu, M. Arello, and Singh. 2012a. “Can Institutions be Reformed
E. Dekel, 3–47. Cambridge, MA: Cambridge from Within? Evidence from a Random-
University Press. ized Experiment with the Rajasthan Police.”
Agarwal, Sumit, Souphala Chomsisengphet, Neale Unpublished.
Mahoney, and Johannes Stroebel. 2015. “Regu- Banerjee, Abhijit, Raghabendra Chattopadhyay,
lating Consumer Financial Products: Evidence Esther Duflo, Daniel Keniston, and Nina Singh.
from Credit Cards.” Quarterly Journal of Eco- 2012b. “Improving Police Performance in
nomics 130 (1): 111–64. Rajasthan, India: Experimental Evidence on
Alatas, Vivi, Abhijit Banerjee, Rema Hanna, Ben- Incentives, Managerial Autonomy and Train-
jamin A. Olken, Ririn Purnamasari, and Mat- ing.” National Bureau of Economic Research
thew Wai-Poi. 2016. “Self-Targeting: Evidence Working Paper 17912.
from a Field Experiment in Indonesia.” Jour- Banerjee, Abhijit V., and Esther Duflo. 2009. “The
nal of Political Economy 124 (2): 371–427. Experimental Approach to Development Eco-
nomics.” Annual Review of Economics 1 (1):
151–78.
platform. The e-FMS has reduced the problem of parking Banerjee, Abhijit V., and Esther Duflo. 2012.
of funds and increased transparency. Studies on the effec-
tiveness of electronic fund flow in Bihar done by eminent Poor Economics: A Radical Rethinking of
economists (Banerjee et al. 2014) showed that e-governance the Way to Fight Global Poverty. New York:
reduced fund flow leakage by 25 percent.” PublicAffairs.
Banerjee, Abhijit V., Esther Duflo, and Rachel Chetty, Raj, John N. Friedman, Søren Leth-Pe-
Glennerster. 2008. “Putting a Band-Aid on tersen, Torben Helen Nielsen, and Tore Olsen.
a Corpse: Incentives for Nurses in the Indian 2014. “Active versus Passive Decisions and
Public Health Care System.” Journal of the Crowd-Out in Retirement Savings Accounts:
European Economic Association 6 (2–3): 487– Evidence from Denmark.” Quarterly Journal
500. of Economics 129 (3): 1141–219.
Banerjee, Abhijit, Esther Duflo, Clement Choi, James J. 2002. “Defined Contribution Pen-
Imbert, Santhosh Mathew, and Rohini Pande. sions: Plan Rules, Participant Choices, and the
2016. “E-governance, Accountability, and Path of Least Resistance.” In Tax Policy and
Leakage in Public Programs: Experimen- the Economy, Vol. 16, edited by James M.
tal Evidence from a Financial Management Poterba, 67–113. Cambridge, MA: MIT Press.
Reform in India.” National Bureau of Eco- Choi, James J. 2003. “Optimal Defaults.” Ameri-
nomic Research Working Paper 22803. can Economic Review 93 (2): 180–85.
Banerjee, Abhijit, Rema Hanna, Jordan C. Kyle, Choi, James J., John Beshears, David Laibson,
Benjamin A. Olken, and Sudarno Sumarto. and Brigitte C. Madrian. 2012. “Default Sticki-
2015. “The Power of Transparency: Informa- ness among Low-Income Individuals.” RAND
tion, Identification Cards and Food Subsidy Working Paper WR-926-SSA.
Programs in Indonesia.” National Bureau of Choi, James J., David Laibson, Brigitte C.
Economic Research Working Paper 20923. Madrian, and Andrew Metrick. 2004. “For
Baugus, Brian, and Feler Bose. 2015. “Sunset Leg- Better or Worse: Default Effects and 401(k)
islation in the States: Balancing the Legislature Savings Behavior.” In Perspectives on the Eco-
and the Executive.” https://www.mercatus.org/ nomics of Aging, edited by David A. Wise,
system/files/Baugus-Sunset-Legislation.pdf. 81–121. Chicago: University of Chicago Press.
Benhassine, Najy, Florencia Devoto, Esther Duflo, Cogley, Timothy, Riccardo Colacito, Lars Peter
Pascaline Dupas, and Victor Pouliquen. 2015. Hansen, and Thomas J. Sargent. 2008. “Robust-
“Turning a Shove into a Nudge? A ‘Labeled ness and U.S. Monetary Policy Experimenta-
Cash Transfer’ for Education.” American tion.” Journal of Money, Credit, and Banking
Economic Journal: Economic Policy 7 (3): 40 (8): 1599–623.
86–125. Congdon, William J., Jeffrey R. Kling, Jens Lud-
Bernheim, B. Douglas, Andrey Fradkin, and Igor wig, and Sendhil Mullainathan. 2017. “Social
Popov. 2015. “The Welfare Economics of Policy: Mechanism Experiments and Policy
Default Options in 401(k) Plans.” American Evaluations.” In Handbook of Field Experi-
Economic Review 105 (9): 2798–837. ments, Vol. 1, edited by Esther Duflo and Abhi-
Burwell, Sylvia, John Holdren, Alan Krueger, and jit Banerjee. Amsterdam: North Holland.
Cecilia Munoz. 2013. M-13-17 Memorandum Dal Bó, Ernesto, Frederico Finan, and Martin A.
to the Heads of Departments and Agencies: Rossi. 2013. “Strengthening State Capabilities:
Next Steps in the Evidence and Innovation The Role of Financial Incentives in the Call to
Agenda. Washington, DC: Executive Office of Public Service.” Quarterly Journal of Econom-
the President, Office of Management and Bud- ics 128 (3): 1169–218.
get. Deserranno, Erika. 2015. “Financial Incentives
Campbell, John Y. 2016. “Richard T. Ely Lecture: as Signals: Experimental Evidence from the
Restoring Rational Choice: The Challenge of Recruitment of Health Workers.” Unpublished.
Consumer Financial Regulation.” American Devoto, Florencia, Esther Duflo, Pascaline Dupas,
Economic Review 106 (5): 1–30. William Pariente, and Vincent Pons. 2012.
Carroll, Gabriel D., James J. Choi, David Laib- “Happiness on Tap: Piped Water Adoption in
son, Brigitte C. Madrian, and Andrew Metrick. Urban Morocco.” American Economic Jour-
2009. “Optimal Defaults and Active Deci- nal: Economic Policy 4 (4): 68–99.
sions.” Quarterly Journal of Economics 124 Duflo, Esther, Michael Greenstone, Rohini Pande,
(4): 1639–74. and Nicholas Ryan. 2013. “Truth-Telling by
Chetty, Raj. 2015. “Richard T. Ely Lecture: Third-Party Auditors and the Response of Pol-
Behavioral Economics and Public Policy: A luting Firms: Experimental Evidence from
Pragmatic Perspective.” American Economic India.” Quarterly Journal of Economics 128
Review 105 (5): 1–33. (4): 1499–545.
Dur, Umut M., Scott Duke Kominers, Parag Khan, Adnan Q., Asim I. Khwaja, and Benjamin
A. Pathak, and Tayfun Sönmez. 2013. “The A. Olken. 2016b. “Tax Farming Redux: Exper-
Demise of Walk Zones in Boston: Priorities imental Evidence on Performance Pay for Tax
vs. Precedence in School Choice.” National Collectors.” Quarterly Journal of Economics
Bureau of Economic Research Working Paper 131 (1): 219–71.
18981. Klemperer, Paul. 2002. “What Really Matters in
Executive Order 13563. 2011. “Improving Regu- Auction Design.” Journal of Economic Per-
lation and Regulatory Review.” Federal Regis- spectives 16 (1): 169–89.
ter 76: 3821–23. Kling, Jeffrey R., Sendhil Mullainathan, Eldar
Field, Erica, Rohini Pande, Natalia Rigol, Simone Shafir, Lee C. Vermeulen, and Marian V. Wro-
Schaner, and Charity Troyer Moore. 2016. “On bel. 2012. “Comparison Friction: Experimental
Her Account: Can Strengthening Women’s Evidence from Medicare Drug Plans.” Quar-
Financial Control Boost Female Labor Sup- terly Journal of Economics 127 (1): 199–235.
ply?” Unpublished. Li, Shengwu. 2016. “Obviously Strategy-Proof
Finan, Frederico, Benjamin A. Olken, and Rohini Mechanisms.” http://siepr.stanford.edu/sites/
Pande. 2015. “The Personnel Economics of the default/files/publications/16-015.pdf.
State.” National Bureau of Economic Research Ludwig, Jens, Jeffrey R. Kling, and Sendhil Mul-
Working Paper 21825. lainathan. 2011. “Mechanism Experiments
Gibbons, Robert. 2010. “Inside Organiza- and Policy Evaluations.” Journal of Economic
tions: Pricing, Politics, and Path Depen- Perspectives 25 (3): 17–38.
dence.” Annual Review of Economics 2 (1): Madrian, Brigitte C., and Dennis F. Shea. 2001.
337–65. “The Power of Suggestion: Inertia in 401(k)
Glennerster, Rachel. 2017. “The Practicalities of Participation and Savings Behavior.” Quar-
Running Randomized Evaluations: Partner- terly Journal of Economics 116 (4): 1149–87.
ships, Measurement, Ethics, and Transpar- Ministry of Rural Development (MoRD). 2015.
ency.” In Handbook of Field Experiments, Vol. “Note to Cabinet.” Government of India.
1, edited by Esther Duflo and Abhijit Banerjee. Moore, Quinn, Kevin Conway, and Bran-
Amsterdam: North Holland. don Kyler. 2012. “Direct Certification in the
Greenstone, Michael. 2009. “Toward a Culture National Lunch Program: State Implementa-
of Persistent Regulatory Experimentation and tion Progress School Year 2011–2012. Report
Evaluation.” In New Perspectives on Regu- to Congress: Summary.” Alexandria, VA: US
lation, edited by David Moss and John Cis- Department of Agriculture, Food and Nutrition
ternino, 111–25. Cambridge, MA: The Tobin Service, Office of Research and Analysis.
Project. Moore, Quinn, Kevin Conway, Brandon Kyler,
Handel, Benjamin R. 2013. “Adverse Selection and Andrew Gothro. 2013. “Direct Certifi-
and Inertia in Health Insurance Markets: When cation in the National Lunch Program: State
Nudging Hurts.” American Economic Review Implementation Progress, School Year 2012–
103 (7): 2643–82. 2013.” Report CN-13-DC. Alexandria, VA: US
Handel, Benjamin R., and Jonathan Kolstad. Department of Agriculture, Food and Nutrition
2015. “Getting the Most from Marketplaces: Service, Office of Research and Analysis.
Smart Policies on Health Insurance Choice.” Muralidharan, Karthik, and Paul Niehaus. 2016.
Brookings Hamilton Project Discussion Paper “Experimentation at Scale.” Unpublished.
2015–08. Muralidharan, Karthik, Paul Niehaus, and San-
Hanna, Rema, Sendhil Mullainathan, and Joshua dip Sukhtankar. 2016a. “Building State Capac-
Schwartzstein. 2014. “Learning through Notic- ity: Evidence from Biometric Smartcards in
ing: Theory and Evidence from a Field Exper- India.” American Economic Review 106 (10):
iment.” Quarterly Journal of Economics 129 2895–929.
(3): 1311–53. Muralidharan, Karthik, Paul Niehaus, and San-
Khan, Adnan Q., Asim I. Khwaja, and Benja- dip Sukhtankar. 2016b. “General Equilibrium
min A. Olken. 2016a. “Making Moves Mat- Effects of (Improving) Public Employment
ter: Experimental Evidence on Incentivizing Programs: Experimental Evidence from India.”
Bureaucrats through Performance-Based Post- Unpublished.
ings.” Unpublished. Olken, Benjamin A. 2007. “Monitoring
orruption: Evidence from a Field Experi-

C University Press.
ment in Indonesia.” Journal of Political Econ- Simester, Duncan. Forthcoming. “Field Exper-
omy 115 (2): 200–49. iments in Marketing.” In Handbook on Field
Pathak, Parag A. 2011. “The Mechanism Design Experiments, Vol. 1, edited by Esther Duflo
Approach to Student Assignment.” Annual and Abhijit Banerjee. Amsterdam: North Hol-
Review of Economics 3 (1): 513–36. land.
Pathak, Parag.Forthcoming. “What Really Mat- Stango, Victor, and Jonathan Zinman. 2014.
ters in Designing School Choice Mechanisms.” “Limited and Varying Consumer Attention:
In Advances in Economics and Economet- Evidence from Shocks to the Salience of
rics, 11th World Congress of the Econometric Bank Overdraft Fees.” Review of Financial
Society. Studies 27 (4): 990–1030.
Roth, Alvin E. 2002. “The Economist as Engi- Thaler, Richard H., and Cass R. Sunstein.
neer: Game Theory, Experimentation, and 2008. Nudge: Improving Decisions about
Computation as Tools for Design Econom- Health, Wealth, and Happiness. New Haven:
ics.” Econometrica 70 (4): 1341–78. Yale University Press.
Scott, James C. 1998. Seeing Like a State: How Willis, Lauren E. 2013. “When Nudges Fail:
Certain Schemes to Improve the Human Slippery Defaults.” University of Chicago
Condition Have Failed. New Haven: Yale
Law Review 80 (3): 1155–229.

4. Esther Duflo (2017)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4. Esther Duflo (2017)

Uploaded by

Copyright:

Available Formats

American Economic Review: Papers & Proceedings 2017, 107(5): 1–26

RICHARD T. ELY LECTURE

The Economist as Plumber †

The reason we like these buttons so much,

problem: how to transfer money from the center

Panel A Control plants Panel B Treatment plants

Figure 2. Audit Readings for Suspended Particulate Matter (SPM)

orruption: Evidence from a Field Experi-

You might also like

4. Esther Duflo (2017)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4. Esther Duflo (2017)

Uploaded by

Copyright:

Available Formats

American Economic Review: Papers & Proceedings 2017, 107(5): 1–26

RICHARD T. ELY LECTURE

The Economist as Plumber †

The reason we like these buttons so much,

problem: how to transfer money from the center

Panel A Control plants Panel B Treatment plants

Figure 2. Audit Readings for Suspended Particulate Matter (SPM)

­ orruption: Evidence from a Field Experi-

You might also like

orruption: Evidence from a Field Experi-