You are on page 1of 11

Title: Structures Behind Numbers: Critically Examining the “Credibility Revolution” and

“Evidence-based Policy”
Author(s): Vinayak Krishnan

Source: Economic and Political Weekly (Engage), Vol. 58, Issue No. 41, 14 Oct, 2023.

ISSN (Online): 2349-8846

Published by: Economic and Political Weekly (Engage)

Article URL:
https://www.epw.in/engage/article/structures-behind-numbers-critically-examining
Author(s) Affiliation: Vinayak Krishnan (vinayak1994@gmail.com) is a PhD research
scholar at the University of Sussex.

Articles published in EPW Engage are web exclusive.

1 , Vol.58, Issue No.41


Structures Behind Numbers: Critically
Examining the “Credibility Revolution”
and “Evidence-based Policy”
Vinayak Krishnan

Abstract: Statistical analysis has


become a mainstay of contemporary
social science research. This is
particularly salient in the discipline
of economics. While the mid-20th
century saw theoretical work as the
most fruitful form of research, shifts
from the 1980s onward saw far
greater attention being paid to
empirical techniques and applying the
results of economics research to real-
world policy problems. Most
economists have welcomed this shift as
a “Credibility Revolution” that has
facilitated a more scientific and objective analysis of socio-economic phenomenon. Moreover,
much of this research is increasingly being used to analyse and affect policy change as well,
giving rise to the much-vaunted practice of evidence-based policy This article examines the
historical antecedents of both these phenomena and the empirical foundations that form their
bases. Further, it critically examines the premise of “scientific objectivity” that these methods of
research promise. It seeks to argue that while statistical knowledge provides a range of insights
to researchers and policymakers, the numbers and indicators that form the basis of this analysis
are socially and politically constructed.

The 2019 Nobel Prize in Economics was awarded to Abhijit Banerjee, Esther Duflot, and
Michael Kremer “for their experimental approach to alleviating global poverty” (Royal Swedish
Academy of Sciences 2019). Following in quick succession, the 2021 prize was also conferred
on scholars who made methodological breakthroughs in the field. One half was awarded to
David Card for “empirical contributions to labour economics” and the other to Joshua Angrist
and Guido Imbens for “methodological contributions to the analysis of causal relationships”

2 , Vol.58, Issue No.41


(Royal Swedish Academy of Sciences 2021). Both of these instances represent an interesting
moment in understanding the forms of knowledge that are receiving international recognition.
While empiricists have been honoured previously (among others, Daniel McFadden and James
Heckman received the prize in 2000 for their contributions to econometrics), the economics
Nobel Prize has been dominated by scholars who have furthered theoretical understanding in
various economic sub-disciplines. In this context, the (almost) consequent cases of Nobel Prizes
being granted for methodological innovation mark an interesting historical event and merit
further examination.

The Credibility Revolution and the subsequent rise of


Evidence-based Policy
The phenomenon of awarding methodological innovations in economics research has been
closely associated with two important processes that have completely altered the landscape of
economics and allied social sciences. The first of these is what has widely come to be known
as the “credibility revolution” within the discipline of economics. Coined by economists Joshua
Angrist (one of the 2021 Nobel Prize Winners) and Jörn-Steffen Pischke in a paper published in
the Journal of Economic Perspectives in 2010, the term has come to define the raison d'etre for
much of economics research today.

Angrist and Pischke (2010) open their paper with a critique made by various well-known
economists during the 1980s about the lack of empirical rigour within the field. These scholars
had lamented the fact that economists of the time did not pay close to attention to the quality
of data and econometric methods while conducting research. The authors then go on to argue
that contemporary economics research has effectively remedied this problem, and researchers
today pay far greater attention to empirical methods than was seen in earlier decades. This
change in approach, with an emphasis on strong research design and the use of scientific
techniques, is what is termed the “credibility revolution” in economics.

A crucial methodological innovation that has facilitated this revolution, according to Angrist and
Pischke (2010: 4), is the use of research designs that involve “random assignments.” The
foundational idea here is that the economic impact of a particular policy intervention or
politico-economic event, known as the “treatment” in the economics literature, cannot be
analysed through a simple comparison of those who received it and those who did not. Rather,
a causal connection can only be obtained when the treatment is given randomly to separate
groups of people. It would be instructive to understand this concept with an example.

Assume that researchers want to find out if a cash transfer programme implemented in a
particular country has led to improvements in health outcomes. A plain comparison of the
health indicators between groups of people who did and did not receive the cash transfer is

3 , Vol.58, Issue No.41


not sufficient to determine its causal impact. This is primarily because of what economists refer
to as “selection bias.” Cash transfer schemes are normally availed by poor people who have
lower incomes (oftentimes there is some threshold income above which individuals are not
eligible for the scheme). Since poorer and low-income people are more likely to participate in
this scheme, it is possible that they will also possess worse-off health outcomes to start with as
compared to those who do not avail the scheme. Even if their health improves dramatically
after receiving income support, the average health indicators of this group (termed the
“treatment group”) may be less than the average health indicators for the group that did not
receive assistance from the programme (termed the “control group”). This is because the
control group consists of people with high incomes who may already have better health
outcomes, as they are able to access better quality resources. Hence, we are in a situation
where people from lower income groups “select into the treatment” and therefore bias the
estimates from a general comparison.

Therefore, researchers need to remove selection bias to arrive at an accurate assessment of the
improvement in health outcomes due to the programme. In order to do this, it is necessary to
compare the difference between the health of people who did receive the cash transfer with
what their health levels would have been had they not received the cash transfer. This
situation, of what the outcome for treated individuals would have been had they not received
the treatment, is known as the “counterfactual.” The obvious problem in undertaking such an
evaluation is that the counterfactual, by definition, cannot be observed.

To solve this problem, economists employ the concept of randomisation. Continuing with our
above example, the cash transfer is now randomly assigned to two groups of people. Unlike
the prior research design, which was handicapped by the fact that people with lower incomes
were more likely to avail of the programme (or had to because of some threshold income), in
this case the policy treatment is handed out in a random fashion and is independent of any
underlying characteristics of the individuals involved. In other words, all individuals, regardless
of their existing socio-economic position, are equally likely to receive the treatment. This
ensures that the baseline comparison is happening between equivalent groups. Further, the
control group in this case acts as a counterfactual; they represent what happens when an
equivalent group of people (not a richer set of people as in the original example) does not
receive the cash transfer. Hence, a comparison between health outcomes of these two groups
would yield an accurate estimate of the causal impact of the cash transfer programme on
health. Randomisation thus removes the problem of selection bias.

This principle of randomised assignment is based heavily on the research in medical science
where randomised clinical trials are conducted to estimate the effectiveness of drugs (Deaton
and Cartwright 2018). In economics research, randomisation is achieved in two ways. As
described in the above example, it could involve an actual experiment where the treatment,
which is usually some form of policy intervention, is randomly allocated to different groups and

4 , Vol.58, Issue No.41


then the effects are studied. This is what is known as randomised controlled trials (RCTs). The
winners of the 2019 Nobel Prize were pioneers of this method and have applied RCTs to a
wide variety of economic and political questions.

However, conducting a randomised experiment is an expensive and time-consuming endeavour,


which is not feasible for studying all research questions. Therefore, the second method involves
exploiting randomness created through a pre-existing policy change or a sudden socio-political
event, such as the raising of minimum wages or an unexpected influx of migrants, to create
situations where treatment groups and their equivalent counterfactuals can be compared. This
allows researchers to estimate the causal impacts of such events on various economic
parameters of interest. The latter method is known as “quasi-experiments.” They are similar to
randomised experiments but, unlike RCTs, do not involve experiments that have actually been
conducted on the field. Instead, they rely on real-life occurrences, which are called “natural
experiments,” to create random assignments. There are different forms of quasi-experimental
techniques, with the most widely used ones in the economics literature being difference-in-
differences, instrumental variables, and regression discontinuity design. Each of these methods
utilises different empirical innovations to ensure that the control group represents an actual
counterfactual for the treatment group. The three winners of the 2021 Nobel Prize were at the
forefront of developing and applying such quasi-experimental research designs to important
questions in areas such as labour economics, health, and education.

The use of randomisation, whether in the form of RCTs or quasi-experimental techniques, has
hugely contributed to the “credibility revolution” in economics. While RCTs began only in the
late 1990s and the early 2000s, quasi-experimental methods have been utilised extensively since
the 1980s in economics research. Moreover, the credibility revolution has now moved beyond
economics and sees widespread application in other social science disciplines. Both political
science and public policy research, particularly within American academic institutions, are
heavily quantitative and frequently employ either RCTs or quasi-experimental methods.

This research has also transitioned from being purely academic to influencing policy design.
Through applying randomisation techniques, the researchers claim that they can rigorously
“evaluate” the impact of various policy proposals and help decision-makers choose the most
appropriate intervention for a particular problem. This has given rise to the second major
phenomenon that has deeply influenced social science and development research over the last
few decades: that of “evidence-based policy.” According to this framework, only those policies
are to be implemented and scaled for which a statistical impact on a particular socio-economic
indicator (or group of indicators) can be clearly demonstrated. The most rigorous way to show
this is by conducting a field experiment in the form of an RCT and then implementing the
results of such trials through the state’s administrative machinery. A large number of
international development agencies (such as the World Bank) and policy consultancies are
primarily engaged in this work of evidence-based policy research today. Their clients largely

5 , Vol.58, Issue No.41


tend to be national or sub-national governments in developing countries of South Asia or
Africa. Various governments, including the central and state governments in India, have
become increasingly receptive to these forms of research and include statistical inputs from such
organisations in their policy process. Here is an example of J-PAL South Asia, one of the most
well-known development research organisations in the world, being engaged by the
Government of Tamil Nadu “to institutionalise the use of evidence in its policy decisions.”

A Deeper Look at “Evidence” and the Statistical


Analysis of Society
The credibility revolution, as discussed by Angrist and Pischke (2010), has had an incredible
impact on economics and allied disciplines. Researchers now pay immense attention to data
quality and the econometric methods used to analyse such data. In addition, there is a strong
focus on applying economics research to real-world policy problems. As economic historians
Roger Backhouse and Beatrice Cherrier (2017: 7) have argued, the contours of the economics
discipline have changed from one where “being a theorist was the most prestigious activity for
an economist to engage in” to one “in which economists take pride in being applied, whether
applied theorists or empirical economists who tackle problems of policy.” This profound shift
has been the most prevalent in the field of development economics and has subsequently
spilled over to related social sciences such as political science and public policy.

There can be no argument that this shift has led to an explosion in new forms of knowledge.
The myriad papers and research reports that use causal inference techniques have generated a
great number of novel insights for both social scientists and policymakers. Yet, this
development needs to be critically analysed. The credibility revolution and evidence-based
policy has led to a situation where statistical data is automatically viewed as representing an
unbiased and objective portrayal of socio-economic reality. This is a problematic and oftentimes
incorrect point of view. Anthropologist Sally Merry (2011), in a provocative article titled
“Measuring the World,” argues that while statistical indicators have a sense of scientific
objectivity attached to them, they “typically conceal their political and theoretical origins and
underlying theories of social change and activism.” Statistical analysis requires converting social,
economic, and political phenomena into numbers. Merry’s argument essentially seeks to
highlight that this conversion process involves assumptions that are inherently political and
ideological. This applies to both field experiments (RCTs) as well as studies that use quasi-
experimental methods and hence the results of such studies must be scrutinised more closely.

Let us begin with RCTs, which have become the gold standard of empirical research in social
science. Studies using RCTs have been critiqued on various grounds, often by fellow
economists and researchers themselves. As Jean Drèze (2019) notes, RCTs assume a very
technocratic and scientific approach to policy formulation: similar to a lab experiment, an RCT

6 , Vol.58, Issue No.41


is designed to find a specific policy “fix” to socio-economic problems. However, policy
decisions often involve questions of redistribution and power that are intrinsically political and
do not lend themselves to solely technocratic solutions. RCTs have been criticised for ignoring
these realities. Drèze (2019), for instance, argues that policy decisions ultimately involve value
judgements “that no RCT, or for that matter no evidence, can settle on its own.” In a similar
vein, Angus Deaton and Nancy Cartwright (2018: 10) state that “the widespread and largely
uncritical belief that RCTs give the right answer permits them to be used as dispute-
reconciliation mechanisms for resolving political conflicts.”

In addition to RCTs, it is also necessary to analyse quasi-experimental methods, which rely on


natural randomisation caused by external events, with a critical lens. There is an assumption
that quasi-experimental methods, because they are less interventionist and involve large
numbers of data points, have more applicability and accuracy than the results of RCT studies.
However, this need not always be the case. A case in point is the influential study on labour
regulation by Timothy Besley and Robin Burgess in 2004. Besley and Burgess, both faculty
members at the London School of Economics and Politcs’s Department of Economics,
published a paper that sought to understand the impact of labour regulation on output,
employment, investment, and productivity. Using data from the National Sample Surveythe
Annual Survey of Industries, and an instrumental variable methodology (one of the three
frequently used quasi-experimental methods mentioned above), they find that “pro-worker
labour regulation resulted in lower output, employment, investment and productivity in the
formal manufacturing sector” (Besley and Burgess 2004: 92). The paper essentially argues that
regulations that seek to protect the welfare of workers raise the cost of doing business for firms
and employers, which then leads to poor economic outcomes in terms of employment, output,
and investment. This is evident from the concluding paragraphs of the paper where the authors
state the following: “Our finding that regulating in a pro-worker direction was associated with
increases in urban poverty are particularly striking as they suggest that attempts to redress the
balance of power between capital and labour can end of up hurting the poor.”

It is instructive to carry out a deeper analysis of Besley and Burgess’s econometric


methodology. As mentioned earlier, statistical analysis requires the translation of abstract
concepts into some kind of numerical metric. In the case of Besley and Burgess (2004), the
theoretical variable of “labour regulation” had to be converted into a measurable indicator to
statistically analyse how it affects the other outcomes mentioned above. To do this, they create
an index based on amendments to the Industrial Disputes (ID) Act, the key legislation that
governs industrial relations in India. Labour laws can be amended by both Parliament and state
assemblies as they belong to the Concurrent List of the Constitution. To build their numerical
index of labour regulation, the two authors analyse individual state-level amendments made to
the ID act and “code” them as “neutral, pro-worker or pro-employer” (Besley and Burgess
2004: 98). These are respectively coded as +1 (for a pro-worker amendment), 0 (for a neutral
amendment), and −1 (for a pro-employer amendment). These scores are then aggregated over

7 , Vol.58, Issue No.41


years “to give a quantitative picture of how the regulatory environment evolved over time”
(Besley and Burgess 2004: 98). Once this numerical index of labour regulation is created, it is
then regressed on other outcomes such as employment and output. A negative causal
relationship is reported, which means that higher labour regulation (measured in the form of
this index) has caused lower levels of employment and manufacturing output.

The paper by Besley and Burgess has subsequently been heavily critiqued on methodological
grounds. Scholars have pointed out that their regulatory index is constructed only using the
Industrial Disputes act, while ignoring a whole set of other labour laws that comprise the
regulatory landscape of labour in India (Storm 2019). Moreover, as Aditya Bhattacharjea’s
(2006) comprehensive rebuttal to Besley and Burgess shows, their numerical index for labour
regulation is based on various flawed assumptions. Having reviewed each of the legal
amendments that Besley and Burgess analyse and code into their index, Bhattacharjea (2006:15)
finds various instances of “inappropriate classification of individual amendments, summary
coding of incommensurable changes as +1 or −1, and misleading cumulation over time.” The
critique about “incommensurability” is an important one. Statistical analysis requires the
construction of numerical indicators, where values of that indicator can be placed on a scale of
increasing and decreasing order (such as an index measuring “higher and lower” labour
regulation). Bhattacharjea argues that in this case, such an indicator cannot be created because
various amendments to the ID Act are fundamentally different from each other and thus
cannot be numerically compared to each other on a common scale. Given all these problems
with their index, it seems quite clear that the empirical foundations for Besley and Burgess’
core claim that labour regulation causes poor economic outcomes is built on incredibly shaky
ground.

Further, the paper and its subsequent critiques reveal something more striking about the
process of statistically analysing society. Merry’s argument, mentioned previously, is that
statistical work on society presents a veneer of objectivity and scientific inquiry, despite being
based on underlying ideological predilections about society itself. This is completely evident is
the case of Besley and Burgess (2004). Their index of labour regulation, which is core to their
statistical framework, is entirely based on value judgements and subjective opinions. They
arbitrarily convert complex legal changes, all of which were carried out in specific social and
political contexts over decades, into simple numerical forms (+1, 0, and −1) that entirely hide
all of these nuances. Moreover, many of these value judgements are clearly linked to a
neoclassical ideological framework in which the market mechanism for allocating returns to
labour and capital is considered the most efficient, and any intervention by the state must be
minimal. Yet, despite these strong judgements and ideological opinions, the statistical results of
the paper are interpreted as objective and representing a true economic reality. The political
values that remain foundational to the statistical analysis are forgotten. These arbitrary decisions,
however, need to be taken into account to arrive at a more critical understanding of socio-
economic statistics and the inferences that can be drawn from their analysis.

8 , Vol.58, Issue No.41


Economists would defend themselves by arguing that Besley and Burgess (2004) represents an
outlier and that subsequent research has comprehensively debunked their results about labour
regulation. The paper has been followed by a whole range of studies that clearly show, through
various statistical techniques, that greater labour regulation does not cause reduced employment
and output (Bhattacharjea 2006; Karak and Basu 2019; Sood and Nath 2020). This is, however,
not an acceptable position. Despite its fundamental flaws, the results of the paper continue to
be cited in various influential policy publications. As recently as 2019, 15 years after Besley and
Burgess’ questionable findings, the Government of India stated in its Economic Survey that
restrictive labour laws caused firms to remain small and hire less workers, furthering the level
of unemployment (Ministry of Finance 2019). Moreover, based on this flawed analysis, the
central government and various state governments have gone on to make legislative
amendments to labour laws, which include diluting a significant number of safeguards for
workers (Ministry of Finance 2021). This policy of deregulation will have incredible material
consequences for millions of India’s workers (Storm 2019). The policy life of Besley and
Burgess (2004) is thus unconnected with its academic credentials. Although it was thoroughly
critiqued, the paper precipitated a range of policies that can potentially hurt India’s working
class. The real-world impacts of Besley and Burgess’ research thus go well beyond the confines
of academia and therefore cannot be justified as simply a case of poorly conducted research.

This entire saga is also illustrative to understand the politics of “evidence-based policy.” Even
though the results of Besley and Burgess (2004) were strongly contested, decision makers went
ahead and formulated policy on the basis of this highly ambiguous “evidence.” Notwithstanding
their claim to objectivity, policymakers and research organisations that work with them (mostly
staffed by individuals with PhDs in economics who most certainly had the technical expertise to
understand the pitfalls in Besley and Burgress’ research), chose selective evidence that was
convenient for them at a given point of time. How else does one explain the continued policy
relevance of a flawed paper in the form of Besley and Burgess (2004), even though there exists
equally (if not more) empirically rigorous evidence from heterodox economists showing that
labour regulation is not associated with significantly lower employment and output.

Conclusions
There cannot be any doubt that the credibility revolution in economics and broader social
science has provided useful knowledge on a variety of socio-economic questions. The focus on
high-quality statistical data and methodological rigour has had major implications for social
scientific analysis. However, an incessant reliance on the objectivity of statistical information on
socio-economic phenomena needs to be questioned. Statistical tools are necessary to understand
society, but they are not the only legitimate forms of social knowledge. Non-statistical
information in the form of qualitative interviews or ethnographic evidence can generate as
much insight about the social, political, and economic realities as numerical indicators can.
Numbers about society are ultimately connected to the structures of political economy and

9 , Vol.58, Issue No.41


social stratification as well as ideological frameworks, that underly social life itself. These
structures behind numbers must be taken into account when undertaking and interpreting
statistical analysis of socio-economic events.

Given these realities, the “evidence” that goes into “evidence-based policy” needs to be viewed
with more caution than is the case at the present. As the Besley and Burgess example
highlights, state officials and organisations working with them (international development
agencies and private policy consultancies) are all operating within a hierarchy of power
relations. In such a situation, particular forms of evidence that is critical of these hierarchies
often fall by the wayside. Moreover, “evidence” cannot be restricted entirely to statistical data
and analyses, particularly those that conform to a neoclassical understanding of political
economy. Other forms of knowledge, based on differing ideological frameworks, need to enter
public policy discourse. Only then can there be wider dialogues and policy be truly formulated
on the basis of evidence that represents actual economic realities.

References:

Angrist, Joshua D and Jorn-Steffen Pischke (2010): “The Credibility Revolution in


Empirical Economics: How Better Research Design is Taking the Con out of
Econometrics,” Journal of Economic Perspectives, Vol 24, No 2, pp 3-30,
https://doi.org/10.1257/jep.24.2.3.

Backhouse, Roger and Beatrice Cherrier (2017): “The Age of the Applied Economist: The
Transformation of Economics since the 1970s,” History of Political Economy, Vol 49, pp
1-33, https://doi.org/10.1215/00182702-4166239.

Besley, Timothy and Robin Burgess (2004): “Can Labour Regulation Hinder Economic
Performance? Evidence from India,” The Quarterly Journal of Economics, Vol 119, No 1,
pp 91-134, https://doi.org/10.1162/003355304772839533.

Bhattacharjea, Aditya (2006): “Labour Market Regulation and Industrial Performance in


India: A Critical Review of the Empirical Evidence,” Working Paper No. 141, Centre for
Development Economics, Delhi School of Economics, Delhi,
http://www.cdedse.org/pdf/work141.pdf.

Deaton, Angus and Nancy Cartwright (2018): “Understanding and misunderstanding


randomized controlled trials,” Social Science & Medicine, Vol 210, pp 2-21,
https://doi.org/10.1016/j.socscimed.2017.12.005.

Drèze, Jean (2019): “Some Questions Around the Use of ‘Evidence-Based’ Policy,” The
Wire, 15 October,

10 , Vol.58, Issue No.41


https://thewire.in/economy/some-questions-around-the-use-of-evidence-based-policy.

Karak, Anirban and Deepankar Basu (2019): “Profitability or Industrial Relations: What
Explains Manufacturing Performance across Indian States?” Development and Change,
Vol 51, No 3, pp 817-42, https://doi.org/10.1111/dech.12493.

Merry, Sally Engle (2011): “Measuring the World: Indicators, Human Rights and Global
Governance,” Current Anthropology, Vol 52, No S3, pp S83-S95.

Ministry of Finance (2019): “Economic Survey 2018-19 Volume 1,” Government of India,
New Delhi.

Ministry of Finance (2021): “Economic Survey 2020-21 Volume 1,” Government of India,
New Delhi.

Royal Swedish Academy of Sciences (2019): “The Prize in Economic Sciences 2019,” 14
October 14,
https://www.nobelprize.org/uploads/2019/10/press-economicsciences2019-2.pdf.

Royal Swedish Academy of Sciences (2021): “The Prize in Economic Sciences 2021,” 11
October,
https://www.nobelprize.org/uploads/2021/10/press-economicsciencesprize2021-2.pdf.

Sood, Atul and Paaritosh Nath (2020): “Labour Law Changes Innocuous Mistakes or
Sleight of Hand?” Economic & Political Weekly, Vol 55, No 22, pp 33-37.

Storm, Servaas (2019): “The Bogus Paper that Gutted Workers’ Rights,” Institute for
New Economic Thinking, 6 February 6,
https://www.ineteconomics.org/perspectives/blog/the-bogus-paper-that-gutted-workers-right
s.

11 , Vol.58, Issue No.41

You might also like