Professional Documents
Culture Documents
https://doi.org/10.1257/jel.20191450
391
392 Journal of Economic Literature, Vol. LIX (June 2021)
characterizing the practical settings where (Cunningham and Shah 2018), immigration
synthetic controls may be useful and those policy (Bohn, Lofstrom, and Raphael 2014),
where they may fail. corporate political connections (Acemoglu
Section 2 briefly introduces the ideas et al. 2016), taxation (Kleven, Landais, and
behind the synthetic control methodology Saez 2013), organized crime (Pinotti 2015),
in the context of comparative case studies. and many other key policy issues. They have
Section 3 discusses some of the formal aspects also been adopted as the main tool for data
of the synthetic control methodology that are analysis across different sides of the issues in
of particular interest for empirical applica- recent prominent debates on the effects of
tions. Readers who are already familiar with immigration (Borjas 2017, Peri and Yasenov
the synthetic control methodology may only 2019) and minimum wages (Allegretto et al.
need to read subsections 3.3 to 3.5 in detail, 2017, Jardim et al. 2017, Neumark and
and skim through section 2 and the rest of Wascher 2017, Reich et al. 2017). Synthetic
section 3 in order to acquaint themselves with controls are also applied outside econom-
terms and notation that will be employed in ics: in the social sciences, biomedical disci-
later sections. Sections 4 through 6 comprise plines, engineering, etc. (see, e.g., Heersink,
the core of the article. Section 4 discusses the Peterson, and Jenkins 2017; Pieters et al.
practical advantages of synthetic control esti- 2017). Outside academia, synthetic controls
mators. Sections 5 and 6 discuss contextual have found considerable coverage in the
and data requirements for synthetic control popular press (see, e.g., Guo 2015, Douglas
empirical studies. I discuss the validity of these 2018) and have been widely adopted by mul-
requirements in applied settings and potential tilateral organizations, think tanks, business
ways to adapt the research design when the analytics units, governmental agencies, and
requirements do not hold in practice. Section consulting firms. For example, the synthetic
7 describes robustness and diagnostic checks control method plays a prominent role in the
to evaluate the credibility of a synthetic con- official evaluation of the effects of the massive
trol counterfactual and to measure the extent Bill & Melinda Gates Foundation’s Intensive
to which results are sensitive to changes in the Partnerships for Effective Teaching program
study design. Section 8 discusses extensions (Gutierrez, Weinberger, and Engberg 2016).
and recent proposals. The final section con- Synthetic control methods were originally
tains conclusions and describes open areas for proposed in Abadie and Gardeazabal (2003)
research on synthetic controls. and Abadie, Diamond, and Hainmueller
(2010) with the aim to estimate the effects
of aggregate interventions, that is, interven-
2. A Primer on Synthetic Control
tions that are implemented at an aggregate
Estimators
level affecting a small number of large units
In a recent Journal of Economic Perspectives (such as a cities, regions, or countries), on
survey on the econometrics of policy eval- some aggregate outcome of interest. More
uation, Susan Athey and Guido Imbens recently, synthetic control methods have
describe synthetic controls as “arguably the been applied to settings with a large num-
most important innovation in the policy eval- ber of units.1 We will discuss this and other
uation literature in the last 15 years” (Athey extensions in section 8.
and Imbens 2017). In the last few years, syn-
thetic controls have been applied to study 1 See, for example, Acemoglu et al. (2016), Kreif et al.
the effects of right-to-carry laws (Donohue, (2016), Abadie and L’Hour (2019), and Dube and Zipperer
Aneja, and Weber 2019), legalized prostitution (2015).
Abadie: Using Synthetic Controls 393
Consider a setting where one aggregate between the unit exposed to treatment and a
unit, such as a state or a school district, is group of units that are similar to the exposed
exposed to an event or intervention of inter- unit but were not affected by the treatment.
est. For example, Abadie, Diamond, and This can be achieved whenever the evolu-
Hainmueller (2010) study the effect of a large tion of the outcomes for the unit affected by
tobacco control program adopted in California the intervention and the comparison units is
in 1988, and Bifulco, Rubenstein, and Sohn driven by common factors that induce a sub-
(2017) evaluate the effects of an educational stantial amount of co-movement.
program adopted in the Syracuse, New York, Comparative case studies have long been
school district in 2008. In accordance with applied to the evaluation of large-scale events
the program evaluation literature in econom- or aggregate interventions. For example, to
ics, the terms “treated” and “untreated” will estimate the effects of the massive arrival of
refer to units exposed and not exposed to the Cuban expatriates to Miami during the 1980
event or intervention of interest, respectively. Mariel boatlift on native unemployment in
I will use the terms “event,” “intervention,” Miami, Card (1990) compares the evolution
and “treatment” interchangeably. Traditional of native unemployment in Miami at the
regression analysis techniques require large time of the boatlift to the average evolution
samples and many observed instances of the of native unemployment in four other cit-
event or intervention of interest and, as a ies in the United States. Similarly, Card and
result, they are often ill-suited to estimate the Krueger (1994) use Pennsylvania as a com-
effects of infrequent events, such as policy parison to estimate the effects of an increase
interventions, on aggregate units. Economists in the New Jersey minimum wage on employ-
have approached the estimation of the effects ment in fast food restaurants in New Jersey.
of large-scale but infrequent interventions A drawback of comparative case studies of
using time-series analysis and comparative this type is that the selection of the compar-
case studies. Single-unit time-series analysis ison units is not formalized and often relies
is an effective tool to study the short-term on informal statements of affinity between
effects of policy interventions in cases when the units affected by the event or interven-
we expect short-term effects to be of a sub- tion of interest and a set of comparison units.
stantial magnitude.2 However, the use of Moreover, when the units of observation are a
time-series techniques to estimate medium small number of aggregate entities, like coun-
and long-term effects of policy intervention tries or regions, no single unit alone may pro-
is complicated by the presence of shocks to vide a good comparison for the unit affected
the outcome of interest, aside from the effect by the intervention.
of the intervention. Comparative case stud- The synthetic control method is based on
ies are based on the idea that the effect of an the idea that, when the units of observation
intervention can be inferred by c omparing the are a small number of aggregate entities, a
evolution of the outcome variables of interest combination of unaffected units often pro-
vides a more appropriate comparison than
any single unaffected unit alone. The syn-
2 The literature on “interrupted time-series” is partic-
thetic control methodology formalizes the
ularly relevant in the context of policy evaluation. See, for
example, Cook and Campbell (1979), which discusses the selection of the comparison units using a
limitations of this methodology if interventions are gradual data driven procedure. As we will discuss
rather than abrupt and/or if the causal effect of an inter- later, this formalization also opens the door
vention is delayed in time. Interrupted time-series meth-
ods are closely related to regression-discontinuity design to a mode of quantitative inference for com-
techniques (see, e.g., Thistlethwaite and Campbell 1960). parative case studies.
394 Journal of Economic Literature, Vol. LIX (June 2021)
3. Formal Aspects of the Synthetic Control intervention of interest for the affected unit
Method in period t(with t > T0 ) is:
Suppose that we obtain data for J+ 1 Because unit “one” is exposed to the
units: j = 1, 2, … , J + 1. Without loss of gen- intervention after period T 0, it follows that
erality, we assume that the first unit ( j = 1) for t > T0 we have Y1t = Y I1t. Simply put, for
is the treated unit, that is, the unit affected the unit affected by the intervention and a
by the policy intervention of interest.3 The post-intervention period we observe the
“donor pool,” that is, the set of potential potential outcome under the intervention.
comparisons, j= 2, … , J + 1is a collec- The great policy evaluation challenge is
tion of untreated units not affected by the to estimate Y N1t for t > T0: how the outcome
intervention. We assume also that our data of interest would have evolved for the
span Tperiods and that the first T 0 periods affected unit in the absence of the interven-
are before the intervention. For each unit, j, tion. This is a counterfactual outcome, as the
and time, t, we observe the outcome of inter- affected unit was, by definition, exposed to
est, Yjt. For each unit, j, we also observe a set the intervention of interest after t = T0. As
of kpredictors of the outcome, X1j, … , Xkj , equation (1) makes clear, given that Y I1t is
which may include p re-intervention values observed, the problem of estimating the
of Yjtand which are themselves unaffected effect of a policy intervention is equivalent
by the intervention. The k× 1 vec- to the problem of estimating Y N
1t. Notice also
tors X1, … , XJ+1 contain the values of the that equation (1) allows the effect of the
predictors for units j = 1, … , J + 1, respec- intervention to change over time. This is cru-
tively. The k × J matrix, X0 = [X2 ⋯ XJ+1 ], cial because intervention effects may not be
collects the values of the predictors for the J instantaneous and may accumulate or dissi-
untreated units. For each unit, j, and time pate as time after the intervention passes.
period, t, we will define Y N jt to be the poten-
3.2 Estimation
tial response without intervention. For the
unit affected by the intervention, j = 1, and Comparative case studies aim to repro-
a post-intervention period, t > T0 , we will duce Y N
1t—that
is, the value of the outcome
define Y I1t to be the potential response under variable that would have been observed
the intervention.4 Then, the effect of the for the affected unit in the absence of the
intervention—using one unaffected unit
or a small number of unaffected units that
3 The synthetic control framework can easily accom- have similar characteristics as the affected
modate estimation with multiple treated units by fitting unit at the time of the intervention. When
separate synthetic controls for each of the treated units. the data consist of a few aggregate entities,
In practice, however, estimation with several treated units
may carry some practical complications that are discussed such as regions or countries, it is often dif-
in section 8. ficult to find a single unaffected unit that
4 Y I and Y N are the potential outcomes of Rubin’s
1t jt provides a suitable comparison for the unit
model for causal inference (see, e.g., Rubin 1974, Holland
1986). To simplify notation, I exclude the start time of affected by the policy intervention of inter-
the intervention from the notation for Y I1t. Notice, how- est. As mentioned above, the synthetic con-
ever, that the value of Y I1t depends in general not only on trol method is based on the observation that
when the intervention starts, but also other features of the
intervention that are fixed in our analysis and, therefore, a combination of units in the donor pool
excluded from the notation. may approximate the characteristics of the
Abadie: Using Synthetic Controls 395
(h=1
k t∈0
= ∑ vh (Xh1
− w2 Xh2
− ⋯
for some set 0 ⊆ {1, 2, … , T0 } of pre-
intervention periods. Abadie, Diamond,
)
1/2
and Hainmueller (2015) propose a related
− wJ+1 XhJ+1
) 2
method to choose v1, …, vk via out-of-sample
validation. The ideas behind out-of-sample
subject to the restriction that w 2, … , wJ+1
are validation selection of v1, …, vk are described
onnegative and sum to one.6 Then, the esti-
n next. The goal of the synthetic control is to
mated treatment effect for the treated unit at approximate the trajectory that would have
time t = T0 + 1, … , T is been observed for Y1t and t > T0 in the
J+1
absence of the intervention. For that pur-
pose, the synthetic control method selects
(8) τˆ 1t − ∑ w ⁎j Yjt .
= Y1t a set of weights W such that the resulting
j=2
synthetic control resembles the affected
The positive constants v 1, … , vk in (7) reflect unit before the intervention along the val-
the relative importance of the synthetic con- ues of the variables X 11, …, Xk1 . The ques-
trol reproducing the values of each of the k tion of choosing V = (v1 , … , vk ) boils down
predictors for the treated unit, X 11, … , Xk1 . For to assessing the relative importance of
N
a given set of weights, v1, … , vk , minimizing each of X11, …, Xk1 as a predictor of Y 1t .
equation (7) can be easily accomplished That is, the value vhaims to reflect the rel-
using constrained quadratic optimiza- ative importance of approximating the value
tion. That is, each potential choice of of Xh 1 for predicting Y N 1t in the post-intervention
V = (v1 , … , vk ) produces a synthetic con- period, t = T0 + 1, …, T. Because Y N 1t is not
trol, W( V) = (w2 ( V), … , wJ+1
( V))′ , which observed for t = T0 + 1, …, T , we cannot
directly evaluate the relative importance of
fitting each predictor to approximate Y N
1t in
6 For the sake of expositional simplicity, I discuss only the post-intervention period. However, Y N1t
the normalized Euclidean norm in equation (7). Of course, is observed for the p re-intervention peri-
other norms are possible. Also, to avoid notational clut-
ter, dependence of the norm in equation (7) from the ods t = 1, 2, …, T0 , so it is possible to use
weights v1, … , vk is left implicit in the notation. pre-intervention data to assess the predictive
Abadie: Using Synthetic Controls 397
power on Y N
1t of the variables X1j, … , Xkj
. This This is a heuristic procedure, and one that
can be accomplished in the following manner. is useful only as long as it produces V ⁎, such
that Y1 t ≈ w̃ 2 (V⁎)Y2t + ⋯ + w̃ J+1 (V⁎)YJ+1t
1. Divide the p re-intervention periods for t = t0 + 1, … , T0 , and X1 ≈ X0 W ⁎ for the
into a initial training period and a set of predictors used to calculate W ⁎.
subsequent validation period. For To give sharpness to the discussion of the
simplicity and concreteness, we properties and practical implementation
will assume that T 0is even and the of synthetic control estimators I will refer,
training and validation periods span as a running example, to an application in
t = 1, …, t0 and t = t0 + 1, …, T0, Abadie, Diamond, and Hainmueller (2015),
respectively, with t0 = T0 / 2. In practice, which estimates the effect of the 1990
the lengths of the training and valida- German reunification on per capita GDP
tion periods may depend on applica- in West Germany. In this application, the
tion-specific factors, such as the extent intervention is the 1990 German reunifica-
of data availability on outcomes in the tion and the treated unit is the former West
pre-intervention and post-intervention Germany. The donor pool consists a set of
periods, and the specific times when industrialized countries, and X 1 and X0 col-
the predictors are measured in the data. lect prereunification values of predictors of
economic growth. Figure 1, panel A, com-
2. For every value V, let w̃ 2 ( V), …, w̃ J+1
(V) pares the trajectory of per capita GDP before
be the synthetic control weights com- and after the reunification for West Germany
puted with training period data on the and a simple average of the countries in the
predictors. The MSPE of this synthetic donor pool, for the years 1960–2003. This
control with respect to Y N 1t in the valida- is the comparison in equation (4). Average
tion period is per capita GDP among the countries in the
T0
donor pool fails to reproduce the trajec-
(9) ∑ (Y1t
tory of per capita GDP for West Germany
− w̃ 2 (V) Y2t
− ⋯
t=t0 +1 even before the reunification takes place in
1990. Moreover, the restriction of parallel
) .
− w ̃ J+1( V) YJ+1t
2
trends required for d ifference-in-differences
models (see, e.g., Abadie 2005, Angrist
3. Select a value V⁎ ∈ such that the and Pischke 2009) fails to hold in the
MSPE in equation (9) is small, where pre-intervention data. Figure 1, panel B,
is a set of potential values for V. reports the trajectory of p er capita GDP for
West Germany and for a synthetic control
4. Use the resulting V ⁎and data on the pre- calculated in the manner explained in this
dictors for the last t0 periods before in section. This figure shows that a weighted
the intervention, t = T0 − t0 + 1, … , T0, average of the countries in the donor pool
⁎ = W(V ⁎) .7
to calculate W is able to closely approximate the trajectory
of per capita GDP for West Germany before
the German reunification.
7 As discussed in Klößner et al. (2018), cross-validation Moreover, the synthetic control of fig-
weights are not always unique. That is, minimization of ure 1, panel B, closely reproduces the
equation (9) may not have a unique solution. In principle,
this could be dealt with via penalization (e.g., adding a term
γ∑kh=1 v 2m for some γ
> 0to equation (9), which favors should aim to demonstrate that their results are not overly
dense sets of weights). In practice, however, researchers sensitive to particular choices of V.
398 Journal of Economic Literature, Vol. LIX (June 2021)
Panel A Panel B
Per Capita GDP (PPP 2002 USD)
20,000 20,000
10,000 10,000
5,000 5,000
0 0
1960 1970 1980 1990 2000 1960 1970 1980 1990 2000
Year Year
West Germany OECD Synthetic West Germany
Notes: Panel A compares the evolution of per capita GDP in West Germany to the evolution of per capita
GDP for a simple average of OECD countries. In panel B the comparison is with a synthetic control calcu-
lated in the manner explained in subsection 3.2. See Abadie, Diamond, and Hainmueller (2015) for details.
TABLE 1
Economic Growth Predictor Means before the German Reunification
West Germany Synthetic West Germany OECD average Austria (nearest neighbor)
(1) (2) (3) (4)
GDP per capita 15,808.9 15,802.2 13,669.4 14,817.0
Trade openness 56.8 56.9 59.8 74.6
Inflation rate 2.6 3.5 7.6 3.5
Industry share 34.5 34.4 33.8 35.5
Schooling 55.5 55.2 38.7 60.9
Investment rate 27.0 27.0 25.9 26.6
Note: The first column reports X1 , the second column reports X0 W ⁎, the third column reports a simple average of Xj
for the 16 OECD countries in the donor pool, and the last column reports the value of X jfor the nearest neighbor
of West Germany in terms of predictors values. GDP per capita, inflation rate, and trade openness are averages for
the 1981–90 period. Industry share (of value added) is the average for 1981–89. Schooling is the average for 1980
and 1985. Investment rate is averaged over 1980–84. See Abadie, Diamond, and Hainmueller (2015) for variable
definitions and sources. The nearest neighbor in column 4 minimizes the Euclidean norm of the pairwise differences
between the values of the predictors for West Germany and for each of the countries in the donor pool, after resca-
ling the predictors to have unit variance.
TABLE 2
Synthetic Control Weights for West Germany
Australia —
Austria 0.42
Belgium —
Denmark —
France —
Greece —
Italy —
Japan 0.16
Netherlands 0.09
New Zealand —
Norway —
Portugal —
Spain —
Switzerland 0.11
United Kingdom —
United States 0.22
N
controls estimators for the cases when Y 1t is control estimator is unbiased for a vector
generated by (i) a linear factor model, or (ii) autoregressive model, and provide a bias
a vector autoregressive model.8 They show bound for a linear factor model. Here, I will
that, under some conditions, the synthetic restrict the exposition to the linear factor
8 Notice that the assumptions on the data-generating estimation of τ1t for t > T0requires no assumptions on the
Njt , but not Y 1t. Since Y 1t = Y1t is observed,
I I
process involve Y process that generates Y I1t.
400 Journal of Economic Literature, Vol. LIX (June 2021)
model, which can be seen as a generalization outcomes). Then the bias of τˆ it is controlled
of difference in differences. Consider the by the ratio between the scale of the indi-
following linear factor model for Y jtN , vidual transitory shocks, εit, and the number
of pre-intervention periods, T 0. The intu-
jt = δt + θt Zj + λt μj + εjt,
(10) Y N ition behind this result is rather immediate.
Under the factor model in equation (10), a
where δtis a time trend, Zj and μj are vec- synthetic control that reproduces the val-
tors of observed and unobserved predic- ues Z1 and μ1would provide an unbiased
tors of Y N , respectively, with coefficients θt
jt estimator of the treatment effect for the
and λt, and εjtis zero mean individual tran- treated. If X1 = X0 W ⁎, then the synthetic
sitory shocks. In the time-series literature in control matches the value of Z1. On the
econometrics, θt and λtare referred to as com- other hand, μ 1is not observed, so it cannot
mon factors, and Z j and μj as factor loadings. be matched directly in the data. However,
The term δ tis a common factor with constant a synthetic control that reproduces the val-
loadings across units, while λ t represents a ues of Z 1but fails to reproduce the values
set of common factors with varying loadings of μ1 can only provide a close match for
across units. A difference-in-differences/ the pretreatment outcomes if differences
fixed effects panel model can be obtained in the values of the individual transitory
from equation (10) by restricting λ t to be shocks between the treated and the syn-
time invariant, so λt = λ(see Bai 2009). thetic controls compensate for the differ-
This has the effect of restricting the mean ences in unobserved factor loadings. This
outcomes of units with the same values for is unlikely to happen when the scale of the
the observed predictors, Zj = z , to follow transitory shocks, ε it, is small or the num-
parallel trends, δ t + θt z + λμj. A linear fac- ber of pretreatment periods, T0, is large. In
tor model provides a useful extension to the contrast, a small number of pre-intervention
difference-in-differences/fixed effects panel periods combined with enough variation in
data models by allowing Y N jt to depend on the unobserved transitory shocks may result
multiple unobserved components, μj, with in a close match for p retreatment outcomes
coefficients, λt, that change in time. In con- even if the synthetic control does not closely
trast to d ifference in differences, the linear match the values of μ 1. This is a form of
factor model does not impose parallel mean over-fitting and a potential source of bias.
outcome trends for units with the same val- In practice, the condition X 1 = X0 W ⁎
ues for Zj. is replaced by the approximate ver-
Abadie, Diamond, and Hainmueller (2010) sion X1 ≈ X0 W ⁎. It is important to notice,
provide a characterization of the bias of the however, that for any particular d ata set there
synthetic control estimator for the case when are not ex ante guarantees on the size of the
the synthetic control reproduces the charac- difference X1 − X0 W ⁎. When this difference
teristics of the treated unit. Let X 1 be the vec- is large, Abadie, Diamond, and Hainmueller
tor that includes Z1 and the pre-intervention (2010) recommend against the use of syn-
outcomes for the treated unit, and thetic controls because of the potential for
let X0 be the matrix that collects the same substantial biases. For the factor model in
variables for the untreated units. Suppose equation (10), obtaining a good fit X 1 ≈ X0 W ⁎
that X1 = X0 W ⁎, that is, the synthetic con- when X1 and X0 include pre-intervention
trol represented by W ⁎
, is able to repro- outcomes typically requires that the vari-
duce the characteristics of the treated unit ance of the transitory shock is small (see
(including the values of the pre-intervention Ferman and Pinto 2019). Moreover, because
Abadie: Using Synthetic Controls 401
the bias bound depends inversely on T 0, more general (nonlinear) process for Y N it . If
one could erroneously conclude that under the process that determines Y N it is nonlinear
the factor model in equation (10), the syn- in the attributes of the units, even a close fit
thetic control estimator is unbiased as T 0 by a synthetic control, which is a weighted
goes to infinity. However, the bias bound in average, could potentially result in large
Abadie, Diamond, and Hainmueller (2010) interpolation biases.
is derived under X1 = X0 W ⁎, and its practi- A practical implication of the discussion
cal relevance depends on the ability of the in the previous paragraph is that each of the
synthetic control to reproduce the trajectory units in the donor pool have to be chosen judi-
of the outcome for the treated unit. Sizable ciously to provide a reasonable control for the
biases may persist as T0 → ∞ , unless the treated unit. Including in the donor pool units
quality of the fit, X1 − X0 W ⁎, is good. That is, that are regarded by the analyst to be unsuit-
the ability of a synthetic control to reproduce able controls (because of large discrepancies
the trajectory of the outcome variable for the in the values of their observed attributes Z j or
treated unit over an extended period of time, because of suspected large differences in the
as in figure 1 panel B, provides an indication values of the unobserved attributes μj relative
of low bias. However, a large T 0 cannot drive to the treated unit) is a recipe for bias.
down the bias if the fit is bad. In practice, There are other factors that contribute
synthetic controls may not perfectly fit the to the bias bound in Abadie, Diamond, and
characteristics of the treated units. Section 7 Hainmueller (2010). In particular, the value
discusses a backdating exercise that can of the bound increases with the number on
often be used to obtain an indication of the unobserved factors, that is, the number of
size and direction of the bias arising from components in μ j. The dependence of the
imperfect fit. bias bound on the number of unobserved
The risk of over-fitting may also increase factors is relevant for the discussion on the
with the size of the donor pool, especially choice of predictors for the synthetic control
when T0is small. For any fixed T0, a larger J method in the next subsection.
makes it easier to fit pretreatment outcomes
3.4 Variable Selection
even when there are substantial discrepan-
cies in factor loadings between the treated A synthetic control provides a predic-
unit and the synthetic control. Consistent 1t for t > T0 , the potential outcome
tor of Y N
with this argument, the bias bound without the intervention for the treated units
for τˆ 1 tderived in Abadie, Diamond, and in a post-intervention period. Like for any
Hainmueller (2010) depends positively on J . other prediction procedure, the choice of
Under a factor model for Y N it
, a large num- predictors (in X1 and X0 for synthetic con-
ber of units in the donor pool may create or trol estimators) is a fundamental part of the
exacerbate the bias of the synthetic control estimation task. This subsection discusses
estimator, especially if the values of μ j in the variable selection in the synthetic control
donor pool greatly differ from μ1.9 Moreover, method. To aid the discussion of the differ-
the factor model in equation (10) should be ent issues involved in variable selection for
interpreted only as an approximation to a synthetic controls, I will employ the con-
cepts and notation of the linear factor model
framework of subsection 3.3. Predictor vari-
9 A large Jmay be beneficial in h igh-dimensional set- ables in X1 and X0 typically include both
tings, as demonstrated in Ferman (2019), who shows that
under certain conditions synthetic control estimators may pre-intervention values of the outcome vari-
0 → ∞and J → ∞.
asymptotically unbiased as T able as well as other predictors, Zj.
402 Journal of Economic Literature, Vol. LIX (June 2021)
exploited to provide guarantees against the where Yˆ jt N is the outcome on period t pro-
use of results to guide specification searches. duced by a synthetic control when unit j is
This is because synthetic control weights can coded as treated and using all other J units
be calculated using p re-intervention data to construct the donor pool. This is the root
only in the design phase of the study, before mean squared prediction error (RMSPE)
post-intervention outcomes are observed
of the synthetic control estimator for
or realized. Section 4 discusses this issue in unit jand time periods t 1, … , t2 . The ratio
more detail. between the p ost-intervention RMSPE and
pre-intervention RMSPE for unit j is
3.5 Inference
Rj( T0 + 1, T)
Abadie, Diamond, and Hainmueller (12) rj = ___________ .
Rj( 1, T0 )
(2010) propose a mode of inference for the
synthetic control framework that is based on That is, rjmeasures the quality of the
permutation methods. In its simpler version, fit of a synthetic control for unit jin the
the effect on the intervention is estimated posttreatment period, relative to the quality
separately for each of the units in the sample. of the fit in the pretreatment period. Abadie,
Consider the case with a single treated unit, Diamond, and Hainmueller (2010) use the
as in subsection 3.1. A permutation distribu- permutation distribution of rj for inference.
tion can be obtained by iteratively reassign- An alternative solution to the problem of poor
ing the treatment to the units in the donor pretreatment fit in the donor pool is to base
pool and estimating “placebo effects” in each inference on the distribution R j( T0 + 1, T)
iteration. Then, the permutation distribution after discarding those placebo runs with
is constructed by pooling the effect estimated Rj( 1, T0 ) substantially larger than R 1( 1, T0 )
for the treated unit together with placebo (see Abadie, Diamond, and Hainmueller
effects estimated for the units in the donor 2010).
pool. The effect of the treatment on the unit A p-value for the inferential procedure
affected by the intervention is deemed sig- based on the permutation distribution of rj,
nificant when its magnitude is extreme rela- as described above, is given by
tive to the permutation distribution.
One potential complication with this pro- J+1
1 ∑ I r − r ,
p = _
cedure is that, even if a synthetic control is +( j 1)
J + 1 j=1
able to closely fit the trajectory of the out-
come variable for the treated unit before the
intervention, the same may not be true for all where I+ ( ⋅ )is an indicator function that
the units in the donor pool. For this reason, returns one for n onnegative arguments and
Abadie, Diamond, and Hainmueller (2010) zero otherwise. While p
-values are often
propose a test statistic that measures the used to summarize the results of testing
ratio of the p ost-intervention fit relative to procedures, the permutation distribution
the pre-intervention fit. For 0 ≤ t1 ≤ t2 ≤ T of the test statistics, rj, or of the placebo
and j = {1, … , J + 1}, let gaps, Yjt − Yˆ jt N , are easy to report/visualize
and provide additional information on (e.g.,
(11) Rj(t1 , t2 ) on the magnitude of the differences between
the estimated treatment effect on the treated
test inversion (see, e.g., Firpo and Possebom nature of a placebo intervention. Consider,
2018). for example, the 1990 German reunifica-
Replacing Yjt − Yˆ jt N in R( T0 + 1, T) with tion application in Abadie, Diamond, and
+
their positive or negative parts, (Yjt − Yˆ jt N ) Hainmueller (2015). In that context, it would
− be difficult to articulate the nature of the
( )
or Yjt − Yˆ jt N
, leads to
one-sided infer- assignment mechanism or even describe pla-
ence. One-sided inference may result in cebo interventions. (France would reunify
a substantial of gain of power.10 This is an with whom?) Moreover, even if a plausible
important consideration in many compara- assignment mechanism exists, estimation of
tive case study settings, where samples are the assignment mechanism is often hope-
considerably small. Alternative test statistics less because many comparative case studies
(see, e.g., Firpo and Possebom 2018) could feature a single or a small number of treated
potentially be used to direct power to spe- units.
cific sets of alternatives. It is important to note that the availabil-
As discussed in Abadie, Diamond, and ity of a w ell-defined procedure to select
Hainmueller (2010) this mode of inference the comparison unit, like the one provided
reduces to classical randomization inference by the synthetic control method, makes the
(Fisher 1935) when the intervention is ran- estimation of the effects of placebo inter-
domly assigned, a rather improbable setting, ventions feasible. Without a formal descrip-
especially in contexts with aggregate units. tion of the procedure used to choose the
More generally, this mode of inference eval- comparison for the treated unit, it would be
uates significance relative to a benchmark difficult to reapply the same estimation pro-
distribution for the assignment process, one cedure to the units in the donor pool. In this
that is implemented directly in the data. sense, the formalization of the choice of the
Abadie, Diamond, and Hainmueller (2010) comparison unit provided by the synthetic
use a uniform benchmark, but one could eas- control method opens the door to a mode of
ily depart from the uniform case. Firpo and quantitative inference in the context of com-
Possebom (2018) propose a sensitivity analy- parative case studies.
sis procedure that considers deviations from Another important point to notice is that
the uniform benchmark. the permutation method described in this
Because in most observational settings subsection does not attempt to approximate
assignment to the intervention is not ran- the sampling distributions of test statis-
domized, one could, in principle, adopt tics. Sampling-based statistical tests employ
permutation schemes that incorporate infor- restrictions on the sampling mechanism
mation in the data on the assignment prob- (data-generating process) to derive a distribu-
abilities for the different units in the sample tion of a test statistic in a thought experiment
(as in, e.g., Rosenbaum 1984). However, where alternative samples could have been
in many comparative case studies it is dif- obtained from the sampling mechanism that
ficult to articulate the nature of a plausible generated the data. In a comparative case
assignment mechanism or even the specific study framework, however, sampling-based
inference is complicated—sometimes
because of the absence of a well-defined
10 Notice that, in the presence of a treatment effect on sampling mechanism or data-generating
the treated unit, permutations in which the treated unit process, and sometimes because the sample
contributes to the placebo synthetic control will tend to
produce effects of the opposite sign to the effect on the is the same as the population. For example,
treated unit, increasing the power of the one-sided test. in their study of the effect of terrorism on
Abadie: Using Synthetic Controls 405
economic outcomes in Spain, Abadie and reproduce the outcome of the treated unit
Gardeazabal (2003) employ a sample con- in the absence of the intervention. Some
sisting of all Spanish regions. Here, sampling advantages of synthetic controls relative to
is not done at random from a well-defined regression-based counterfactual are listed
super-population. As in classical randomiza- next.
tion tests (Fisher 1935), design-based infer-
ence takes care of these complications by No Extrapolation. Synthetic control esti-
conditioning on the sample and considering mators preclude extrapolation, because syn-
only the variation in the test statistic that is thetic control weights are n onnegative and
induced by the assignment mechanism (see, sum to one. It is easy to check that, like their
e.g., Abadie et al. 2020).11 synthetic control counterparts, the regres-
regsum to one. Unlike
sion weights in W
the synthetic control weights, however,
4. Why Use Synthetic Controls?
regression weights may be outside the [ 0, 1]
In this section, I will describe some advan- interval, allowing extrapolation outside of the
tages of synthetic control estimators rela- support of the data (see Abadie, Diamond,
tive to alternative methods. For the sake of and Hainmueller 2015 for details).12 Table 3
concreteness and because linear regression reports regression weights for the German
is arguably the most widely applied tool in reunification example. In this application,
empirical research in economics, I emphasize the regression counterfactual utilizes nega-
the differences between synthetic control tive values for four countries.
estimators and linear regression estimators.
However, much of the discussion applies Transparency of the Fit. Linear regres-
more generally to other estimators of treat- sion uses extrapolation to guarantee a
ment effects. perfect fit of the characteristics of the
treated unit, X0 W reg = X1 (and, there-
– –
A linear regression estimator of the
effect of the treatment can easily be con- fore, X0 W reg = X1 ) even when the untreated
structed using the panel data structure units are completely dissimilar in their char-
described in subsection 3.1. Let Y 0 be the acteristics to the treated unit. In contrast,
(T − T0) × Jmatrix of post-intervention out- synthetic controls make transparent the
comes for the units in the donor pool with actual discrepancy between the treated unit
–
( t, j)-element –
equal to YT0 +t,J+1. Let X1 and the convex combination of untreated
and X 0 be the result of augmenting X 1 units that provides the counterfactual of
and X0, respectively, with a row of ones. For interest, X1 − X0 W ⁎ . This discrepancy is
non-singular X0 X ′0, a regression-based esti-
– –
equal to the difference between columns 1
mator of the counterfactual Y N for t > T0 is and 2 in table 1. In addition figure 1, panel B,
– – −1 –1t
1 , where B = (X 0 X ′0 ) X 0 Y0 ′
–
ˆB ′ X ˆ . That is, brings to light the fit of a synthetic control in
the regression-based estimator is akin to a terms of pre-intervention outcomes. That is,
synthetic control, as it uses a linear combi- the information in table 1 and figure 1 makes
nation, Y0 W reg , of the outcomes in the clear the extent to which the observations in
donor pool, with W reg = X ′0 (X 0 X ′0 ) −1 X 1 , to
– – – –
the donor pool can approximate the char-
acteristics of the treated units by interpola-
tion only. In some applications, comparisons
11 In particular, this is in contrast to the bias bound cal-
culations in Abadie, Diamond, and Hainmueller (2010),
which are performed over the distribution of the individual 12 See King and Zeng (2006) on the dangers of relying
transitory shocks, εit. on extrapolation to estimate counterfactuals.
406 Journal of Economic Literature, Vol. LIX (June 2021)
TABLE 3
Regression Weights for West Germany
Australia 0.12
Austria 0.26
Belgium 0.00
Denmark 0.08
France 0.04
Greece −0.09
Italy −0.05
Japan 0.19
Netherlands 0.14
New Zealand 0.12
Norway 0.04
Portugal −0.08
Spain −0.01
Switzerland 0.05
United Kingdom 0.06
United States 0.13
like that of columns 1 and 2 of table 1 may synthetic control weights can play a role
reveal that it is not possible to approximate similar to p re-analysis plans in randomized
the characteristics of the treated unit(s) control trials (see, e.g., Olken 2015), provid-
using a weighted average of the units in the ing a safeguard against specification searches
donor pool. In that case, Abadie, Diamond, and p -hacking.
and Hainmueller (2010, 2015) advise against
using synthetic controls. Transparency of the Counterfactual.
Synthetic controls make explicit the
Safeguard against Specification Searches. contribution of each comparison unit to
In contrast to regression but similar to clas- the counterfactual of interest. Moreover,
sical matching methods, synthetic controls because the synthetic control coefficients
do not require access to p osttreatment out- are proper weights and are sparse (more on
comes in the design phase of the study, when sparsity below), they allow a simple and pre-
synthetic controls are calculated. This implies cise interpretation of the nature of the esti-
that all the data analysis on design decisions mate of the counterfactual of interest. For
like the identity of the units in the donor the application to the effects of the German
pool or the predictors in X1 and X0 can be reunification in table 2, the counterfactual
made without knowing how they affect the for West Germany is given by a weighted
conclusions of the study (see Rubin 2007 for average of Austria (0.42), Japan (0.16), the
a related discussion regarding matching esti- Netherlands (0.09), Switzerland (0.11), and
mators). Moreover, synthetic control weights the United States (0.22) with weights in
can be calculated and p reregistered/pub- parentheses. Simplicity and transparency
licized before the p osttreatment outcomes of the counterfactual allows the use of the
are realized, or before the actual interven- expert knowledge to evaluate the validity
tion takes place. That is, p reregistration of of a synthetic control and the directions of
Abadie: Using Synthetic Controls 407
X1
X0W*
Figure 2. Projecting X
1on the Convex Hull of X
0
potential biases. For instance, smaller neigh- above, sparsity plays an important role for
boring countries to West Germany, such as the interpretation and evaluation of the
Austria, the Netherlands, and Switzerland, estimated counterfactual. The sparsity of
have a substantial weight on the compo- synthetic control weights has an immediate
sition of the synthetic control of table 2. If geometric interpretation. Assume, for now,
economic growth in these countries were that X1falls outside the convex hull of the
negatively affected by the German reunifica- columns of X0. This is typical in empirical
tion during the 1 990–2003 period (perhaps practice and a consequence of the curse of
because West Germany diverted demand dimensionality. Assume also that the columns
and investment from these countries to East of X0are in general position (that is, there
Germany), this would imply that figure 1, is no set of m columns, with 2 ≤ m ≤ k + 1,
panel B, estimates a lower bound on the that fall into an (m − 2)-dimensional hyper-
magnitude (absolute value) of the negative plane). Then, the synthetic control is unique
effect of the German reunification on per and sparse—with the number of n onzero
capita GDP in West Germany. weights bounded by k —as it is the projection
of X1on the convex hull of the columns of X0.
Sparsity. As evidenced in the results of Figure 2 provides a visual representation of
tables 2 and 3, synthetic controls are sparse, the geometric interpretation of the sparsity
but regression weights are not. As discussed property of synthetic control estimators.
408 Journal of Economic Literature, Vol. LIX (June 2021)
Only the control observations marked in red for t > T0. In contrast to the lasso, however,
contribute to the synthetic control. the identity and magnitude of n onzero coef-
Notice that table 1 indicates that the syn- ficients constitute important information to
thetic control for West Germany falls close interpret the nature of the estimate and eval-
to but outside the convex hull of the values uate its validity and the potential for biases.
of economic growth predictors in the donor One of the greatest appeals of the syn-
pool (otherwise, columns 1 and 2 would be thetic control method resides, in my opinion,
identical). As a result, the number of n onzero in the interpretability of the estimated coun-
weights in table 2 is not larger than the num- terfactuals, which results from the weighted
ber of variables in table 1. If desired, spar- average nature of synthetic control estima-
sity can be increased by imposing a bound tors and from the sparsity of the weights.
on the density (number of nonzero weights) Despite the practical advantages of syn-
of W ⁎in the calculation of synthetic controls thetic control methods, successful applica-
(see Abadie, Diamond, and Hainmueller tion of synthetic control estimators crucially
2015). depends on important contextual and data
In some cases, especially in applications requirements, which are discussed in the
with many treated units, the values of the next two sections.
predictors for some of the treated units may
fall in the convex hull of the columns of X0.
5. Contextual Requirements
Then, synthetic controls are not necessarily
unique or sparse. That is, a minimizer of This section will discuss contextual
equation (7) may not be unique or sparse, requirements, that is, the conditions on the
although sparse solutions with no more context of the investigation under which
than k + 1nonzero weights always exist. A synthetic controls are appropriate tools for
question is then how to choose among the policy evaluation, as well as suitable ways to
typically infinite number of solutions to the modify the analysis when these conditions do
minimization of equation (7). A modification not perfectly hold. It is important, however,
of the synthetic control estimator in Abadie to point out that most of the requirements
and L’Hour (2019) discussed in section 8 listed in this section pertain not only to syn-
addresses this problem and produces syn- thetic control methods, but also to any other
thetic controls that are unique and sparse type of comparative case study research
(provided that untreated observations are design.
in general quadratic position, see Abadie
and L’Hour 2019 for details). In contrast, as Size of the Effect and Volatility of the
shown in table 3, regression estimators are Outcome. As previously discussed, the goal
typically not sparse. of comparative case studies is to estimate
It is important to notice that the role of the effect of a policy intervention on the unit
sparsity in the context of synthetic control (e.g., state or region) exposed to an interven-
methods differs from the usual role that spar- tion of interest. That is, comparative case
sity plays in other statistical methods like the studies typically estimate the effect of an
lasso, where a sparsity-inducing regulariza- intervention on a single treated unit or on a
tion is employed to prevent over-fitting, and small number of treated units. The nature of
where the interpretation of the lasso coeffi- this exercise indicates that small effects will
cients is often not at issue. Like for the lasso, be indistinguishable from other shocks to
the goal of synthetic controls is out-of-sample the outcome of the affected unit, especially
prediction; in particular, p N1t
rediction of Y if the outcome variable of interest is highly
Abadie: Using Synthetic Controls 409
volatile.13 As a result, the impact of “small” also important to eliminate from the donor
interventions with effects of a magnitude pool any units that may have suffered large
similar to the volatility of the outcome are idiosyncratic shocks to the outcome of inter-
difficult to detect. Even a large effect may be est during the study period, if it is judged
difficult to detect if the volatility of the out- that such shocks would not have affected
come is also large. Outcome variables that the outcome of the unit of interest in the
include substantial random noise elevate the absence of the intervention.15 Moreover, it
risk of over-fitting, as explained in subsec- is important to restrict the donor pool to
tion 3.3. In cases where substantial volatil- units with characteristics that are similar to
ity is present in the outcome of interest it is the affected unit. The reason is that, while
advisable to remove it via filtering, in both the restrictions placed on the weights, W, do
the exposed unit as well as in the units in the not allow extrapolation, interpolation biases
donor pool, before applying synthetic control may still be important if the synthetic
techniques.14 Notice, however, that the chal- control matches the characteristics of
lenge posed by volatility comes only from the the affected unit by averaging away large
fraction of it that is generated by unit-specific discrepancies between the characteristics
factors (e.g., the individual-specific transi- of the affected unit and the characteristics
tory shocks, εjt, in equation (10)). Volatility of the units in the synthetic control. For
generated by common factors affecting other the German reunification example, Abadie,
units (e.g., the common factors λ t in equa- Diamond, and Hainmueller (2015) restrict
tion (10)) can be differentiated out by choos- the donor pool to a set of OECD econ-
ing an appropriate synthetic control. omies. Related to this point, Abadie and
L’Hour (2019) propose adding to the objec-
Availability of a Comparison Group. tive function in equation (7) a set of pen-
The very nature of comparative case stud- alty terms that depend on the discrepancies
ies implies that inference based on these between the characteristics of the affected
methods will be faulty in the absence of a unit and the characteristics of the individual
suitable comparison group. First and fore- units included in the synthetic control (see
most, in order to have units available for section 8 for details).
the donor pool, it is important that not all
units adopt interventions similar to the one No Anticipation. As in any research
under investigation during the period of the design that exploits time variation in the
study. Units that adopt an intervention simi- outcome variable to estimate the effect of
lar to the one adopted by the unit of interest an intervention, synthetic control estimators
should not be included in the donor pool may be biased if forward-looking economic
because they are affected by the interven- agents react in advance of the policy inter-
tion, very much like the unit of interest. It is vention under investigation, or if certain
components of the intervention are put in
place in advance of the formal implementa-
13 In studies that seek to estimate the average effect tion/enactment of the intervention. If there
of an intervention that is observed in a large number of
instances, the volatility of the outcome variable can often
be reduced by averaging. In contrast, as explained above, 15 As an example, in their study of the effect of
comparative case studies often focus on the effect of a sin- California’s tobacco control legislation, Abadie, Diamond,
gle event or intervention. and Hainmueller (2010) discard from the donor pool sev-
14 For example, Amjad, Shah, and Shen (2018) propose eral states that adopted large-scale tobacco programs or
singular value thresholding to d e-noise data for synthetic substantially increased taxes on tobacco during the sample
controls. period of the study.
410 Journal of Economic Literature, Vol. LIX (June 2021)
are signs of anticipation, it is advisable to advisable to select for the donor pool units
backdate the intervention in the data set to that are affected by the same regional eco-
a period before any anticipation effect can nomic shocks as the unit where the interven-
be expected, so the full extent of the effect tion happens. On the other hand, if spillover
of the intervention can be estimated. Notice effects are substantial and affect units in close
that backdating the intervention in the data geographical proximity, those units may pro-
does not mechanically bias the estimator of vide a biased estimate of the counterfactual
the effect of the intervention even if some outcome without intervention for the unit
periods before the intervention are mistak- affected by the intervention. In cases when
enly recorded as post-intervention periods. units potentially affected by spillover effects
The reason is that, as shown in equations are discarded from the donor pool, the trans-
(2) and (3), the synthetic control estima- parency of the fit of synthetic controls allows
tor does not restrict the time variation in researchers to evaluate the reduction in the
the effect of the intervention. Therefore, quality of the match between the character-
periods barely affected by the interven- istics of the treated unit and the characteris-
tion may show small or zero effects, while tics of the synthetic control.
subsequent periods may produce a large Potential spillover effects can also be
estimated effect. This is in contrast with accounted for in the analysis phase of a
much of the practice using panel data mod- synthetic control study. If units affected by
els, where in many instances the effect of spillover effects are included in the synthetic
an intervention is restricted to be constant control, the researcher should be aware
across post-intervention periods. of the potential direction of the bias of the
resulting estimator. For example, Abadie,
No Interference. In the setup of subsection Diamond, and Hainmueller (2015) estimate
3.1, we defined the potential outcomes Y I1t and the economic impact of the 1990 German
Y N
it only in terms of the treatment status for reunification using a synthetic control of
unit 1 and unit i, respectively, at time t . This other OECD countries to approximate the
is the stable unit treatment value assumption trajectory of the counterfactual per capita
in Rubin (1980), which implies that there is GDP for West Germany in the absence of
no interference across units. That is, units’ the unification. As explained above, if coun-
outcomes are invariant to other units’ treat- tries that compose the synthetic control for
ments. In some instances, however, an inter- West Germany, like Austria, suffered from
vention may have spillover effects on units the negative effects of the German reunifi-
that are not directly targeted by it. Assuming cation, then we would expect the synthetic
that such spillover effects do not exist is a control estimator to be attenuated. That is,
strong restriction that must often be enforced in this case, the synthetic control estimate
in the design of the study or accounted for in would provide a lower bound on the mag-
the analysis of the results. nitude of the causal effect of the German
The assumption of no interference can reunification on GDP per capita in West
be enforced in the design of a study by dis- Germany. Notice that it is the transparency
carding from the donor pool those units with of the counterfactual and sparsity of the syn-
outcomes possibly affected by the interven- thetic control counterfactual estimate that
tion on the treated unit. Notice that there makes this exercise possible. In regression
is a potential tension between this practice settings, like the one in section 4, the weight
and the issues discussed in Availability of a of each unit in the counterfactual estimate
Comparison Group. On the one hand, it is is rarely computed in empirical practice, and
Abadie: Using Synthetic Controls 411
group can reproduce the changes in the out- and λt, or at least some of their components,
come variable for the unit of interest even vary little in time. In that case, the magni-
if the level of the outcome variable cannot tudes of Δ θ t and Δλtmay be small even
be reproduced. In other cases, however, if the magnitudes of θt and λt are large.
credible counterfactuals require reproduc- This is the usual rationale for working with
ing not only the trend of the outcome vari- differenced outcomes and the basis for
able for the treated but also the level. For difference-in-differences estimators. There
example, some formulations of the conver- may be opposing forces at play, however.
gence hypothesis in economic growth imply Suppose, in particular, that the idiosyncratic
that countries with different levels of per shocks, εjt, are independent or roughly inde-
capita GDP will tend to experience differ- pendent in time. Then, the variance of Δ εjt
ent growth rates on average, in the absence is larger than the variance of εjt. Now, follow-
of an intervention. Similarly, n onlinearities ing the characterization of the bias in subsec-
in labor earnings profiles over the life cycle tion 3.3, a larger residual variance may result
imply that differences in the age distribution in a higher risk of over-fitting and an increase
across populations will typically result in dif- in the bias of the synthetic control estimator.
ferences in the growth of labor earnings.17
More generally, there may not exist a com- Time Horizon. The effect of some inter-
bination of untreated units that provide a ventions may take time to emerge or to be
credible approximation to the treated units, of sufficient magnitude to be quantitatively
and the conventional synthetic control esti- detected in the data. An obvious but unsat-
mator should not be used in that case. isfying approach to this problem is to wait
It should also be noted that differencing until the effects of the intervention run their
the dependent variable may result in a sub- course. A more proactive approach is to use
stantial increase in the part of the variance surrogate outcomes or leading indicators of
of the outcome that is attributable to noise, the outcome variable of interest.
potentially inducing an increase in bias. As
an example, consider the linear factor model
in equation (10). Differencing equation (10) 6. Data Requirements
we obtain
This section discusses data requirements
jt = Δδt + Δθt Zj + Δλt μj + Δεjt,
ΔY N for credible applications of synthetic controls.
Like many of the contextual requirements in
the previous section, the data requirements
where ΔY N
jt = Y jt − Y jt−1 with analogous
N N
discussed here apply not only to synthetic
expressions for Δ δt, Δθt, Δλt, and Δεjt. Notice control estimation but, more generally, to
first that the differenced equation retains the comparative case study methods.
linear factor structure. Notice also that dif-
ferencing the outcome may help control the Aggregate Data on Predictors and
bias when the vectors of common factors θ t Outcomes. From the previous discussion, it
can be seen that the synthetic control method
requires the availability of data on outcomes
17 Notice that for n onlinearities in the process that gen- and predictors of the outcome for the unit
erates Y N
it may require that, for each unit jcontributing to or units exposed to the intervention of inter-
the synthetic control, X jis reasonably close to X
1. Section 8
describes a synthetic control estimator with weights that est and a set of comparison units. Predictors
penalize ∥ X1 − Xj∥. and outcomes are often series reported by
Abadie: Using Synthetic Controls 413
government agencies, multilateral organi- the affected unit and the donor pool for a
zations, and private entities. Examples of large p
re-intervention window.
these types of outcomes are state-level crime A caveat to the preference for a large
rates in the United States (Donohue, Aneja, number of p re-intervention periods is
and Weber 2019), country-level p er capita given by the possibility of structural breaks.
GDP (Abadie, Diamond, and Hainmueller Consider the linear factor model of equa-
2015), and state-level cigarette consump- tion (10). In this model, structural stability
tion statistics in the United States (Abadie, is represented by the restriction of con-
Diamond, and Hainmueller 2010), which are stant factor loadings. Even if the model is
routinely reported in publications produced a good representation of the distribution
or commissioned by the Federal Bureau of of the data at a relatively short time scale,
Investigation, the World Bank, and tobacco its accuracy may suffer once we allow the
industry groups, respectively. Sometimes, number of periods to be large enough.
when aggregate data do not exist, aggregates Choosing v1, … , vk to up-weight the most
of micro-data are employed in comparative recent measures (relative to the prediction
case studies. For example, in his study of the window) included in X1 and X0 helps allevi-
labor market effects of the Mariel Boatlift in ate structural instability concerns.
Miami, Card (1990) uses micro-data from With a small number of pre-intervention
the Current Population Survey (CPS) to periods, close or even perfect fit of the pre-
estimate aggregate values for wage rates and dictor values for the treated unit may be spu-
unemployment for workers in Miami and riously attained, in which case the resulting
a set of four comparison cities before and synthetic control may fail to reproduce the
after the Mariel Boatlift. Similarly, Bohn, trajectory of the outcome for the treated
Lofstrom, and Raphael (2014) use data from unit in the absence of the intervention. The
the CPS to estimate the fraction of the pop- severity of this problem can be diminished if
ulation composed of Hispanic n oncitizens by powerful predictors of post-intervention val-
state in the United States. ues of Y N
jt , aside from pre-intervention values
of the outcome, are included in X j, reducing
Sufficient
Pre-intervention Information. the residual variance and, as a result, the risk
The credibility of a synthetic control of over-fitting.
estimator depends in great part on its ability
to steadily track the trajectory of the out- Sufficient P
ost-intervention Information.
come variable for the affected unit before This data requirement derives partly from
the intervention. As discussed in subsec- the Time Horizon contextual requirement in
tion 3.3, Abadie, Diamond, and Hainmueller section 5. The evaluation data must include
(2010) show that if the data-generating pro- outcome measures that are possibly affected
cess follows a linear factor model, then the by the intervention and are relevant for the
bias of the synthetic control estimator is policy decision or scientific inquiry that is
bounded by a function that is inversely pro- the object of the study. This may be prob-
portional to the number of p re-intervention lematic if the effect of an intervention is
periods (provided that the synthetic control expected to arise gradually over time and if
closely tracks the trajectory of the outcome no forward-looking measures of the outcome
variable for the affected unit during the are available. Conversely, in some practical
pre-intervention periods). Therefore, when instances, the effect of an intervention may
designing a synthetic control study, it is of dissipate rapidly after showing substan-
crucial importance to collect information on tial effectiveness for a few initial periods.
414 Journal of Economic Literature, Vol. LIX (June 2021)
35,000
25,000
15,000
treated unit before the intervention occurs. of any particular country. In other exam-
Second, a gap between per capita GDP for ples, however, results may not be as robust
West Germany and its synthetic control as those in figure 4, and the scientific sig-
counterpart appears around the time of the nificance of the estimates should be evalu-
German reunification, as in figure 1, panel ated with that information in mind. If the
B. This is the case even when the interven- exclusion of a unit from the donor pool has
tion is ten-year backdated in the data and a large effect on results without a discernible
the procedure uses no information on the change in pre-intervention fit, this may war-
timing of the actual intervention. The shape rant investigating if the change in the magni-
and direction of the gap in figure 3 is sim- tude of the estimate is caused by the effects
ilar to that of figure 1, panel B, albeit of a of other interventions or by particularly large
somewhat smaller magnitude. The fact that idiosyncratic shocks on the outcome of the
the estimated effect of the German reuni- excluded untreated unit.
fication appears shortly after 1990 even
when the intervention is artificially t en-year
8. Extensions and Related Methods
backdated in the data provides credibility to
the synthetic control estimator of the 1990 As the literature on synthetic control
German reunification. methods and related methods has greatly
expanded in recent years, it has become
Robustness Tests. Regardless of the esti- increasingly difficult for researchers inter-
mation method employed in the analysis, ested in applying these methods to figure out
the main conclusions of an empirical study what is available where. In this section, I pro-
should display some level of robustness with vide a brief guide to the recent contributions
respect to changes in the study design. In the in the area. This represents only an incom-
context of synthetic controls, two important plete snapshot of a literature that is rapidly
ways the design of a study may influence evolving.
results are (i) the choice of units in the donor
pool, and (ii) the choice of predictors of the Multiple Treated Units. Several recent
outcome variable. The first choice corre- articles consider estimation and inference
sponds to the columns in X 0, and the second with synthetic controls for the case where
one corresponds to the rows in [ X1 : X0 ] . there are multiple treated units. Notice
As a example of a robustness test, fig- that the presence of multiple treated units
ure 4 reports the results of a leave-one-out does not give rise to additional conceptual
re-analysis of the German reunification
challenges for the estimation of synthetic
data in Abadie, Diamond, and Hainmueller controls. Treatment effects can be estimated
(2015), taking from the sample o ne-at-a-time for each treated unit separately and aggre-
each of the countries that contribute to the gated in a second step if desired. However,
synthetic control in table 2. All l eave-one-out the presence of multiple treated units cre-
estimates closely track the per capita GDP ates some practical problems for estimation,
series for West Germany before 1990. The as well as new challenges and opportunities
resulting estimates for the years after the for inference.
reunification are all negative and centered A potential complication with synthetic
around the result produced using the entire control estimation is that the minimizer
donor pool. The main conclusion of a neg- of equation (7) subject to the weight con-
ative estimate of the German reunification straints may not be unique, especially if the
on per capita GDP is robust to the exclusion values of the predictors for a treated unit fall
416 Journal of Economic Literature, Vol. LIX (June 2021)
35,000
West Germany
Synthetic West Germany
Per Capita GDP (PPP 2002 USD)
Synthetic West Germany (leave−one−out)
25,000
15,000
5,000
0
1960 1970 1980 1990 2000
Year
inside the convex hull of the values of the in the space of the predictors, even when an
predictors for the donor pool. Suppose, for alternative solution exists based only on units
now, that only the first unit is treated. If X 1 with predictor values similar to X1. This may,
belongs to the convex hull of the columns in turn, lead to large interpolation biases that
of X0, this implies that we can find W ⁎ such remain unchecked under the illusion of per-
that X1 = X0 W . Moreover, the number of
⁎
fect fit, X1 = X0 W ⁎.19
minimizers to equation (7) may be (and will Even in moderate dimensions, k , the curse
typically be) infinite.18 That is, there may exist of dimensionality works to keep treated
an infinite number of solutions to the prob- observations outside of the convex hull of
lem of minimizing of equation (7), subject to the units in the donor pool. Therefore, in
the weight constraints, that perfectly repro- settings with one treated unit, multiplicity of
duce X1. An algorithm minimizing equation solutions is rarely an issue, and if it arises it
(7) subject to the weight constrains may can often be easily addressed by restricting
select a solution, W ⁎, with positive entries for the donor pool to units with predictor val-
units that are far away from the treated unit ues most similar to the values of the predic-
tor for the unit exposed to the treatment.
However, in settings with many treated and airwise matching discrepancies between
p
untreated units, multiplicity of solutions and the predictor values for unit iand each of
how to choose among them become import- the units that contribute to its synthetic con-
ant issues for estimation. Moreover, large trol, weighted by the magnitudes of their
interpolation biases may also arise in settings contributions. The penalty term is added
where the predictor values for treated units to equation (13) with the aim of reducing
fall outside the convex hull of the predictor interpolation biases. As λ → ∞, the penal-
values for the units in the donor pool, espe- ized estimator converges to one-to-one
cially when the units contributing to synthetic matching. As λ → 0, the estimator uses an
controls are far away from the treated units aggregate of pairwise matching discrepan-
in the space of predictors. That is, there may cies weighted by W to select among all syn-
be cases such that X1 ≈ X0 W ⁎ but where Xj thetic controls that attain the minimal value
greatly differs from X 1, for some unit j con- for ∥Xi − X0 W∥. Values of λ between zero
tributing to the synthetic control. and infinity trade off aggregate fit of the syn-
To address these challenges, Abadie thetic control and pairwise fit of each of the
and L’Hour (2019) propose a synthetic units that contribute to it.
control estimator that incorporates a pen- Abadie and L’Hour (2019) show that
alty for pairwise matching discrepancies if λ > 0, then the minimizer of equation (13)
between the treated units and each of the is unique and sparse (provided that the col-
units that contribute to their synthetic umns of X 0are in general quadratic position,
controls. Consider a setting with I treated see Abadie and L’Hour 2019 for details).
units and Juntreated units. We will index They also provide cross-validation tech-
observations so that the treated units come niques to select λ.
first. That is, units j= 1, … , Iare treated Let W ⁎i = (w ⁎iI+1
, … , w ⁎iI+J
)′ be the solu-
and units j = I + 1, … , I + Jare untreated, tion to the minimization problem in equa-
with I + Junits in total. As in previous sec- tion (13). Then, the estimated treatment
tions, Xjis the vector of predictor values effect for i = 1, … , Iand t = T0 + 1, … , T is
for unit j, and X0is the matrix of the pre- as in (8),
dictor values for the units in the donor pool. I+J
λ > 0
(14) τˆ it = Yit − ∑ w ⁎ij Yjt ,
For , the estimator in Abadie and
L’Hour (2019) minimizes j=I+1
I+J
(13) ∥Xi − X0W∥ 2 + λ ∑ w
i ∥Xi − Xj∥ 2 with average treatment effect given by
j=I+1 I
1 ∑ τ
τˆ t = _ ˆ .
with respect to W = (wI+1 )′ , for
, … , wI+J I i=1 it
each treated unit, i = 1, … , I, subject to the
constraints that the weights w I+1, … , wI+J In many instances, especially where the
are nonnegative and sum to one.20 The first sample units are aggregates like regions
term in equation (13) is the aggregate dis- or countries, a weighted average (e.g.,
crepancy between the predictor values for population-weighted, or GDP-weighted)
treated unit iand its synthetic control. The treatment effect may be most relevant.
second term in equation (13) penalizes Dube and Zipperer (2015) and Abadie and
L’Hour (2019) propose extensions of the per-
20 Although this is not reflected in the notation in equa- mutation methods in Abadie, Diamond, and
tion (13), λ may depend on i. Hainmueller (2010) to the case with m
ultiple
418 Journal of Economic Literature, Vol. LIX (June 2021)
treated units. They employ rank-based sta- a separate synthetic control for each treated
tistics on the permutation distribution of unit, they calculate a single synthetic control
treatment effects, where the identity of the to match aggregate values of the predictors
treated units is permuted at random in the between the treated and nontreated samples.
data. In particular, Abadie and L’Hour (2019) As in the usual synthetic control estimator, the
propose the following simple generalization weights in Hainmueller (2012) and Robbins,
of the permutation test in Abadie, Diamond, Saunders, and Kilmer (2017) are nonnegative
and Hainmueller (2010). They consider a and sum to a predetermined constant (typi-
setting with Itreated units and J untreated cally equal to one, or to the number of treated
units. In each permutation b = 1, … , B, the units, depending on the scaling of the vari-
identities of the Itreated units are reas- ables in the data set). These estimators require
signed in the data among the I + Junits in that there is at least a convex combination of
the sample, and statistics r b,1, … , rb,I
are units in the donor pool that exactly matches
calculated for the units coded as treated in a prespecified set of moments of the predic-
the permutation. These statistics could be tors for the treated units. Among the sets of
(bias-corrected) synthetic control estimates weights that perfectly reproduce the moments
of treatment effects, or rescaled versions that for the treated sample, Hainmueller (2012)
take into account the p re-intervention fit as in and Robbins, Saunders, and Kilmer (2017)
equation (12), or their absolute values, pos- choose the one that minimizes a measure of
itive parts, or negative parts, depending on discrepancy with respect to constant weights.
the context. Notice that when Iis small rela-
tive to I + J, it may be possible to consider all Bias Correction. Another practical com-
possible treatment reassignments, in which plication in settings with many treated units
case Bis equal to (I + J)-choose-I. If consid- is that, even with a moderate k, the predictor
ering all possible treatment reassignments is values for some of the treated units may not
computationally expensive, inference can be be closely reproduced by a synthetic control,
based on Brandom draws from all subsets or may be closely reproduced only by com-
of Iunits in the sample. Let r 0,1, … , r0,I
be binations of units with large pairwise match-
the same statistics calculated for the actual ing discrepancies in predictor values with
treated units. Then, Bpermutation repe- respect to the treated unit. At the same time,
titions, in addition to the original sample including those ill-fitted units in the calcula-
values for treatment, produce I × (B + 1) tion of the aggregate effect may be important
statistics, r0,1, … , r0,I , … , rB,1
, … , rB,I
. Now, for the desired interpretation of the estimate
for each b = 0, … , B, one can calculate tb (e.g., as an estimate of the average effect of
equal to the sum of the ranks of rb,1, … , rb,I the treatment on the treated). In that case,
within r0,1, … , r0,I , … , rB,1 , … , rB,I
. The per- one could be concerned about the potential
mutation inference in Abadie and L’Hour biases produced by matching discrepancies
(2019) is based on the “extremeness” of between the values of the predictors for
the statistic t 0within the permutation dis- the treated units and those for the respec-
tribution t0, t1 , … , tB . Notice that, for I = 1 tive synthetic controls.21 Bias corrections
this mode of inference amounts to the per-
mutation test in Abadie, Diamond, and
Hainmueller (2010). 21 Related to this problem, Ferman and Pinto (2019)
Hainmueller (2012) and Robbins, Saunders, and Botosaru and Ferman (2019) study the properties of
synthetic control estimators for cases where the value of
and Kilmer (2017) consider also settings with the predictors for a treated unit cannot be closely matched
multiple treated units. Instead of producing by a synthetic control.
Abadie: Using Synthetic Controls 419
play also an important role in reducing reg- Equation (16) provides an interpretation of
ularization biases in inferential methods for the b ias-corrected synthetic control estima-
regression-based variants of synthetic con- tor as a synthetic control estimator applied
trols (see, e.g., Arkhangelsky et al. 2019, to regression residuals. The bias correction
Chernozhukov, Wüthrich, and Zhu 2019b). in equation (16) is related to the proposal in
Abadie and L’Hour (2019) and Doudchenko and Imbens (2016) to residual-
Ben-Michael et al. (2020) propose
ize the outcomes with respect to covariates
modifications of the synthetic control esti- before calculating synthetic controls.
mator along the lines of the bias-correction A different avenue to evaluate the bias
techniques of Rubin (1973), Quade (1982), of synthetic control estimators, which was
and Abadie and Imbens (2011). They use discussed in section 7, is given by the avail-
regression adjustments to attenuate the bias ability of pre-intervention periods, when
of synthetic control estimators in settings the effect of the treatment is not yet real-
where the synthetic control counterfactual is ized. In the absence of anticipation effects,
constructed using untreated units with values estimates of treatment effects before the
of the predictors that do not closely repro- intervention are reflective of estimation
duce the predictor values for the treated biases. If biases are stable in time, esti-
unit or units. For t = T0 + 1, … , T, let μˆ 0t
be mates of those biases could be used to
a sample regression function (parametric or correct synthetic control estimates. Bias
nonparametric) estimated by regressing the adjustments of this type, which are closely
untreated outcomes, YI+1,t, … , YI+J,t
, on the related to d ifference-in-differences meth-
values of the predictors for the untreated ods, are proposed in Arkhangelsky et al.
units, XI+1, … , XI+J
. The bias-corrected syn- (2019) and Chernozhukov, Wüthrich, and
thetic control estimator for unit i is Zhu (2019b).
I+J
( )
Regression-Based Methods and Extra-
(15) τˆ it = Yit − ∑ w ⁎ij Yjt polation. Several articles have contributed
j=I+1
regression-based estimators for synthetic
I+J controls. These procedures allow extrapola-
− ∑ w ⁎ij (μˆ 0t
(Xi) − μˆ 0t(Xj)). tion by considering synthetic controls that
j=I+1 are not convex combinations of the units in
the donor pool. Doudchenko and Imbens
The first term on the right-hand side of (15) (2016) consider an estimator that fits all
is the synthetic control estimator in (14). The pretreatment outcomes for the treated, with
second term uses a regression adjustment to weights that may be negative and may not
correct for discrepancies between the pre- sum to one, and allow for a constant shift in
dictor values for the treated unit and the the level of the synthetic control estimator.
predictor values for the units that contribute They propose to use an elastic net—that is, a
to the synthetic control. Alternatively, the combination of lasso (L1) and ridge (L2 ) pen-
estimator in (15) can be expressed as alties—to regularize the weights. The coun-
terfactual estimates for t = T0 + 1, … , T in
(Yit − μˆ 0t(Xi))
(16) τˆ it = Doudchenko and Imbens (2016) are
I+J J+1
− ∑ (Yjt − μˆ 0t(Xj)).
⁎ij
w (17) Yˆ 1 Nt = αˆ + ∑ w
ˆ j Yjt ,
j=I+1 j=2
420 Journal of Economic Literature, Vol. LIX (June 2021)
where αˆ , w
ˆ 2 , … , w
ˆ J+1
minimize Matrix Completion/Estimation Methods.
Amjad, Shah, and Shen (2018); Amjad et
2
T0 J+1 al. (2019); and Athey et al. (2020) propose
t=1( )
(18) ∑ Y1t
− α − ∑wj Yjt related methods that use tools from the
j=2 matrix completion/matrix estimation liter-
ature. Suppose, as before, that unit j = 1 is
J+1 J+1 the treated unit, and units j = 2, … , J + 1
1 − λ2
( 2 j=2 j )
+ λ1 _
∑w 2 + λ2 ∑ |wj| , are not treated. Amjad, Shah, and
j=2 Shen (2018) posit a n onlinear f actor-
structure model for the untreated,
jt = f(μj, λt) + εjt,
Y N with j = 2, … , J + 1
with respect to (α, w2, … , wJ+1) ∈ ℝ J+1, and t = 1, … , T, where εjt is random noise.
where λ1 ≥ 0 and 0 ≤ λ2 ≤ 1are regulariza- Their framework allows for the pres-
tion parameters selected by c ross-validation. ence of missing values in the matrix {Y N jt },
To incorporate additional predictors in
their estimation procedure, Doudchenko with j = 1, … , J + 1, and t = 1, … , T. Using
and Imbens (2016) propose to use least matrix estimation methods, in particular
squares in a first step to residualize the out- singular value thresholding (see Chatterjee
comes Yjt for j = 1, … , J + 1 in equation 2015), Amjad, Shah, and Shen (2018) esti-
(18) with respect to any other covariates. mate a low-rank approximation, { M }
ˆ jt , to the
Chernozhukov, Wüthrich, and Zhu (2019a) matrix {Mjt} = {f(μj, λt)}. The objects M ˆ jt
consider different penalty terms, includ- are used to d e-noise the outcomes Y N jt and
ing lasso regularization (i.e., λ2 = 1), in the to impute missing values, if any. Then, syn-
context of an inferential procedure for syn- thetic controls are obtained as linear com-
thetic controls. Regression estimators of binations of M ˆ jt, with coefficients estimated
this type are also related to the panel data by ridge regression of Y1t on M ˆ 2 t, … M ˆ J+1t in
approach to the program evaluation esti- the p re-intervention periods. The estimator
mator in Hsiao, Ching, and Wan (2012), in Amjad, Shah, and Shen (2018) does not
where λ1 = 0and the parameters in equa- incorporate covariates, using data on out-
tion (17) are estimated by unpenalized least comes, Yjt, only. Amjad et al. (2019) modify the
squares. Li (2019) considers the same esti- estimator in Amjad, Shah, and Shen (2018) to
mator as in Hsiao, Ching, and Wan (2012), incorporate additional variables aside from
but regularizes the weights w ˆ J+1
ˆ 2 , … , w to be the outcome of interest, under the assump-
nonnegative. tion that all variables depend on common
Arkhangelsky et al. (2019) introduce a latent factors. Athey et al. (2020) postulate
synthetic control estimator that weights the model Y N jt = Mjt + εjt, for j = 1, … , J + 1
not only the units in the control group, but and t = 1, … , T, where εjt is again random
also the p re-intervention time periods, to noise. In their framework, missing entries
{Y jt }, with j = 1, … , J + 1
N
approximate the counterfactual of inter- in the matrix
est. The time weights in Arkhangelsky et al. and t = 1, … , T, arise naturally for the treated
(2019) play a similar role as the predictor observation (or treated observations, if mul-
weights, v1, … , vk , of subsection 3.2. They tiple units are treated) in the posttreatment
reflect the importance of each of the individ- periods. Athey et al. (2020) assume that
ual predictors, which in the leading version the matrix { Mjt }, with j = 1, … , J + 1
of the estimator of Arkhangelsky et al. (2019) and t = 1, … , T , is low-rank, which allows
are past outcome values. them to obtain an estimate, { }
ˆ jt , via matrix
M
Abadie: Using Synthetic Controls 421
completion techniques. The estimated coun- otential outcome under the intervention
p
terfactual outcomes without the treatment is Y I1t = P N t + τt + ut for t > T0 . To simplify
for the treated are the values of M ˆ jt such the exposition, assume that u 1, … , uT are
N
that Y jt is missing. Extensions allow for mod- i.i.d. Then, the distribution of a function,
els with covariates and the inclusion of time S(uT 0 +1, … , uT ), of the post-intervention val-
fixed effects and unit fixed effects (separate ues of utshould be the same as the distribution
from the low-rank matrix, M jt). of S( uπ (T0 +1), … , uπ (T)), where π(1), … , π(T)
is a random permutation of 1 , … , T. Suppose
Inference. Several studies have proposed for now that P N
t is known. Then, under a null
inferential tools for synthetic controls as alter- hypothesis, τT0 +1 = aT 0 +1, …, τT = aT , we can
natives to the permutation test in subsection compute ut = Y1 t − P N t − at, where at = 0
3.5. Firpo and Possebom (2018) propose sev- for 1 ≤ t ≤ T0. As a result, we can test the
eral generalizations of the permutation test null hypothesis by comparing the value
in subsection 3.5 and contribute confidence of S(uT 0 +1, … , uT )to its permutation
sets based on inverting the results of these distribution, that is, the distribution
tests. In a repeated sampling framework for of S ( uπ ( T0 +1), … , uπ (T)), which can be directly
stationary data and large T 0, Hahn and Shi computed in the data. A feasible implemen-
(2017) propose to apply the e nd-of-sample tation of the test requires estimation of the
instability test of Andrews (2003) to obtain residuals, u1 , … , uT . In the context of the
an inferential procedure for synthetic con- synthetic control method, Chernozhukov,
trol estimators. In the context of synthetic Wüthrich, and Zhu (2019a) adopt the
t = ∑ j=2 wjYjt with nonnegative
J+1
control estimators, the e nd-of-sample insta- model P N
bility test of Andrews (2003) is related to the weights that sum to one, and E[ut Yjt] = 0
backdating ideas of section 7. It compares for j = 2, … , J + 1 , and implement their
the values of treatment effects computed test on constrained least squares residu-
for the T − T0 post-intervention periods to als, uˆ 1, … , uˆ T . The proposal in Chernozhukov,
the distribution of the of same values com- Wüthrich, and Zhu (2019a) differs from
puted for every subset of T − T0 consecu- other synthetic control procedures in two
tive pre-intervention periods. Related also important respects. First, while much of the
to Andrews’s end-of-sample instability test, literature on synthetic controls has adopted
Chernozhukov, Wüthrich, and Zhu (2019a) the linear factor model of subsection 3.3
devise a sampling-based inferential proce- as a working model to understand the
dure for synthetic controls and related meth- properties of synthetic control estimators,
ods that employs permutations of regression Chernozhukov, Wüthrich, and Zhu (2019a)
residuals in the time dimension. In particular, adopt instead the restriction E [ut Yjt] = 0
Chernozhukov, Wüthrich, and Zhu (2019a)
for j = 2, … , J + 1to estimate P Nt .
23
They
1t = P t + ut, where u1, … , uT
assume Y N N
are
stationary and weakly dependent with mean
zero. Let τT0 +1, … , τT be the effects of 23 To understand the differences between these two
the treatment on the treated unit (unit frameworks, notice that when the data are generated
one) at times t = T0 + 1, … , T.22 The by the linear factor model of subsection 3.3, and there
is an unbiased synthetic control—that is, a synthetic
control with weights, w2, … , wJ+1 that exactly repro-
duces Z1 and μ1 —then, the restriction E[ut Yjt ] = 0 for
j = 2, … , J + 1 and ut = Y N 1t − ∑ j=2 wj Yjt
J+1
N
22 To be consistent with the notation for P
t and ut and does not hold in
because only unit one is treated, here I drop the subscript general (see Ferman and Pinto 2019). One exception is
indicating that identity of the treated unit from the nota- given by the results in Ferman (2019), which imply that
tion for treatment effect. E[ut Yjt ] = 0for j = 2, … , J + 1will approximately hold as
422 Journal of Economic Literature, Vol. LIX (June 2021)
show, however, that regardless of the valid- Other Contributions. In this article, I
ity of the model, their testing procedure have provided a brief description of selected
remains valid as long as the estimated strands of the literature on synthetic con-
residuals, uˆ 1 , … , uˆ T are exchangeable under trols and related methods, starting with
the null hypothesis. Second, in contrast the canonical estimator in sections 2 and 3,
to other synthetic control procedures that and describing some extensions and related
compute the weights w2, … , wJ+1 using methods in the current section. The litera-
pre-intervention data only, in the inferential ture is vast in its totality, however, and there
procedure of Chernozhukov, Wüthrich, and are many noteworthy contributions I did
Zhu (2019a) the synthetic control weights not cover. They include Bai and Ng (2019);
are estimated under the null hypothe- Brodersen et al. (2015); Gobillon and Magnac
sis, τT0 +1 = aT 0 +1, …, τT = aT , using data (2016); Gunsilius (2020); K ennedy-Shaffe,
on Y N it = Yit − atfor all periods, including de Gruttola, and Lipsitch (2020); Viviano
the periods after the intervention. For a sim- and Bradic (2019); and Xu (2017), among
ilar set of models, Chernozhukov, Wüthrich, many others. Samartsidis et al. (2019) study
and Zhu (2019b) propose bias-corrected the performance of the canonical synthetic
synthetic control estimation and confi- control estimator and related methods in the
dence intervals for the mean value of the context of the German reunification exam-
treatment effect over the p ost-intervention ple of subsection 3.2. As the set of methods
period, ( τT0 +1 + ⋯ + τT) / (T − T0) in set- on synthetic controls keeps expanding and
tings when both T0 and T − T0 are large. enriching the applied econometrics tool-
Similar to d ifference in differences, the kit, this is still a young literature and much
bias-correction procedure of Chernozhukov, remains to be done. I mention some open
Wüthrich, and Zhu (2019b) adjusts for dif- areas in the final section of this article.
ferences in p re-intervention outcomes
between the treated unit and the synthetic
9. Conclusions
control. Confidence intervals are based on
an asymptotically pivotal t-statistic and cen- Synthetic controls provide many practical
tered on the average of K-fold cross-fitted advantages for the estimation of the effects
versions of the bias-corrected synthetic con- of policy interventions and other events of
trol estimate. Cattaneo, Feng, and Titiunik interest. However, like for any other sta-
(2021) propose predictive intervals for tistical procedure (and especially for those
synthetic control estimators and related aimed at estimating causal effects), the cred-
methods. They adopt a predictive model ibility of the results depends crucially on the
Y N1t = P t + ut, where
N
P N
t depends on level of diligence exerted in the application
observed predictors and unknown parame- of the method and on whether contextual
ters, and u t is an unobserved random error. and data requirements are met in the empir-
Their predictive intervals for τˆ1 t = Y1t − Y N t ical application at hand. In this article, I
(with t > T0 ) take into account estima- emphasize the notion that mechanical appli-
tion uncertainty about the values of the cations of synthetic controls that do not take
parameters in P N t as well as irreducible into account the context of the investigation
uncertainty about the value of u t. or the nature of the data are risky enter-
prises. To this end, the article discusses the
methodological underpinnings of synthetic
J → ∞if there are weights, w2, … , wJ+1 that asymptotically
recover Z1 and μ1and are increasingly diluted among the control estimators and the conditions under
units in the donor pool. which they provide suitable estimates of
Abadie: Using Synthetic Controls 423
causal effects. It also describes how the anal- Abadie, Alberto, Alexis Diamond, and Jens Hainmuel-
ysis may be modified in the cases when those ler. 2015. “Comparative Politics and the Synthetic
conditions do not hold. Finally, the article Control Method.” American Journal of Political Sci-
ence 59 (2): 495–510.
discusses some recent extensions that widen Abadie, Alberto, and Javier Gardeazabal. 2003. “The
the applicability, robustness, and flexibility of Economic Costs of Conflict: A Case Study of the
the method. Basque Country.” American Economic Review 93
Open areas of related research abound, (1): 113–32.
both methodological and empirical. Results Abadie, Alberto, and Guido W. Imbens. 2011.
“Bias-Corrected Matching Estimators for Average
on sampling-based inference, external valid- Treatment Effects.” Journal of Business & Economic
ity, sensitivity to model restrictions, esti- Statistics 29 (1): 1–11.
mation with multiple interventions, and Abadie, Alberto, and Jérémy L’Hour. 2019. “A Penal-
the identification of the channels though ized Synthetic Control Estimator for Dissagregated
which the effect of an event or intervention Data.” Unpublished. https://sites.google.com/site/
jeremylhour/research.
operates, to mention a few, are scant or
Acemoglu, Daron, Simon Johnson, Amir Kermani,
absent in the synthetic controls literature. An James Kwak, and Todd Mitton. 2016. “The Value of
area of recent heightened interest regarding Connections in Turbulent Times: Evidence from the
the use of synthetic controls is the design of United States.” Journal of Financial Economics 121
experimental interventions in settings where (2): 368–91.
Allegretto, Sylvia, Arindrajig Dube, Michael Reich, and
the intervention of interest can only be Ben Zipperer. 2017. “Credible Research Designs for
applied to one or a small number of aggre- Minimum Wage Studies: A Response to Neumark,
gate units. In addition, existing results on Salas, and Wascher.” ILR Review 70 (3): 559–92.
robust and efficient computation of synthetic Amjad, Muhammad, Devavrat Shah, and Dennis Shen.
controls are scarce, and more research is 2018. “Robust Synthetic Control.” Journal of
Machine Learning Research 19 (22): 1–51.
needed on the computational aspects of this Amjad, Muhammad J., Vishal Misra, Devavrat Shah,
methodology. On the empirical side, many and Dennis Shen. 2019. “mRSC: Multidimensional
of the events and policy interventions econ- Robust Synthetic Control.” Proceedings of the ACM
omists care about take place at an aggregate on Measurement and Analysis of Computing Systems
level, affecting entire aggregate units like 3 (2). https://doi.org/10.1145/3341617.3326152.
Andrews, D. W. K. 2003. “End-of-Sample Instability
school districts, cities, regions, or countries. Tests.” Econometrica 71 (6): 1661–94.
This is exactly the setting synthetic controls Angrist, Joshua D., and Jörn-Steffen Pischke. 2009.
were designed for, and potential applications Mostly Harmless Econometrics: An Empiricist’s
of synthetic controls in economics are many. Companion. Princeton and Oxford: Princeton Uni-
versity Press.
Arkhangelsky, Dmitry, Susan Athey, David A. Hirsh-
References berg, Guido W. Imbens, and Stefan Wager. 2019.
“Synthetic Difference in Differences.” NBER Work-
Abadie, Alberto. 2005. “Semiparametric Differ- ing Paper 25532.
ence-in-Differences.” Review of Economic Stud- Athey, Susan, Mohsen Bayati, Nikolay Doudchenko,
ies 72 (1): 1–19. Guido Imbens, and Khashayar Khosravi. 2020.
Abadie, Alberto, Susan Athey, Guido W. Imbens, and “Matrix Completion Methods for Causal Panel Data
Jeffrey M. Wooldridge. 2020. “Sampling-Based ver- Models.” Available on arXiv at 1710.10251.
sus Design-Based Uncertainty in Regression Analy- Athey, Susan, and Guido W. Imbens. 2017. “The State
sis.” Econometrica 88 (1): 265–96. of Applied Econometrics: Causality and Policy Eval-
Abadie, Alberto, Alexis Diamond, and Jens Hain- uation.” Journal of Economic Perspectives 31 (2):
mueller. 2010. “Synthetic Control Methods for 3–32.
Comparative Case Studies: Estimating the Effect Bai, Jushan. 2009. “Panel Data Models with Interactive
of California’s Tobacco Control Program.” Journal Fixed Effects.” Econometrica 77 (4): 1229–79.
of the American Statistical Association 105 (490): Bai, Jushan, and Serena Ng. 2019. “Matrix Comple-
493–505. tion, Counterfactuals, and Factor Analysis of Missing
424 Journal of Economic Literature, Vol. LIX (June 2021)