You are on page 1of 16

University of Utah

Case Selection Techniques in Case Study Research: A Menu of Qualitative and Quantitative
Options
Author(s): Jason Seawright and John Gerring
Source: Political Research Quarterly, Vol. 61, No. 2 (Jun., 2008), pp. 294-308
Published by: Sage Publications, Inc. on behalf of the University of Utah
Stable URL: http://www.jstor.org/stable/20299733
Accessed: 26/02/2010 15:06

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=sage.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

University of Utah and Sage Publications, Inc. are collaborating with JSTOR to digitize, preserve and extend
access to Political Research Quarterly.

http://www.jstor.org
Political Research Quarterly
Volume 61 Number 2
June 2008 294-308

Case Selection Techniques in mi


>200S University of Utah
177/1065912907313077
http://prq.sagepub.com
Case Study Research hosted at
http://online.sagepub.com

A Menu of Qualitative and Quantitative Options


Jason Seawright
Northwestern University, Evanston, Illinois

John Gerring
Boston University, Massachusetts

How can scholars select cases from a large universe for in-depth case study analysis? Random sampling is not typi
cally a viable approach when the total number of cases to be selected is small. Hence attention to purposive modes of
sampling is needed. Yet, while the existing qualitative literature on case selection offers a wide range of suggestions
for case selection, most techniques discussed require in-depth familiarity of each case. Seven case selection proce
dures are considered, each of which facilitates a different strategy for within-case analysis. The case selection proce
dures considered focus on typical, diverse, extreme, deviant, influential, most similar, and most different cases. For
each case selection procedure, quantitative approaches are discussed that meet the goals of the approach, while still
requiring information that can reasonably be gathered for a large number of cases.

case case selection; methods; multimethod research


Keywords: study; qualitative

selection
is the primordial task of the case world (over some period of time). Evidently, the

Case study researcher, for in choosing cases, one also problem of representativeness cannot be ignored if
sets out an agenda for studying those cases. This the ambition of the case study is to reflect on a
means that case and case analysis are inter
selection broader population of cases. At the same time, a truly
twined to a much greater extent in case study representative case is by no means easy to identify.
research than in large-Af cross-case analysis. Indeed, Additionally, chosen cases must also achieve varia
the method of choosing cases and analyzing those tion on relevant dimensions, a requirement that is
cases can scarcely be separated when the focus of a often unrecognized. A third difficulty is that back
work is on one or a few instances of some broader ground cases often play a key role in case study

phenomenon. analysis. They are not cases per se, but they are
Yet choosing good cases
for extremely small sam nonetheless integrated into the analysis in an infor
ples is a challenging endeavor (Gerring 2007, chaps. mal manner. This means that the distinction between
2 and 4). Consider that most case studies seek to elu the case and the population that surrounds it is never
cidate the features of a broader population. They are as clear in case study work as it is in the typical large
about something larger than the case itself, even if the N cross-case
study.

resulting generalization is issued in a tentative fash Despite the importance of the subject, and its evi
ion (Gerring 2004). In case studies of this sort, the dent complexities, the question of case selection has
chosen case is asked to perform a heroic role: to stand received relatively little attention from scholars since
for (represent) a population of cases that is often the pioneering work of Eckstein (1975), Lijphart
much larger than the case itself. If cases consist of (1971, 1975), and Przeworski and Teune (1970). To be
countries, for
example, the population might be sure, recent work has noted the problem of sample bias
understood as a region (e.g., Latin America), a partic and debated its sources and impact at great length
ular type of country (e.g., oil exporters), or the entire (Achen and Snidal 1989; Collier andMahoney 1996;

294
Seawright, Gerring / Case Selection Techniques 295

Geddes 1990; King, Keohane, and Verba 1994; provide a concrete and fruitful integration of quanti
Rohlfing 2008; Sekhon 2004), but no solutions to this tative and qualitative techniques, a line of inquiry

problem have been proffered beyond those implicit in pursued by a number of recent studies (e.g., George
work by Eckstein, Lijphart, and Przeworski and Teune. and Bennett 2005; Brady and Collier 2004; Gerring
In the absence of detailed, formal treatments, 2001, 2007; Goertz 2006; King, Keohane, andVerba
scholars continue to lean primarily on pragmatic con 1994; Ragin 2000).
siderations such as time, money, expertise, and
access. They may also be influenced by the theoreti
cal prominence of a given case. Of course, these are Why Not Choose Cases Randomly?
perfectly legitimate factors in case selection. they Yet
do not provide a methodological justification for why Before exploring specific techniques for case selec
case A might be preferred over case B. Indeed, they tion in case study research, it is worth asking at the out
may lead to highly misleading results, as suggested set whether such approaches are, in fact, necessary.
by the literature on sample bias (cited previously). Given the dangers of selection bias introduced whenever
Thus, even if cases are initially chosen for pragmatic researchers choose their cases in a purposive fashion,
reasons, it is essential that researchers understand perhaps case study researchers should choose cases ran

retroactively how the properties of the selected cases domly. This is the counsel one might intuit from quanti
comport with the rest of the population. tative methodological quarters (e.g., Sektion 2004).
To be sure, methodological arguments for small-Af Yet serious problems are likely to develop if one
case selection are not entirely lacking. These are char chooses a very small sample in a completely random
acteristically summarized as case study types: extreme, fashion (i.e., without any prior stratification). These
deviant, crucial, most similar, and so forth; however, may be illustrated through two simple Monte Carlo
these commonly invoked terms are poorly understood experiments, each involving a sample of cases and a
and often misapplied. The techniques we discuss sub single variable of interest, ranging from 0 to 1, with a

sequently thus offer the possibility for small-Af scholars mean of 0.5, in the population. In the first experiment,
to develop more rigorous and detailed explanations of a computer generates five hundred random samples,
how their cases relate to the others in a broader uni each consisting of one thousand cases. In the second
verse. Moreover, existing discussions of case selection experiment, the computer generates five hundred ran
for case studies offer little practical direction in circum dom samples, each consisting of only five cases.
stances where the potential cases are numerous. How How representative are the random samples in these
are we to know
which cases are deviant (or most two experiments? Both produce unbiased samples. The
deviant) if the population numbers in the hundreds or average across the means drawn from the first experi
thousands? Finally, and perhaps most important, the ment is 0.499, while the result for the second experi
usual menu of options derived from Eckstein and col ment is 0.508?both figures being very close to the true
leagues is notably incomplete. population mean; however, the means in the second
In this article, we clarify the methodological issues experiment are more spread out than the means in the
involved in case where the scholar's first experiment. When sizes are -
selection, objec sample large (N
tive is to buildand test general causal theories about 1,000), the standard deviation is about 0.009; when
the social world on the basis of one or a few cases. sizes are small =
sample (N 5), it is about 0.128. This
We also attempt to provide a more
comprehensive result shows that for a comparative case study com
menu of
options for case in case study
selection posed of five cases (or less), randomized case selection
work. Our final objective is to offer new techniques procedures will often produce a sample that is substan
for case selection in situations where data for key tially unrepresentative of the population.
variables are available across a large sample. In these Given the insufficiencies of randomization as well
situations, we show that standard statistical tech as the problems posed by a purely pragmatic selec
niques may be profitably employed to clarify and tion of cases, the argument for some form of purpo
systematize the process of case selection. Of course, sive case selection seems strong. It is true that
this sort of large-N analysis is not practicable in all purposive methods cannot entirely overcome the
instances, but where it is?that is, where data and inherent unreliability of generalizing from small-Af
modeling techniques are propitious?we suggest that samples, but they can nonetheless make an important
it has a lot to offer to case study research. To the contribution to the inferential process by enabling
extent that these techniques are successful, they may researchers to choose the most appropriate cases for
296 Political Research Quarterly

a given research strategy, which may be either quan techniques discussed in this article, it will be appar
titative or qualitative. ent that most of these depend on a clear idea of what
the breadth of the chief inference is. It is only by ref
erence to this larger set of cases that one can begin to

Techniques of Case Selection think about which cases might be most appropriate
for in-depth analysis. If nothing?or very little?is
How, then, are we
to choose a sample for case known about the population, the methods described

study analysis? Note that case selection in case study in this study cannot be implemented or will have to
research has the same twin objectives as random sam be reimplemented once the true population becomes

pling; that is, one desires (1) a representative sample apparent. Thus a case study whose primary purpose is
and (2) useful variation on the dimensions of theoret casing?establishing what constitutes a case and, by
ical interest.1 One's choice of cases is therefore dri extension, what constitutes the population (Ragin
ven by the way a case is situated along these 1992)?will not be able to make use of the tech
dimensions within the population of interest. It is niques discussed here.
from such cross-case characteristics that we derive Several caveats
pertain specifically to the use of
the seven case study types presented in Table 1: typi statistical reasoning in the selection of cases. First,
cal, diverse, extreme, deviant, influential, most simi the population of the inference must be reasonably
lar, and most different. Most of these terms will be large; otherwise, statistical techniques are inapplica
familiar to the reader from studies published over the ble. Second, relevant data must be available for that
past century (e.g.,Mill 1872; Eckstein 1975; Lijphart population, or a sizable sample of that population,
1971; Przeworski and Teune 1970). What bears on all of the key variables, and the researcher must

emphasis is the variety of methodological purposes feel reasonably confident in the accuracy and concep
that these case selection techniques presume. tual validity of these variables. Third, all the standard
Before beginning, several caveats and clarifica assumptions of statistical research (e.g., identifica
tions must be issued. First, the case
selection proce tion, specification, robustness, measurement error)
dures discussed in this article properly apply to some must be carefully considered. Often, a central goal of
case studies?but not all. As is well recognized, the the case study is to clarify these assumptions or cor

key term case study is ambiguous, a


referring to het rect errors in statistical analysis, so the process of in

erogeneous set of research designs (Gerring 2004, depth study and case selection may be an interactive
2007). In this study, we insist on a fairly narrow def one. We shall not dilate further on these matters,
inition: the intensive (qualitative or
quantitative) except to warn the researcher against the unthinking

analysis of a single unit or a small number of units use of statistical techniques.


(the cases), where the researcher's goal is to under Finally, it is important to underline the fact that our
stand a larger class of similar units (a population of discussion disregards two important considerations
cases). There is thus an inherent problem of inference pertaining to case selection: (1) pragmatic, logistical
from the sample (of one or several) to a larger popu issues, including the theoretical prominence of a case
lation. By contrast, a very different style of case study in the literature on a topic, and (2) the within-case

(so-called) aims to elucidate features specific to a characteristics of a case. The first set of factors,
case. Here the problem of case selection which we have already mentioned, is not method
particular
does not exist (or is at any rate minimized), for the ological in character; as such, it does not bear on the
case of primary concern has been identified a priori. validity of an inference stemming from a case study.
This style of case study work is discussed in a com Moreover, we suspect that there is not much that can

panion piece (Gerring 2006). be said about these issues that is not already self
A second matter of definition concerns the goals evident to the researcher. The second factor is
undertaken by a researcher. In this study, we are con methodological, properly speaking, and there is a
cerned primarily with causal inference, rather than great deal to be said aboutit (Gerring and McDermott
with inferences that are descriptive or predictive in 2007). In this study, however, we focus on factors of
nature. The reader should keep in mind that case case selection that depend on the cross-case character
studies that are
largely descriptive may not follow istics of a case: how the case fits into the theoretically
similar procedures of case selection. specified population. This is how the term case selec
A third matter of clarification concerns the popula tion is typically understood, so we are simply following
tion of the (causal) inference. In perusing the different convention by dividing up the subject in this manner.2
(continued)

inrated cases. anon-lier, be


itconsidered
may of
the caseis sensevariation of course,
they
not
mirror
may
of population.
the the
with
full that
(Of
variation ofa
conducted,
be
corroboby
test,
a the influence
estimates
unusual
on
Bydefinition,
typical
given
Representativeness
speci
Achievable
incomparison may An not
representative.
itIf
istypically
case were relationship.
the Diverse likely
distribution
cases
are
in
the
to bethe
representing
minimal of
only
sample it hypothesis
cross-case
which variable)
newstudy
general
a
based
(a
If
research.
the
case
on now
is
case influential
of
the typical
new
whole,
itwould
sample
of notashave
a overall
representative larger includes the the
representative, the After the
study is representative
relationship.
fied case
population.)

relationship.

Use probe new forY,


explanations
to disconfirm
deterministic
a to casesthat
confirm
mayaor
mechanisms
that
either given
Exploratory;
Exploratory
probe confirmatory;
existing (rare) to the
double-check results
probe disconfirm
illuminates
Exploratory
causal full
range
confirmatory; of or argument, confirm an
to explanation Confirmatory;
ofa influence
analysis
to
Confirmatory; or the open-ended or cross-case
variation
Y,
X,on
or X/Y

of
Y
X
or
theory

Cross-Case
Methods Selection
of andAnalysis
tions of Xor
technique
Large-Af or (3) continuous),
(if ofvalues Hatmatrix distance
or
1 Case devia deviations
away
from
the
mean
A (outlier) Cook's
standard
Y Y tabulations, analysis, case
Acase Diversity
(on-lier)
low-residual calculated
(1)bemay
of valuesby
Xor or factor
analysis) high-residual
Table categorical (2)
Protestant), (e.g., Catholic,
Acaselying
standard
many
combinationsdiscriminant
Jewish,
(e.g., cross
on
based

Y
X
of
or

Cases
(one
somedeviate
or Y relative extreme
values
unusual
or
of
X or
univariate from
Cases
(one or
are
typical Cases
(twoexemplify
or of Cases
(one exemplifysome relationship.
more)
examples
diversecross-case more)
of Y,
X, or X/Y.
more)cross-case influential
the of
variables.
to
Definition some more) configurations
independent
or Cases
(one
more)
or with
values

relationship. distribution.

Diverse Deviant
Extreme Influential
MethodTypical

to
Representativeness
broadly
of
representative
thewill
population
the
provide broadly
the
representative
of
Most
similar
cases
that
are different
Most
that
cases
are
population
will
provide
the basis for
basis
strongest
for
generalization. strongest
generalization.

Use X-
F-centered;
confirmatory
or the
evidence
weak
of
eliminate
(1)
causes existence
necessary
ifExploratory
the
hypothesis
is or
to or
Exploratory
confirmatory;
of
relationship
causal
a
provide
(2)
if
ATF-centered
(definitively)

(continued)
1Table Large-TV
technique

large-TV
of
method
case
Inverse
the
of
similar
most

selection
Matching

Cases
(two
more)
ordifferent
are
(two
Cases
are
more)
or
Definitionsimilar
specified
variables
other
on variables
specified
other
on

Y.
than
and/or
X, Y.
and
Xx
than

Note:
the
to
theoretical
factor
Xx
causal
interest.
of
refers

different
Most
Most
similar

Method

00
Seawright, Gerring / Case Selection Techniques 299

The exposition will be guided by an ongoing Figure 1


example, the?presumably causal?relationship Democracy andWealth in 1995
between economic development, as measured by per
capita gross domestic product (GDP; Summers and
Heston 1991), and democracy, as operationalized by
thePolity2 variable drawn from thePolity IV data set
(Marshall and Jaggers 2005). Figure 1 displays the
classical result in the form of a bivariate scatterplot.
Consistent with most work onthe subject, wealthy
countries are almost exclusively democratic (Boix
and Stokes 2003; Lipset 1959). For heuristic pur
poses, certain unrealistic simplifying assumptions
will be adopted in the subsequent discussion. We i-1-!-r
7 0 9 10
shall assume, for example, that the Polity measure of Logged1995PerCapitaGOP

democracy is continuous and unbounded. We shall


assume, more importantly, that the true relationship
between economic development and democracy is
whether it validates the stipulated causal mechanisms
log-linear, positive, and causally asymmetric, with
or not. Otherwise, the researcher may try to show that
economic development treated as exogenous and
the causal mechanisms are different than those that
democracy as endogenous (but see Gerring et al.
had been previously stipulated. Or he or she may
2005; Przeworski et al. 2000). that there are no causal mechanisms
argue plausible
Our discussion of various techniques will be fairly
connecting this independent variable with this partic
straightforward: we will briefly state an idea about
ular outcome. In the latter case, a typical case research
case selection from the tradition of case study
design may provide disconfirming evidence of a gen
research, we will specify the central issue involved in
eral causal proposition.
that approach to case selection, and then we will
review available statistical tools for addressing this
Large-N analysis. One may identify a typical case
issue in a large-Af context. It should be clear that the from a large population of potential cases by looking
goal of this article is not to develop new quantitative
for the smallest possible residual?that is, the distance
estimators, but rather to show how existing estimators between the predicted value and the actual (measured)
can be put to use in new contexts. value?for all cases in a multivariate analysis. In a
large sample, there will often be many cases with
almost identical near-zero residuals. In such situations,
Typical Case
estimates may not be accurate enough to distinguish
The typical case study focuses on a case that
among several almost-identical cases. Thus researchers
exemplifies a stable, cross-case relationship. By con may randomly select from the set of cases with very
struction, the typical case may also be considered a
high typicality (a stratified random-sampling proce
representative case, according to the terms of what
dure) or choose from among these cases according to
ever cross-case model is employed. Indeed, the latter
nonmethodological criteria, as discussed.
term is often employed in the psychological literature As an example, let us returning to the example
(e.g., Hersen and Barlow 1976, 24). introduced the relationship
previously, involving
Because the typical case is well explained by an between per capita GDP and level of democracy.
existing model, the puzzle of interest to the researcher Recall that the outcome (Y) is simply the Polity
lies within that case. Specifically, the researcher wants
democracy score, and there is only one independent
to find a typical case of some phenomenon so that he
variable: logged per capita GDP. Hence a very simple
or she can better explore the causal mechanisms at
model of the relationship may be represented as
work in a general, cross-case relationship. This explo
ration of causal mechanisms may lead toward several
different conclusions. If the existing theory suggests a E (Polity,)= ?0+?1GDP, (1)
specific causal pathway, then the researcher may per
form a pattern-matching investigation, in which the Scholars may also wish to include other nonlinear
evidence at hand (in the case) is judged according to transformations of the logged per capita GDP variable
300 Political Research Quarterly

to allow a more flexible functional form. In the Figure 2


current example, we will
add a quadratic term. Hence Residuals from a Regression of
the model to be considered is Democracy on Wealth

E (Polity,) = ?0 + ?^DP, +
?2GDP,2. (2)

For the purposes of selecting typical cases, the

specific coefficient estimates are relatively unimpor


tant, but we will report them, to two digits after the
decimal, for the sake of completeness:

E (Polity.) = 10.52 - 4.59 GDP, + 0.45 GDP,2. (3)

Much more important are the residuals for each


case. Figure 2 shows a histogram of these residuals.
Apparently, a fairly large number of cases have quite -20 -15 -10 -5 510 0
low residuals and may therefore be considered Residua!fromRobustRegression
typical.
(A higher proportion of cases fall far below the regres
sion line than far above it, suggesting either that the
model may be incomplete or that the error term does
not have a normal distribution. It is hoped that within
Catholic), the identification of diversity is readily
case analysis will be able to shed light on the reasons The one case
apparent. investigator simply chooses
for the asymmetry.) Indeed, twenty-six cases have a
from each category. For a continuous variable, the
typicality score between 0 and -1. Any or all of these
researcher usually chooses both extreme values (high
might reasonably be selected for in-depth analysis on
and low), and perhaps the mean or median as well.
account of their typicality in this general model. The researcher may also look for natural break points
in the distribution that seem to correspond to categor
Conclusion. Typicality responds to the first
ical differences among cases. Where the causal factor
desideratum of case selection, that the chosen case be
of interest is a vector of variables, and where these
representative of a population of cases. Even so, it is
factorscan be measured, the researcher may simply
important to remind ourselves that the single-minded
combine various causal factors into a series of cells,
pursuit of representativeness does not ensure that it
based on cross tabulations of factors deemed to have
will be achieved. Note that the test of typicality intro
an effect on Y. Things become slightly more compli
duced here, the size of a case's residual, can be mis
cated when one or more of these factors is continu
leading if the statistical model ismisspecified. Thus a
ous, rather than dichotomous, since the researcher
case may lie directly on the regression line but still
will have to arbitrarily redefine that variable as a cat
be, in some important respects, atypical.
egorical variable (as previously).
Diversity may also be understood in terms of vari
Diverse Cases ous causal paths, running from exogenous factors to
A second case selection strategy has as its primary a particular outcome. Perhaps three different inde

objective the achievement of maximum variance pendent variables and all cause F, but they
(Xv X2, X3)
along relevant dimensions. We refer to this as a do so independently of each other and in different
diverse case method.3 It requires the selection of a set ways. Each is a sufficient cause of Y.4 George and
of cases?at minimum, two?which are intended to Smoke (1974), for example, wish to explore different

represent the full range of values characterizing X, 7, types of deterrence failure?by fait accompli, by lim
or some particular X/Y relationship. The investigation ited probe, and by controlled pressure. Consequently,
is understood to be exploratory (hypothesis seeking) they wish to find cases that exemplify each type of
when the researcher focuses on X or Y and confirma causal mechanism. This may be identified by a tradi
tory (hypothesis testing) when he or she focuses on a tional form of path analysis, by qualitative compara
particular XIY relationship. tive analysis (Ragin 2000), by sequence analysis
Where the individual variable of interest is cate (Abbott and Tsay 2000), or by qualitative typologies
gorical (on/off, red/black/blue, Jewish/Protestant/ (Collier, LaPorte, and Seawright 2007; Elman 2005).
Seawright, Gerring / Case Selection Techniques 301

Large-N analysis. Where causal variables are con (cf. Emigh 1997;Mahoney and Goertz 2004; Ragin
tinuous and the outcome is dichotomous, the 2000, 60; Ragin 2004,126).
researcher may employ discriminant analysis to iden
cases. Diverse case selection for categor (E) for the ith case
tify diverse Large-N analysis. Extremity
ical variables is also easily accommodated in a can be defined in terms of the sample mean (X) and
the standarddeviation (s) for that variable:
large-Af context by using some version of stratified
random sampling. In this approach, the researcher
identifies the different substantive categories of inter
est as well as the number of cases to be chosen from
each category. Then, the needed cases may be ran
from among those available in each This definition of extremity is the absolute value of
domly chosen
category (Cochran 1977). the Z-score (Stone 1996, 340) for the ith case. This
One assumes that the identification of diverse cate may be understood as a matter of degrees, rather than
of cases will, at the same time, identify cate as a (necessarily arbitrary) threshold.
gories
Since extremeness is a unidimensional it
gories that are internally homogenous (in all respects concept,
that might affect the causal of interest). may be applied with reference to any dimension of a
relationship
Because of the small number of cases to be chosen, the problem, a choice that is dependent on the scholar's
cases selected are not guaranteed to be representative of research interest. Let us say that we are principally inter
each category. Nevertheless, if the categories are care ested in countries' level of democracy?the dependent
the researcher should, in principle, be variable in the exemplary model that we have been
fully constructed,
indifferent among cases within a given category. Hence exploring. The mean
of our democracy measure is 2.76,
random is a sensible if suggesting that, on average, the countries in the 1995
sampling tiebreaker; however,
there is suspected diversity within each category, then data set tend to be somewhat more democratic than
measures should be taken to ensure that the chosen autocratic (by Polity's definition). The standard devia
cases are typical of each category. A case study should tion is 6.92, implying that there is a fair amount of scat
not focus on an atypical member of a subgroup. ter around the mean in these data. Extremeness scores
for this variable, understood as deviation from the mean,
Conclusions. Encompassing a full range of varia can then be graphed for all countries according to the
tion is likely to enhance the representativeness of the previous formula. These are displayed in Figure 3. As it

sample of cases chosen by the researcher. This is a happens, two countries share the largest extremeness
distinct advantage. Of course, the inclusion of a full scores (1.84): Qatar and Saudi Arabia. Both are graded
range of variation may distort the actual distribution as -10 on Polity's twenty-one-point system (which
of cases across
this spectrum. If there are more high ranges from -10 to +10). These are the most extreme
cases than low cases in a population, and the cases in the population and, as such, pose natural
researcher chooses only one high case and one low subjects of investigation wherever the researcher's prin
case, the resulting sample of two is not perfectly rep cipal question of interest is in regime type.
resentative. Even so, the diverse case method proba
bly has stronger claims to representativeness than any case method
Conclusion. The extreme appears to
other small-Af sample the typical case).
(including violate the social science folk wisdom warning us not
to "select on the dependent variable" (Geddes 1990;
King, Keohane, and Verba 1994; see also discussion
Extreme Case
in Brady and Collier 2004; Collier and Mahoney
The extreme case method selects a case because of 1996). Selecting cases on the dependent variable is
its extreme value on the independent (X) or dependent indeed problematic if the researcher treats the result
(Y) variable of interest. An extreme value is understood extreme case?as if it were repre
ing sample?the
here as an observation that lies far away from the mean sentative of a population.5 However, this is not the
of a given distribution; that is to say, it is unusual. Ifmost proper use of the extreme case method. Note that the
cases are positive along a given dimension, then a nega extreme case method refers back to a larger sample of
tive case constitutes an extreme case. If most cases are cases lying in the background of the analysis. These
negative, then a positive case constitutes an extreme case. cases provide a full range of variation as well as amore
For case study analysis, it is the rareness of the value that So long as
representative picture of the population.
makes a case valuable, not its positive or negative value these background cases are not forgotten (i.e., retained
302 Political Research Quarterly

Figure 3 relative to some general model of causal relations.


Extremeness Scores on Democracy The deviant case method
selects cases that, by refer
ence to some general cross-case demon
relationship,
strate a surprising value; they are poorly explained.
The important point is that deviantness can only be
assessed relative to the general (quantitative or quali
tative) model employed.7 This means, of course, that
the relative deviantness of a case is likely to change
whenever the general model is altered.
The purpose of a deviant case analysis is usually to
probe for new?but as yet unspecified?explanations.
In this circumstance, the deviant case method is only
slightly more bounded than the extreme case method.
It, too, is an exploratory form of research. The
researcher hopes that causal processes within the
deviant case will illustrate some causal factor that is
applicable to other
(deviant) that in cases. This means
most circumstances, a deviant case study culminates in
a general proposition?one that may be applied to
other cases in the population. As a consequence, one
deviant case study may lead to a new cross-case model
in the subsequent analysis as points of reference), the
that identifies an entirely different set of deviant cases;
analysis is not likely to be subject to problems of sam
however, there is also a second, less common reason
ple bias. The extreme case approach to case study analy a deviant case. If the researcher
for choosing is inter
sis is therefore a conscious attempt tomaximize variance
ested in disconfirming a deterministic proposition,
on the dimension of interest, not tominimize it.
then any deviant case will do, so long as it lies within
Note also that the extreme case method is a purely
the specified population of the inference (Dion 1998).
exploratory method?a way of
probing possible
causes of F, or possible effects of X, in an open-ended
Large-N analysis. In statistical terms, deviant-case
fashion. If the researcher has some notion of what
selection is the opposite of typical-case selection.
additional factors might affect the outcome of inter
Where a typical case lies as close as possible to the
est, or of what relationship the causal factor of inter
prediction of a formal, mathematical representation
est might have on F, then he or she ought to pursue
of the hypothesis at hand, a deviant cases stands as far
one of the other methods explored in this article. It
as possible from that prediction. Hence, referring
follows that an extreme case method
may morph into
back to the model developed in equation (1), we can
a different kind as a study evolves,
of approach that
define the extent to which a case deviates from the
is, as a more specific hypothesis comes to light.
as follows:
predicted relationship
Indeed, the extreme case method often serves as an
entr?e into a subject, a subject which is subsequently
Deviantness = abs E . . .
(/) \y?- (y, \xu, xKi)]
interrogated with a more determinate (less open = + . .+
zbs[yi-b0 blxUi+. bKxKi].
ended) method.

Deviantness ranges from 0, for cases exactly on the


Deviant Case
regression line, to a theoretical limit of positive infin
The deviant case method selects that case
that, by ity. Researchers will be interested in selecting from the
reference to some
general understanding of a topic cases with the highest overall estimated deviantness.
(either a specific theory or common sense), demon In our running example, the most deviant cases fall
strates a surprising value. The deviant case is there below the regression line, as can be seen in Figure 4.
fore closely linked to the investigation of theoretical In fact, all eight of the cases with a deviantness score
anomalies. To say deviant is to imply anomalous.6 of more than 10?Croatia, Cuba, Indonesia, Iran,
Thus, while extreme cases are judged relative to the Morocco, Singapore, Syria, and Uzbekistan?are
mean of a single distribution (the distribution of val below the regression line. An analysis focused on
ues along a single variable), deviant cases are judged deviant cases might well select a subset of these.
Seawright, Gerring / Case Selection Techniques 303

Figure 4 about important missing variables). Because the tech


Influence Scores from a Regression of niques for identifying this sort of case are different
Democracy on Wealth than those used to identify the deviant case, we apply
a new term to this method?the influential case. The
goal of this style of case study is to explore cases that
may be influential vis-?-vis some larger cross-case

theory, not to propose new theoretical formulations


(though this may be the unintended by-product of an
influential case analysis).

Large-N analysis. Influential cases in regression


are those cases that, if counterfactually assigned a dif
ferent value on the dependent variable, would most
substantially change the resulting estimates. Two
J....ll.JllJ.|]ll.llL.II.I.ILJjiJll.Jljll.Jlljll,jlllljlljllj,)JJI..J.mJ,l,
T-1-1-i-1-1-1-'
quantitative measures of influence are commonly
20 0 40 60 80 100 120
applied in statistical analysis. The first, often referred
to as the leverage of a case, derives from what is called
the hat matrix. An interesting feature of the hat matrix
is that it does not depend on the values of the depen
Conclusion. As we
have noted, the deviant case
dent variable. This means that the measure of leverage
method is usually an exploratory form of analysis. As
soon as a researcher's of a particular case
derived from the hat matrix is, in effect, a measure of
exploration
potential influence. It tells us how much difference
has identified a factor to explain that case, it is no
the case would make in the final estimate if it were to
longer (by definition) deviant. If the new explanation
have an unusual score on the dependent variable, but
can be accurately measured as a single variable (or
it does not tell us how much difference each case actu
set of variables) across a larger sample of cases, then
ally made in the final estimate. Analysts involved in
a new cross-case model is in order. In this fashion, a
selecting influential cases will sometimes be inter
case study initially framed as deviant case may trans
ested inmeasures of potential influence because such
form into some othersort of analysis.
measures are relevant in selecting cases when there
This feature of the deviant case study also helps to
may be some a priori uncertainty about scores on the
resolve questions about its representativeness. The rep
dependent variable. Much of the information in such
resentativeness a
of deviant case is problematic since the
case studies comes from a careful, in-depth measure
case in question is, by construction, atypical. However,
ment of
the dependent variable?which may some
doubts about representativeness are addressed if the
times be
unknown, or only approximately known,
researcher generalizes whatever proposition is pro
before the case study begins. The measure of leverage
vided by the case study to other cases; that is, a new
derived from the hat matrix is appropriate for such sit
variable is added to the benchmark model. The modi
uations because it does not require actual scores for
fied cross-case
analysis should pull the deviant case
the dependent variable.
toward the expected value, mitigating an initial prob
A second commonly discussed measure of influ
lem of unrepresentativeness. The deviant case, one
ence in statistics is Cook's distance. This statistic is a
hopes, is now more or less typical.
measure of the extent to which the estimates of the ?,
parameters would
change if a given case were omit
Influential Case
ted from the analysis. This, in turn, depends primar
Sometimes, the choice of a case ismotivated solely ily on two quantities: the size of the regression
by the need to check the assumptions behind some residual for that case and the leverage for that case.
general model of causal relations. In this circum The most influential cases are those with substantial
stance, the extent to which a case fits the overall model leverage that lie significantly off the regression
is important only insofar as itmight affect the overall line. These cases contribute quite a lot to the infer
set of findings for the whole population. Once cases ences drawn from the analysis. Cook's distance thus
that do influence overall
findings have been
identified, provides a measure of how much actual?and not
it is important to decide whether or not they genuinely potential?influence each case has on the overall
fit in the sample (and whether they might give clues regression. In the examples that follow, Cook's distance
304 Political Research Quarterly

will be used as
the primary measure of influence such cases a large-Af cross-case
within data set.
because our interest is in whether any particular cases For heuristic purposes, we focus on two-case com
be influencing the coefficient estimates in our parisons. Readers should be aware that this can,
might
and often should, be adapted to more complex com
democracy-and-development regression.
Figure 4 shows the Cook's distance scores for each of parisons.
the countries in the 1995 per capitaGDP and democracy The most useful statistical tool for identifying
cases for in-depth analysis in a most similar setting is
data set. Most countries have quite low Cook's dis
probably some variety of matching strategy.9
tances. The three most serious exceptions to this gener
are the numbered Statistical estimates of causal effects based on match
alization lines in the figure: Jamaica
have been amajor
topic in quantitative
(74), Japan (75), andNepal (105). Of these three,Nepal ing techniques
methodology over
the last twenty-five years, first in
is clearly the most influential by a wide margin. Hence
statistics (Rosenbaum 2004; Rosenbaum and Rubin
any case study of influential cases with respect to the
1983), and subsequently, in econometrics (Hahn
relationshipmodeled in equation (4) would probably
start with an in-depth consideration of Nepal. 1998;Hirano, Imbens, and Ridder 2003) and political
science (Ho et al. 2007; Imai 2005). This family of
Conclusions. The use of an influential case strat techniques is based on an extension of experimental
egy of case selection is limited to instances in which logic. In a randomized experiment, elaborate statisti
a researcher has reason to be concerned that his or her cal models are unnecessary for causal inference
results are being driven by one or a few cases. This is because for a large enough selection of cases, the
most likely to be true in small- to moderate-sized treatment group and the control group have a high
samples. Where N is very large?greater than 1,000,
probability of being quite similar, on both measured
let us say?it is unlikely that a small set of cases
and unmeasured variables (other than the indepen
(much less an individual case) will play a dramati
influential role. Of there may be influ
dent variable and its effects). Hence very simple sta
cally course,
ential sets of cases, for example, countries within a
tistical treatments (e.g., a difference of means test)
or may be sufficient to demonstrate a causal inference.
particular continent cultural region, or persons of
Irish extraction. Sets of influential observations are In observational studies, by contrast, it is quite
often in a time-series data
cross-section unusual to find situations in which the cases with a
problematic
set, where each unit (e.g., country) contains multiple high score on the
independent variable (which
observations (through time) and hence may have a roughly correspond to the treatment group in an
strong influence on aggregate results. are similar across all factors
experiment) background
to the cases with a lower score on the independent
Most Similar/Most Different Cases variable (corresponding to the control group).
Typically, the treatment group in an observational
The most similar method, like the diverse case
study will differ in many ways from the control
method, employs of two cases (Lijphart
a minimum
group, a fact that is likely to confound the correct
1971, 1975; Meckstroth 1975; Przeworski and Teune
estimation of Xx's effect on Y.
1970; Skocpol and Somers 1980).8 In its purest form, One common approach to this identification prob
the chosen pair of cases is similar on all the measured
lem is to introduce a variable for each potential con
independent variables, except the independent variable founder in a general of causal relationships
analysis
of interest. Table 2 offers a stylized example of the sim
(e.g., a regression model). Matching techniques have
plest sort of most similar analysis, with only two cases been as an explicit alternative to this
developed
and with all variables measured dichotomously. Here
control-variable approach. This approach begins by
the two cases are similar across all background condi
identifying a set of variables (other than the depen
tions that might be relevant to the outcome of interest,
dent variable or the main independent variable) on
as signified by Xv the vector of control variables. The
which the cases are to be matched. Then, for each
cases differ, however, on one dimension?Xx?and on
case in the treatment group, the researcher tries to
the outcome, Y. Itmay be presumed from this pattern of cases
identify from the control group with the exact
covariation across cases that the presence or absence of
same scores on the matching variables (the covari
Xl is what causes variation on F.
ates). Finally, the scholar looks at the difference on

outlined the most simi the dependent variable between the cases in the treat
Large-N analysis. Having
lar research design as it is employed in qualitative ment group and the matching cases in the control
contexts, we turn to the question of how to identify group. If the set of matching variables is broad
Seawright, Gerring / Case Selection Techniques 305

Table 2 the control group with similar propensity scores to


Most Similar Analysis with Two Cases the treatment cases. The end result of this propensity
score procedure is a set of matched cases that can be
Variable
compared in whatever way the researcher deems
Case XxX2Y appropriate. These are the most similar cases, return

1 + + + ing to the qualitative terminology.


2 + Suppose that to study the relationship between
wealth and democracy, the researcher wishes to select
Note: Plusses and minuses represent the score demonstrated by a
cases that are as similar as possible to India and Costa
case on a particular dimension (variable), coded dichotomously.

Xx
= the variable of theoretical interest; X2
= the
background/
Rica in background being as different
variables?while
control variable or vector; Y = the outcome. as possible on per capita GDP. To select most similar
cases for the study of the relationship between wealth
and democracy, we will need a statistical model of the

enough to include all confounders, the average differ causes of a country's wealth. Obviously, such a propo
ence between the treatment group and the matching sition is complex. Since this is simply an illustrative
control cases should provide a good estimate of the example, we shall be content with a cartoon model that
causal effect. only two independent variables. Specifically,
includes a
in most observational studies, the country's wealth will be assumed to be a function of the
Unfortunately,
matching procedure described previously?known as origin of its legal system (i.e., British, French, German,
exact matching?is This Scandinavian, or socialist) and a variable measuring the
impossible. procedure
almost fails for continuous
always variables, such as latitude of the country's capital.
wealth, age, or distance, since there are no The first step in selecting most similar cases is to
generally
two cases with precisely the same score on these regress per capita GDP (the independent variable of
scalar dimensions.
Additionally, larger the the theoretical interest) on these variables. The fitted val
number of matching variables (either ues from this regression serve as propensity scores,
employed
dichotomous or continuous), the lower the likelihood and cases with similar propensity scores are inter
of finding exact matches. preted as matching. It is important to keep in mind
In situations where exact matching is infeasible, that the quality of the match depends on the quality of
researchers may employ approximate matching, in the statistical model used to generate the propensity
which cases from the control group that are close scores; a superficial model, like the one used here,

enough to matching cases from the treatment group obviously produces superficial matches. Even so,
are accepted as matches. One implementation is they are illustrative of the power of this method to
called a technique that select useful case comparisons.
propensity-score matching,
focuses on finding cases that share a similar esti The analysis identifies propensity scores for our
mated probabilityof having been in the treatment two focus cases: Costa Rica (7.63) and India (8.02).
group, on the matching
conditional variables. In other Examining the propensity score data for other cases,
a we see that Benin has a propensity score of 7.58?
words, when looking for match for a specific case
in the treatment group, researchers look for cases in quite similar to Costa Rica's?and a per capita GDP
the control group that?before the score on the inde of US$1,163, which is substantially different from

pendent variable was known?would have been as Costa Rica's US$5,486. Hence Benin and Costa Rica
likely to be in the treatment
group as the other case. may be seen as most similar cases for testing the rela
This is accomplished by a two-stage analysis, the first tionship between wealth and democracy. Similarly,
stage of which approaches the key independent vari Singapore's propensity score of 7.99 is a close match

able, Xx (understood as the treatment), as a dependent for India's, in spite of a noticeable difference between
variable and the matching variables as independent Singapore's per capitaGDP of US$27,020 and India's
variables. Once this model has been estimated, the US$2,066. These two pairs of cases thus meet the cri
second stage of the analysis employs the fitted values teria for most similar case comparison and can be pur
for each case, which tell us the probability of that sued according to the logic expressed in Table 2.
case being assigned to the treatment group, condi
tional on its scores on the matching variables. These Conclusion. The most similar method is one of the
fitted values are to as propensity
referred scores. oldest recognized techniques of qualitative analysis,
The final step in the process is to choose cases from harking back to J. S. Mill's (1872) classic study
306 Political Research Quarterly

System of Logic. By contrast, matching statistics are Unfortunately, research strategies that are ideal for
a relatively new technique in the arsenal of the social exploration are not always ideal for confirmation.
sciences and have rarely been employed for the pur Once a specific hypothesis is adopted, the researcher
pose of selecting cases for in-depth analysis. Yet we must shift to a different research design.
believe that there may be a fruitful interchange There are three ways to handle this. One can
between the two approaches. Indeed, the current pop explain, straightforwardly, that the initial research was
ularity of matching among statisticians rests on what undertaken in an exploratory fashion and therefore

qualitative researchers would recognize as a case was not constructed to test the specific hypothesis that
based approach to causal analysis. is?now?the primary argument. Alternatively, one
The most different method of case selection is the can try to redesign the study after the new (or revised)
reverse image of the previous research design. Rather hypothesis has been formulated. This may require
than looking for cases that are most similar, one looks additional field research, or perhaps the integration of
for cases that are most different. Specifically, the additional cases
or variables, which can be obtained
researcher tries to identify cases where just one inde through secondary sources or through consultation of

pendent variable as well as the dependent variable experts. A final approach is to simply jettison, or

covary, and all other plausible independent variables deemphasize, the portion of research that no longer
show different values. These are deemed most differ addresses the (revised) key hypothesis. In the event,
ent cases, though they are similar in two essential practical considerations will probably determine
the causal variable or combinations
respects: of interest (Xx) and the which of these three strategies, of
outcome (F). Analysts have usually taken the position strategies, is to be followed. (They are not mutually
that this research design is a weaker tool for causal exclusive.) The point to remember is that revision of
inference than the most similar method, a matter one's cross-case research design is entirely normal
addressed elsewhere (Gerring 2007). For present pur and perhaps to be expected.
poses, it is sufficient to note the utility of large-TV sta A final complication, which we have noted in each
tistical analysis as a technique for choosing cases in section of
the article, is that of representativeness.
small-Af comparisons. There is only one situation in which a case study
researcher need not be concerned with the represen
tativeness of his or her chosen case: this is the influ
ential case research design, where a case is chosen
Complications
because of its possible influence on a cross-case
The seven case selection strategies listed in Table model and hence is not expected to be representative
1 are intended to provide a menu of options for of a larger sample. In all other circumstances, cases
researchers seeking to identify useful cases for in must be representative of the population of interest in

depth research, a means of implementing these whatever ways might be relevant to the proposition in

options in large-Af settings, and useful advice for how question. This is not an easy matter to test. However,
to maximize variation on key dimensions?while in a large-TV context, the residual for that case (in

maintaining claims to case representativeness within whatever model the researcher has greatest confi
a broader population. In this final section, we address dence) is a reasonable place to start. Of course, this
several complications that may arise in the course of test is only as good as the model at hand. Any incor

implementing these procedures. rect specifications or incorrect modeling procedures


Some case studies follow only one strategy of case will likely bias the results and give an incorrect
selection; however, it is important to recognize that assessment of each case's so-called typicality. Given
many case studies also mix and match case selection the explanatory weight that individual cases are asked

strategies. There is not much that we can say about to bear in a case study analysis, it is wise to consider
combinations of strategies, except that where the more than just the residual test of representativeness.
cases allow for a variety of empirical strategies, there Deductive logic?expectations about the causal rela
is no reason not to pursue them. tionships of interest and the case of choice?are
The second complication that deserves emphasis is sometimes more useful than purely inductive tests.
the changing status of a case during the course of a In any case, there is no dispensing with the ques
researcher's investigation. Often, a researcher begins tion. Case studies (with the two exceptions already
in an exploratory mode and proceeds to a confirmatory noted) rest on an assumed synecdoche: the case
mode?that is, she develops a specific X/Y hypothesis. should stand for a population. If this is not true, or if
Seawright, Gerring / Case Selection Techniques 307

there is reason to doubt this assumption, then the Unpublished manuscript, Department of Political Science,
of California at Berkeley.
utility of the case study is brought severely into University
Collier, David, and James Mahoney. 1996. Insights and pitfalls:
question. Selection bias in qualitative research. World Politics 49
(October): 56-91.
Notes Dion, Douglas. 1998. Evidence and inference in the comparative
case study. Comparative Politics 30(2): 127-45.
1.Where cases are chosen, the researcher must also
multiple Eckstein, Harry. 1975. Case studies and theory in political
be aware of problems of case however, these prob
independence; science. InHandbook of political science. Vol. 7 of Political
lems are in no sense to case study work (Gerring 2001,
unique science: and ed. Fred I. Greenstein and Nelson
Scope theory,
178-81). W. Polsby, 79-138. MA:
Reading, Addison-Wesley.
2. It may be worthwhile to recall that case selection is often
Elman, Colin. 2003. Lessons from Lakatos. In Progress in
an iterative process; within-case research may suggest revisions
international relations theory: Appraising the field, ed. Colin
to the statistical used to select cases, lead
techniques potentially Elman andMirium Fendius Elman, 21-68. Cambridge, MA:
ing to a new sample and new opportunities for within-case analy MIT Press.
sis. Nonetheless, the distinction between within-case and -. 2005. Explanatory typologies in qualitative studies
cross-case seems
indispensable.
analysis of international International 59(2):
politics. Organization
3. This method has not received much attention on the part of
293-326.
hence the absence of a generally rec
qualitative methodologists,
Emigh, Rebecca. 1997. The power of negative thinking: The use
ognized name. It bears some resemblance to J. S. Mill's joint of negative case methodology in the development of sociolog
method of agreement and difference (Mill 1872), which is to say, ical theory. and 26:649-84.
Theory Society
a mixture of most similar and most different analysis, as dis
Geddes, Barbara. 1990. How the cases you choose affect the
cussed subsequently. Patton (2002, 234) employs the concept of answers Selection bias in comparative In
you get: politics.
maximum variation (heterogeneity) sampling. Political analysis, vol. 2, ed. James A. Stimson, 131-50. Ann
4. This is sometimes referred to as causal equifinality (Elman Arbor: University of Michigan Press.
2005; George and Bennett 2005). Alexander L., and Andrew Bennett. 2005. Case studies
George,
5. The exception would be a circumstance in which the researcher
and theory development. Cambridge, MA: MIT Press.
intends to disprove a deterministic argument (Dion 1998). Alexander
George, L., and Richard Smoke. 1974. Deterrence in
6. For discussions of the important role of anomalies in the
American foreign policy: Theory and practice. New York:
development of scientific theorizing, see Elman (2003) and Columbia University Press.
Lakatos (1978). For examples of deviant case research designs in
Gerring, John. 2001. Social science methodology: A criterial
the social sciences, see Amenta (1991), Eckstein (1975), Emigh
framework. Cambridge, UK: Cambridge University Press.
(1997), Kazancigil (1994), and Kendall andWolf (1955). -. 2004. W^hat is a case and what is it good for?
study
7. We use the somewhat awkward term deviantness, rather
American Political Science Review 98(2): 341-54.
than the more natural deviance, because deviance already has a
-. 2006. studies: A methodological
Single-outcome primer.
somewhat different meaning in statistics.
International Sociology 21(5): 707-34.
8. Sometimes the most similar method is known as the method
-. 2007. Case research: and
study Principles practices.
of difference (Mill 1872). Cambridge, UK: Cambridge University Press.
9. For good introductions, see Ho et al. (2007), Morgan and Harding
Gerring, John, Philip Bond, William Barndt, and Carola Moreno.
(2005),Rosenbaum (2004), andRosenbaum and Silber (2001). 2005. and growth: A historical World
Democracy perspective.
Politics 57(3): 323-64.
References Gerring, John, and Rose McDermott. 2007. An experimental
template for case-study research. American Journal of
Abbott, Andrew, and Angela Tsay. 2000. Sequence analysis and Political Science 51(3): 688-701.
optimal matching methods in sociology. Sociological Goertz, Gary. 2006.
science Social
concepts: A user's guide.
Methods and Research 29:3-33. Princeton, NJ: Princeton
University Press.

Achen, H., and Duncan Snidal. 1989. Rational deter Hahn, Jinyong. 1998. On the role of the propensity score in effi
Christopher
rence theory and comparative case studies. World Politics 41 cient semiparametric estimation of average treatment effects.

(January): 143-69. Econometrica 66(2): 315-32.

Amenta, Edwin. 1991. Making the most of a case study: Theories Hersen, Michel, and David H. Barlow. 1976. Single-case experi
of the welfare state and the American experience. In Issues mental designs: Strategies for studying behavior change.
and alternatives in comparative social research, ed. Charles Oxford, UK: Pergamon Press.
C. Ragin, 172-94. Leiden: E. J. Brill. Hirano, Keisuke, Guido Imbens, and Geert Ridder. 2003.

Boix, Charles, and Susan C. Stokes. 2003. Endogenous democra Efficient estimation of average treatment effects using the
tization. World Politics 55(4): 517-49. estimated propensity score. Econometrica 71(4): 1161-89.

Brady, Henry, and David Collier, eds. 2004. Rethinking social inquiry: Ho, Daniel E., Kosuke Imai, Gary King, and Elizabeth A. Stuart.
Diverse tools, shared standards. Lanham, MD: Rowman and 2007. Matching as for
nonparametric preprocessing reducing
LMefield. model dependence in parametric causal inference. Political

Cochran, William G. 1977. Sampling techniques. New York: John Analysis 15(3): 199-236.
Wiley. Imai, Kosuke. 2005. Do get-out-the-vote calls reduce turnout?

Collier, David, Jody LaPorte, and Jason Seawright. 2007. Putting The importance of statistical methods for field experiments.
typologies to work: Tools for comparative analysis. American Political Science Review 99:283-300.
308 Political Research Quarterly

Kazancigil, Ali. 1994. The deviant case in comparative analysis. In Political institutions and material in the world,
well-being
Comparing nations: Concepts, strategies, substance, ed. Mattei 1950-1990. Cambridge, UK: Cambridge University Press.
Dogan andAli Kazancigil, 213-38. Cambridge, UK: Blackwell. Przeworski, Adam, and Henry Teune. 1970. The logic of compar
Kendall, Patricia L., and Katherine M. Wolf. 1955. The analysis ative social inquiry. New York: John Wiley.
of deviant cases in communications research. In The language Charles C. 1992. and the process of social
Ragin, "Casing"
of social research, ed. Paul F. Lazarsfeld and Morris inquiry. In What is a case? Exploring the foundations of
Rosenberg, 167-70. New York: Free Press. First published social inquiry, ed. Charles C. Ragin and Howard S. Becker,
1949 by Harper and Brothers. 217-26. Cambridge, UK: Cambridge University Press.
Robert O. Keohane, and Verba. 1994. -. 2000. social science. of
King, Gary, Sidney Fuzzy-set Chicago: University
Designing social
inquiry: Scientific inference in qualitative Chicago Press.
research. Princeton, NJ: Princeton Press. -. 2004. the tables. In Rethinking social
University Turning inquiry:
Lakatos, Imre. 1978. The methodology of scientific research pro Diverse tools, shared standards, ed. Henry E. Brady and

grammes. Cambridge, UK: Cambridge University Press. David Collier, 123-38. Lanham, MD: Rowman and Littlefield.

Lijphart, Arend. 1971. Comparative politics and the comparative Rohlflng, Ingo. 2008. What you see and what you get: Pitfalls and
method. American Political Science Review 65(3): 682-93. principles of nested analysis in comparative research.
-. 1975. The comparable cases strategy in comparative Comparative Political Studies, doi: 10.1177/0010414007308019,
research. Comparative Political Studies 8:158-77. published online November 27, 2007, http://cps.sagepub.com/
Lipset, Seymour Martin. 1959. Some social requisites of democ cgi/content/abstract/0010414007308019vl.
racy: Economic development and political development. Rosenbaum, Paul R. 2004. Matching in observational studies. In
American Political Science Review 53 (March): 69-105. Applied Bayesian modeling and causal inference from an

Mahoney, James, and Gary Goertz. 2004. The possibility princi incomplete-data perspective, ed. A. Gelman and X.-L. Meng,

ple: Choosing negative cases in comparative research. 15-24. New York: John Wiley.
American Political Science Review 98(4): 653-69. Rosenbaum, Paul R., and Donald B. Rubin. 1983. The central

Marshall, Monty G., and Keith Jaggers. 2005. Polity IV Project: role of the propensity score in observational studies for causal
Political regime characteristics and transitions, 1800-1999. effects. Biometrika 70:40-51.
Center for International Development and Conflict Rosenbaum, Paul R., and Jeffrey H. Silber. 2001. Matching and

Management, http://www.cidcm.umd.edu/polity/. thick description in an observational study of mortality after


Meckstroth, Theodore. 1975. "Most different systems" and "most surgery. Biostatistics 2(2): 217-32.
similar systems": A study in the logic of comparative inquiry. Sekhon, Jasjeet S. 2004. Quality meets quantity: Case studies,
Comparative Political Studies 8(2): 133-77. conditional probability and counterfactuals. Perspectives in

Mill, John Stuart. 1872. System of logic. 8th ed. London: Politics 2(2): 281-93.
Longmans, Green. Firstpublished 1843. Skocpol, Theda, and Margaret Somers. 1980. The uses of com

Morgan, Stephen L., and David J. Harding. 2005. Matching esti parative history in macrosocial inquiry. Comparative Studies
mators of causal effects: From stratification and weighting to in Society and History 22(2): 147-97.
data analysis routines. Unpublished manuscript, Stone, Charles J. 1996. A course in probability and statistics.
practical
Department of Sociology, Cornell University. Belmont, CA: Duxbury Press.

Patton, Michael Quinn. 2002. Qualitative research and evalua Summers, Robert, and Alan Heston. 1991. The Penn world table
tion methods. Thousand Oaks, CA: Sage. (mark 5): An expanded set of international comparisons,
Przeworski, Adam, Michael Alvarez, Jose Antonio Cheibub, and 1950-1988. Quarterly Journal of Economics 106(2): 327-68.
Fernando Limongi. 2000. Democracy and development:

You might also like