Professional Documents
Culture Documents
Abstract
There are situations where the data or the theory suggest or require,
respectively, that one estimate the boundary lines that separate regions of
observations from regions of no observations. Of particular interest are ceil-
ing or floor lines. For example, many theories use terms such as veto player,
constraint, only if, and so on, which suggest ceilings. Ceiling hypotheses have
a nonstandard form claiming the probability of Y will be zero for all values of
Y greater than the ceiling value of Yc for a given value of X. Conversely, ceiling
hypotheses make no specific prediction about the value of Y for a given value
of X except that it will be less than the ceiling value. Floors work by guaran-
teeing minimum levels. The article gives numerous examples of theories that
imply ceiling or floor hypotheses and numerous examples of data that fit such
hypotheses. The article proposes quantile regression as a means of estimat-
ing the boundaries of the no-data zone as well as criteria for evaluating the
importance of the boundary variable. These techniques are illustrated for
ceiling and floor hypotheses relating gross domestic product/capita and
democracy.
1
University of Notre Dame, IN, USA
2
Rotterdam School of Management, Erasmus University, Rotterdam, The Netherlands
Corresponding Author:
Gary Goertz, Kroc Institute, University of Notre Dame, Notre Dame, IN, 46556, USA.
Email: ggoertz@nd.edu
4 Sociological Methods & Research 42(1)
Keywords
QCA, necessary conditions, quantile regression, GDP/capita, democracy,
veto player
Introduction
Of constant concern to social scientists is fitting empirical data analysis, for
example, statistics, to theories. There has been progress in matching game
theoretic models with appropriate statistical methods (e.g., EITM Project
in political science). In this article, we explore a mismatch between theories
and statistical data analysis. We show that there exist a large class of theories,
models, and hypotheses that postulate ‘‘floors’’ and ‘‘ceilings.’’ To test and
evaluate these theories, then we need methodologies that can estimate these
quantities of theoretical interest.
The examples in the tables below illustrate that a wide range of theoretical
language implies that ceilings are what the hypothesis is about. A ceiling is a
value Yc for a given value of X that observations rarely if ever exceed. A
‘‘glass ceiling’’ for women means that there are professional levels that are
very difficult to attain. Conversely, the ceiling hypothesis makes no specific
claim about the exact value of Y in the range [0, Yc] for a given value of X (for
convenience in much of our presentation we will assume that all observations
lie in [0,1]).1
‘‘Floors’’ work in the opposite manner. The value of Yf for a given value of
X is the minimum which we will find for that value of X. In other words, vir-
tually all of the observations will lie above the floor, [Yf,1].
The core argument of our article is (1) that many theories predict ceiling
or floor data patterns, (2) many descriptive scatterplots have ceiling or floor
no-data patterns, (3) the quantity of theory interest is not a line through the
middle of the data but the ceiling or floor line, and (4) the importance of the
ceiling or floor is the relative size of the no observation zone created by the
floor or the ceiling.
We work from both directions: we provide examples of theories that pre-
dict ceiling or floor patterns, and at the same time we illustrate that data scat-
terplots with large no-data zones are not uncommon, that is, that ceiling or
Goertz et al. 5
floor theories would fit or explain these data. In particular, we think that ceil-
ing or floor scatterplots arise quite frequently, particularly in large cross-
national studies, as well as in research focusing on institutions (domestic
or international).
A variety of theories or causal mechanisms can produce ceiling or floor
hypotheses. For example, many theories of institutions invoke them as ‘‘con-
straints’’ on behavior, which suggests a ceiling effect. There are multiple
causal mechanisms that generate ceiling or floor hypotheses. We focus in
particular on an important class, those hypotheses and theories formulated
in terms of necessary and sufficient conditions. It must be stressed that our
methodology is not limited to these kinds of hypotheses, but extends to any
theory that explicitly or implicitly invokes constraints, floor, ceilings, and so on.
Ceilings and floors often produce what we will call ‘‘triangular no-data’’
patterns or scatterplots.2 While the zone of no data can take many forms there
are good theoretical and empirical reasons to focus on zones that are ‘‘trian-
gular’’ in shape. In particular, necessary and sufficient conditions by defini-
tion produce triangular no-data zones. But we shall also see that game theory
produces hypotheses about triangular no-data zones as well.
So while we think our methodology is particularly well suited to necessary
and sufficient condition hypotheses, it is not limited to them. Nothing in the
methodology requires fuzzy logic variables or requires the use of necessary
and sufficient condition language. For example, the researcher might prefer
the language of constraints, for example, veto players, to the logic of neces-
sary and sufficient conditions.
The first half of our article focuses on different ways in which hypotheses
about no-data zones can arise. While we focus much of our attention on nec-
essary and/or sufficient condition theories and hypotheses as a core example,
this is by no means the only way that theories focus on no-data zones. For
example, in our example involving democracy and wealth, we suggest that
Przeworski et al.’s (2000) hypothesis about this relationship postulates a
floor below which we should see no cases. We also discuss game theoretic
models as another large class of examples.
The second half of the article provides our methodological solution. The
key idea is to draw a line dividing the zone of data from the zone of no data,
that is, the floor or the ceiling. The ‘‘importance’’ of the constraint, floor or
ceiling, then is determined by how large this region is. Here we use directly
the meaning of constraints: the more important a constraint, the larger the set
of options that is eliminated.
There exist statistical methods for estimating boundaries of data.3 We
focus in particular on quantile regression as one such technique. While
6 Sociological Methods & Research 42(1)
eggs, in our analysis which starts out at the highest level with ideas of con-
straint; the next level down focuses on ceiling and floors as common kinds of
constraints; the next level is the particular specification of floors and ceilings
in terms of necessary and sufficient conditions.
Geometrically, we focus particular attention on no-data regions of trian-
gular shape. Constraints can potentially be of any shape, but if we think in
standard Cartesian coordinate terms, the regions of no data often lie at the
corners. This means that they are triangular. Continuous necessary condi-
tions (i.e., fuzzy logic) by definition have empty triangular data zones.
Finally, quantile regression is a natural way to draw boundary lines around
triangular no-data regions that lie in corners.4
Hence, it is natural to look for what we call ‘‘triangular data or theories.’’
If constraint, ceiling, and floor hypotheses predict empty zones in the data,
then one possible interpretation of triangular data is via ceiling, floor, or con-
straint theories. As we shall see, game theory models often predict triangular
no-data zones. Since economic and game theory models often involve con-
straints of various sorts, they might naturally generate predictions of no-
data in certain regions.
.
1.0 .....
....
.....
....
.....
.
......
.
0.9 .
.....
....
.....
....
.....
......
.
0.8 .
.....
....
.
.....
....
.....
.....
0.7 .
.....
.
..
. .
....
..... .
....
.....
0.6 .
..
.....
.
.....
....
.....
....
.....
. .
Y 0.5 ..
..
..... . .
.....
. ....
.....
....
. ...
.....
0.4 ..
..
.....
..
.....
.. ..
.... .
.....
....
0.3 ..... .
..... .
.....
.
.....
.
.
. . .
.....
.... .
0.2 .....
..... . . . .
.
.....
.
.
. .
.....
.....
.... . . . .. . .
0.1 .....
.....
. .
.
.
...
.....
.
.
.
.....
.....
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X
Necessary Conditions
1 0 X
0 X X
0 1
Sufficient Conditions
1 X X
0 X 0
0 1
.
1.0 .....
....
.....
. .. . ....
.....
.
......
.
.
.....
0.9 ....
.....
.. .. . .
....
.....
....
..
..
.. . .....
0.8 . . ....
.....
....
.....
. . .
..
.....
.
.....
0.7 ... . ....
.....
. ....
.....
.....
. .. ..
.
. .....
.. .....
.....
....
0.6 .....
. .....
. . .
.....
.
.
. . .....
.....
....
. . .....
Y 0.5 . ..
......
. .....
.....
. .....
....
.....
. .....
0.4 . ....
.
.....
.
.
. .....
.....
.....
. ..
......
0.3 .....
....
. .....
.....
.....
.
......
.
.
. .....
0.2 ....
.....
....
.....
.
......
.
.
.....
....
0.1 .....
....
..
.
. .....
.....
.....
.....
.....
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X
Schimmelfennig, F. 2001. The community trap: liberal norms, rhetorical action, and the
eastern enlargement of the European Union.
‘‘In the rationalist perspective, however, a community of basic political values and
norms is at best a necessary condition of enlargement [of the European Union] . . . .
By contrast, in the sociological perspective, sharing a community of values and norms
with outside states is both necessary and sufficient for their admission to the organi-
zation.’’ (p. 61)
Lupia, A. and M. McCubbins. 1998. The democratic dilemma: can citizens learn what they
need to know?
‘‘Theorem 4.1: Communication leads to enlightenment if and only if: 1. the speaker is per-
suasive, 2. only the speaker initially possesses the knowledge that the principal needs,
and 3. neither common interests nor external forces induce the speaker to reveal
what he knows.’’ (p. 69)
Duverger, M. 1954. Political parties: their organization and activity in the modern state.
‘‘The introduction of universal suffrage led almost everywhere (the United States
excepted) to the development of Socialist parties.’’ (p. 66)
Skocpol, T. 1979. States and social revolutions: a comparative analysis of France, Russia,
and China.
‘‘I have argued that (1) state organizations susceptible to administrative and military
collapse when subject to intensified pressures from more developed countries
abroad and (2) agrarian sociopolitical structures that facilitated widespread peasant
revolts against landlords were, taken together, the sufficient distinctive causes of
social revolutionary situations commencing in France, 1789, Russia, 1917, and China,
1911.’’ (p. 154)
Huth, P. 1996. Standing your ground: territorial disputes and international conflict.
‘‘The presence of strategic territory, then, was relatively close to being a sufficient
condition for a dispute to exist.’’ (p. 75)
Triangular Theories
Necessary or sufficient condition theories directly produce ceiling and
floor hypotheses. There are other theories that explain or predict that all
observations are in one zone, and hence that there are regions where no
observations occur. Since the no-data zone is often triangular in nature, we
can call these ‘‘triangular’’ theories.
Table 3 gives a few examples that we have uncovered in our reading. It is
worth noting the presence of game theoretic models (e.g., Acemoglu and
Robinson, Bueno de Mesquita, Gartzke, etc.) in this list. While beyond the
scope of this article we suggest that a triangular data pattern is one of the
more common empirical implications of game theoretic models.
Figure 3 provides an example of what we have been calling triangular the-
ories taken from Acemoglu and Robinson’s Economic origins of dictatorship
and democracy. This is a good example, given our interest in theories relat-
ing economic variables with democracy and also a good example of how
game theoretic models can easily produce a prediction of triangular data pat-
terns. Acemoglu and Robinson’s theory of democratic consolidation predicts
a region determined by the ‘‘costs of coup’’—a constraint variable—and
‘‘inequality’’ where we should see democratic consolidations; of course, this
means that there is also a zone of no democratic consolidations.
12 Sociological Methods & Research 42(1)
Acemoglu, D., and J. Robinson. 2006. Economic origins of dictatorship and democracy.
Fig. 7.2, Costs of coups versus inequality and the consolidation of democracy, see
figure 3 here.
Bueno de Mesquita, B. 1985. The war trap revisited: a revised expected utility model.
Fig. 4, Two sides view of the situation the probability of war.
Gartzke, E. 1998. Kant we all just get along? Opportunity, willingness, and the origins
of the democratic peace.
Fig. 1, Relationship between opportunity, willingness, and the democratic peace.
Langlois, C., and J.-P. Langlois. 2006. Bargaining and the failure of asymmetric deter-
rence: trading off the risk of war for the promise of a better deal.
Fig. 2, Defender’s offer versus Challenger’s demand in equilibrium.
..
.....
.....
Fully consolidated democracy .....
.....
..
......
.
..
.....
.....
.....
.....
.....
..
......
.
..
.....
.....
.....
.....
.....
.........
..
.....
.....
.....
.....
.....
..
......
.
..
Costs of .....
.....
.....
..... Semiconsolidated
.....
coups ϕ .
...
......
.
.....
.....
democracy
.....
.....
..... .....
....
..
......
. ......
.... ......
..... ......
.
..
...... ..
... ......
.. ......
..... ......
..... ......
.....
..... . .....
..... . .....
..
....... . ...
.. .....
. ..
..... .. ....
..... .. ....
..... .. ....
..... .. ....
..... .. ....
.
......
. .
..... ...
.....
.
..
.....
......
......
.....
. Coups or
..... ......
..... ...... Unconsolidated democracy
..... ......
..
......
. ..... ......
..... ..... .
..... .... .
..... ...... .
...........
.......
0
δ Inequality θ 1
More generally, the Acemoglu and Robinson book has quite a few exam-
ples of where the game theoretic model predicts that we should see data con-
centrated in various regions. Not all these involve triangular shapes. For
example, sometimes the region is rectangular (see Figure 4 below where
we discuss such regions using Geddes’s data on economic growth and labor
repression). The key point is not necessarily the shape of the region. The key
point is that there are lines (or curves) that separate a region of data from one
of no-data.
Cingranelli, D., and Pasquarello, T. 1985. Human rights practices and the distribution
of foreign aid to Latin American countries.
Fig. 1, USA economic assistance versus level of respect for human rights.
Clark, Gilligan, and Golder. 2006. A simple multivariate test for asymmetric hypotheses.
Fig. 3, Number of effective legislative parties versus median district magnitude.
Geddes, B. 2003. Paradigms and sand castles: theory building and research design in
comparative politics.
Fig. 3.6, Growth in GDP/capita versus labor repression, (see figure 4).
Hoddie, M., and C. Hartzell. 2003. Civil war settlements and the implementation of
military power-sharing arrangements.
Fig. 1, Postwar life expectancy versus implementation of military power-sharing or
power dividing provisions to civil war settlements.
Noël, A. and Thérien, P. 1995. From domestic to international justice: the welfare
state and foreign aid.
Fig. 5, ODA (percentage of GNP) versus social transfers (percentage of GNP).
(continued)
16 Sociological Methods & Research 42(1)
Table 4. (continued)
Schoultz, L. 1981. U.S. foreign policy and human rights violations in Latin America.
Fig. 1, Level of human rights violations versus USA aid.
De Soysa, I., and E. Neumayer. 2007. Resource wealth and the risk of civil war onset:
results from a new dataset of natural resource rents, 1970–1999.
Fig. 1., Mean primary commodity exports/GDP versus mean energy rents/gross
national income.
Diener, E., and M. Seligman. 2004. Beyond money: toward an economy of well-being
Fig. 2, Satisfaction with life versus GDP/capita.
Elkins, Z., and J. Sides. 2007. Can institutions build unity in multiethnic states.
Fig. 1B, Proportion among majorities versus proportion among minorities identify-
ing with state only (World Values Survey).
Ceiling Zones
...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ....... ...... ...... .......
...........................................................................................
10 . ...
.... ... ...
.. .... ..
9 . * . ..
A C ... ... ...
.. ... ..
8 . . .. Scope
... ... ...
.
. .
. ..
7 ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ....... ...... ...... ......
* . * * ..
... ...
% GDP/capita growth
... .. *
. ..
6 B .
... ...
.
. ..
5 ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ......
........................................................................................... * .
** ** **
4
** * *
3 *...............................*....*............................................................................................................................................................................................... Regression line
* * *
2 * *
* * **
1 *
0 *
* *
−1
*
−2
−3
0 1 2 3 4 5
Labor Repression
[L]et us pose the key question in slightly different form: What are the neces-
sary and sufficient conditions for maximizing democracy in the real world?
(Dahl 1956:64, see also 75)
18 Sociological Methods & Research 42(1)
2 (Democracy) 0 0 2 17
3–4 1 0 8 16
5–6 0 6 7 3
7–9 3 6 10 6
10 3 5 3 0
11–12 13 7 6 3
13–14 (Authoritarian) 11 2 7 2
a
Freedom House democracy scale.
Source: Diamond 1992.
It has been argued by Max Weber among others that the factors making for
democracy in this area are a historically unique concatenation of elements, The
basic argument runs that capitalist economic development created the burgher
class whose existence was both a catalyst and a necessary condition for democ-
racy. (Lipset 1959:85)
..
.....
.
... ....
4 ... * * .....
... .....
... ....
... .....
....... * * * ....
.
..
......
... ..
2 ... .....
... .....
....
... .....
* * ..... * ..
......
...
.. ....
... .....
0 ... * .....
....
... .....
..... .
......
.
...
... * ** * .
.....
....
.....
−2 ** * .....
.....
....
.
*
..
.
* ..... **
....
.....
.....
−4 ** * ..
.
.....
....
. *
.....
* * .... *
.....
.....
.....
−6 * ** ** * * * .
.....
.
...... *
.....
* * ** * * * *** ..... *
....
.....
.....
−8 * .
...
.....
.
.
.....
....
* * * * *
.....
.....
.....
− 10
6 7 8 9 10 11
Logged GDP/capita (1995)
Figure 5. Estimating floors and ceilings: GDP/capita and the level of democracy.
The key principle to note is that ceiling and floor hypotheses involve a
fundamentally different orientation to hypotheses and data analysis:
Ceiling and floor hypotheses are about drawing boundaries lines
between zones of data and zones of no data; they are not about drawing
lines through the middle of data.
Most statistical methodologists today see causation and causal models in terms
of estimating ATE. Thinking about causal relationships in terms of constraints,
floors, and ceils means that there are other causal effects worth looking at.
this section, but the same principles apply to floors. Since ceilings and floors,
along with necessary and sufficient conditions, are quite common in qualita-
tive methods this section provides most of the principles and basic methodol-
ogy that qualitative scholars need for their own research. It also serves as an
intuitive and nontechnical introduction to the material in the next section
which provides statistical techniques and more developed criteria for esti-
mating boundary lines and their importance.
We have seen that there are close ties between theories invoking con-
straints, ceilings, and necessary conditions. Here we illustrate these linkages
in a simple, but real-life, example involving the relationship between labor
repression and economic growth.
There is a large qualitative, case study, literature on the causes of high
economic growth that arose in an attempt to analyze the rapid growth of some
economies in the 1970s and 1980s. Most obvious were the Asian tiger econo-
mies such as South Korea, Singapore, and Taiwan. Many qualitative analysts
based their analyses on these countries and argued that their rapid growth rate
depended on a disciplined and quiescent labor force and, therefore, on gov-
ernment’s extensive control over labor (labor repression). Repressed labor
meant lower labor costs, increased international competitivity, and so on.
Many of the arguments were about the constraints that free organized
labor put on the rate of economic growth (e.g., Deyo 1989; Haggard
1990). This then can be converted in a necessary condition hypothesis:
Hypothesis (necessary condition): High levels of labor repression are
necessary for high levels of economic growth.
We can frame this in terms of ceilings:
Hypothesis (ceiling): There should be no observations in the zone of
low labor repression and high economic growth.
Here we see a nice concrete example of the natural relationship between
constraints, causal mechanisms, necessary conditions, and ceiling hypotheses.
The first principle of ceiling (or floor) analysis says:
Where does the ceiling or necessary condition hypothesis claim is the
region of no observations?
The simple, qualitative, but very useful test is to examine the data to see if
there are any observations in the predicted no-data zone.
Figure 4 reproduces Geddes’s data on all 32 developing countries whose
GDP per capita in 1970 was greater than that of South Korea (Geddes
2003:104; we follow Geddes in choosing this set of countries). The ceiling
22 Sociological Methods & Research 42(1)
zone is 8.55. Sticking with the rectangle restriction, then we would prefer the
AþB ceiling zone. Abandoning the rectangle restriction, we could enlarge
the zone by combining zones A, B, and C. We then have a new, larger ceiling
zone in the form of an indented rectangle.
The choice of the ceiling boundary can have major implications for the
substantive interpretation of the results. For example, drawing the horizontal
line at 5 percent growth versus 7 percent growth means that there is more
room for economic growth that could be achieved without increasing labor
repression. This could have major policy implications; most governments
would be very happy with 7 percent growth so there would be less argument
for labor repression. Similarly, if we move the vertical line to the right, then
it means you have to pay for significantly more labor repression to get high
growth.
A natural question is how important are these constraints on economic
growth? Are these ceilings and constraints important or minor?
To answer this question leads to the next step in our proposed methodol-
ogy. The previous steps gave us some idea of the size of the no-data zone. We
now need to compare that to something in order to get some idea of how
important such constraints are. In order to do this we must first establish what
we call the empirical or theoretical scope of the analysis. This is the next crit-
ical step in the methodology of ceilings (or floors).
In Figure 4, we need to fix the scope for labor repression as well as GDP/
capita growth. Ideally, the researcher should have good theoretical and/or
empirical reasons for setting the scope. However, since explicit scope deci-
sions are rare we suspect that the most popular option will be to use the
maximum and minimum of the empirical data to fix the scope. The range
of the labor repression data is from zero to about 5 (4.4, Iraq, is the maxi-
mum). Hence, one might fix the scope of labor repression to be [0,5].
In many cases, there are reasons to think that values significantly higher
than those in the data are reasonable (particularly in modest N settings).
Conversely, extreme outliers might suggest using something like the
95–99 percentiles.
The literature that Geddes was reacting to focused on the conditions for
‘‘high economic growth.’’ So in our calculations, this should enter into the
determination of the size of the scope. Scope is thus high economic growth,
not the whole range, positive and negative, of economic growth. If we look at
the usual understanding of ‘‘high’’ economic growth in the post–World War
II period (‘‘high’’ growth would be significantly lower in the 19th century), it
ranges from about 4 percent to about 10 percent. To make our calculations
easier, we choose the scope of [5,10].
24 Sociological Methods & Research 42(1)
Now that we have the scope limits we can proceed to estimating the
importance of the ceiling. The basic principle is simple:
The importance of the ceiling is the size of the no-data zone compared
with the size of the scope zone, i.e., the ratio of the two.
In Figure 4, if we take the largest rectangle, AþB then we have a esti-
mated constraint of 11.25/25 ¼ .45. We think this constitutes considerable
limits on high economic growth, since the labor repression variable excludes
almost half of the scope. We think that it will take much more experience
with estimating constraints in this manner to get a feel for what is ‘‘large’’
and what it is not. However, as a rough first proposal we think constraints
above 15–20 percent would clearly be important.
This example illustrates quite dramatically the difference between statistical
procedures that estimate lines through data versus our procedure which esti-
mates lines that separate the region of data from the one of no data. Because
Geddes was not looking for regions of no data, she did not see them; once you
are looking for them, they jump out at you. Using the data on labor repression
and economic growth, we have found support for the hypothesis that labor
repression is a strong constraint on high economic growth. In the next section,
we abandon our restrictions on rectangular shape and perfect fit. As many of our
examples above illustrate, we want to estimate triangular regions of no data, and
we typically want to allow a few observations into the region of no data.
In summary, the key steps in the methodology are the following:
1. Explicitly formulate the constraint as a ceiling (or floor) hypothesis.
2. Estimate the size of the no-data zone.
3. Determine the scope and its size.
4. Calculate the ratio of the no-data zone size to the scope size.
In this section, we have started from more or less clear hypotheses that X is a
constraint on Y. It is clear from Figure 4 that one can work backward from the
data to hypotheses. The empty zone in a scatterplot can be interpreted as a con-
straint and/or a necessary condition. Since the relationship between X and Y is
often not specified, empty zones can help the researcher think about the causal
relationship in terms of constraints. Of course, whether such an interpretation
makes sense depends on the empirical and theoretical context.
for drawing the boundaries, (2) to allow some counterexamples in the no-data
zone, and (3) to provide criteria for choosing among alternative boundary
lines.
In this section, we introduce quantile regression as a methodology which
allows us to systematically draw lines bounding no-data zones. We focus on
triangular zones because as we have seen they are probably the most com-
mon and simplest kind of geometric shape. Using quantile regression allows
us to vary the number of counterexamples that we allow (on average) into the
‘‘no-data’’ zone, which now becomes the ‘‘almost no-data’’ zone.
Once we allow counterexamples into the analysis, we are faced with a
fundamental trade-off. On one hand, our principle is to maximize the size
of no-data zone. We can enlarge this by including more and more counterex-
amples. However, we have a opposing principle which is that we would like
as few counterexamples as possible. We shall propose a formula, a criterion,
that balances these competing goals allowing one to calculate what we call
the optimal boundary line (OBL).
As such, this section is more technical since we briefly describe what
quantile regression is. Also we discuss the technical details and logic behind
our OBL formula. For those not interested in the technical details, we encour-
age them to skip to our analysis of the GDP/capita–democracy relationships.
Most of the key points of this section are made in the discussion of this exam-
ple, and most of the discussion is understandable with the material from the
preceeding section in hand.
We need a systematic way to allow for some error rate in drawing the line,
say, .01, .05, or .10. First, social science data are not perfect, there are con-
ceptual and measurement problems, and so on. Second, one might also con-
sider that no observations is too high a standard. If the zone is ‘‘virtually’’
empty, then one might consider that the ceiling hypothesis is supported by
the data. Quantile regression is designed exactly to do this since we can ask
for the .99, .95, or .90 quantile regression line. Since quantile regression has
almost never been used in sociology and political science (according to our
JSTOR search) and rarely in economics (though see Heckman, Ichimura, and
Todd 1997; Abadie, Angrist, and Imbens 2002; for nice and relatively non-
technical introductions see Angrist and Piscke 2009; Cade and Noon 2003), it
is useful to give a basic description of the technique.8
Quantile regression was developed in the late 1970s largely by Roger
Koenker and colleagues (e.g., Koenker and Bassett 1978). This was a period
when statisticians were very interested in developing robust statistical tech-
niques. It was equally motivated by common problems of heteroscedasticity
in data and its implications for the estimation of confidence intervals and the
26 Sociological Methods & Research 42(1)
like. This literature often mentions an early remark by Mosteller and Tukey
(1977) that one could easily investigate estimated changes in things other
than the mean of the response variable, and that focusing just on the mean
might give an incomplete view of the relationship between the Y and
X variables. Of course, that is what we have been arguing here, we are not
always so interested in the mean effect of the treatment on Y but rather the
impact of X on the boundary of Y.
The basic idea behind quantile regression is quite simple: instead of focus-
ing on the mean one looks at quantiles. So the quantile regression analogue
of least squares regression is a median regression. As such, a quantile regres-
sion looks very similar to an ordinary regression: QY ðtjX Þ ¼ b0 ðtÞX0 þ
b1 ðtÞX1 þ . . . þ bn ðtÞXn þ 2 where t is the quantile of interest. So
QY ð:50jX Þ is a median regression. The ts attached to the bs indicate that the
relationship between X and Y is changing depending on the quantile.
The conditional quantiles denoted by QY ðtjX Þ are the inverse of the con-
ditional cumulative distribution function of the response variable, FY1 ðtjX Þ.
So QY ð:95jX Þ is a function where on average 95 percent of the values of Y are
less than the estimated function of X. Koenker’s insight was that quantile
regression could be estimated by an optimization function minimizing a sum
of the weighted absolute deviations, where the weights are asymmetric func-
tions of t (Koenker 2005). The use of absolute, as opposed to squared, devia-
tions again signals quantile regression’s origins in robust statistics.
When choosing large, for example, .90 or .95, or small, for example, .10 or
.05, quantiles one estimates lines at the boundaries, top or bottom respec-
tively, of the data. This immediately gives us the possibility of allowing some
observations into the ceiling or floor zones. If we choose a 95 quantile regres-
sion, then on average we will find about 5 observations of the 100 in the no-
observation zone.9
A key insight of the quantile regression methodology is that there may be
no relationship between X and Y when looking at the mean treatment effect,
but the regression line for the .95 quantile might show an important relation-
ship. The labor repression–high economic growth example we discussed
above illustrates this: The regression line is flat but there is a clear no-
observation zone, and we find that the importance of labor repression for
high economic growth is large. In terms of the equation above,
b1 ð:50Þ might not be significantly different from zero, but b1 ð:95Þ might sug-
gest an enormous impact of X on Y.
So while quantile regression was originally developed more as a robust tech-
nique for regression (focusing on the median and no distributional assumptions)
it has found perhaps its most important applications in areas where boundaries
Goertz et al. 27
are of key empirical and theoretical importance. For example, a major area of
application is ecology, where often one wants to know about the carrying capac-
ity of environments. Cade and Noon in their introduction to quantile regression
for ecologists make this argument: ‘‘The ecological concept of limiting factors
as constraints on organisms often focuses on rates of change in quantiles near the
maximum response, when only a subset of limiting factors are measured’’
(2003:413). This quote uses the terms we have often seen where the focus is
on the zone of no observations, such as ‘‘constraints’’ and ‘‘limiting factors.’’
It is perhaps not an accident that five of the six scatterplots Cade and Noon chose
to illustrate quantile regression have triangular no-data regions.
As we will see in Tables 7 and 8, one typically estimates a number of
quantile regression lines. In part, this is because of its sensitivity to outliers,
particularly at extreme percentiles, but also because the researcher may be
interested in the changing relationship between X and Y at different percentiles.
As our example of using quantile regression, we continue with the exam-
ple of the relationship between wealth and democracy. We borrow some data
from Gerring (2007), who looks at GDP/capita and polity democracy scores
for 1995, excluding countries with high GDP/capita from oil revenues, for
example, oil monarchies.
Przeworski et al. (2000) have provocatively argued that the wealth–democ-
racy relationship is not the one proposed by modernization or endogenous
growth models. What wealth does is to prevent democracies from lapsing back
into authoritarianism. We can express his proposition in terms of sufficient
conditions, hence a hypothesis about floors: Democracy and a high level of
GDP/capita are sufficient for no transition to authoritarianism. In this formula-
tion, we have a theory that predicts a floor pattern in the data. Figure 5 shows
that in fact we do see a floor pattern (the extreme outlier is Singapore).
The first key principle in using quantile regression for our purposes is to
estimate boundary lines for a range of quantiles. Table 7 illustrates this for
the floor of the wealth–democracy data, where we have calculated lines for
.01–.20 quantiles. This is important because in any given situation we do not
know how many counterexamples are best to allow into the floor zone.
It would be useful to have a method for determining which of the various
quantile regression lines is the ‘‘best’’ according to some reasonable criteria.
In determining the OBL, we have several criteria. In Table 7, we have the
following key variables in the columns, where S ¼ scope size10:
The OBL formula allows us to balance the costs of allowing in more coun-
terexamples against the benefits of increasing the exclusion zone. A decision
rule would be to take the maximum OBL score to determine the ‘‘best bound-
ary line’’ for a given floor or ceiling.
We think that the OBL formula is quite useful for choosing the best line
within a data set or population. However, we eventually want to be able to
make some comparisons across studies. One way to do this is to take a fixed
standard. For many reasons, an obvious choice is the .95 quantile regression.
Using the .95 quantile regression means that there are, on average, 5 percent
counterexamples. This means we find a 5 percent error rate acceptable, and
reflects the fact that we take measurement error into account. For example,
Braumoeller and Goertz (2000) use this standard. Obviously 0.05 is the com-
mon standard for type I error in statistical studies. Using the .95 quantile
regression line means that we will always have roughly the same proportion
30 Sociological Methods & Research 42(1)
convention the range 7–10. This means that out of a range of 21 (i.e., –10 to
10), democracy is only a relatively small part of the whole scale, i.e., 4/21 ¼
.19. The second argument for the .15 line looks at the theoretical context.
Przeworski’s central argument was about a floor for democracy. While the
.15 line produces many counterexamples, they are located clearly in the
authoritarian region; there are virtually no counterexamples in the democracy
zone.14
The key thing is that what we are really interested in is the boundary, not
the line through the middle of the data. This boundary is implying that if a
country has a given GDP/capita level it is not going to slip below a certain
level of democracy. The implication is that it will not transition to levels
of democracy–authoritarianism below that floor.
As we have noted above, for example, Table 6, many scholars have noted
necessary condition relationships in these data. This ceiling hypothesis is:
high GDP/capita is a necessary condition for democracy. Hence, it is useful
to look at the ceiling boundary for the data in Figure 5. The procedure for
ceilings is the same as for floors except one is using 80–.99 quantiles
instead of .01 to .20.
Here the data are much better behaved and have a much clearer triangular
shape. Unlike the floor data, we are clearly in the zone of democracy in the
upper left corner. The OBL scores in Table 8 show once again that we have a
choice for the optimal line. The actual maximum OBL value is for the .87
quantile, but we get quite good scores for the .95 quantile. Given that the data
and their scales are not problematic for the ceiling, we think that following
the .95 rule makes a lot of sense. We have six observations above the ceiling
line which is 3.5 percent of the whole data set.
As Table 8 reports, the size of the ceiling zone is much smaller than the
floor zone (in Table 7). So if we consider the scope of all the data, one
might be tempted to conclude that the floor is more important than the ceil-
ing, but when we look at the ceiling we are no longer really looking at the
scope of all the data, so one needs to take into account the changing nature
of scope.
This ceiling illustrates another key point of boundary line analysis: often
we are interested in regions of the scope. In our particular case, scholars have
been very interested in high-quality democracy or democracy in general. As
we have stressed in our brief literature review, many have thought about the
wealth–democracy relationship in terms of necessary conditions. Perhaps, the
most important and common version of this is that wealth is a necessary condi-
tion for democracy (e.g., tested in Braumoeller and Goertz 2000). If this is the
proposition of interest, then we limit ourselves (as we did analogously for high
32 Sociological Methods & Research 42(1)
growth in the Geddes example) to the 7–10 region of polity scores. Taking the
standard .95 we have an important constraint at almost 25 percent of scope.
All of sudden what was a relatively unimportant ceiling in general
becomes a significant one in the context of a specific hypothesis. The
data indicate that it is very difficult for a poor country to become a
democracy and even more difficult to be a high-quality democracy, that
is, polity ¼ 10.
Our discussion of floors and ceilings illustrates that the substantive inter-
pretation of the ceiling and floor zones is critical in many cases. High-quality
democracy is only a small region of the polity authoritarianism to democracy
scale; it is only one level of a possible 21 levels on the polity scale. But sub-
stantively we have a great interest in the causes and consequences of good
democracy. This example illustrates how important the definition of the
scope is in evaluating ceilings and floors. We think that one of the more
novel aspects of our boundary methodology is its explicit inclusion of scope
considerations into the calculations.
Our very brief analysis of ceilings and floors in the wealth–democracy
relationship illustrates the strength of the quantile regression methodology
and the usefulness of asking about regions of no observations. Our brief anal-
ysis has produced four important results:
Notice that our looking for no-data zones means that we have potentially a
variety of conclusions and results even though it is just a bivariate scatterplot.
A typical statistical analysis would estimate the line through all the data and
one would have one parameter estimate of interest. Here we see that in fact
we have a series of conclusions depending on the region of the data we are
looking at. Often these are a combination of very strong results about ceilings
or floors, combined with very weak results in areas where the scatterplot
looks pretty random. Thus, looking for ceilings or floors is an interesting way
to dissect data for strong relationships.
Goertz et al. 33
Conclusions
Our focus on ceilings and floors allowed us to integrate many of the disparate
findings in the literature relating wealth to democracy. Instead of a set of iso-
lated empirical findings, we have a consistent set of relationships. Instead of
looking at one line through the middle of the data, we have seen that there are
multiple regions where there are few data points; these correspond to well-
known claims.
We have suggested that ceiling–floor hypotheses, theories, and data are
not uncommon in political science and sociology. We have also suggested
that some fields are more likely to formulate these than others. One area in
particular we think is quite full of these hypotheses is the wide variety of lit-
eratures on the causes and consequences of political institutions. To get a feel
for the extent of floor and ceiling hypotheses, we examined a prominent
anthology on comparative institutions, Steinmo, Thelen, and Longstreth
(1992) which has seven substantive chapters. Three of those chapters clearly
deal with ceiling issues: Weir’s chapter ‘‘Ideas and the politics of bounded
innovation,’’ Immergut’s (1992) chapter on veto players which has a triangu-
lar theory (figure 3.1), Rothstein’s (1992) chapter on labor-market institu-
tions, which has some nice rectangular data (table 2.1).
Within the special topic of the causes, or at least correlates, of democracy,
Acemoglu and Robinson’s (2006) chapter 3 is quite useful in getting a feel
for the prevalence of triangular data. They provide a number of scatterplots
of democracy versus various popular independent variables, such as inequal-
ity, education, tax revenue, along with GDP/capita. Three of these four vari-
ables show a clear triangular relationship with democracy (tax revenue is the
exception).
Another area worthy of future work are triangular theories that arise from
game theoretic models. Triangular theories seem to arise naturally in game
theory settings; this potential linkage needs exploration. More generally,
Amartya Sen has stressed that size of the choice set, in contrast to the
actual choice, is critical in understanding development and inequality
(1992:51–52).
We have only looked at bivariate relationships involving ceilings and
floors. An obvious question is how do control and confounding variables fit
into this analysis? One of the most important concerns in statistical and cau-
sal analysis is confounding variables. How do the floor and ceiling factors
interact with other causal variables? While it goes beyond the confines of
a single article, it is likely that things will look much different than in tradi-
tional statistical analyses. To get a sense of how things can be different,
34 Sociological Methods & Research 42(1)
Acknowledgments
We thank Jan Box-Steffensmeier, Bear Braumoeller, Rick Doner, Alex Hicks, Gary
King, and SMR reviewers for comments on earlier drafts of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or pub-
lication of this article.
Notes
1. We exclude cases where regions are excluded by definition; we only include
those where the ceiling or floor is determined by some causal mechanism.
2. There is no reason why the floor or ceiling must be a straight line or that one side must
lie on the X- or Y-axis. We shall see examples of this in our empirical analyses below.
3. Ragin’s fuzzy set methodology (2000, 2008) is another alternative which we do
not explore here.
4. As we noted above, and as illustrated by Figures 1 and 2, by definition necessary
or sufficient conditions produce no-data triangular scatterplots. One of the most
common responses to previous versions of this article was ‘‘I can produce trian-
gular scatterplots using other functional forms than necessary conditions.’’
Hence, we stress the following caveat: there are multiple theories and
Goertz et al. 35
mathematical functions that can produce triangular data distributions. Our posi-
tion is that necessary and sufficient condition and game theories are common and
popular theoretical approaches which produce such predictions. As a general
point, there are almost always multiple theories or data-generating procedures
that are consistent with the data. For example, one of the first things that one
learns in a mathematical statistics class is variance stabilizing transformations.
To produce a triangular scatterplot, one needs to come up with a variance desta-
bilizing transformation. One reader cleverly proposed the stata code Y ¼ exp(X þ
X invnorm(uniform())) where X is random uniform. Another way to get trian-
gular scatterplots is via interaction terms. For example, Y ¼ bXZ could produce a
bivariate triangular scatterplot of X versus Y.
5. While we discuss this at greater length below, the logical language of sufficiency
is very close to that of probability or statistics; ‘‘If X then Y’’ becomes
‘‘ProbðY jX Þ is very high. Sometimes it is hard to tell if probability language is
being used when the author feels that a bald statement of sufficiency will be con-
tradicted by a few counterexamples.
6. Tufte (1969) gives another nice example where two scatterplots have the same R2
but where one is quite triangular and the other is not.
7. Some might suggest that labor repression is an ordinal variable. But by conduct-
ing a regression analysis Geddes is treating it as an interval variable.
8. While we have suggested that quantile regression is a good technique for estimat-
ing ceilings and floor, it is not the only one available. Data envelop analysis and
stochastic frontier analysis are other options. As with statistical techniques in
general, there are various options that it would be worth exploring for their rela-
tive advantages and disadvantages.
9. We continue to use the term ‘‘no-observation zone’’ even though we now allow a
small number of counterexamples into this zone.
10. Note that within a given data set this will be a constant.
11. If one were focusing on comparisons across populations or data sets, one might
consider standardizing the number of counterexample by the size of the popula-
tion, that is, ACE ¼ Z/(C/N).
12. For the ceiling we use just t.
13. Why multiplication? One way to think of combining criteria is via utility or pro-
duction functions. One can think of the three factors in OBL procedure as labor,
capital, and so on in the classic Cobb-Douglas production function with expo-
nents of one. In our context, one property of this function is that it prefers situa-
tions of balance between the two criteria, which we think is a desirable one in this
case (see the discussion of ‘‘compromise’’ in Goertz 2004).
14. In fact, if we wanted to go with more complex boundaries, we could go to Figure
5 and draw a rectangle for the bottom half and a triangle for the top. This example
36 Sociological Methods & Research 42(1)
might be typical of the problems one will generally face in drawing the boundary
line. One balances the criteria we mentioned above, simplicity of shape, number
of observations left in the region, and maximizing the region size.
References
Abadie, A., J. Angrist, and G. Imbens. 2002. ‘‘Instrumental Variables Estimates of the
Effect of Subsidized Training on the Quantiles of Trainee Earnings.’’ Econome-
trica 70:91-117.
Acemoglu, D., and J. Robinson. 2006. Economic Origins of Dictatorship and Democ-
racy. Cambridge: Cambridge University Press.
Angrist, J., and J.-S. Piscke. 2009. Mostly Harmless Econometrics. Princeton, NJ:
Princeton University Press.
Apodaca, C. 1998. ‘‘Measuring Women’s Economic and Social Rights Achieve-
ments.’’ Human Rights Quarterly 20:139-72.
Barro, R. 1991. ‘‘Economic Growth in a Cross-section of Countries.’’ Quarterly Jour-
nal of Economics 106:407-43.
Braumoeller, B., and G. Goertz. 2000. ‘‘The Methodology of Necessary Conditions.’’
American Journal of Political Science 44:844-58.
Bueno de Mesquita, B. 1985. ‘‘The War Trap Revisited: A Revised Expected Utility
Model.’’ American Political Science Review 79:156-77.
Cade, B., and B. Noon. 2003. ‘‘A Gentle Introduction to Quantile Regression for
Ecologists.’’ Frontiers in Ecology 1:412-20.
Camerer, C. 2003. Behavioral Game Theory: Experiments on Strategic Interaction.
Princeton: Princeton University Press.
Caruso, R. 2006. ‘‘A Trade Institution as a Peaceful Institution? A Contribution to
Integrative Theory.’’ Conflict Management and Peace Science 23:53-72.
Cingranelli, D., and T. Pasquarello. 1985. ‘‘Human Rights Practices and the Distribu-
tion of Foreign aid to Latin American Countries.’’ American Journal of Political
Science 29:539-63.
Cioffi-Revilla, C. 1983. ‘‘A Probability Model of Credibility: Analyzing Strategic
Nuclear Deterrence Systems.’’ Journal of Conflict Resolution 27:73-108.
Cioffi-Revilla, C., and H. Starr. 2003. ‘‘Opportunity, Willingness, and Political
Uncertainty: Theoretical Foundations of Politics.’’ Pp. 225-48 in Necessary Con-
ditions: Theory, Methodology, and Applications, edited by G. Goertz and H. Starr.
New York: Rowman & Littlefield.
Clark, W., M. Gilligan, and M. Golder. 2006. ‘‘A Simple Multivariate Test for Asym-
metric Hypotheses.’’ Political Analysis 14:311-31.
Dahl, R. 1956. A Preface to Democratic Theory. Chicago, IL: University of Chicago
Press.
Goertz et al. 37
De Soysa, I., and E. Neumayer. 2007. ‘‘Resource Wealth and the Risk of Civil War
Onset: Results from a New Dataset of Natural Resource Rents, 1970–1999.’’ Con-
flict Management and Peace Science 24:201-18.
Deyo, F. 1989. Beneath the Miracle: Labor Subordination in the New Asian Industri-
alism. Berkeley: University of California Press.
Diamond, L. 1992. ‘‘Economic Development and Democracy Reconsidered.’’ in
Reexamining Democracy: Essays in Honor of Seymour Martin Lipset, edited by
G. Marks and L. Diamond. Newbury Park, CA: Sage.
Diener, E., and M. Seligman. 2004. ‘‘Beyond Money: Toward an Economy of Well-
being.’’ Psychological Science in the Public Interest 5:1-31.
Doner, R., et al. 2005. Systematic vulnerability and the origins of developmental
states: Northeast and Southeast Asia in comparative perspective. International
Organization 59:327-61.
Drezner, D. 2007. All Politics is Global: Explaining International Regulatory
Regimes. Princeton, NJ: Princeton University Press.
Dul, J., T. Hak, G. Goertz, and C. Voss. 2010. ‘‘Necessary Condition Hypotheses in
Operations Management.’’ International Journal of Operations & Production
Management 30:1170-90.
Duverger, M. 1954. Political Parties: Their Organization and Activity in the Modern
State. London, UK: Methuen.
Elkins, Z., and J. Sides. 2007. ‘‘Can Institutions Build Unity in Multiethnic States.’’
American Political Science Review 101:693-708.
Gartzke, E. 1998. ‘‘Kant We All Just Get Along? Opportunity, Willingness, and the
Origins of the Democratic Peace.’’ American Journal of Political Science 42:
1-27.
Geddes, B. 2003. Paradigms and Sand Castles: Theory Building and Research
Design in Comparative Politics. Ann Arbor, MI: University of Michigan
Press.
Gerring, J. 2007. Case Study Research: Principles and Practices. Cambridge: Cam-
bridge University Press.
Goertz, G. 2003. ‘‘The Substantive Importance of Necessary Condition Hypotheses.’’
Pp. 65-94 in Necessary Conditions: Theory, Methodology, and Applications, edi-
ted by G. Goertz and H. Starr. New York: Rowman & Littlefield.
Goertz, G. 2004. ‘‘Constraints, Compromises, and Decision Making.’’ Journal of
Conflict Resolution 48:14-38.
Goertz, G. 2012. ‘‘Descriptive–Causal Generalizations: ‘Empirical Laws’’’ in the
Social Sciences?’’ Pp. 85-108 in Oxford Handbook of the Philosophy of the Social
Sciences, edited by H. Kincaid. Oxford: Oxford University Press.
Goertz, G., and H. Starr (eds.). 2003. Necessary Conditions: Theory, Methodology,
and Applications. New York: Rowman & Littlefield.
38 Sociological Methods & Research 42(1)
Gordon, S., and A. Smith. 2004. ‘‘Quantitative Leverage Through Qualitative Knowl-
edge: Augmenting the Statistical Analysis of Complex Causes.’’ Political Analy-
sis 12:233-55.
Haggard, S. 1990. Pathways From the Periphery: The Politics of Growth in the Newly
Industrializing Countries. Ithaca: Cornell University Press.
Harvey, F. 2003. ‘‘Practicing Coercion: Revisiting Successes and Failures Using
Boolean Logic and Comparative Methods.’’ Pp. 147-78 in Necessary Conditions:
Theory, Methodology, and Applications, edited by G. Goertz and H. Starr. New
York: Rowman & Littlefield.
Heckman, J., H. Ichimura, and P. E. Todd. 1997. ‘‘Matching as an Econometric Eva-
luation Estimator: Evidence from Evaluating a Job Training Programme.’’ Review
of Economic Studies 64:605-54.
Hechter, M. 2000. ‘‘Nationalism and Rationality.’’ Studies in Comparative Interna-
tional Development 35:3-19.
Hibbs, D. 1977. ‘‘Political Parties and Macroeconomic Policy.’’ American Political
Science Review 71:1467-87.
Hoddie, M., and C. Hartzell. 2003. ‘‘Civil War Settlements and the Implementation
of Military Power-Sharing Arrangements.’’ Journal of Peace Research 40:
303-20.
Huth, P. 1996. Standing Your Ground: Territorial Disputes and International Con-
flict. Ann Arbor, MI: University of Michigan Press.
Immergut, E. 1992. ‘‘The Rules of the Game: The Logic of Health Policy-Making in
France, Switzerland, and Sweden.’’ Pp. 57-89 in Historical Institutionalism in
Comparative Analysis, edited by S. Steinmo, K. Thelen, and F. Longstreth. Cam-
bridge: Cambridge University Press.
Kaufmann, D., A. Kraay, and P. Zoido-Lobaton. 1999. ‘‘Governance matters.’’
Manuscript. Policy Research Working Paper. World Bank.
Kenworthy, L. 2002. ‘‘Corporatism and Unemployment in the 1980s and 1990s.’’
American Sociological Review 67:367-88.
Koenker, R. 2005. Quantile Regression. Cambridge: Cambridge University Press.
Koenker, R. and Bassett. 1978. ‘‘Regression Quantiles.’’ Econometrica 46:33-50.
Langlois, C., and J.-P. Langlois. 2006. ‘‘Bargaining and the Failure of Asymmetric
Deterrence: Trading off the Risk of War for the Promise of a Better Deal.’’ Con-
flict Management and Peace Science 23:159-80.
Lijphart, A. 1990. ‘‘The Political Consequences of Electoral Laws, 1945–85.’’ Amer-
ican Political Science Review 84:481-96.
Lipset, S. 1959. ‘‘Some Social Requisites of Democracy: Economic Development and
Political Legitimacy.’’ American Political Science Review 53:69-105.
Lipset, S. 1992. ‘‘Social Requisites of Democracy Revisited.’’ American Sociological
Review 59:1-22.
Goertz et al. 39
Author Biographies
Gary Goertz is professor at the Kroc Institute for International Peace Studies at the
University of Notre Dame. He is the author or editor of nine books and over 40 arti-
cles on issues of methodology, international institutions, and conflict studies, includ-
ing ‘‘Necessary Conditions: Theory, Methodology, and Applications,’’ (2003
Rowman & Littlefield) and ‘‘Social Science Concepts: A User’s Guide’’ (2006 Prin-
ceton University Press), ‘‘Explaining War and Peace: Case Studies and Necessary
Condition Counterfactuals,’’ (Routledge 2007) ‘‘Politics, Gender, and Concepts: The-
ory and Methodology,’’ (2008 Cambridge University Press) and ‘‘A Tale of Two Cul-
tures: Qualitative and Quantitative Research in the Social Sciences’’ (2012 Princeton
University Press).
Tony Hak is an associate professor of research methodology at the Rotterdam School
of Management, Erasmus University, the Netherlands. His research interests include
necessary condition analysis, case study methodology, the challenges of academic
business surveys, and the obstacles to the application of the ‘‘new’’ statistics (with
a focus on effect sizes and meta-analysis) in business research. He is author (with oth-
ers) of several publications on the methodology of discourse analysis, conversation
analysis, the principles of coding and coder training, survey interviewing, cognitive
interviewing, and necessary condition analysis.
Jan Dul is a professor of technology and human factors at the Rotterdam School of
Management, Erasmus University, the Netherlands. His research interests include
effects of social-organisational and physical work environments on employee perfor-
mance, and business research methodology. He is author (with others) of several pub-
lications on (case study) research methodology and necessary condition analysis.