Professional Documents
Culture Documents
edu/entries/paradox-simpson/
Simpson's Paradox
First published Mon Feb 2, 2004; substantive revision Thu Aug 6, 2009
Further, the arithmetical structures that invalidate such arguments pose deep problems for
inferences from statistical regularities to conclusions about causal relations. Robust
associations between variables can mask underlying causal structures that, when made
explicit, expose the associations to be causally spurious. In the example above, higher
recovery rates in each subpopulation are not sufficient to establish that a proposed
treatment is causally effective in promoting recovery. Provided that the sample space is
large enough to support causal inferences, different partitions of the population will
exhibit different regularities that can appear to support incompatible conclusions about
whether a treatment is causally effective. However, once the arithmetical structures that
underlie arguments like the one above are made explicit, the structures provide a rich
resource for providing causal models for actual and possible causal systems that are
initially puzzling and can appear to be impossible. These include causal models for the
evolution of traits such as altruism in a setting in which natural selection disadvantages
1 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
Section 1 provides a brief history of Simpson's Paradox, a statement and diagnosis of the
arithmetical structures that give rise to it, and the boundary conditions for its occurrence.
Section 2 examines patterns of invalid reasoning that have their sources in Simpson's
Paradox and possible ways of countering its effects. A particularly important case where
Simpson's Paradox has been invalidly employed is discussed in Section 3. It has been
mooted that paradoxical data provide counter-examples to the Sure Thing Principle in
theories of rational choice. Why such data appear to provide counter-examples to the
Sure Thing Principle is explained, and the appearance that they do so is dispelled. Section
4 discusses the roles and implications of paradoxical data for theories of causal inference
and for analyses of causal relations in terms of probabilities. While the conclusions of this
section are largely negative, Section 5 illustrates how apparently paradoxical data can
support causal models for the evolution of traits that at first appear to be incompatible
with a setting in which natural selection disadvantages individuals that exhibit the traits.
The death rate for African Americans was lower in Richmond than in New
York.
2 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
The death rate for Caucasians was lower in Richmond than in New York.
The death rate for the total combined population of African Americans and
Caucasians was higher in Richmond than in New York.
They next posed two questions about the data concerning mortality rates: “Does it follow
that tuberculosis caused [italics added] a greater mortality in Richmond than in New
York…” and “…are the two populations that are compared really comparable, that is,
homogeneous?” (Cohen & Nagel 1934). After posing the questions, they left it as an
exercise for the reader to answer them. Following the publication of Simpson's paper,
statisticians initiated a lively debate about the significance of facts like those that are
verified by the tables Cohen and Nagel cited. The debate sought constraints on statistical
practice that would avoid conundrums arising from actual and possible paradoxical data.
However, this debate did not address the first question posed by Cohen and Nagel
concerning causal inference. As Judea Pearl notes in his survey of the statistical literature
on Simpson's paradox, statisticians had an aversion to talk of causal relations and causal
inference that was based on the belief that the concept of causation was unsuited to and
unnecessary for scientific methods of inquiry and theory construction (Pearl 2000,
173–181).
3 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
The following interpretation of the structure illustrates why it can give rise to perplexity.
The example is loosely based on a discrimination suit that was brought against the
University of California, Berkeley (see Bickle et al., 1975).
Men Women
History 1/5 < 2/8
Geography 6/8 < 4/5
University 7/13 > 6/13
How can it be that each Department favours women applicants, and yet overall men fare
better than women? There is a ‘bias in the sampling’, but it is not easy to see exactly
where this bias arises. There were 13 male and 13 female applicants: equal sample sizes
for both groups. Geography and History had 13 applicants each: equal sample sizes
again. Nor does the trouble lie in the fact that the samples are small: multiply all the
numbers by 1000 and the puzzle remains. Then the reversal of inequalities becomes fairly
robust: you can add or subtract quite a few from each of those thousands without
disturbing the Simpson's Reversal.
The key to this puzzling example lies in the fact that more women are applying for jobs
that are harder to get. It is harder to make your way into History than into Geography.
(To get into Geography you just have to be born; to get into History you have to do
4 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
something memorable.) Of the women applying for jobs, more are applying for jobs in
History than in Geography, and the reverse is true for men. History hired only 3 out of 13
applicants, whereas Geography hired 10 out of 13 applicants. Hence the success rate was
much higher in Geography, where there were more male applicants.
Simpson's Reversal of Inequalities occurs for a wide range of values that can be
substituted for a, b, c, d, A, B, C, D in the above schema. The values fall within a broad
band that lies between two extremes:
On one extreme, slightly more women are applying for jobs that are much harder to get.
Men Women
History 1/45 < 5/55
Geography 50/55 < 45/45
University 51/100 > 50/100
On the other extreme, many more women are applying for jobs that are slightly harder to
get.
Men Women
History 4/5 < 90/95
Geography 94/95 < 5/5
University 98/100 > 95/100
Further, the numerators and denominators of fractions that instantiate the schematic
pattern can be uniformly multiplied by any positive number without perturbing the
relations between the fractions. Fractions that exhibit these patterns correspond to
percentages and probabilities. In their probabilistic form, Colin Blyth provides the
following boundary conditions for Simpson's Reversals (Blyth 1972). Let ‘P’ represent a
probability function, and take conditional probabilities to be ratios of unconditional
probabilities in accordance with their orthodox definition; i.e., reading the ‘/’ in the
context P(--/..) as ‘given that’,
P(A/B&C) ≥ δ . P(A/~B&C)
P(A/B&~C) ≥ δ . P(A/~B&~C)
5 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
On the assumption that the propositions of arithmetic are necessary, these possibilities are
tantamount to existence conditions in arithmetic. The schema:
is valid in a large family of modal logics. The boundary conditions for Simpson's
Reversals allow that any probabilistic association between A and B can be inverted in
some further partition of B. From the standpoint of arithmetic there is a partition {C,~C}
within which associations between A and B are inverted. An important related
consequence is that it is always mathematically possible to provide some condition or
factor C that renders A probabilistically independent of B when C is conjoined with B as
a condition on A and with ~B as a condition on A. These facts of arithmetic carry no
empirical significance by themselves. However, they do have methodological significance
insofar as substantive empirical assumptions are required to identify salient partitions for
making inferences from statistical and probability relationships.
The need for substantive empirical assumptions arises in settings where there are
instances of arithmetical possibilities that are marked out by Simpson's Reversals in urn
models and in possible and actual empirical settings. For example, consider an urn model
for our story about the success rates for job applicants. The model consists of twenty-six
balls. Each ball is labeled with one of the elements from the sets {M, ~M}, {H, ~H}, and
{S, ~S}, e.g., a given ball might be labeled [~M, H, ~S] Assume that the labels are
distributed to correspond to the distributions of job applicants. In trials of drawing balls
from the urn with replacement, the associations between the M's, H's, and S's in the
sub-populations, and the reverse association between M's and S's in the overall
population, are resilient. The resilient associations are due only to the structure of the
model and do not have any causal significance. By way of contrast, substantive
assumptions are required to draw inferences in other cases.
Patterns in data that fall within the boundary conditions for Simpson's Reversals of
Inequalities can raise problems for testing and evaluating empirical hypotheses, e.g.,
testing the effectiveness and safety of medical procedures. A course of treatment for a
malady that affects the staff of History and Geography can be correlated with a lower
death rate for treated compared with untreated patients in History, and a lower death rate
for treated compared with untreated patients in Geography; yet, the course of treatment
may nevertheless correlate with a higher death rate when treated patients are compared
with untreated patients overall. Conversely, a treatment can be correlated with higher
mortality rates in each sub-population, while it is correlated with a lower mortality rate in
the total population. In such cases it is far from clear what, if anything, to conclude from
the correlations about the effectiveness and safety of the treatment.[2] Moreover, with
patterns like those surmised for this example, different ways of partitioning the same data
can produce different correlations that appear to be incompatible with the correlations
under the initial way of partitioning the data. E.g., under a partition by academic
6 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
discipline, patients appear to fare worse when treated, even though there can be a
positive correlation in the total population between treatments and recoveries. This is
consistent with a positive correlation between treatments and recoveries when the
population is partitioned by gender. While Historians and Geographers each fare worse
given the treatment, males and females from the two Departments can each fare better
given the treatment, and these facts are consistent with the combined population faring
better, or with the combined population faring worse.[3]
The aforementioned possibilities are due to the fact that the following formulae are
collectively consistent. Take ‘P’ to be a probability function. Probability models can be
provided that verify the consistency of the set consisting of the following formulae:
Similar inequalities are possible with signs reversed, and equalities that represent
probabilistic independence are consistent with positive and/or negative associations in
partitions of the populations. These facts are not paradoxical from an arithmetical point of
view. However, regularities that can be represented by them cannot all be assigned causal
significance, and probabilistic equalities that are sufficient for probabilistic independence
cannot all be taken to represent causal independence.
Standard statistical methods for significance testing offer no insurance against conflicting
results when data are partitioned or consolidated. In a setting where the effectiveness of a
new medical treatment is under test, the following data support rejecting the null
hypothesis, at the .05 level, that treatment (T) makes no difference to recovery (R),
where the alternative to the null hypothesis is that treatment is favorable for recovery.[82]
R ~R
T 369 340
~T 152 176
However, in this model, when the population is further partitioned by gender, the opposite
recommendation for males and for females is supported at the .05 level of significance.
7 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
~T 73 145 79 31
Take the null hypothesis to be that there is no association between treatments and
recoveries, and the alternative to the null hypothesis that treatment is less favorable for
recovery than no treatment. Rejecting the null hypothesis falls within the .05 level of
significance for both the M-tables and the ~M-tables. So, when the consolidated data are
considered, treatment is favored, but when the population is partitioned by gender, no
treatment is favored for both males and females. A further partition, e.g., a partition by
age groups, can reverse the associations within partitions by gender. So treatments can be
positively correlated with recoveries in the total population, negatively correlated with
recoveries when the population is partitioned by gender, and positively correlated with
recoveries when the population is partitioned by age. The generality of the boundary
conditions for Simpson's reversals of inequalities guarantees that there always are models
in arithmatic that accomodate data and support conflicting recommendations. Arithmatic
is silent on which partitions to take as the basis for evaluating conflicts between
hypotheses given data and the ways data can be partitioned.
Now, treating terms as proper fractions, we can have a/b = 2a/2b, and A/B = 5A/5B; c/d
= 3c/3d, and C/D = 4C/4D. However, when these equivalent representations are pooled,
the resulting relations between fractions will often differ from the original relations. E.g.,
(2a + 3c)/(2b + 3d) can be more or less than (a + c)/(b + d). Hence, it is invalid to
conclude that relations between percentages or ratios when data are pooled will conform
to the regularities that are exhibited by the sets that comprise partitions of the data.
8 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
Equivalent representations of ratios make different contributions when data are pooled.
A related point comes out even more vividly when fractions are interpreted as
probabilities. It was noted above that a Simpson's Reversal can take the following
probabilistic form: It is possible to have
One way for intuitive reasoning to overlook this possibility is by overlooking the so-called
law of total probability and its relevance to this setting. From the probability calculus we
have the following equivalences that represent probabilities as weighted averages.
Skewed weights for P(B/C), P(B/~C), P(~B/C), and P(~B/~C) create the range of
possibilities that are marked out by the boundary conditions for Simpson's Reversals.
E.g., let P(A/B) = .54 and P(A/~B) = .44. So, B is positively relevant to A. Let the
weights that feature in the representation of these probabilities in terms of a factor C be
as follows: P(B/C) = .28, P(~B/C) = .72, P(B/~C) = .66, and P(~B/~C) = .34. Given
these weightings, B will be positively relevant to A, but it will be negatively relevant to A
in each of the cells provided by the partition {C, ~C}. I.e., P(A/B&C) = .27 and
P(A/B&~C) = .33, and P(A/~B&C) = .64, and P(A/~B&~C) = .66.[4] If, as a matter of
habit, intuitive reasoners tend to ignore the effects of such skewing, they will be taken
aback when Simpson's Reversals turn up in actual and possible data. Of course, it is an
empirical question whether such oversight is the source of invalid reasoning, or whether
another hypothesis better explains why many people find Simpson's Reversals to be
impossible at first, and why the reversals continue to be surprising even after their source
9 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
In theories of rational choice in which preferences are ordered by the rule of maximizing
expected utility, STP is a consequence of the fact that the expected utility of an option
can be represented as a probabilistically weighted average of the expected utilities of
mutually exclusive and collectively exhaustive ways the world could be on the assumption
that the option is chosen. E.g., with ‘EU’ representing a function that assigns expected
utilities and ‘P’ a probability function,
When you know that B holds, it becomes a parameter for the expected utility of A, and
similarly when you know that ~B holds. So if the expected value that is assigned to C is
less than A on the assumption you know that B obtains, and similary on the assumption
that B does not obtain, then the expected value of C is unconditionally less than the
expected value of A.
Now suppose that you are offered bets on applicants gaining jobs in the example
concerning the two departments. Your options are to bet on a randomly drawn successful
applicant being male, or to bet on a randomly drawn successful applicant being female.
Let C be the event of applying for a job in History, and ~C be the event of applying for a
job in Geography. (Every person in the relevant domain applies for exactly one position.)
Given that the success rates for females were greater than that for males in both
departments, does the STP recommend that you should back females as the bettor's
choice? One might (invalidly) reason as follows: given that females have a greater chance
of success in their applications given C and given ~C, STP recommends a preference for
bets on females in a lottery in which you are betting on the gender of successful
applicants. Of course, this would be bad advice in the setting of the example, as the
success rate for males was greater overall. Given a suitably large number of bets, a clever
bookie could be assured of a handsome profit if bettors backed females in the
competitions for jobs. Their success rate was lower than their male competitors’ success
rate overall despite being higher in each department.
To see what has gone awry in the attempt to apply STP in this setting it suffices to note
that a random draw from successful applicants is made from the mixture that contains
10 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
males and females, and there are more males in the mixture. (Recall that females were
applying in greater numbers for jobs that were harder to get.) It is insufficient for the
applicability of the Principle that probabilities line up with females having a greater
chance of success in each department. The Principle applies to preferences , taken as
weighted averages of utilities with probabilities supplying the weights. The presented
options are
To be told that a selected applicant applied for a position in History (C) or in Geography
(~C) does not affect the probabilities of success in the mixture. This is evident when the
expected utilities of the options are explicitly represented as weighted averages. Using
‘M’ for male, ‘~M’ for female, ‘S’ for successful, and ‘C’ and ‘~C’ as above, the
expected utilities for the options are as follows.
Option 1: EU(~M&S) =
EU(~M&S&C)P(C/S&~M) + EU(~M&S&~C)P(~C/S&~M)
Option 2: EU(M&S) =
EU(M&S&C)P(C/S&M) + EU(M&S&~C)P(~C/S&M)
Given the figures that were used in the example, the probability relations between the
weightings are as follows:
It is these relations that are the source of the illusion that STP selects Option 1. The
probability of a successful female applicant having applied for a position in History is
greater than that of her male competitor among the applicants in History, and similarly for
females in Geography. If the candidates had been sorted by their applications to the
respective departments, where females had higher success rates, and the drawing was
done from a randomly chosen department (with repeated draws and replacement until a
successful applicant is drawn) rather than from the mixture of successful applicants, then
the best choice would be for the gender with the higher success rates in the respective
departments, i.e., females. Such an arrangement would not be affected by the fact that
more women applied for jobs that were harder to get. But that is not the arrangement that
has been stipulated for the bets where selection is made from the pooled successful
applicants. The chances of selecting a male (or a female) from that mixture are
independent of the department to which the successful applicants had applied.
Accordingly, rational bettors will find STP to be inapplicable in the setting, because they
will not have the preferences that its application requires, i.e., a preference for females,
given that they applied for a job in History (C), and a preference for females, given that
they applied for a job in Geography (~C). For rational bettors, EU(~M&S) =
11 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
EU(~M&S&C) = EU(~M&S&~C), and similarly for M's, while, on the figures provided
in the example, EU(~M&S) < EU(M&S).
Simpson's Reversal of Inequalities illustrates that from an arithmetical point of view, there
always is a factor or proposition C that ‘screens off’ any correlation. The existence of
such a factor cannot be sufficient for a correlation to be spurious. For example, suppose
that the probability of A given B is greater than without B. The following diagram
illustrates this possibility with probabilities corresponding to the proportional sizes of
enclosed spaces with all of A represented by the enclosed rectangle that is intersected by
the line dividing B from ~B.
The boundary conditions for Simpson's Reversals guarantee that there is a C that
intersects equal parts of A&B and A&~B. In Section 1 it was noted that arithmetical
possibilities are tantamount to existence conditions for arithmetical facts. Provided that a
sample space can be partitioned sufficiently finely, the probabilistic relevance between A
and B can be “washed out” by some arbitrary factor C within which the probabilities of
A&B and A&~B are equal. The following diagram illustrates this arithmetical possibility:
12 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
P(A/B&C) = P(A/~B&C)
The inference that lawfully correlated variables are causally independent of each other if
the correlation is due to a common cause is a special case of a more general view that
causes increase the chances of their effects.[5] When there is a common cause C of a
correlation between variables B and A, B does not cause A; the raising of A's chances is
due to C, and while B might be a symptom of A, it is so by virtue of being a separate
effect of C that precedes A. The following diagram illustrates these relationships. (Arrows
represent the directions of causal connections.)
Given C, B does not raise A's chances. The underlying idea behind analyses of causation
in terms of chance raising is that causes promote their effects. In deterministic settings,
chances take only extreme values, and causes do not ‘raise’ an effects’ chances of
occurring except in the degenerate sense that they raise the chances of their effects from
zero without them to one with them (excluding cases of deterministic overdetermination).
However, it is a contingent matter whether the world we inhabit is deterministic or
13 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
E.g., if smoking increases the chances of heart disease by 25%, but also increases the
chances of regular exercise by 40% while exercise decreases the chances of disease by
70%, smokers will on balance benefit from their habit with respect to cardiovascular
health. In this set-up, there could be a Simpson's Reversal where smokers who exercise
fare worse than non-smokers who exercise, and similarly for smokers who do not
exercise compared with non-smokers, while the smokers’ rates of disease are lower
overall. The net causal effect of smoking on health is positive in the example due to the
contribution of a third variable, exercising, that is an effect of smoking. It is the causal
contributions of further variables that are the sources of Simpson's Reversals in other
causal set-ups where the effects of direct causal links are modified by the additional
variables’ contributions. These include cases where direct effects are nullified by
inhibitory effects of an accompanying factor, e.g., substances that are separately
poisonous, acid and alkali, can interact to have no deleterious effect when they are taken
together. Each acts as an antidote for the other.[6] Further entanglements include cases
where a cause that promotes an effect is accompanied by an inhibitory cause of the effect
and they are both effects of a common cause. E.g.,
14 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
A common cause
Is Cartwright's observation cause for pessimism about the program of analyzing causation
and causal relevance in probabilistic terms? Not necessarily. It sets a problem about
causal entanglements that are not tracked by probability relations and probabilistic
entanglements that are not due to causal relations. The program of providing probabilistic
representations of causal relations needs to provide conditions that disentangle causal
networks. What is required is a way of locating the right partitions of populations, where
the right ones are the ones whose probability relations do track causal connections while
holding relevant background factors fixed. A number of different proposals have been put
forward in the literature on probabilistic causation that aim to provide criteria for locating
the right partitions of data for the purpose of identifying causal connections.
The proposals fall into two broad categories: (1) Reductive proposals: these do not appeal
to causal concepts and they aim to provide a filter on correlations that identifies which
correlations are spurious. Correlations that are not spurious are meant to conform to
intuitions about causal relations and to implement the roles that are intuitively assigned to
causal relations.[7] (2) Non-reductive proposals: these are unabashed about using causal
concepts to distinguish between spurious and causal correlations. Proposals from this
second group are generally skeptical about the Humean program that motivates reductive
proposals, and set-ups that are instances of Simpson's Reversals are one of their main
critical scalpels (Cartwright 1979, and especially Dupre & Cartwright 1988).
Nevertheless, they too face the problem of providing a filter on correlations that marks
out which of them are spurious, but they do not feel constrained to avoid reference to
15 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
causal relations in providing criteria for selecting partitions that provide reliable data for
causal inferences. In sum, both reductionists and anti-reductionists who endorse the
program of representing causal relations in terms of probability relations propose that
C causes E if and only if the probability of E is greater given C than given not
C, provided that …X….
The proviso is needed to filter cases where probability relations between C-type events
and E-type events do not track causal relations. Their opinions divide on whether causal
concepts need to or can be used without vicious circularity in spelling out the content of
the proviso …X…. Reductionists seek ways of spelling out the proviso in terms of
homogenous reference classes, where homogeneity is spelled out in terms of robust
correlations conditional on a set of factors that are held fixed. Anti-reductionists are quick
to ask: which factors? To take all possible factors to be relevant is not only
epistemologically intractable, but it can lead to silly conclusions insofar as all but
absolutely fundamental causal processes can be manipulated by introducing some
intervening factors. E.g., the probability of death given a heart attack is greater than
without the heart attack, but the contribution of the heart attack is ‘screened off’ in cases
where the heart attack coincides with being run down by a truck. In this example, the
chances of death are overdetermined. Cases of causal overdetermination are extreme
examples of causal networks in which probabilistic relevance is washed out or inverted
by the causal contributions of an exogenous variable. In the experimental sciences,
attempts at isolating interactions between factors from intervening variables are standard
procedure. However, what is achievable even in the best laboratory conditions will fall
short of the ideal of showing that there are no intervening factors on which a correlation
is dependent. To show the latter would require showing that a negative existential
proposition is true.
Anti-reductionists have a ready answer to the question of which factors have to be held
fixed when evaluating probabilistic dependencies and probabilistic independence. They
want all potentially causally relevant factors that are of interest to be held fixed for the
purposes of identifying the probability relations between C and E that are due to and are
apt for representing causal connections. According to this approach, reference classes
that are causally homogenous provide the proper basis for evaluating probability relations.
One then looks to background scientific theories and other knowledge of causal relations
to determine whether reference classes are causally homogenous.[8] In many cases,
however, our curiosity about causal relations outstrips our current knowledge of causally
relevant variables that need to be held fixed. Then, inferences to causal relations from
statistical data that can always be counter-posed with reversed regularities in different
partitions of the data can lead to inconsistent claims concerning causal relations.
16 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
Next, map the History Department onto Norway during a very severe winter in Norway,
and suppose there are more rats than lemmings in Norway. Then life is tough for
everyone in Norway, and it is even tougher for lemmings than for rats. Map the
Geography Department onto Sweden which is in the midst of a very mild winter, and
suppose there to be more lemmings than rats in Sweden. Then life is easier for everyone
in Sweden, though it is even easier for free-riding and opportunistic rats than it is for
lemmings. Finally, consider the reproductive rates for rats and lemmings in the total land
mass of the two countries. (Or, if these ‘rats’ and ‘lemmings’ were businesses, consider
their relative bankruptcy rates.) The numbers might then display the same pattern that we
described for hiring rates of men and women at the University of California:
Lemmings Rats
Norway (1×109)/(5×109) < (2×109)/(8×109)
Sweden (6×109)/(8×109) < (4×109)/(5×109)
Scandinavia (7×109)/(13×109) > (6×109)/(13×109)
Lemmings are losing ground in Norway, and they are losing ground in Sweden; yet they
are gaining ground in combined areas that constitute the two countries.
The reason that lemmings are gaining ground in the combined area of the two countries is
that more of the lemmings are living where the survival rate is higher. Note that the
survival rate is higher there precisely because that is where more of the lemmings are
living. Thus, if rats congregate together, the selfish efficiency of each rat will be bad not
only for the poor lemmings in the neighborhood but also for other rats. Even if only
slightly more of the rats are living in one region rather than another, if the benefits they
gain at their neighbors’ expense become too extreme then this will reduce the survival
rate of everyone in that neighborhood, rats included; this will precipitate a Simpson's
Reversal, and the number of rats will begin to go down globally when compared with
lemmings.
In both Darwinian evolutionary theory and much of economic theory, it is hard to see
how ‘altruism’ (or, for that matter, systematic inefficiency) could evolve, or be sustained
over the long term. That is, it is hard to see how a population could sustain heritable
patterns of behaviour that benefit the competitors of an individual business or organisms
at the expense of the long-term chances of survival or reproductive success for those
17 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
individuals and others with the same dispositions. For this reason it is of considerable
theoretical significance to explore the applications of Simpson's Paradox, to see whether
this might help to explain not only the altruism but also the irrationality, inefficiency,
laziness and other vices that may prevail in populations, and that can cause a population
to fall short of the economic rationalist's or Darwinian's ideal of the ruthlessly efficient
pursuit by each individual of its own profits or long-term reproductive success.
Sam Butchart has devised two Games that illustrate the dynamics of survival and
reproduction in settings where Simpson's Reversals occur. (See the link in the Other
Internet Resources section below.) One, Sharks and Suckers, is modeled on John
Conway's ‘Game of Life’. The other, Rats and Lemmings, is modeled on Axelrod's
tournaments of iterated rounds of ‘Prisoner's Dilemma’. In these games, it is a surprising
result that populations robustly sustain a proportion of Suckers or Lemmings in the long
term. Sharks and Rats never disappear completely, but nor do they ever take over
completely. Thus, Simpson's Paradox places a constraint on how selfish, how efficient
and how rational businesses or organisms can become. On balance, this is probably
cheerful news.
Bibliography
Axelrod, R., 1984, The Evolution of Cooperation, New York: Basic Books.
Bickel, P. J., Hjammel, E. A., and O'Connell, J. W., 1975, “Sex Bias in Graduate
Admissions: Data From Berkeley”, Science, 187: 398–404.
Blyth, C. R., 1972, “On Simpson's Paradox and the Sure Thing Principle”, Journal
of the American Statistical Association, 67: 364–366.
Cartwright, N., 1979, “Causal laws and effective strategies”, Noûs, 13 (4):
419–437.
Cartwright, N., 2001, “What is wrong with Bayes Nets?”, The Monist, 84 (2):
242–265. Reprinted in Probability is the Very Guide of Life, H. E. Kyburg, Jr. and
M. Thalos (eds.), Chicago and La Salle, IL: Open Court, 2003, 253–275.
Cohen, M. R., and Nagel, E., 1934, An Introduction to Logic and Scientific
Method, New York: Harcourt, Brace and Co.
Dawid, A. P., 1979, “Conditional independence in statistical theory,” Journal of the
Royal Statistical Society (Series B), 41: 1–15.
Dupre, J. and Cartwright, N., 1988, “Probability and causality: Why Hume and
indeterminism don't mix”, Noûs, 22: 521–536.
Eells, E., 1987, “Cartwright and Otte on Simpson's Paradox,” Philosophy of
Science, 54: 233–243.
Glymour, C. and Meek, C., 1994, “Conditioning and Intervening”, British Journal
for the Philosophy of Science, 45: 1001–1021.
Hardcastle, V.G., 1991, “Partitions, probabilistic causal laws, and Simpson's
Paradox,” Synthese, 86: 209–228.
Hesslow, G., 1976, “Discussion: Two notes on the probabilistic approach to
causality,” Philosophy of Science, 43: 290–292.
18 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
Lindly, D. V., and Novick, M. R., 1981, “The role of exchangeability in inference”,
Journal of the American Statistical Association, 9: 45–58.
Malinas, G., 1997, “Simpson's Paradox and the wayward researcher”, Australasian
Journal of Philosophy, 75: 343–359.
Malinas, G., 2001, “Simpson's Paradox: A logically benign, empirically treacherous
hydra”, The Monist, 84 (2): 265–284. Reprinted in Probability Is the Very Guide of
Life, Henry E. Kyburg, Jr. and Mariam Thalos (eds.), Chicago and La Salle, IL:
Open Court, 2003, 165–182.
Mittal, Y., 1991, “Homogeneity of subpopulations and Simpson's Paradox”,
Journal of the American Statistical Association, 86: 167–172.
Otte, R., 1985, “Probabilistic causality and Simpson's Paradox”, Philosophy of
Science, 52: 110–125.
Pearl, J., 1988, Probabilistic Reasoning in Intelligent Systems, San Mateo, CA:
Morgan Kaufman.
Pearl, J., 2000, Causality: Models, Reasoning, and Inference, New York,
Cambridge: Cambridge University Press.
Reichenbach, H., 1971, The Direction of Time, Berkeley: University of California
Press.
Savage, L. J., 1954, The Foundations of Statistics, New York: John Wiley and
Sons.
Simpson, E.H., 1951, “The interpretation of interaction in contingency tables”,
Journal of the Royal Statistical Society (Series B), 13: 238–241.
Skyrms, B., 1980, Causal Necessity, New Haven; Yale University Press.
Sober, E., 1993, The Nature of Selection, Chicago: University of Chicago Press.
Sober, E., 1993, Philosophy of Biology, Oxford: Oxford University Press.
Sober, E. and D. S. Wilson, 1998,Unto Others: The Evolution and Psychology of
Unselfish Behaviour, Cambridge, MA: Harvard University Press.
Spohn, W., 2001, “Bayesian nets are all there is to causality”, in Stochastic
Dependence and Causality, D. Constantini, M. C. Galavotti, and P. Suppes (eds.),
Stanford: CSLI Publications.
Sunder, S., 1983, “Simpson's reversal paradox and cost allocations”, Journal of
Accounting Research, 21: 222–233.
Suppes, P., 1970, A Probabilistic Theory of Causality, Amsterdam; North-Holland
Publishing Co..
Thalos, M., 2003, “The Reduction of Causation”, in H. Kyburg and M. Thalos
(eds.), Probability is the Very Guide of Life: The Philosophical Uses of Chance,
Chicago: Open Court.
Thornton, R. J., and Innes, J. T., 1985, “On Simpson's Paradox in economic
statistics”, Oxford Bulletin of Economics and Statistics, 47: 387–394.
Van Frassen, B. C., 1989, Laws and Symmetry, Oxford: Clarendon.
Yule, G. H., 1903, “Notes on the theory of association of attributes in Statistics”,
Biometrika, 2: 121–134.
19 of 20 10/28/2011 12:00 AM
Simpson's Paradox (Stanford Encyclopedia of Philosophy) http://plato.stanford.edu/entries/paradox-simpson/
Butchart, Sam and John Bigelow, Sharks and Suckers and Rats and Lemmings, two
computer games designed to illustrate Simpson's Paradox.
Simpson's Paradox, by Alan Crowe
Simpson's Paradox — When Big Data Sets Go Bad, in Amazing Applications of
Probability and Statistics at www.intuitor.com.
Online paper by Nick Chater, Ivo Vlaev and Maurice Grinberg, “A new
consequence of Simpson's Paradox: Stable co-operation in one-shot Prisoner's
Dilemma from populations of individualistic learning agents,” University College
London/New Bulgarian University.
Related Entries
causation: probabilistic | game theory: evolutionary | physics: Reichenbach's common
cause principle | prisoner's dilemma
Acknowledgments
The authors would like to thank Paul Oppenheimer for spotting an incorrectly specified
statistic and probability in Section 1.3 and Section 2, respectively.
Copyright © 2009 by
Gary Malinas <g.malinas@uq.edu.au>
John Bigelow <john.bigelow@arts.monash.edu.au>
20 of 20 10/28/2011 12:00 AM