You are on page 1of 9

Evolution. 39(4), 1985, pp.

783-791

CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH


USING THE BOOTSTRAP

JOSEPH FELSENSTEIN
Department of Genetics SK-50, University of Washington, Seattle, WA 98195

Abstract. - The recently-developed statistical method known as the "bootstrap" can be used
to place confidence intervals on phylogenies. It involves resampling points from one's own
data, with replacement, to create a series ofbootstrap samples ofthe same size as the original
data. Each of these is analyzed, and the variation among the resulting estimates taken to
indicate the size of the error involved in making estimates from the original data. In the
case of phylogenies, it is argued that the proper method of resampling is to keep all of the
original species while sampling characters with replacement, under the assumption that the
characters have been independently drawn by the systematist and have evolved indepen-
dently. Majority-rule consensus trees can be used to construct a phylogeny showing all of
the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If
a group shows up 95% of the time or more, the evidence for it is taken to be statistically
significant. Existing computer programs can be used to analyze different bootstrap samples
by using weights on the characters, the weight of a character being how many times it was
drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned
by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show
significant evidence for a group if it is defined by three or more characters.

Received July 12, 1984. Accepted April 12, 1985

It is rare that any attempt is made to a confidence interval by finding all trees
put a confidence interval on an estimate that cannot be rejected in comparison
of a phylogeny. Most methods for infer- with the best supported tree. I have re-
ring phylogenies yield one or a few trees, cently extended Cavender's analysis to
and their users rarely go beyond exam- the case of a molecular clock with three
ining the variation among trees that are species, obtaining, in that case, confi-
tied with the best tree under whatever dence limits that were somewhat smaller
criterion is being employed. There is no than Cavender's (Felsenstein, 1985). I
reason to believe that this practice con- have also recently reviewed the appli-
stitutes an adequate exploration of the cation of statistics to inferring phyloge-
size of the confidence limits on the esti- nies (Felsenstein, 1983a); that paper may
mate. be consulted for earlier references on sta-
A few authors have explored the ques- tistical estimation of phylogenies.
tion of confidence limits on phylogenies. An important recent statistical method
The pioneer in doing so is Cavender is the bootstrap (Efron, 1979), a relative
(1978, 1981) who examined the confi- of the jackknife. Like the jackknife, it is
dence limits for a four-species case, in a method of resampling one's own data
terms of how many steps worse a tree to infer the variability of the estimate.
must be than the most parsimonious tree This paper will explore the use of the
to be significantly worse. His results were bootstrap in inferring phylogenies, where
a bit disconcerting: when inferences were it leads to a practical method for placing
based on twenty characters, a tree would confidence intervals on the estimates.
have to be 9 steps worse to be signifi-
cantly worse. This implies that the con- The Bootstrap
fidence intervals would be quite large. A straightforward statistical exposition
Templeton (1983) has constructed a of the bootstrap is given by Efron and
test of whether one tree is significantly Gong (1983), and a readable elementary
better supported than another. In prin- treatment is that by Diaconis and Efron
ciple such a test could be used to delimit (1983). The basic idea of the bootstrap
783
784 JOSEPH FELSENSTEIN

involves inferring the variability in an can be inferred by computing the vari-


unknown distribution from which your ance of this collection of t* values, and
data were drawn by resampling from the the confidence limits on the parameter
data. Suppose that you had data points can be approximated by using the appro-
Xl> X 2 , ••• , x.; which you are willing to priate upper and/or lower percentiles of
assume were drawn independently from the observed distribution ofthe t* values.
the same distribution. From these, ap- The justification for this resampling is
plying some method T of statistical es- that, if the original sample size n is large,
timation, we obtain an estimate each possible value of X will be repre-
sented in the same proportion as in the
t = T(xl> X2' ••• , x n) (1)
underlying distribution, and resampling
ofa parameter we are interested in. Ifwe from the data points with replacement
knew the exact distribution from which will be the same as sampling from the
the Xi were drawn, and if the function T underlying distribution. For smaller
were sufficiently tractable algebraically, sample sizes, the process is an approxi-
we could obtain a formula for the stan- mation but frequently is a very good one.
dard error ofthe estimate t, and also con- The monograph by Efron (1982) can be
struct confidence intervals for t. consulted for further details on the prop-
The bootstrap procedure is most useful erties of the bootstrap.
when we either do not know the distri-
bution of the X;, or when T is so com- Bootstrapping Phylogenies
plicated that its standard error is difficult How can the bootstrap be applied to
to compute. It suggests that we resample phylogenies? Instead ofsample points Xl>
our data to construct a series of fictional X 2 , ••• , x; we usually have a table of
sets of data. Each of these is constructed species x characters (or species x sites
by sampling n points from the Xi' sam- for molecular sequences). It is not im-
pling with replacement. Each such fic- mediately obvious how resampling can
tional data set consists of n points, x 1*, be done in the data table. I will argue that
... , x;* where each point x i* is drawn at a justifiable procedure is to bootstrap
random from among the n original data across the characters, that is, to sample
points. It is quite likely that, in this re- characters (or sites) from the data table
sampling process, some of the original with replacement. Thus, each bootstrap
data points are represented more than sample consists of a new data table with
once, and others are omitted. the same set of species, but with some of
For each fictional set of data, we com- the original characters duplicated and
pute the estimate others dropped by the process of sam-
pling n characters from the original set
t* = T(x 1*, x 2*, ... , x;"). (2)
with replacement.
The resampling process is done many The justification for this is that we can
times (say r times), each time producing view each character as having evolved
a fictional sample of n points by sampling independently from the others according
with replacement from the original n data to a stochastic process that has among its
points. For each the estimate t* is com- parameters the topology and branch
puted. We are then in possession of a lengths ofthe underlying phylogeny. Each
collection ofr estimates ofthe parameter. character is then a random sample from
The essential idea ofthe bootstrap is that a distribution of all possible configura-
this set of estimates has a distribution tions of characters. For example, if we
that approximates the distribution of the are considering nucleic acid sequence data
actual estimate t. A bias-corrected esti- with p species, there are 4P possible out-
mate of the parameter can be computed comes at each site, not counting the pos-
by averaging the r different t* values (Ef- sibilities of deletion and insertion. To a
ron and Gong, 1983). The variance of t first approximation we can consider each
BOOTSTRAPPING PHYLOGENIES 785

site to be independently drawn from a characters. I have recently discussed (Fel-


distribution with 4P possibilities, whose senstein, 1983a) some of the statistical
probabilities depend on the phylogeny we issues involved in such a random-sam-
are trying to estimate. pling model of inference of phylogenies.
Given this independence ofevolution- A more serious difficulty is lack of in-
ary processes in different characters, the dependence of the evolutionary process-
configurations in the characters are seen es in different characters. Ifthe characters
to be drawn independently and identi- are correlated (as measurement charac-
cally distributed (i.i.d.), a necessary con- ters often are), then, in effect, we have
dition for the bootstrap method to be val- fewer characters in the study than we be-
id. In fact, in the case ofdiscrete character lieve. If correlations mean that what ap-
states (such as nucleic acids), the under- pear to be 50 independent characters are
lying distribution is multinomial, since really more like 30, the variability that
there are 4P possibilities each of which we infer for our estimate will be too small,
has some probability of occurring. De- producing overconfidence in the result; a
spite the complexity ofthe structure being bootstrap involving sampling 30 char-
inferred (the phylogeny) the statistical acters at random from among the 50
model is a very straightforward one-in- would have been more appropriate,
dependent samples from a multinomial though there is no way to know this in
distribution. advance. For the purposes of this paper,
It might be argued that this presup- we will ignore these correlations and as-
poses that the same probabilistic evolu- sume that they cause no problems; in
tionary process is operating in all of the practice, they pose the most serious chal-
characters, which is extremely unrealis- lenge to the use of bootstrap methods.
tic. Such an assumption is not necessary. A similar problem can arise when mul-
If instead we had a variety of different tistate characters have been recoded into
kinds ofcharacters evolving according to binary "factors" that are then treated as
different processes, we need only imagine if they were independent two-state char-
that there is an additional stage in the acters. These cannot be completely in-
process of random sampling, one occur- dependent, as they would have to be if
ring in the mind of the systematist. We the bootstrap sampled them indepen-
imagine, as part ofthe stochastic process, dently. Walter Fitch (pers. comm.) has
a step in which the systematist randomly suggested that this problem can be avoid-
draws each character from a pool of dif- ed by retaining a record of which binary
ferent kinds ofcharacters, each kind hav- factors are associated with which of the
ing a different evolutionary process that original characters, and having the boot-
applies to it. Once drawn, each character strap sample the original characters and
then has its actual configuration deter- keep all of the binary factors of a char-
mined by the appropriate stochastic evo- acter together. Thus, if there were nine
lutionary process. The resulting distri- characters that had been expanded to 20
bution of character configurations is a binary factors, we would construct the
mixture of multinomial distributions, bootstrap sample by drawing nine times
and, as such, is still a multinomial dis- from the nine characters, and whenever
tribution and is still i.i.d. a character was drawn we would take care
In practice the systematist may not to put all of its binary factors into the
have sampled the characters at random. bootstrap sample.
Systematists frequently include charac-
ters in the study in groups (such as groups Confidence Limits on Phylogenies
of measurements on the skull). We are An interesting problem arises when we
then not justified in regarding the process begin to consider how to construct con-
of choice of characters as a series of ran- fidence limits on the phylogenies. Each
dom samples from a pool of possible bootstrap sample is a data set that must
786 JOSEPH FELSENSTEIN

be analyzed to obtain an estimate of the they are both present in the true tree. But
phylogeny. We then have r phylogenies. at least they cannot be contradictory: each
Each of these is a complicated multi- being present on at least 95% of the
variate entity that has a tree topology and bootstrap estimated trees, they must co-
may also have branch lengths. Defining occur on at least one ofthe trees and must
a confidence interval and summarizing it thus be either nested or disjoint.
in a useable form is far from a simple The same argument has been used by
matter. Margush and McMorris (1981) to define
In bootstrapping, confidence limits on "majority rule" consensus trees. These
a statistic are frequently constructed by are trees composed of all those subsets
the percentile method, which involves that appear in a majority of a collection
simply taking (for a 95% confidence in- oftrees. By the argument just given, these
terval) the empirical upper and lower subsets must define a tree, since no two
2.5% points of the distribution of boot- of them can conflict. If we take the set of
strap estimates of the statistic. Consider phylogenies that result from analyzing a
testing whether the probability of heads series of bootstrap samples and make a
of a tossed coin exceeds 0.50. If we did majority-rule consensus tree, recording
not know about the binomial distribu- on it how often each subset appears, we
tion and decided instead to use the boot- will obtain a tree that can be used to de-
strap, a one-sided confidence interval on fine at a glance confidence sets for any
the probability of heads could be con- rejection probability below 50%. The
structed by finding the empirical lower majority-rule consensus tree itself can be
5% point ofthe distribution of bootstrap considered to be an overall bootstrap es-
estimates. The set ofvalues less than 0.50 timate of the phylogeny.
would therefore be rejected if values of In cases where we are using a statisti-
the estimated probability of heads that cally well-founded method, such as max-
small or smaller occurred less than 5% imum likelihood estimation, we would
of the time among the bootstrap esti- hope that the bootstrap method and the
mates. curvature ofthe likelihood surface would
The approach used here starts with the give similar indications of which parts of
assumption that the systematist is pri- the phylogeny were well estimated and
marily interested in whether some par- which not. Where the method ofinferring
ticular group is monophyletic. A rooted phylogenies is one with undesirable sta-
tree is a series of statements asserting tistical properties such as inconsistency,
monophyly of a series of nested or dis- the bootstrap does not correct for these.
joint sets of species. Suppose that we are For example, clustering by overall sim-
interested in a subset S of species and ilarity makes an inconsistent estimate of
wish to know whether there is significant the phylogeny if rates ofevolution in dif-
support in the data for the assertion that ferent lineages differ by more than a cer-
this group is monophyletic. We can reject tain amount. Parsimony methods are
the alternatives to the subset S if they subject to the same problem, but require
occur in less than 5% of the bootstrap greater inequalities of evolutionary rate
estimates. to be inconsistent. For an elementary dis-
We thus wish to search for all subsets cussion of these phenomena, see my re-
S of species that occur on 95% or more cent review article (Felsenstein, 1983b).
of the bootstrap estimates. Each of these Bootstrapping provides us with a confi-
subsets may be considered to be sup- dence interval within which is contained
ported (in the sense that its alternatives not the true phylogeny, but the phylogeny
are rejected), although those confidence that would be estimated on repeated
statements are not joint confidence state- sampling of many characters from the
ments: if two subsets are each supported underlying pool of characters. As such it
at the 95% level, we might have as little may be misleading if the method used to
as 90% confidence in the statement that infer phylogenies is inconsistent.
BOOTSTRAPPING PHYLOGENIES 787

TABLE l. Fossil horse data of Cam in and Sokal (1965). The states of each character are in a linear series.
-1, 0, 1, 2, ... , with the ancestral state being O. The data are also shown in binary recoded form in
which the nine multistate characters have been recoded into 20 binary factors. The first line ofthat table
indicates the correspondence between the original and recoded characters. Bootstrap sampling ofcharacters
should be done before any recoding into binary factors.
Binary Factors
Name Characters 11112 22333 44566 77889

Mesohippus 0 0 0 0 0 0 0 0 0 00000 00000 00000 00000


Hypohippus -1 3 3 0 0 0 0 0 1 00011 11111 00000 00001
Archaeohippus 1 0 0 0 0 0 0 1 1 10000 00000 00000 00101
Parahippus 1 1 1 1 0 0 0 0 1 10001 00100 10000 00001
Merychippus 2 2 2 2 1 1 1 2 1 11001 10110 11110 10111
Merych. secundus 2 2 2 2 1 -1 -1 2 1 11001 10110 11101 01111
Nannippus 2 2 1 2 1 1 1 2 1 11001 10100 11110 10111
Neohipparion 2 3 3 2 1 1 1 2 1 11001 11111 11110 10111
Calippus 2 2 1 2 1 -1 -1 2 1 11001 10100 11101 01111
Pliohippus 3 3 3 2 1 -1 -1 2 1 11101 11111 11101 01111

One difficulty in the interpretation of one would have to engage in an extrap-


the result is that we may not have decided olation to make their variance larger. The
which subset of species interests us until difficulty in envisaging a procedure like
after the bootstrap result is examined. this is that the space of possible phylog-
This raises the "multiple tests" problem: enies does not lend itself readily to ex-
if we have 20 statistical tests, on average trapolation: once a branch length has
one should show significance at the 95% shrunk to zero it is not immediately ob-
level purely at random. There are ways vious what to do next. Unlike normal
of making simple corrections ifthenum- means, phylogenies do not live in a flat
ber ofindependent tests is known, but in Euclidean space. One way to make the
this case the different tests (the different jackknife vary as much as the bootstrap
subsets that show up on the majority-rule would be to drop not one observation,
consensus tree) are probably correlated, but half the observations chosen at ran-
so that it is not easy to see how to com- dom. This possibility is worth exploring,
pute the number of independent tests so but for the moment it is not obvious what
as to correct for it. I have simply taken advantage there would be to using the
the 95% level as correct, as if we had jackknife rather than the bootstrap.
chosen the test of interest a priori.
One might wonder whether the jack- Using Existing Computer Programs
knife would be a viable alternative to the The process of generating many boot-
bootstrap. If we make a set of estimates strap samples from a data set is a tedious
by dropping one character at a time and one. One might think that it would re-
then estimating the phylogeny, the re- quire special programs to rewrite the data
sulting phylogenies will vary far less than matrix, leaving out some characters and
the bootstrap estimates do. In the simple duplicating others. Fortunately, much of
test case of sample averages estimating that work can be avoided by making use
the mean of a normal distribution, it of differential character weights, which
turns out that the jackknife estimates are allowed in most computer programs
of the mean will have a variance only for inferring phylogenies, particularly
n 2/(n - 1)3 times as large as that of the programs using parsimony methods.
corresponding bootstrap estimates (Ef- These programs usually allow integer
ron and Gong, 1983). To make the vari- weights for the characters, weights that
ance among the jackknife estimates as can be 0, 1, 2, .... A weight of zero
large as that among bootstrap estimates, means that the character is in effect
788 JOSEPH FELSENSTEIN

dropped from the analysis. A weight of gram package PHYLIP, available free
w means that the character is counted as from me (see the Appendix below).
if present w times, so that each change of
state in the character is counted as if it An Example
were w changes of state. Table 1 shows the fossil horse data giv-
This automatically accomplishes the en by Camin and Sokal (1965 pp. 321-
duplication and deletion of characters 322) as a computational example. The
without the necessity of recopying the full list of species and references for the
data matrix. Different bootstrap samples original data are given by Camin and 80-
could be fed into the programs by doing kal (1965). The data set has ten species
computer runs with different weights. The and nine multistate characters. Mesohip-
weights are generated by starting with pus has been taken as the outgroup, as it
weights ofzero for all characters. We then was in Camin and Sokal's paper.
sample n characters at random with re- Figure 1 shows the results of running
placement (using a table ofrandom num- a branch-and-bound program that finds
bers, for example). Each time a character all most parsimonious trees according to
is drawn, its weight is increased by one, the Wagner parsimony criterion. There
so that in the end its weight counts the are ten most parsimonious trees. The left
number of times it was sampled. Here tree in Figure 1 shows nine of them: the
are five of the weight vectors that were two empty circles with three descendants
generated when bootstrap sampling was represent not trifurcations, but points at
done on 20 characters: which the tree can be resolved into any
of three bifurcating topologies. All nine
21100212120121012010
01001510031020011211 possible combinations of these are in the
list of most parsimonious trees. The tenth
01031211201012100121 tree is the one shown at the right ofFigure
12130421030000100101
1. All of these trees require 29 changes
20010100211211121211
of character state.
To do bootstrap sampling, one would Ifwe were to take the variation among
generate a vector ofweights, run the phy- the most parsimonious trees as providing
logeny estimation program with those an adequate indication ofthe uncertainty
weights, generate another vector of in our estimate of the phylogeny, we
weights, run the program with those, and would conclude that four monophyletic
so on. The process is fairly tedious, al- groups were defined, as these groups show
though with microcomputers it need not up in all ten of the most parsimonious
be expensive. In the fossil horse example trees. They are:
below, I have used 50 bootstrap samples.
(Plioh ipp us, Merychippus secundus,
This might strike a statistician as too few,
Calippus)
but a systematist as too many. The more
(Nannippus, Neohipparion, Merychip-
samples are taken, the more accurate an pus)
idea we will have of which groups are
(Plioh ipp us, Merychippus secundus,
likely to be monophyletic. Even with a
Calippus, Na nnippus, Neohippa-
small number of bootstrap samples we
rion, Merychippus)
will quickly get a feel for which parts of (all but Mesohippus and Archaeohip-
our estimate of the phylogeny are well pus)
supported and which not.
A computer program that carries out When we carry out bootstrap sampling
bootstrap sampling and computes the ofcolumns from the left-hand part ofTa-
majority rule consensus tree is available ble 1 and analyze 50 bootstrap replicates,
for the case of discrete characters ana- we get the results shown in Figure 2. Next
lyzed by the parsimony and compatibil- to each branch of the tree is shown the
ity methods. It is contained in the pro- number of times that the bootstrap es-
BOOTSTRAPPING PHYLOGENIES 789

FIG. 1. All most parsimonious trees for the fossil horse data in Table 1 when phylogenies are evaluated
by the Wagner parsimony criterion. There are ten most parsimonious trees in all. Nine of these can be
generated by resolving each of the trifurcations in the left tree into all three possible bifurcations. The
tenth is shown in the right tree. All abbreviations are first three letters of names in Table 1, except MSE =
Merychippus secundus.

timate contained the corresponding formation or other method ofrooting the


monophyletic group. The tree shown is tree, we could still carry out bootstrap
the majority-rule consensus tree. The sampling and construct an unrooted ma-
consensus tree turns out in this case not jority-rule consensus tree. To do that, we
to be one ofthe most parsimonious trees. would only need to note that each branch
All four of the monophyletic groups list- of one of the replicate bootstrap esti-
ed above occur on it, but only one of mates divides the species into two groups,
these (the six-species group consisting of at least one of which would be mono-
Pliohippus, Merychippus secundus, Ca- phyletic if we could root the tree. The
lippus, Nannippus, Neohipparion, and unrooted majority-rule consensus tree is
Merychippus) comes close to occurring defined by finding those partitions that
95% of the time in the bootstrap sam- occur in a majority of the replicate trees.
pling (it occurs 47/50 or 94% ofthe time). A simple way of doing this is to choose
The others only occur about two-thirds an arbitrary species as an outgroup, make
of the time. It is apparent that taking the a majority-rule consensus tree of the re-
set of most parsimonious trees as defin- sulting rooted trees, and then present the
ing the confidence interval would result result as an unrooted tree without indi-
in far too narrow an interval. cating which species was the outgroup.
The example given here has had the
tree rooted by use of an outgroup. If we Perfectly Hennigian Data
were using a method that produced an Occasionally, though rather rarely, a
unrooted tree and had no outgroup in- data set will arise that has no internal
790 JOSEPH FELSENSTEIN

characters is less than 0.05. This is easily


computed, given nand c.
The probability of leaving out all c
characters in drawing n characters with-
out replacement is (1 - c/ny'. The value
of c that is necessary to make this less
than 0.05 is the same for all relevant val-
ues of n: it is c = 3. We can thus conclude
that, if the data are perfectly Hennigian,
three characters are enough for the boot-
strap to indicate significant support for a
monophyletic group at the 95% level. Any
group supported by fewer characters will
not be in the bootstrap confidence inter-
val. Of course, we are assuming that the
evolutionary processes and the inclusion
of the characters by the systematist are
independent across characters.
Although three characters are enough
to guarantee inclusion of a group, if the
data are perfectly Hennigian, one will
never encounter any character that con-
tradicts the group. Sometimes we have
FIG. 2. Bootstrap estimate ofthe phylogeny for great confidence that our characters are
the data ofTable 1 when phylogenies are evaluated "clean" ones, that reversals and paral-
by the Wagner parsimony criterion. Fifty bootstrap
samples were analyzed. The groups shown in this
lelisms would be so rarely seen that we
tree are those that occurred in a majority of the can have confidence in a group even if it
resulting trees, plus the most frequently occurring is supported by only one character. The
groups that were compatible with these. Next to present "rule of three" would then seem
each branch is shown the number of times that the to be a conservative one.
monophyletic group it defines occurred. Further ex-
planation is given in the text. It may be doubted that the rule is really
always conservative. I have recently
studied, by exact enumeration methods,
the problem of placing confidence limits
on phylogenies using parsimony meth-
conflict, all characters being perfectly ods when there are only three species and
compatible. This sort of data, which is an evoutionary process for which an evo-
that envisioned by Hennig when he sug- lutionary clock may be assumed (Felsen-
gested using derived character states to stein, 1985). It turns out that in the worst
define monophyletic groups, allows us to case, when the characters are equally
avoid entirely the bootstrap sampling likely to resolve a trifurcation in any of
process. The argument is quite simple. the three possible ways, if we have three
Suppose that we want to construct a 95% characters all of which support the same
confidence interval by bootstrap sam- resolution, this is not statistically signif-
pling. Suppose that c characters out of n icant at the 95% level. Four characters
define the same monophyletic group. would be. (I am indebted to Alan Tem-
That group will show up in the bootstrap pleton for pointing out the connection
estimate if anyone of those c characters between the two calculations.)
is drawn in the sampling of n characters. In many cases, strong conclusions have
The monophyletic group will be part of been drawn from the existence of groups
the 95% confidence interval if and only defined by as little as one character. The
if the probability of omitting all c of the great advantage of the present approach
BOOTSTRAPPING PHYLOGENIES 791
is that it provides a practical method, - - . 1982. The jackknife, the bootstrap, and
albeit a flawed one, for assessing the un- other resampling plans. CBMS-NSF Regional
Conference Series in Applied Mathematics No.
certainty inherent in such conclusions. I 38. Society for Industrial and Applied Mathe-
suspect that the levels of uncertainty matics. Philadelphia, PA.
found in practice will be so great as to EFRON, B., AND G. GONG. 1983. A leisurely look
give pause to all but the firmest expo- at the bootstrap, the jackknife, and cross-vali-
dation. Amer. Statist. 37:36-48.
nents of nonstatistical hypothetico-de- FELSENSTEIN, J. 1983a. Statistical inference of
ductive approaches to inferring phylog- phylogenies. J. Roy. Statist. Soc. A 146:246-
enies. 272.
--. 1983b. Parsimony in systematics: Bio-
ACKNOWLEDGMENTS logical and statistical issues. Ann. Rev. Ecol.
Syst. 14:313-333.
I am grateful to Kent Fiala of the De- - - . 1985. Confidence limits on phylogenies
partment ofEcology and Evolution, State with a molecular clock. Systematic Zoology 34:
University of New York at Stony Brook, 152-161.
for providing me with the fossil horse MARGUSH, T., ANDF. R. McMORRIS. 1981. Con-
sensus notrees. Bull. Mathemat. BioI. 43:239-
data ofCamin and Sokal in recoded form. 244.
I wish to thank Walter Fitch, Monty Slat- TEMPLETON, A. R. 1983. Phylogenetic inference
kin, Alan Templeton, Bill Engels, Ruth from restriction endonuclease cleavage site maps
Shaw, and an anonymous statistical re- with particular reference to the evolution ofhu-
viewer for suggestions for improvement mans and the apes. Evolution 37:221-224.
of the manuscript. This work was sup- Corresponding Editor: W. R. Engels
ported by task agreement number DE-
AT06-76EV71005 of contract number
DE-AM06-76RL02225 between the U.S. ApPENDIX
Department of Energy and the Univer- Availability of the PHYLIP
sity of Washington. Program Package
PHYLIP, the Phylogeny Inference Package, is a
LITERATURE CITED free package of computer programs, written in Pas-
CAMIN, J. H., AND R. R. SOKAL. 1965. A method cal, for inferring phylogenies. It includes parsimony
for deducing branching sequences in phylogeny. methods, compatibility methods, distance matrix
Evolution 19:311-326. methods, and maximum likelihood methods. The
CAVENDER, J. A. 1978. Taxonomy with confi- Pascal source code is provided (compiled object
dence. Math. Biosci. 40:271-280 (Erratum: Vol. code is not). PHYLIP will be written in a standard
44, p. 308, 1979). format on a magnetic tape provided by the recip-
- - . 1981. Tests of phylogenetic hypotheses ient. It will also be provided on 51f4-inch diskettes
under generalized models. Math. Biosci. 54:217- if 6 double density diskettes are sent. A variety of
229. soft-sectored MSDOS, CP/M-80, and CP/M-86
DIACONIS, P., AND B. EFRON. 1983. Computer- formats can be written; double-sided, hard-sec-
intensive methods in statistics. Sci. Amer. 249: tored, or 3.5-inch formats cannot, nor can any Ap-
116-130. ple formats. For information on formats supported
EFRON, B. 1979. Bootstrap methods: Anotherlook and restrictions on countries to which distribution
at the jackknife. Ann. Statist. 7:1-26. and support are available, please write the author.

You might also like