Professional Documents
Culture Documents
401
1055-7903/00 $35.00
Copyright © 2000 by Academic Press
All rights of reproduction in any form reserved.
402 DOLPHIN ET AL.
The principle of the test is that if characters are more small data set, or one which lacks the ability to fully
congruent within than between partitions, then ho- resolve trees, without addressing the issue further (for
moplasy, and therefore tree lengths, will be minimized example Cannatella et al., 1998; Piercey-Normore et
when the characters are maintained in their original al., 1998). This is clearly an unsatisfactory position.
partitions and the ILD measure will be large. We consider that the problem needs urgent atten-
tion, given the inherently noisy nature of molecular
Applications of the ILD Test
data, combined with the trend toward multiple gene
Many studies have used the ILD test to infer the analyses and the availability of incongruence tests in
probability of evolutionary events using incongruence standard phylogeny reconstruction software. We be-
between data sets. Recent examples include tests for lieve that our study is the first to formally investigate
hybridization (Doyle et al., 1999), horizontal gene the relationship between noise and incongruence in the
transfer (Cho et al., 1998; Lecointre et al., 1998), re- ILD test, although several previous studies have
combination (Geiser et al., 1998), large-scale conver- touched upon this. Some studies (e.g., Graham et al.
gence (Muona, 1995; Quicke and Belshaw, 1999), and 1998) have shown that incongruence can exist between
hybrid origins of parthenogenesis (Normark and Lan- real and shuffled partitions; others (Cunningham,
teri, 1998). For these purposes, the generation of sig- 1997; Messenger and McGuire, 1998; Stanger-Hall and
nificant results in the test by noise would be positively Cunningham, 1998) show that the reduction or exclu-
misleading. sion of noise from matrices can reduce levels of incon-
Another common application of the test is during the gruence. Quicke and Belshaw (1999) showed that the
“conditional combination” or “prior agreement” ap- addition of noise to a small, simulated matrix reduced
proach to phylogeny reconstruction (Bull et al., 1993), incongruence with a simulated matrix of incongruent
in which the analysis of two data sets is carried out characters. We therefore investigated, using simula-
separately if they are found to be significantly incon- tions, how and why noise can affect the ILD test.
gruent (by some measure) and simultaneously if they
are not (see, for example, Johnson and Sorensen, 1998; MATERIALS AND METHODS
Vidal and Lecointre, 1998; Carbone et al., 1999; Hoot et
al., 1999; Spangler and Olmstead, 1999). Leaving aside We created two data matrices, called Comb and
the debate over separate versus combined analysis Bush, each of 16 taxa with 13 characters forming a
(e.g., Chippindale and Wiens, 1994; Nixon and Carpen- perfectly pectinate or symmetrical cladogram, respec-
ter, 1996), there is currently an unresolved issue con- tively. No characters were homoplastic and each node
cerning the precise effect that combining noisy and less was supported by one character (Fig. 1). All analyses
noisy data sets has on phylogeny reconstruction (Wen- were conducted separately for each matrix to reveal
zel and Siddall, 1999; Källersjö et al., 1999). This any effects of topology on the noise–incongruence rela-
means that a test which is used to decide whether or tionship.
not to combine data sets should not confound both
noise and incongruent signal into a single result. Effect of Noise on ILD Test Results
The Influence of Noise on the ILD Test We assessed the incongruence between each struc-
tured matrix and a range of data partitions containing
There is no consensus in the literature on how noise varied amounts of noise. ILD tests were conducted
will affect the ILD test. The test was published with between each matrix and a duplicate of that matrix
the stated expectation that noise would not affect the which had been modified by shuffling character states,
result (Farris et al., 1994), and this expectation has within an increasing number (from 1 to 13) of ran-
entered standard texts on phylogenetic methodology domly selected characters, using a macro in Microsoft
(Kitching et al., 1998). Nonetheless, authors differ in Excel. Examples of shuffled matrices are presented in
their opinion as to the relationship between noise or Fig. 1. The analyses were carried out on 30 shuffled
homoplasy and incongruence. For example, Swofford replicate partitions for each level of noise. Options on
(1991), quoted in Farris et al. (1994), states that the PAUP* (version 64) were set to 100 replicates of heu-
ILD test is not influenced by different levels of ho- ristic search using simple addition, MULPARS, and an
moplasy within the matrices but that it should be; unlimited maxtree setting.
Vidal and Lecointre (1998) agree that it is not influ-
Exploring the Response of the ILD Test to Noise
enced by noise, but they follow Farris et al. (1994) in
believing that it should not be; Graham et al. (1998) To interpret the results of the previous section, we
come to the conclusion that it is in fact influenced by proceeded to investigate the relationship between the
noise and offer advice on how to solve this problem (by amount of noise in a matrix and the length of the
excluding poorly supported branches). Other authors most-parsimonious tree (MPT) which it would produce.
make the observation during their work that some Using replicate matrices with varying amounts of noise
significant results in the test may be due to a noisy or generated as above, we searched for MPTs using
NOISE AND INCONGRUENCE 403
FIG. 1. The data matrices used in this study, their respective cladograms, and examples showing replicates with four characters shuffled
(boldface).
RESULTS
DISCUSSION
We propose that an alternative null model for the (branch-and-bound search, 10,000 replicates). In 10 of
ILD test, which makes allowance for the major effect of 15 instances, the significance of the shuffled replicate
noise demonstrated in our study, would be useful in ILD test was greater than that of the original partition.
some cases. To detect incongruence that cannot be Thus, we agree with the authors’ hypothesis that the
attributed to noise alone we recommend the following incongruence is due to the small and noisy nature of
procedure: (1) perform an ILD test on the two original the CALLS matrix.
data partitions (A and B); (2) shuffle the character
states within the characters of one matrix (A) repeat- ACKNOWLEDGMENTS
edly to generate a series of pure noise partitions; (3)
carry out ILD tests between these shuffled replicates David Swofford kindly allowed us to publish results obtained from
and the original matrix B; (4) repeat the procedure a test version of his program PAUP*. K.D. was supported by a
comparing shuffled replicates of B against unshuffled BBSRC Grant. R.B. and C.D.L.O. were supported by NERC Grants.
A; (5) only if the original level of conflict in step 1 is
significantly higher than the levels observed in both REFERENCES
steps 3 and 4 can we say that the partitions are signif-
icantly incongruent. Bull, J. J., Huelsenbeck, J. P., Cunningham, C. W., Swofford, D. L.,
Shuffling character states within characters retains and Waddell, P. J. (1993). Partitioning and combining data in
features of a matrix such as frequency and between- phylogenetic analysis. Syst. Biol. 42: 384 –397.
character distribution of 0’s and 1’s while replacing Cannatella, D. C., Hillis, D. M., Chippindale, P. T., Weigt, L., Rand,
A. S., and Ryan, M. J. (1998). Phylogeny of frogs of the Physalae-
signal with noise. Any difference in the ILD test results mus pustulosus species group, with an examination of data incon-
obtained before and after shuffling one matrix in this gruence. Syst. Biol. 47: 311–335.
way indicates features of agreement or incongruence Carbone, I., Anderson, J. B., and Kohn, L. M. (1999). Patterns of
that cannot be attributed to noise in the initial parti- descent in clonal lineages and their multilocus fingerprints are
tions (although other effects, such as base composition resolved with combined gene genealogies. Evolution 16: 354 –362.
bias, cannot be discounted). Chippindale, P. T., and Wiens, J. J. (1994). Weighting, partitioning,
Two examples of this approach are the following. (1) and combining characters in phylogenetic analysis. Syst. Biol. 43:
278 –287.
We further analyzed the mitochondrial data examined
Cho, Y., Qiu, Y. L., Kuhlman, P., and Palmer, J. D. (1998). Explosive
in Cunningham (1997) in which partitions represent
invasion of plant mitochondria by a group I intron. Proc. Natl.
1st, 2nd, and 3rd codon positions with the 3rd codon Acad. Sci. USA 95: 14244 –14249.
positions saturated by multiple substitutions (a good Cunningham, C. W. (1997) Is congruence between data partitions a
example of noise). The initial ILD test using branch- reliable predictor of phylogenetic accuracy? Empirically testing an
and-bound option finds the 2nd and 3rd codon parti- iterative procedure for choosing among phylogenetic methods.
tions to be highly incongruent (P ⬍ 0.001)—a striking Syst. Biol. 46: 464 – 478.
result, given the putatively identical phylogenetic his- Doyle, J. J., Doyle, J. L., and Brown, A. H. D. (1999). Incongruence in
tory of the two partitions. Shuffling the 3rd codon char- the diploid B-genome species complex of Glycine (Leguminosae)
revisited: Histone H3-D alleles versus chloroplast haplotypes. Mol.
acter states tends to reduce the level of incongruence, Biol. Evol. 16: 354 –362.
with the mean difference in tree lengths between orig- Farris, J. S. (1991). “Arn”. Program and documentation. Molekular-
inal and replicate partitions being 3.18 S.D. (n ⫽ 30) systematiska laboratoriet, Naturhistoriska riksmuseet, Box 50007
compared to 6.93 S.D. when the unshuffled data are S 104 –105, Stockholm, Sweden.
analyzed. However, the value of 6.93 S.D. lies within Farris, J. S., Källersjö, M., Kluge, A. G., and Bult, C. (1994). Testing
the 95% confidence limit calculated for the 30 shuffled significance of incongruence. Cladistics 10: 315–319.
partitions. The importance of distinguishing between Geiser, D. M., Pitt, J. I., and Taylor, J. W. (1998). Cryptic speciation
noise and signal in the case of codon position charac- and recombination in the aflatoxin-producing fungus Aspergillus
flavus. Proc. Natl. Acad. Sci. USA 95: 388 –393.
ters has been mentioned above, and this example high-
Graham, S. W., Kohn, J. R., Morton, B. R., Eckenwalder, J. E., and
lights the need for an unambiguous incongruence mea- Barrett, S. C. H. (1998). Phylogenetic congruence and discordance
sure. (2) In their extensive study of leptodactylid frogs, among one morphological and three molecular data sets from
Cannatella et al. (1998) examine phylogenetic matrices Pontederiaceae. Syst. Biol. 47: 545–567.
whose characters represent a range of morphological, Hoot, S. B., Magallon, S., and Crane, P. R. (1999). Phylogeny of basal
behavioral, and molecular sources. Multiple pairwise eudicots based on three molecular data sets: atpB, rbcL and 18S
ILD tests between the partitions revealed that the only nuclear ribosomal DNA sequences. Ann. Missouri Bot. Garden 86:
1–32.
instance of significant incongruence was when a ma-
Johnson, K. P., and Sorensen, M. D. (1998). Comparing molecular
trix comprising advertisement call characters (which
evolution in two mitochondrial protein coding genes (cytochrome b
they named CALLS) was compared with a morpholog- and ND2) in the dabbling ducks (Tribe: Anatini). Mol. Phylogenet.
ical matrix. Evol. 10: 82–94.
We created a series of shuffled replicates of the Källersjö, M., Albert, V. A., and Farris, J. S. (1999). Homoplasy
CALLS matrix and performed ILD tests comparing increases phylogenetic structure. Cladistics 15: 91–93.
these replicates with the original morphological matrix Kitching, I. J., Forey, P. L., Humphries, C. J., and Williams, D. M.
406 DOLPHIN ET AL.
(1998). “Cladistics: The Theory and Practice of Parsimony Analy- phological data sets: An example from the evolution of endopara-
sis,” 2nd ed., Oxford Univ. Press, Oxford. sitism among parasitic wasps (Hymenoptera: Braconidae). Syst.
Lecointre, G., Rachdi, L., Darlu, P., and Denamur, E. (1998). Esch- Biol. 48: 436 – 454.
erichia coli molecular phylogeny using the incongruence length Spangler, R. E., and Olmstead, R. G. (1999). Phylogenetic analysis of
difference test. Mol. Biol. Evol. 15: 1685–1695. Bignoniaceae based on the cpDNA gene sequences rbcL and ndhF.
Messenger, S. L., and McGuire, J. A. (1998). Morphology, molecules, Ann. Missouri Bot. Garden 86: 33– 46.
and the phylogenetics of cetaceans. Syst. Biol. 47: 90 –124. Stanger-Hall, I. K., and C. W. Cunningham. (1998). Support for a
Mickevich, M. F., and Farris, J. S. (1981). The implications of con- monophyletic Lemuriformes: Overcoming incongruence between
gruence in Menidia. Syst. Zool. 30: 351–370. data partitions. Mol. Biol. Evol. 15: 1572–1577.
Muona, J. (1995). The phylogeny of Elateroidea (Coleoptera), or Swofford, D. L. (1991). When are phylogeny estimates from molecu-
which tree is best today? Cladistics 11: 317–341. lar and morphological data incongruent? In “Phylogenetic Analysis
Nixon, K. C., and Carpenter, J. M. (1996). On simultaneous analysis. of DNA Sequences” (M. Miyamoto and J. Cracraft, Eds.), pp. 295–
Cladistics 12: 221–241. 333. Oxford Univ. Press, Oxford.
Normark, B. B., and Lanteri, A. A. (1998). Incongruence between Swofford, D. L. (1998). PAUP*. Phylogenetic Analysis Using Parsi-
morphological and mitochondrial DNA characters suggests hybrid mony (*and other methods). Version 4. Sinauer, Sunderland, MA.
origins of parthenogenetic weevil lineages (genus Aramigus). Syst. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. (1996).
Biol. 47: 475– 494. Phylogenetic inference. In “Molecular Systematics” (D. M. Hillis,
Page, R. D. M., and Holmes, E. C. (1998). “Molecular Evolution: A C. Moritz, and B. K. Mable, Eds.), pp. 407–514. Sinauer, Sunder-
Phylogenetic Approach,” Blackwell Sci., Oxford. land, MA.
Piercey-Normore, M. D., Egger, K. N., and Bérubé, J. A. (1998). Vidal, N., and Lecointre, G. (1998). Weighting and congruence: A
Molecular phylogeny and evolutionary divergence of North Amer- case study based on three mitochondrial genes in pitvipers. Mol.
ican species of Armillaria. Mol. Phylogenet. Evol. 10: 49 – 66. Phylogenet. Evol. 9: 366 –374.
Quicke, D.L.J., and Belshaw, R. (1999). Incongruence between mor- Wenzel, J. W., and Siddall, M. E. (1999). Noise. Cladistics 15: 51– 64.