You are on page 1of 6

Molecular Phylogenetics and Evolution

Vol. 17, No. 3, December, pp. 401– 406, 2000


doi:10.1006/mpev.2000.0845, available online at http://www.idealibrary.com on

Noise and Incongruence: Interpreting Results of the Incongruence


Length Difference Test
Konrad Dolphin,* Robert Belshaw,* C. David L. Orme,* and Donald L. J. Quicke* ,†
*Department of Biology, Imperial College at Silwood Park, Ascot, Berkshire SL5 7PY, United Kingdom; and †Department of
Entomology, The Natural History Museum, London SW7 5BD, United Kingdom
Received December 15, 1999; revised August 4, 2000; published online November 3, 2000

volume of Cladistics, which, although published in


Incongruence between data sets is an important 1995, bears the date 1994). As we describe below in
concept in molecular phylogenetics and is commonly more detail, the ILD test is commonly used to test
measured by the incongruence length difference (ILD) hypotheses of evolutionary events, and it is used when
test (J. S. Farris et al., Cladistics 10, 315–319). The ILD considering whether to analyze multiple data sets sep-
test has been used to infer specific evolutionary events arately or simultaneously. We believe that both of
and to determine whether to combine data sets for these uses are compromised by the current lack of a
phylogenetic analysis. However, the interpretation in clear understanding of the relationship between the
the literature of the test’s results varies because au- ILD test and noise, defined here as random data (after
thors have conflicting expectations of the effect that Wenzel and Siddall, 1999) which by definition cannot
noise will have. Using simulations we demonstrate
reveal any features of shared phyletic history.
that noise can by itself generate highly significant re-
sults in the ILD test and demonstrate why this is the The ILD Test
case. To clarify the interpretation of test results, we Mickevich and Farris (1981) introduced the ILD
suggest an additional procedure in which the result is measure and it was developed (and also named) by
compared against a frequency distribution generated Farris et al. (1994).
from completely shuffled data. As examples, we apply “For matrices X and Y the incongruence length dif-
this approach to two previous studies that have re- ference D xy is given by:
ported incongruence. © 2000 Academic Press
D xy ⫽ L (x⫹y) ⫺ 共L x ⫹ L y兲

INTRODUCTION L x, L y and L (x⫹y) denote the lengths of most parsimoni-


ous trees calculated for each matrix separately and for
In a phylogenetic context, congruence is the extent to the combined matrix, that including all the charac-
which estimates of phylogeny based on different data ters.”
sets are in mutual agreement (Page and Holmes, 1998; To obtain a measure of the significance of this length
Swofford et al., 1996). The concept plays an important difference, a null length distribution is required, and in
role in phylogenetic research, especially with molecu- practice the test works as follows. (1) Calculate the
lar data sets in which several biological processes, e.g., sum of most-parsimonious tree (MPT) lengths from the
lineage sorting, gene duplication, and introgression two matrices (which we will refer to as partitions in the
(Page and Holmes, 1998), can lead to the estimation of context of an ILD test). (2) Create replicate partitions
different phylogenies of the same taxa from different by pooling all characters and randomly allocating them
genes. to two partitions equal in size to that of the originals.
Congruence may take the form of taxonomic congru- (3) Calculate the sum of the MPT lengths from each of
ence, in which tree topologies are compared; character these replicate partitions to form a tree-length distri-
congruence, in which the data sets themselves are di- bution. (4) Calculate the probability that the sum of
rectly compared in some way; or a combination of these lengths from the original partitions (step 1) lies within
two forms. One of the most intuitively appealing and this distribution: a low probability implies incongru-
widespread of the character congruence measures is ence.
the incongruence length difference (ILD) test of Farris The test is implemented in this way in software
et al. (1994). (Note that this paper is cited in the liter- applications such as arn (Farris, 1991) and PAUP*
ature as both Farris et al. (1994) and Farris et al. (Swofford, 1998); in the latter, the test is known as the
(1995): this is due to the late publication of the 1994 “partition homogeneity” test.

401
1055-7903/00 $35.00
Copyright © 2000 by Academic Press
All rights of reproduction in any form reserved.
402 DOLPHIN ET AL.

The principle of the test is that if characters are more small data set, or one which lacks the ability to fully
congruent within than between partitions, then ho- resolve trees, without addressing the issue further (for
moplasy, and therefore tree lengths, will be minimized example Cannatella et al., 1998; Piercey-Normore et
when the characters are maintained in their original al., 1998). This is clearly an unsatisfactory position.
partitions and the ILD measure will be large. We consider that the problem needs urgent atten-
tion, given the inherently noisy nature of molecular
Applications of the ILD Test
data, combined with the trend toward multiple gene
Many studies have used the ILD test to infer the analyses and the availability of incongruence tests in
probability of evolutionary events using incongruence standard phylogeny reconstruction software. We be-
between data sets. Recent examples include tests for lieve that our study is the first to formally investigate
hybridization (Doyle et al., 1999), horizontal gene the relationship between noise and incongruence in the
transfer (Cho et al., 1998; Lecointre et al., 1998), re- ILD test, although several previous studies have
combination (Geiser et al., 1998), large-scale conver- touched upon this. Some studies (e.g., Graham et al.
gence (Muona, 1995; Quicke and Belshaw, 1999), and 1998) have shown that incongruence can exist between
hybrid origins of parthenogenesis (Normark and Lan- real and shuffled partitions; others (Cunningham,
teri, 1998). For these purposes, the generation of sig- 1997; Messenger and McGuire, 1998; Stanger-Hall and
nificant results in the test by noise would be positively Cunningham, 1998) show that the reduction or exclu-
misleading. sion of noise from matrices can reduce levels of incon-
Another common application of the test is during the gruence. Quicke and Belshaw (1999) showed that the
“conditional combination” or “prior agreement” ap- addition of noise to a small, simulated matrix reduced
proach to phylogeny reconstruction (Bull et al., 1993), incongruence with a simulated matrix of incongruent
in which the analysis of two data sets is carried out characters. We therefore investigated, using simula-
separately if they are found to be significantly incon- tions, how and why noise can affect the ILD test.
gruent (by some measure) and simultaneously if they
are not (see, for example, Johnson and Sorensen, 1998; MATERIALS AND METHODS
Vidal and Lecointre, 1998; Carbone et al., 1999; Hoot et
al., 1999; Spangler and Olmstead, 1999). Leaving aside We created two data matrices, called Comb and
the debate over separate versus combined analysis Bush, each of 16 taxa with 13 characters forming a
(e.g., Chippindale and Wiens, 1994; Nixon and Carpen- perfectly pectinate or symmetrical cladogram, respec-
ter, 1996), there is currently an unresolved issue con- tively. No characters were homoplastic and each node
cerning the precise effect that combining noisy and less was supported by one character (Fig. 1). All analyses
noisy data sets has on phylogeny reconstruction (Wen- were conducted separately for each matrix to reveal
zel and Siddall, 1999; Källersjö et al., 1999). This any effects of topology on the noise–incongruence rela-
means that a test which is used to decide whether or tionship.
not to combine data sets should not confound both
noise and incongruent signal into a single result. Effect of Noise on ILD Test Results

The Influence of Noise on the ILD Test We assessed the incongruence between each struc-
tured matrix and a range of data partitions containing
There is no consensus in the literature on how noise varied amounts of noise. ILD tests were conducted
will affect the ILD test. The test was published with between each matrix and a duplicate of that matrix
the stated expectation that noise would not affect the which had been modified by shuffling character states,
result (Farris et al., 1994), and this expectation has within an increasing number (from 1 to 13) of ran-
entered standard texts on phylogenetic methodology domly selected characters, using a macro in Microsoft
(Kitching et al., 1998). Nonetheless, authors differ in Excel. Examples of shuffled matrices are presented in
their opinion as to the relationship between noise or Fig. 1. The analyses were carried out on 30 shuffled
homoplasy and incongruence. For example, Swofford replicate partitions for each level of noise. Options on
(1991), quoted in Farris et al. (1994), states that the PAUP* (version 64) were set to 100 replicates of heu-
ILD test is not influenced by different levels of ho- ristic search using simple addition, MULPARS, and an
moplasy within the matrices but that it should be; unlimited maxtree setting.
Vidal and Lecointre (1998) agree that it is not influ-
Exploring the Response of the ILD Test to Noise
enced by noise, but they follow Farris et al. (1994) in
believing that it should not be; Graham et al. (1998) To interpret the results of the previous section, we
come to the conclusion that it is in fact influenced by proceeded to investigate the relationship between the
noise and offer advice on how to solve this problem (by amount of noise in a matrix and the length of the
excluding poorly supported branches). Other authors most-parsimonious tree (MPT) which it would produce.
make the observation during their work that some Using replicate matrices with varying amounts of noise
significant results in the test may be due to a noisy or generated as above, we searched for MPTs using
NOISE AND INCONGRUENCE 403

FIG. 1. The data matrices used in this study, their respective cladograms, and examples showing replicates with four characters shuffled
(boldface).

PAUP* branch and bound option with an unlimited


maxtree setting. The analyses were carried out on 30
shuffled replicates for each level of noise.

RESULTS

Effect of Noise on ILD Test Results


Both data matrices show increasing incongruence
with their duplicate as the proportion of shuffled char-
acters in the duplicated set increases (Fig. 2). Signifi-
cant results appear when 50 – 60% of the characters
represent noise. Incongruence can therefore exist be-
tween one data matrix and another in which nearly
half the characters are perfectly congruent with the
first matrix and there is no additional systematic sig-
nal above that expected by chance alone. FIG. 2. Percentage of ILD tests showing significant incongruence
at the 5% level when comparing a homoplasy-free matrix and a
We stress that this result cannot be predicted simply duplicate with an increasing number of characters shuffled. Results
from the test’s design. We would expect, as is suggested from both the bush matrix (triangles, solid line) and the comb matrix
by Farris et al. (1994), that the test should act as the (circles, dotted line) are shown.
404 DOLPHIN ET AL.

noisy matrix. When this difference reaches a certain


level, ILD test results will be significant even in the
absence of systematic incongruence.

DISCUSSION

Our findings that significance in the ILD test can be


increased by a difference in levels of noise in two par-
titions have several implications for phylogenetic re-
search.

(a) Where the ILD test is used to test for particular


FIG. 3. (a) Most-parsimonious tree length grows in a nonlinear events during molecular evolution (recombination, in-
fashion with an increasing proportion of noise in each matrix. Re- trogression, etc.) by assessing the extent of incongru-
sults from both the bush matrix (triangles) and the comb matrix ence between two sequences, the unknown amount of
(circles) are shown. Standard error bars are too small to be visible.
significance generated by different levels of homoplasy
(b) The effect of this curvilinear relationship on the ILD test. Points
A and B represent two equally sized matrices; A is relatively noise- within the data sets will render the interpretation of P
free and produces a shorter tree than the noisier B. When the values generated by the test ambiguous.
characters (and noise) are repartitioned during the ILD test, the (b) The ILD test is increasingly being used to de-
length of the average replicate is closer to B than to A. termine whether combined or separate analysis of par-
titions is favorable. The results of this study show that
equivalent of the Mann–Whitney U test and should the ILD test could lead to a separate analysis of two
assess a difference in mean (⫽ signal) without being matrices representing a similar or identical underlying
affected by a difference in variance (⫽ noise) in the two topology but whose characters had evolved at different
samples. Our suggestion as to the cause of this ILD rates and thus displayed different amounts of noise.
significance is described below. Whether this consequence is advantageous is open to
debate (but see Wenzel and Siddall, 1999 for a discus-
Exploring the Response of the ILD Test to Noise sion of combining noisy and less noisy data sets). None-
Figure 3a illustrates how MPT length grows with the theless, with currently unquantifiable contributions to
increasing amount of noise in a matrix. This relation- the significance of the test coming from both incongru-
ship is nonlinear: a linear relationship would hold if ent signal and noise, the precise meaning of the ILD
each additional shuffled character contributed the test result will be ambiguous.
same number of extra steps to the tree length. In fact, (c) On a related topic, some of the recent debate
as we progressively increase the number of noisy char- concerning the utility or otherwise of 3rd codon posi-
acters, the average length of each shuffled character on tions in phylogeny reconstruction has been based on
the MPT decreases and the average length of each the interpretation of incongruence inferred from the
nonshuffled character increases until their average ILD test. The opposing views regarding the treatment
lengths become identical at extremely high noise lev- of 3rd codon characters are summarized by Vidal and
els. This means that at the noisy end of the spectrum it Lecointre (1998). Some researchers believe that 3rd
is less costly in terms of tree length to shuffle each codon characters can create actual conflict deep within
additional character, and the resulting length to noise the tree and should thus be down-weighted to mini-
relationship is curvilinear. mize incongruence. Others believe that 3rd codon po-
As a consequence of this relationship, matrices with sitions should be included in analyses since they are
intermediate levels of noise produce longer trees than informative at subterminal nodes and simply noisy at
predicted by a linear model. Figure 3b shows the effects deeper levels (and so are not destructive of the signal
that this has on the ILD test. Matrix A is 20% noise from more conservative characters). A discussion of the
and gives a tree length of 22 steps, matrix B is of equal positive consequences of including these characters
size, but contains 80% noise and gives a tree length of can be found in work by Källersjö et al. (1999). Vidal
33 steps: a total initial partition tree length of 55 steps. and Lecointre (1998) demonstrate that there is more
When the characters are repartitioned, each matrix incongruence between codon positions than between
will contain on average 50% noise and give a replicate separate genes using the ILD test and cite this as
sum of tree lengths totalling (29 ⫻ 2) ⫽ 58 steps, which evidence that 3rd codon positions are sometimes in
is 3 steps longer than the original. Although the num- significant conflict with other positions rather than
bers are arbitrary in this example, it is generally true being simply noisy. Thus, they refute the view that 3rd
that replicate partitions with an intermediate amount codon characters should always be included in analy-
of noise will produce a longer sum of tree lengths than ses because of their interpretation of the ILD test re-
an original partition comprising one noisy and one less sults.
NOISE AND INCONGRUENCE 405

We propose that an alternative null model for the (branch-and-bound search, 10,000 replicates). In 10 of
ILD test, which makes allowance for the major effect of 15 instances, the significance of the shuffled replicate
noise demonstrated in our study, would be useful in ILD test was greater than that of the original partition.
some cases. To detect incongruence that cannot be Thus, we agree with the authors’ hypothesis that the
attributed to noise alone we recommend the following incongruence is due to the small and noisy nature of
procedure: (1) perform an ILD test on the two original the CALLS matrix.
data partitions (A and B); (2) shuffle the character
states within the characters of one matrix (A) repeat- ACKNOWLEDGMENTS
edly to generate a series of pure noise partitions; (3)
carry out ILD tests between these shuffled replicates David Swofford kindly allowed us to publish results obtained from
and the original matrix B; (4) repeat the procedure a test version of his program PAUP*. K.D. was supported by a
comparing shuffled replicates of B against unshuffled BBSRC Grant. R.B. and C.D.L.O. were supported by NERC Grants.
A; (5) only if the original level of conflict in step 1 is
significantly higher than the levels observed in both REFERENCES
steps 3 and 4 can we say that the partitions are signif-
icantly incongruent. Bull, J. J., Huelsenbeck, J. P., Cunningham, C. W., Swofford, D. L.,
Shuffling character states within characters retains and Waddell, P. J. (1993). Partitioning and combining data in
features of a matrix such as frequency and between- phylogenetic analysis. Syst. Biol. 42: 384 –397.
character distribution of 0’s and 1’s while replacing Cannatella, D. C., Hillis, D. M., Chippindale, P. T., Weigt, L., Rand,
A. S., and Ryan, M. J. (1998). Phylogeny of frogs of the Physalae-
signal with noise. Any difference in the ILD test results mus pustulosus species group, with an examination of data incon-
obtained before and after shuffling one matrix in this gruence. Syst. Biol. 47: 311–335.
way indicates features of agreement or incongruence Carbone, I., Anderson, J. B., and Kohn, L. M. (1999). Patterns of
that cannot be attributed to noise in the initial parti- descent in clonal lineages and their multilocus fingerprints are
tions (although other effects, such as base composition resolved with combined gene genealogies. Evolution 16: 354 –362.
bias, cannot be discounted). Chippindale, P. T., and Wiens, J. J. (1994). Weighting, partitioning,
Two examples of this approach are the following. (1) and combining characters in phylogenetic analysis. Syst. Biol. 43:
278 –287.
We further analyzed the mitochondrial data examined
Cho, Y., Qiu, Y. L., Kuhlman, P., and Palmer, J. D. (1998). Explosive
in Cunningham (1997) in which partitions represent
invasion of plant mitochondria by a group I intron. Proc. Natl.
1st, 2nd, and 3rd codon positions with the 3rd codon Acad. Sci. USA 95: 14244 –14249.
positions saturated by multiple substitutions (a good Cunningham, C. W. (1997) Is congruence between data partitions a
example of noise). The initial ILD test using branch- reliable predictor of phylogenetic accuracy? Empirically testing an
and-bound option finds the 2nd and 3rd codon parti- iterative procedure for choosing among phylogenetic methods.
tions to be highly incongruent (P ⬍ 0.001)—a striking Syst. Biol. 46: 464 – 478.
result, given the putatively identical phylogenetic his- Doyle, J. J., Doyle, J. L., and Brown, A. H. D. (1999). Incongruence in
tory of the two partitions. Shuffling the 3rd codon char- the diploid B-genome species complex of Glycine (Leguminosae)
revisited: Histone H3-D alleles versus chloroplast haplotypes. Mol.
acter states tends to reduce the level of incongruence, Biol. Evol. 16: 354 –362.
with the mean difference in tree lengths between orig- Farris, J. S. (1991). “Arn”. Program and documentation. Molekular-
inal and replicate partitions being 3.18 S.D. (n ⫽ 30) systematiska laboratoriet, Naturhistoriska riksmuseet, Box 50007
compared to 6.93 S.D. when the unshuffled data are S 104 –105, Stockholm, Sweden.
analyzed. However, the value of 6.93 S.D. lies within Farris, J. S., Källersjö, M., Kluge, A. G., and Bult, C. (1994). Testing
the 95% confidence limit calculated for the 30 shuffled significance of incongruence. Cladistics 10: 315–319.
partitions. The importance of distinguishing between Geiser, D. M., Pitt, J. I., and Taylor, J. W. (1998). Cryptic speciation
noise and signal in the case of codon position charac- and recombination in the aflatoxin-producing fungus Aspergillus
flavus. Proc. Natl. Acad. Sci. USA 95: 388 –393.
ters has been mentioned above, and this example high-
Graham, S. W., Kohn, J. R., Morton, B. R., Eckenwalder, J. E., and
lights the need for an unambiguous incongruence mea- Barrett, S. C. H. (1998). Phylogenetic congruence and discordance
sure. (2) In their extensive study of leptodactylid frogs, among one morphological and three molecular data sets from
Cannatella et al. (1998) examine phylogenetic matrices Pontederiaceae. Syst. Biol. 47: 545–567.
whose characters represent a range of morphological, Hoot, S. B., Magallon, S., and Crane, P. R. (1999). Phylogeny of basal
behavioral, and molecular sources. Multiple pairwise eudicots based on three molecular data sets: atpB, rbcL and 18S
ILD tests between the partitions revealed that the only nuclear ribosomal DNA sequences. Ann. Missouri Bot. Garden 86:
1–32.
instance of significant incongruence was when a ma-
Johnson, K. P., and Sorensen, M. D. (1998). Comparing molecular
trix comprising advertisement call characters (which
evolution in two mitochondrial protein coding genes (cytochrome b
they named CALLS) was compared with a morpholog- and ND2) in the dabbling ducks (Tribe: Anatini). Mol. Phylogenet.
ical matrix. Evol. 10: 82–94.
We created a series of shuffled replicates of the Källersjö, M., Albert, V. A., and Farris, J. S. (1999). Homoplasy
CALLS matrix and performed ILD tests comparing increases phylogenetic structure. Cladistics 15: 91–93.
these replicates with the original morphological matrix Kitching, I. J., Forey, P. L., Humphries, C. J., and Williams, D. M.
406 DOLPHIN ET AL.

(1998). “Cladistics: The Theory and Practice of Parsimony Analy- phological data sets: An example from the evolution of endopara-
sis,” 2nd ed., Oxford Univ. Press, Oxford. sitism among parasitic wasps (Hymenoptera: Braconidae). Syst.
Lecointre, G., Rachdi, L., Darlu, P., and Denamur, E. (1998). Esch- Biol. 48: 436 – 454.
erichia coli molecular phylogeny using the incongruence length Spangler, R. E., and Olmstead, R. G. (1999). Phylogenetic analysis of
difference test. Mol. Biol. Evol. 15: 1685–1695. Bignoniaceae based on the cpDNA gene sequences rbcL and ndhF.
Messenger, S. L., and McGuire, J. A. (1998). Morphology, molecules, Ann. Missouri Bot. Garden 86: 33– 46.
and the phylogenetics of cetaceans. Syst. Biol. 47: 90 –124. Stanger-Hall, I. K., and C. W. Cunningham. (1998). Support for a
Mickevich, M. F., and Farris, J. S. (1981). The implications of con- monophyletic Lemuriformes: Overcoming incongruence between
gruence in Menidia. Syst. Zool. 30: 351–370. data partitions. Mol. Biol. Evol. 15: 1572–1577.
Muona, J. (1995). The phylogeny of Elateroidea (Coleoptera), or Swofford, D. L. (1991). When are phylogeny estimates from molecu-
which tree is best today? Cladistics 11: 317–341. lar and morphological data incongruent? In “Phylogenetic Analysis
Nixon, K. C., and Carpenter, J. M. (1996). On simultaneous analysis. of DNA Sequences” (M. Miyamoto and J. Cracraft, Eds.), pp. 295–
Cladistics 12: 221–241. 333. Oxford Univ. Press, Oxford.
Normark, B. B., and Lanteri, A. A. (1998). Incongruence between Swofford, D. L. (1998). PAUP*. Phylogenetic Analysis Using Parsi-
morphological and mitochondrial DNA characters suggests hybrid mony (*and other methods). Version 4. Sinauer, Sunderland, MA.
origins of parthenogenetic weevil lineages (genus Aramigus). Syst. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. (1996).
Biol. 47: 475– 494. Phylogenetic inference. In “Molecular Systematics” (D. M. Hillis,
Page, R. D. M., and Holmes, E. C. (1998). “Molecular Evolution: A C. Moritz, and B. K. Mable, Eds.), pp. 407–514. Sinauer, Sunder-
Phylogenetic Approach,” Blackwell Sci., Oxford. land, MA.
Piercey-Normore, M. D., Egger, K. N., and Bérubé, J. A. (1998). Vidal, N., and Lecointre, G. (1998). Weighting and congruence: A
Molecular phylogeny and evolutionary divergence of North Amer- case study based on three mitochondrial genes in pitvipers. Mol.
ican species of Armillaria. Mol. Phylogenet. Evol. 10: 49 – 66. Phylogenet. Evol. 9: 366 –374.
Quicke, D.L.J., and Belshaw, R. (1999). Incongruence between mor- Wenzel, J. W., and Siddall, M. E. (1999). Noise. Cladistics 15: 51– 64.

You might also like