Methods of determining cranial and postcranial character congruence

By Ross Mounce* and Matthew Wills
* ross.mounce@gmail.com

1. Why compare subsets of cladistic data?
Optimal estimates of phylogenetic inference work on the 1 basis of Total Evidence . We do not contest this. However, using methods of 'data exploration' to examine subsignals within datasets, can be both a justified and useful means with which to gain quantitative support for more 2 detailed evolutionary explanations . Molecular systematists routinely compare and contrast data sourced from different genes e.g. nuclear, mitochondrial & plastid markers. We think morphologists could stand to gain much insight from similar statistically-explicit comparisons of different anatomical regions. e.g. Do vertebrae make 'good' characters? Does my soft tissue data agree with my osteological data? What influence do dental characters have on the cladogram? Have different parts evolved at different rates?

3. Partitioned Goodman-Bremer Support
Another method relatively unused in palaeontological systematics is 6 7 that of partitioned Goodman -Bremer support (GBS). Using this method one can assess not just the number of characters that support each node in a strict consensus cladogram, but also from which partition those supporting characters came from. Negative support indicates otherwise 'hidden' character conflict between partitions.
Eoraptor_lunensis Herrerasaurus_ischigualastensis -1.0,4, 4.5,4.5, 0.5,3.5, 0,8, 0,9, Ceratosaurus_nasicornis Carnotaurus_sastrei Torvosaurus_tanneri 1,0, Baryonyx_walkeri Allosaurus_fragilis Dilophosaurus_wetherilli 0,5, Liliensternus_liliensterni 0,1, Lophostropheus_airelensis 0,1, Syntarsus_kayentakatae 0,2, Coelophysis_rhodesiensis Coelophysis_bauri

Postcranial elements from UCMP 77270 (Dilophosaurus)

Skull of D. wetherilli

2. What methods should one use?
One would not recommend the consistency index – despite its popular usage, as it is known4 to be a poor measure of homoplasy – affected by number of taxa, characters, character states of characters, and the rate of evolution of characters in cladistic matrix. 1A)
Ensemble Consistency Index
1.000 0.900 0.800 0.700 0.600 0.500 0.400 0.300 0.200 0.100 0.000 0 50 100 150 200 250 300 350 400 450

Fig. 2 Re-analysing data from Ezcurra & Cuny 2007, JVP, to compare support contributed by cranial & postcranial character partitions (cranial,postcranial, GBS values on each node).

3

Note in figure 2 (above) the conflict in GBS from cranial characters to the strongly postcranially-supported node (indicated by the arrow) – the distribution of states in one cranial character is incongruent with the topology supported by postcranial characters. Also there is strongly 'lop-sided' partition support for this cladogram – most nodes are only supported by postcranial characters – despite there being 68 cranial characters in this matrix relative to 77 postcranial.

f(x) = -0.0006274738x + 0.6109492765 R² = 0.1635139213

4. The Incongruence Length Difference Test
The ILD test is perhaps one of the most routinely used methods for comparing sequence data in molecular phylogenetics; with at least 2500 citations to the 10 paper describing it, most of which do use the test . Given its long history, we feel 11,12 this test is under-utilised in palaeontology, but see some recent uses .
Whole dataset
Out A B C D E F Out A B C D E F 000000000 001110011 001110000 001100011 110000000 110001101 110001100 000000000 001110011 001110000 001100011 110000000 110001101 110001100 Out A B C D E F 000000000 000000011 000001100 000111111 001111100 111111101 111111100 000000000 000000011 000001100 000111111 001111100 111111101 111111100

8

9

MP

Length=25

1B)
1.000 0.900 0.800 0.700 0.600 0.500 0.400 0.300 0.200 0.100 0.000 0 10 20 30 40 50 60 70 80 90 100

Number of replicates

Number of characters in dataset

Out A B C D E F Out A B C D E F

000000000 001110011 001110000 001100011 110000000 110001101 110001100 000000000 001110011 001110000 001100011 110000000 110001101 110001100

000000000 000000011 000001100 000111111 001111100 111111101 111111100 000000000 000000011 000001100 000111111 001111100 111111101 111111100

Only 3 random reps were as short ILD p-value = 0.004 (4/1000) Calc. ILD ILD=1

Part A (only)
f(x) = -0.0041252873x + 0.6669527033 R² = 0.4040262399

MP

L=11

Calc. ILD

ILD=3

Summed length of the cranial,postcranial partitions

Ensemble Consistency Index

Fig. 4 Example randomised partition replicates, with which one can use Fig. 3 A toy matrix example of ILD value8 to determine the significance of the calculation. In this instance the length length difference between the difference between these partitions is 2 (25-23) partitions you are interested in.

Part B (only)

MP

L=12

… at least 999 times, and compare with the ILD you originally got, to get an ILD p-value

Length

Fig. 5 Re-analysing Ezcurra&Cuny'07 using the ILD test to compare cranial and postcranial partitions

Number of taxa in dataset

Fig. 1 Data from re-analysis of 163 vertebrate-only cladistic matrices published 2000 – 2011 (Mounce & Damary-Homan, unpublished) further details in ESM. A) The inverse relationship between CI & characters. B) The inverse relationship between CI & taxa. Admittedly, taxa & character number are highly correlated.

Having performed the ILD test, and others on 63 vertebrate data matrices, to compare cranial and postcranial partitions, we find many datasets like figure 5, appear to have unexplained significant incongruence (figure 6; p-values < 0.005).
A)
6 10

B)
13 'Fish' Amphibia Mammals Birds Dinosaurs Reptiles (other) 16

Figure 1 (above) empirically demonstrates some of the problems of using CI as a comparative statistic. We agree 4 with Cuthill et al. that multivariate approaches are needed to adequately control for covariates such as these in Fig. 6 A) Group composition of the datasets analysed B) ILD p-values: red=0.001-0.01 (highly significant), comparative analyses. A better comparative measure of grey=0.011-0.1 (significant or borderline), 5 homoplasy may well be Archie's Homoplasy Excess Ratio blue=0.101-1.0 (not significant) (HER) which is more computationally-demanding. However, Supplementary materials inc. code + data + more refs: http://bit.ly/palassposter if there is a high proportion of non-randomly distributed missing data in the matrix it can lead to negative HER values. Acknowledgements: Many thanks to all the Macroevolution group @UoBath
14 7 40 16 4

We conclude the explanation for this phenomenon might be modularity; allowing observable difference in the rates of morphological evolution to be seen.

References: 1. Grant & Kluge, 2003 Cladistics 2. Kluge, 1989 Syst. Zool. 3. Kluge & Farris, 1969 Syst. Zool. 4. Hoyal Cuthill et al, 2010 Cladistics 5. Archie, 1989 Syst. Zool. 6. Grant & Kluge, 2008 MPE 7. Bremer, 1988 Evol. 8. Mickevich & Farris, 1981 Syst. Zool. 9. Farris et al, 1994 Cladistics 10. Mounce, 2011 http://bit.ly/ILDreview 11. Ketchum & Benson, 2010 Biol. Rev. 12. Smith, N.D. 2010 PLoS ONE *Many of the images displayed on this poster are not my creative works, and may not be compatibly licensed.

Except where otherwise noted* this work is licensed under http://creativecommons.org/licenses/by/3.0/

@RMounce http://about.me/rossmounce

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.