You are on page 1of 21

Computational

Phylogenetics
Tools
By:Meghana Devineni and Darya Orgil

COMP 342: BIOINFORMATICS FINAL PROJECT


PHYLOGENETIC INFERENCE
Phylogenetic inference is a part of evolutionary biology
○ provides insights into the patterns and processes of evolution
○ commonly compared among groups of organisms or genes.

Some reasons why we want to study this type of inference:


o It allows us to understand the evolutionary history of life on Earth and how different
groups of organisms are related to each other.
o It helps us to identify the origins and diversification of key features of organisms,
such as complex behaviors or physiological adaptations.
o Evolutionary relationships can inform disease diagnosis and species conservation
among other areas.
PHYLOGENETIC INFERENCE

Methods of Analyzing Phylogenetic Similarities/Difference between organisms:


➢ DNA sequences
➢ Anatomical structures
➢ Behavioral traits
This is where the computational tools can be useful.
Different computational algorithms and programs can be applied to this larger problem
of phylogenetic analysis.
The goal of all of these tools is to an assembled phylogenetic tree that comments on
the evolutionary ancestry of a group of genes or species .
Bayesian Inference

● Bayesian inference takes a view of the phylogeny


problem that makes analysis of large data sets more
tractable.

● Bayesian Modeling in Bioinformatics discusses the


development and application of Bayesian statistical
methods for the analysis of high-throughput
bioinformatics data

Example: A sample can be used to construct a


consensus tree, with the posterior probability of Bayesian analysis allows researchers to
take into account data as well as prior
the individual clades indicated on the tree. beliefs to calculate the probability that an
alternative is superior.
Maximum Likelihood

● Maximum likelihood methods are used to estimate the phylogenetic trees


for a set of species.
● J. Felsenstein introduced this method of finding an estimate for the
maximum likelihood phylogenetic tree.
● The probabilities of DNA base substitutions are modeled by continuous-
time Markov chains. We use these probabilities to estimate which DNA
bases would produce the data that we observe.
● The topology of the tree is also determined using base substitution
probabilities and conditional likelihoods.
Our project will focus on 3 different
computational phylogenetics software:

BEAST: Bayesian
01 Evolutionary Sampling Trees 02 BayesTrait
Analyses trait evolution among
Methods: Bayesian Inference groups of species for which a
relaxed molecular clock phylogeny or sample of
demographic history phylogenies is available

03 FastDNAML
Optimized maximum likelihood
(nucleotides only)
01. Bayesian
Evolutionary
Sampling Trees
Independent project led by the University of Auckland
Authors: A. J. Drummond, M. A. Suchard, D Xie & A. Rambaut
BEAST
Summary of the features

❑ evolutionary parameter estimation and hypothesis testing

❑ a large number of complementary evolutionary models (substitution


models, insertion-deletion models, demographic models, tree shape
priors, relaxed clock models, node calibration models) into a single
coherent framework for evolutionary inference

❑ faster and more flexible codon-based substitution models

❑ Focuses on calibrated phylogenies and genealogies which are rooted


trees incorporating a time-scale. This is achieved by explicitly
modeling the rate of molecular evolution on each branch in the tree.
Our
Attempt
at
Running
Beast
How Researchers have used BEAST
02.
BayesTrait
Statistical model commissioned by the University of Reading
Authors: M. Pagel, A. Meade
BayesTrait
Summary of the features
❑ BayesTraits is a computer package for performing analyses of trait
evolution among groups of species for which a phylogeny or
sample of phylogenies is available.

❑ Incorporates their earlier and separate programs Multistate,


Discrete and Continuous. BayesTraits can be applied to the
analysis of traits that adopt a finite number of discrete states, or
to the analysis of continuously varying traits.

❑ Hypotheses can be tested about models of evolution, about


ancestral states and about correlations among pairs of traits.
To run BayesTraits using the
Artiodactyl tree, data and input file use
the following command:
Our Attempt at
Running BayesTrait
./BayesTraitsV4 Artiodactyl.trees
Artiodactyl.txt < ArtiodactylMLIn.txt
How Researchers have used BayesTrait
03.
FastDNAml
Program funded by the University of Illinois Urbana-Champaign
Authors:G J Olsen, H Matsuda, R Hagstrom, R Overbeek
Until recently, costs have limited
FastDNAml the use of maximum likelihood
techniques to trees of under ~20
taxa.
Summary of the features
❑ a tool for construction of phylogenetic
trees of DNA sequences using maximum
likelihood
The process of finding a
❑ Trees containing as many as 40-100 phylogenetic tree using maximum
taxa have been easily generated
likelihood involves finding the
❑ Phylogenetic estimates are possible topology and branch lengths of the
even when hundreds of sequences exist. tree that will give us the greatest
probability of observing the DNA
❑ Currently Being used as the tool to sequences in our data.
construct a phylogenetic tree based on
473 small subunit rRNA sequences from
prokaryotes.
Note: Computational Tool Documentation

We were unable to run FastDNAml due to the lack of clear documentation.

Software without documentation can be difficult to navigate

Of the computaitonal tools we used, the following documentation features


helped us use them more effectively:
■ Reference Manual
■ Quick Start
■ ReadMe File
Tools PRO CON BEST FIT FOR

Need to specify a prior distribution Need to specify a prior


over parameter values distribution over parameter Estimation and
values. hypothesis testing of
BEAST provides considerable
BEAST flexibility in the specification of an
evolutionary models
There is a trade-off between a from molecular
evolutionary model. program's flexibility and its sequence data
Performs well on large data. computational performance.

Allows for tracking of specific traits The user can only select either Evaluating evolutionary
standard, conventional MCMC correlation, ancestral
Bayes Performs well on large data or reversible-jump MCMC. state reconstruction in
Trait Correlated evolution between pairs discrete morphological
Does not include Clock models
of discrete binary traits. and needs a phylogeny input traits

Lack of Clear Documentation maximum likelihood


One of very few computational inference of
Fast Only works with two DNA
phylogenetics codes that scale phylogenetic trees
DNAml well. sequence inputs which it from DNA sequence
compares. data.
CITATION
Drummond, A J, and M A Suchard. “Beast Software - Bayesian Evolutionary Analysis Sampling
Trees: Beast Documentation.” BEAST Software - Bayesian Evolutionary Analysis Sampling
Trees | BEAST Documentation, https://beast.community/.

Gary J. Olsen, Hideo Matsuda, Ray Hagstrom, Ross Overbeek, fastDNAml: a tool for
construction of phylogenetic trees of DNA sequences using maximum likelihood, Bioinformatics,
Volume 10, Issue 1, February 1994, Pages 41–48, https://doi.org/10.1093/bioinformatics/10.1.41

Pagel, Mark. BayesTraits V4.0.1 Feb 2023 Documentation,


http://www.evolution.reading.ac.uk/BayesTraitsV4.0.1/BayesTraitsV4.0.1.html.

Elizabeth Martínez-Bautista, Abduction as Phylogenetic Inference: Epistemological Perspectives


in Scientific Practices, Handbook of Abductive Cognition, 10.1007/978-3-031-10135-9_56,
(1651-1679), (2023).

Wikimedia Foundation. (2023, January 15). List of phylogenetics software. Wikipedia. Retrieved
April 23, 2023, from https://en.wikipedia.org/wiki/List_of_phylogenetics_software
CITATION (continued…)
Drummond, A.J., Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC
Evol Biol 7, 214 (2007). https://doi.org/10.1186/1471-2148-7-214

Imke Schmitt, Ruth del Prado, Martin Grube, H. Thorsten Lumbsch, Repeated evolution of
closed fruiting bodies is linked to ascoma development in the largest group of lichenized fungi
(Lecanoromycetes, Ascomycota), Molecular Phylogenetics and Evolution, Volume 52, Issue 1,
2009, Pages 34-44, ISSN 1055-7903, https://doi.org/10.1016/j.ympev.2009.03.017.

Marco Fabio Ortiz-Ramírez, Luis A. Sánchez-González, Gabriela Castellanos-Morales, Juan


Francisco Ornelas, Adolfo G. Navarro-Sigüenza, Concerted Pleistocene dispersal and genetic
differentiation in passerine birds from the Tres Marías Archipelago, Mexico, The Auk, Volume
135, Issue 3, 1 July 2018, Pages 716–732, https://doi.org/10.1642/AUK-17-190.1
Thank You!

You might also like