Professional Documents
Culture Documents
phylogeny
Lars Arvestad
arve@nada.su.se, lars.arvestad@scilifelab.se
Goals of this lecture
● Interpret results
Features
● Numbers: Reliability
Pretty ≠ good
Quick tree at
NCBI
• Bad method?
• Doubtful
alignment
• Good start?
Visualizing evolution
(with Dendroscope)
Branchlengths
● Length b proportional to average number of
mutations per site
Molecular clock?
● Dubious assumption: Evolution has a constant
rate
● Works for some data
– Pseudogenes?
– Introns?
– Reasonable in 3rd codon position
Molecular clock?
Methods for inferring evolution
● Parsimony
≈ simplicity, greed. Which is the most simple tree?
● Distance based
Two steps: Estimate pairwise distances, puzzle together
the most reasonable tree.
● Probability based
Find the most likely tree.
State of th
e
ar t
Probability-based methods
Disadvantages
● Slow?
● How much do you need to compute? (MCMC in
particular)
Reliability
● For “common” variables: confidence intervals, std dev,
etc
● Bootstrap evaluates
edges, not clades!
Rooting of phylogenies
● A reconciliation explains
how one tree depends
on another.
● Reconciliations decide
which gene node is a
speciation
Orthology and other terms
http://code.google.com/p/jprime/
Phylogeny quality
Final comments
● Homologs a requirement!
● Alignments are important: Look at them!
● It is OK to remove ”noise” from an alignment.
Use domains if needed.
● It is good to use complementary methods
● We discussed models of evolution. They can
be compared. ML good for this.