Professional Documents
Culture Documents
I
maximum parsimony example
1 2 3
A : A A A
B : A C A
C : C A C
D : C C C
II
maximum parsimony example
1 2 3
A : A A A
B : A C A
C : C A C
D : C C C
III
maximum parsimony example
1 2 3
A : A A A
B : A C A
C : C A C
D : C C C
I
maximum parsimony example
1 2 3
A : A A A
B : A C A
C : C A C
D : C C C
II
maximum parsimony example
1 2 3
A : A A A
B : A C A
C : C A C
D : C C C
III
maximum parsimony example
1 2 3
A : A A A
B : A C A
C : C A C
D : C C C
steps of maximum parsimony
• construct all possible trees
• Determine length of each tree
• find the shortest tree
• if more than one tree has the shortest length - equally parsimonious
maximum parsimony - challenges
(A,C)
length = 1
find tree length - Fitch's Algorithm
(A,C) (A,G)
length = 2
find tree length - Fitch's Algorithm
(A,C) (A,G)
(C,A,G)
length = 3
find tree length - Fitch's Algorithm
(A,C) (A,G)
(C,A,G)
length = 3
(A,C)
maximum likelihood
• choose the tree which makes the data most probable
• The likelihood of a set of data, D, is the probability of the data, given a
hypothesis, θ. The hypothesis will usually come in the form of different
parameters. We denote the likelihood, L, of a set of data, D, as
L = P(D | θ)
maximum likelihood - coin toss
• Model:
A model of how one ancestral sequence has evolved into thre three above
sequences
maximum likelihood in phylogeny
• Model parameters
⚬ Tree topology and branch lengths
⚬ Nucleotide frequencies (π)
⚬ nucleotide nucleotide substitution rates
computing probability of one column
• Assume that there were ancestral states that evolved to give these
nucleotides
• Assume a tree topology, branch lengths and other parameters, and start
computing at any node.
computing probability of one column
• We assumed that the ancestral nucleotides were both A. But we don't know
that for sure
• Redo the computation for all possible combinations of the ancestral
nucleotides
• Probability should be summed up for all the possible combinations
• Since we have 2 internal nodes and 4 nucleotides, there will be 16 possible
combinations
computing probability of one column
computing probability of an entire alignment
• Probability of individual columns are multiplied to give the probability of
the entire alignment
L = L1 * L2 *... Ln
• In phylogeny software, the summation of logs of the likelihood is done to
prevent underflow
ln(L) = ln(L1) + ln (L2) .... + ln (Ln)
maximum likelihood - advantages
• The method is very appropriate when analyzing a simple data set
containing genetic information.
• When the degree of variance among the genetic data is lower, the
maximum likelihood scores are reliable.
• The results generated through maximum likelihood further confirms the
maximum parsimony scores of a particular phylogenetic relationship.
• Therefore, maximum likelihood analysis acts as a confirmative test.
maximum likelihood - disadvantages