Phylogenetic Trees (BIOINFORMATICS)

Aayudh Das
Phylogenetic Trees
Phylogeny-The evolutionary relationships among organisms, based on a common ancestor Phylogenetics-Area of research concerned with finding the genetic relationships between species
Dendogram used in amore broader sense. e.g. the appearance of a cladogram. Cladogram relationship or tree drawn using cladistic method. Phylogram - is a phylogenetic tree that has branch lengths proportional to the amount of character change. Chronogram - is a phylogenetic tree that explicitly represents evolutionary time through its branch lengths
Steps of a molecular phylogenetic analysis

Decide what sequences to examine. Determine the evolutionary distances between the sequences and build distance matrix. Phylogenetic tree construction.
PHYLOGENETIC METHODS- Distance

The main distance- based methods include the unweighted pair group method with arithmetic mean (UPGMA) and neighbor joining (NJ).
Unweighted pair group method with arithmetic mean (UPGMA)

UPMGA is a so called distance based method and needs complete distance matrices, simple sequential clustering algorithm, generally the algorithm joins the two nearest clusters (species) until only one cluster is left. The observed distances between any two sequences i, j can be denoted dij. The sum of the branch lengths of the tree from taxa i and j can be denoted dij. Ideally, these two distance measures are the same, but phenomena such as the occurrence of multiple substitutions at a single position typically cause dij and dij to differ. Pick smallest entry Dij . Join the two intersecting species and assign branch lengths Dij/2 to each of the nodes.
DWB + DWR 0.34 + 0.42 = = 0.38 2 2 D + DSR 0.29 + 0.44 DS(BR) = SB = = 0.365 2 2 DW(BR) =
Compute new distances to the other species using arithmetic means. Then tree can be constructed.
Steps to create UPGMA

1. We begin with a distance matrix. We identify the least dissimilar groups i.e. the two OTUs i and j that are most closely related. From the matrix we see that OTUs 1 and 2 have the smallest distance. The taxa with the closest distance (sequences 1 and 2) are identified and connected. This allows us to name an internal node [right, node 6, in (b)].
2. We can also identify the next closest sequences (4 and 5), connected by a new node, 7.
3. We can further identify the next smallest distance (value 0.3, shaded red) corresponding to the union of taxon 3 to cluster (4,5). The newly formed group cluster 4,5 are joined with sequence 3.
4. The newly formed group (cluster 4,5 joined with sequence 3) is represented on the emerging tree with new node 8. Finally all sequences are connected in a rooted tree.
Neighbour-joining Method (Saitou & Nei, 1987)

This is method for re-constructing phylogenetic trees and computing the lengths of the branches of this tree. In this method, the two nearest nodes of the tree are chosen and defined as neighbours in our tree. This is done unless all of the nodes are paired together. Neighbours are defined as a pair of OTUs (operational taxonomic units or in other words leaves of the tree), who have connecting them. For instance A and B or C and D are neighbours and not A and D or B and C. 1. The process of starting with a star-like tree and finding and joining neighbours is continued until the topology of the tree is completed.
2. Now a matrix is developed. The columns and rows of the matrix represent nodes, and the value i,j of the matrix represent the distance between node i and node j.
3. Pick the two nodes with the lowest value in the matrix defined in step 2. These are defined as neighbours. For example, assuming nodes A and B are the nearest, we define them as neighbours. The new node we have added is defined as node X.
4. We compute the branch lengths for the branches that have been joined; these are branches A-X and B-X. We repeat the process from stage 2 once again we identify the two nearest nodes and so on.
Here i and j represent all sequences except 1 and 2 and i<j. For the example takenSab=[(dAC+dAD+dAE+dBC+dBD+dBE)/6 + dAB/2 +(dCD+dCE+dDE)/3] =244/6+22/2+48/3=67.7
Maximum Parsimony
Construction best tree is that with the shortest branch lengths possible. Parsimonybased phylogeny based on morphological characters. According to maximum parsimony theory, having fewer changes to account for the way a group of sequences evolved is preferable to more complicated explanations of molecular evolution.
Here 4 mutations are postulated Difficulties-
Here 7 mutations are postulated
In many cases, several trees may postulate the same number of mutations, fewer than any other tree. For such cases, this method does not give a unique answer. Thus instead of mere counting of mutational events, considering quantitative probabilities could improve the approach of drawing a tree. In this method, the tree is chosen to minimize the number of changes required to explain the data. The trees are analyzed by searching for the ancestral sequences and by counting the number of mutations required to explain the respective trees.
Maximum Likelihood
Maximum likelihood is an approach that is designed to determine the tree topology and branch lengths that have the greatest likelihood of producing the observed data set. Likelihood is calculated for each residue in an alignment, including some model of the nucleotide or amino acid substitution process. It is among the most computationally intensive but most flexible methods available. Maximum parsimony methods sometimes fail when there are large amounts of evolutionary change in different branches of a tree. Maximum likelihood, in contrast, provides a statistical model for evolutionary change that varies across branches. Given a character data D, and a model M we want to find out the tree T that maximizes the expression Pr[D|T,M] Assumptions Different characters evolve independently. After species have diverged they evolve independently. Thus if Di is the data for the ith character, then Pr[D|T,M]= Pr[Di|T,M]

Phylogenetic Trees (BIOINFORMATICS)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Phylogenetic Trees (BIOINFORMATICS)

Uploaded by

Copyright:

Available Formats

Aayudh Das

Steps of a molecular phylogenetic analysis

PHYLOGENETIC METHODS- Distance

Unweighted pair group method with arithmetic mean (UPGMA)

Steps to create UPGMA

Neighbour-joining Method (Saitou & Nei, 1987)

Here 4 mutations are postulated Difficulties-

Here 7 mutations are postulated

You might also like