You are on page 1of 10

MOLECULAR

PHYLOGENETIC
ANALYSIS
LECTURE 3

Dr RABAR MANTIK
OBJECTIVES
 1. Decide which organisms and sequences to use in the analysis
 2. Obtain the required sequence experimentally or from databases
 3. Assemble these sequences in a multiple-sequence alignment
 4. Use this alignment to generate phylogenetic trees
 Molecular phylogenetic analysis is the use of macromolecular structure (usu-
ally nucleotide or amino acid sequences) to reconstruct the phylogenetic re-
lationships between organisms. The extent of difference between homologous
DNA, RNA, or protein sequences in different organisms is used as a measure
of how much these organisms have diverged from one another in evolutionary
history.
SELECTING AN ORGANISM
AND ITS SEQUENCE
 The sequences of genes, RNAs, or proteins contain two very different kinds of information:
structural/functional information and historical information.
 Historical information , in constructing a protein there might be several alternatives to create
it but only one way is used , this is because of a successful ancestor.
 Some sequences do carry more phylogenetic information than others; these sequences can be
called “molecular clocks.”
GOOD MOLECULAR DOCKS
CAN HAVE
 Clock-like behavior
 Over a time there will be changes in the
sequences of genes, RNAs, and proteins
 These changes can be entirely random
 The amount of divergence between any
particular sequence in two organisms should
be a measure of how long ago these organisms
diverged from their common ancestor.
CLOCK-LIKE BEHAVIOR
DEPENDS
 Functional properties of the sequence . Large and selected ( nonrandom) sequence change
 The length of the sequence should be large enough to give statistically significant information
 Different part of lengthy sequences used . In order the random sequence in one part does not
influence the other parts.
 With limited amount of variations in the sequence .
 Too little does not give much information and its statistically not significant.
 Too big changes makes alignment difficult or impossible and decreases the reliability of the
treeing algorithm
 Phylogenetic range: A sequence must be present and
identifiable in all of the organisms to be analyzed and must exhibit
clock-like behavior within this range.
 Absence of horizontal transfer : the gene must be acquired
only by inheritance from parent to offspring, not by transfer from
one organism to another except by descent.
 Availability of sequence information: It is of great
pragmatic importance to choose a sequence, for which a great deal
of the sequence data required is already available and annotated and
perhaps already aligned.
THE STANDARD: SMALL-
SUBUNIT RIBOSOMAL RNA
 In most cases, the best molecular clock for phylogenetic analysis is the small-subunit
ribosomal RNA (SSU rRNA) .
 This sequence is always the best starting point;
 The SSU rRNA is so often the best sequence of choice for the following reasons:

1. It is present in all living cells.


2. It has the same function in all cells.
3. It comprises 1,500 to 2,000 residues—large enough to be statistically useful but not
too large to be onerous to sequence.
4. It is made up of ca. 50 independently evolving helices and ca. 500
independently evolving base pairs.
5. It is conserved highly enough in sequence and structure to be easily and
accurately aligned.
6. It contains both rapidly and slowly evolving regions—the rapidly evolving
regions are useful for determining close relationships, whereas the slowly
evolving regions are useful for determining distant relationships.
7. Horizontal transfer of rRNA genes is exceedingly rare (most genes of the
central information-processing pathways of the cell are also resistant to
horizontal transfer).
8. Huge data sets of sequences, alignments, and analysis tools are available.
DECIDING WHICH
ORGANISMS TO INCLUDE
 most often you start out by generating a tree with representatives from a wide
range of organisms scattered around the tree in order to identify what kind of
organism it is in very general terms.
 Then you replace most of these disparate representatives with representatives
that you now know are likely to be closely related.
 The resulting tree gives you more specific information about the group to
which your organism belongs, which can be used again to choose even closer
relatives, and so on until you are satisfied with the representation of the tree.

You might also like