Professional Documents
Culture Documents
Obayori
1. CONCEPTUAL CLARIFICATIONS
iii. Nomenclature – part of taxonomy that deals with assigning names to the various
taxonomic rankings or taxa of an organism
Importance of Taxonomy
___________________________________________________________________________
1
Homo sapiens
Escherichia coli
Pseudomonas aeruginosa
Lysinibacillus fusiformis
Kingdom
Phylum
Class
Order
Family
Genus
Species
Remember!
In the modern system:
i. Domain is the highest level, and there are 3 Domains
ii. Species is the basic taxonomic unit of a classification system
2
• Family – a group of related genera
• Genus – a group of related species
OR
A well-defined group of one or more species that is clearly separated from other
genera
• Species – a group of organisms of the same kind
OR
A collection of strains that share many stable properties and differ significantly from
other group of strains
OR
A collection of strains with similar G+C composition and at least 70% sequence
similarity.
• Strains – variants of the same species
iv. Systematics- is the study of organisms with the ultimate object of characterizing
and arranging them in an orderly manner.
OR
It can also be defined as the comparative study of the diversity of organism, with the
aim of establishing a logical system within which organisms can be described and
classified (Atlas, 1995)
3
2. APPROACHES TO CLASSIFICATION:
ii. Phylogenetic:
Emphasises evolutionary relationship and based on collection of evolutionary
evidence
Groups reflect genetic similarity and evolutionary relatedness
B. Molecular characteristics
i. Percentage G+C content
ii. Nucleic acid sequence
iii. Nucleic acid hybridization
iv. Protein comparison
i. The function of ribosomes has not changed for 3.8 billion years
ii. 16S rRNA genes are universally present among all cellular form
iii. The size of 1540 nucleotides makes them easy to analyse
iv. The primary structure in an alternating sequence of invariant DNA, more or
less, conserved to highly variable regions
v. Lateral gene transfer is either totally absent or exceedingly rare
(Philp et al., 2005)
4
Limitations 16S rRNA gene: Sequence similarities of 16S rRNA gene may not
reflect relatedness
Other marker gaining relevance include: 23S rRNA gene, gyrB, rpoB, dnaK,
dsrAB and 16S-23S rDNA ISR
Advantages
• Numerical taxonomy has the power to integrate data from diverse sources, such as
morphology, physiology, chemistry, molecular etc.
• automation makes for greater efficiency
• Being quantitative, the methods provide greater discrimination along the spectrum of
taxonomic differences and are more sensitive in delimiting taxa.
• Numerical taxonomy has led to the reinterpretation of a number of biological concepts
and to the posing of new biological and evolutionary questions.
5
Association Coefficient is used to estimate the degree of similarity between taxonomic units
Simple Matching Coefficient (Ssm)
(++) + (--)
Ssm = ----------------------------------
(++) + (--) + (+-) +(-+)
++ = Positive matches
-- = Negative matches
+ -; + = Mismatches
(++)
Sj = ---------------------------
(+ +) + (+-) +(-+)
• The use of (--) in Ssm makes organisms that are not similar appear similar
• The Jaccard system eliminates this
A 1.0
B 0.92 1.0
C 0.80 0.72 1.0
D 0.22 0.32 0.28 1.0
E 0.46 0.43 0.47 0.30 1.0
F 0.35 0.45 0.46 0.32 0.32 1.0
A B C D E F
6
iv. Polyphasic Approach:
Collectively the genotypic, chemotaxonomic and phenotypic methods for determining
taxonomic position of microbes constitutes what is known as polyphasic approach for
bacterial systematics (Prakash et al., 2007)
OR
The use of all possible data, viz., genotypic and phenotypic, to determining phylogeny. The
data used depends on desire.
Techniques and markers used in modern polyphasic approaches for
resolving bacterial hierarchy
1. Chemotaxonomic markers - polyamines, quinones, polar lipids, fatty acids
Up to genus level
2. DNA – DNA hybridization
%G+C
tDNA– PCR
Up to species level
3. DNA probes
DNA sequencing
Up to strain level
4. RNA gene sequencing
Up to species level
5. Cell wall structure – teichoic acids, peptidoglycans
Up to genus level
6. Restriction Fragment Length Polymorphism (RFLP)
Pulse Field Gel Electrophoresis PFGE
Ribotyping
DNA amplification
Phage and Bacteriocin typing
Serological techniques
Up to strain level
___________________________________________________________________
(Prakash et al., 2007)
7
3. PHYLOGENETIC TREES
What is phylogenetic tree?
A phylogenetic tree is an estimate of the relationships among taxa (or sequences) and their
hypothetical common ancestors (Hall, 2013).
OR
Phylogenetic tree is a statement about the evolutionary relationship between a set of
homologous characters of organisms.
OR
A tree-like structure that shows the evolutionary relationships among a set of organisms or
biomolecules
8
Figure 2: Parts of a phylogenetic tree
•Branch: defines the relationship between the taxa in terms of descent and ancestry
•Branch length (scaled trees only): represents the number of changes that have
occurred in the branch
•Clade: a group of two or more taxa or DNA sequences that includes both their
common ancestor and their entire descendants
9
Figure 3: Equivalent trees
Out group is a taxon outside the groups of interest. Out group is useful in constructing
evolutionary tree.
10
A B C
Unrooted
Rooted
In a Phylogenetic tree:
11
4. BUILDING PHYLOGENETIC TREES
The data used for building phylogenetic tree can either be
1. Molecular data i.e gene sequence or protein sequence. Or distance data -
morphological data - amino acid, nucleotide substitution, phenotypic features
The methods for building phylogenetic trees can be distinguished on the basis of
Distance based method (phenetic)
Character based method (cladistic)
Distance based methods: are more rapid and computationally intensive. There is loss
of information because characters are discarded once the matrix is discerned.
12
Properties of a good tree building methods
Efficiency – the faster, the more efficient.
Power - a powerful method produces a reasonable result with limited data.
Consistency -always converge on the right answer given enough data
Robustness- violation of the method’s assumptions may not necessarily result in poor
phylogenies
Falsificability–a good method should be able to reveal when its assumptions are violated
.
Present that tree in such a way as to clearly convey the relevant information to others.
13
Phylogenetic tree
Building methods
Weighted Pair
Group Method
and using
Arithmetic Mean
(WPGMA)
UPGMA
This is the simplest tree building method. Strictly speaking, the algorithm is phenetic.
It is a sequential clustering algorithm. The clustering procedure:
• It assumes that initially each species is a cluster on its own.
• Join closest 2 clusters and recalculate distance of the gained pair by taking average.
• Repeat this process until all species are connected in a single cluster.
.
Merits -output a rooted tree and it is very fast
Demerit - it assumes a constant rate of evolution of the sequences in all branches of the tree
14
MAXIMUM PARSIMONY (MP)
The method involves computing the minimum number of substitutions over all sites for
each topology
Merit– it is good with very distantly related sequences
Demerit – it is time consuming
In this method, the likelihood of observing a given set of sequence data for a specific
substitution model is maximized for each topology and the topology that gives the
highest maximum likelihood is chosen as the ML tree.
- The method corrects for multiple mutational events at the same site. This makes it
suitable for reconstructing the relationships between sequences that have been
separated for a long time or are evolving rapidly.
See online materials for Phylogenetic tree building algorithms using UGMA and NJ
15
5. BIOINFORMATICS
In bioinformatics, the kinds of data biologists play include: sequence data like RNA, DNA,
Protein; proteins; metabolites and metabolic pathways; enzymes; taxonomic information etc
Highlights on Bioinformatics
✓ Is essentially concerned with organizing data in databases such that researchers can
access current data and also submit new data
✓ It is concerned with building of tools, and resources to analyze data
✓ It helps to interpret data in a biologically useful manner such that there is a global
analysis of data to reveal common principles that apply across common system
(Benedik, 2010; S3)
16
Entrez- Molecular Biology Database System. It provides integrated access to nucleotide and
protein sequence data, gene centered and genomic mapping information, 3D, PubMed
MEDLINE
Refseq (NCBI) - Reference Sequence Database, is an open access, annotated and curated
collection of publicly available sequences (DNA, RNA and their protein).
www.ncbi.nlm.nihdi.govEntrez/
www.ncbi.nlm.nlh.gov/
BLAST – Basic Local Alignment Search Tool - Explain BLASTn and BLASTp
17