Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
0Activity
0 of .
Results for:
No results containing your search query
P. 1
Multiple Sequence Alignment (MSA)

Multiple Sequence Alignment (MSA)

Ratings: (0)|Views: 11 |Likes:
Published by Aayudh Das
Multiple Sequence Alignment (MSA)
Multiple Sequence Alignment (MSA)

More info:

Categories:Topics
Published by: Aayudh Das on Sep 07, 2013
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOCX, PDF, TXT or read online from Scribd
See more
See less

09/07/2013

pdf

text

original

 
 1
©Aayudh Das
MULTIPLE SEQUENCE ALIGNMENT
A multiple sequence alignment is a collection
of three or more protein (or nucleicacid) sequences that are partially or completely aligned.
 
Sum of Pair (SP) method-
 
 2
Methods for applying multiple sequencealignment 
Three important methods are1.
 
Profiles.2.
 
PSI-BLAST.3.
 
Hidden Markov Model (HMMs).
Profiles
Profiles express the
 patterns inherent in a multiple sequence alignment 
of a set of homologous sequences. They have several applications like -
Advantage-
1.
 
They permit 
greater accuracy in alignments of distantly-related sequences
.2.
 
Sets of residues that are
highly conserved
are
likely to be part of the activesite
, and
give clues to function
.3.
 
The conservation patterns
facilitate identification of other homologoussequences
.4.
 
Patterns from the sequences are
useful in classifying subfamilies
within a set of homologues.5.
 
Set of residues that show
little conservation, and are subject to insertion anddeletion
, are likely to be in surface loops. This
information has been appliedto vaccine design
, because such regions are
likely to elicit antibodies that willcross-react well with the native structure
.
Working procedure-
The basic idea in using profile patterns in
identifying homologues is to match thequery sequence from the database against the sequences in the alignment table
,giving
higher weight to positions that are conserved
 
than to those that arevariable
.But one must not be too compulsive as in that case there is a
chance of missinginteresting distant relatives
.
 
 3
A quantitative measure of conservation-For each position in the table of aligned sequences, take inventory of the distribution of amino acids.
-
 
It is evident that the
positions 26, 27 and 29 contribute
very
high score
and
disagreement at these positions contributes a very low score
.
 
For moderately conserved positions, such as position
28
, we want a modest 
positive contribution to the score if the query sequence has an S
or a
W
at this position, and a
smaller contribution if it has T or Y 
.
 
So the general idea is to score each residue from the query sequence based onthe amino acid distribution at that position in the multiple sequence alignment table.
 
A simple approach would be to use the inventories as scores directly.
 
The sequence VDFSAE would score 13+16+16+7+16+4=72
 
           
Thus we have to take inventory for each
query sequence
and will
have to test allpossible alignments
with
the multiple alignment table
, and take the
largest totalscore.
 It is obvious from these discussions that if the table contained a large and unbiasedsample of sequences then the inventory would provide the correct picture of thepotential distribution of residues at each position.With similar arguments we can say that if our sample were small, the pattern derivedwould be unlikely to reflect the complete repertoire.

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->