You are on page 1of 36

Predictive method Using

Protein sequence
Does
Protein data base sequence NO Protein
sequence similarity align with a family
search protein of analysis
known
structure?

YES

YES Relationship
Predictive 3-D 3-D
to known
structure model comparative
structure?
modeling

NO
YES

NO Is there a
3-D structure predicted Structural
analysis in lab structure? analysis
Protein sequence

„ Amino acid sequences of protein are


derived from translation of cDNA
sequences or predicted gene structure
in genomic DNA sequence.
„ The sequence of aminoacid is basically
responsible for primary structure of
proteins
Database similarity
search
„ The amino acid sequence is used as a query in a database similarity
search against the proteins in the protein databank (PDB) all of
which have a known 3-D structure.

„ A significant alignment of a query sequence with a PDB sequence is


evidence that query sequence has a similar 3-D structure.

„ If relationship with PDB protein is not found, then a second


database similarity search against a protein sequence database
such as SwissProt can be performed. Matching sequences including
both closely related and more distantly related one can then be
used in a search against PDB sequences.

„ The PSI-BLAST tool used for the process of finding related


sequences in protein database. The goal is to discover one or more
database sequences that are related both to the query and to PDB
sequence
In case similarity is not observed proceed for

3-D comparative modeling


„ If the database similarity search revels a significant alignment
between the query sequence and PDB sequence, the
alignment between the sequences can be used to position the
amino acid of the query sequence in the same approximate
3-D structure

„ Sequence alignment of new protein with a protein of known


structure provide a starting 3-D model of the protein by using
computer graphics and protein modeling software, the
aminoacid can then be positioned to accommodate available
space and interactions with neighboring aminoacids.
In case similarity is not observed proceed for

Protein family analysis


„ Proteins have been classified into families on the basis of sequence
similarity. The relationships are shown in multiple sequence alignment
of the protein.

„ Proteins of known 3-D structure have also been classified into fold
families on the basis of a common arrangement of secondary
structure.

„ Sequence of protein in the same fold family are often not similar so
they cannot be aligned.

„ However the individual protein in particular fold family are often


members of families based on the sequence similarity. Hence this
similar sequences are also predicted to have the same structural fold
as the fold family
If the new protein is the member of
a protein family due to sequence
similarity
„ The new sequence is by analyzed by the test sequence for pattern
that represent each family using PSSMs and profile HMMs software.

„ Website such as Interpro include a large collection of patterns and


will search a new sequence for matches. 3-D PSSM includes a
powerful set of scoring matrix based on structural alignment for using
3-D structure prediction.

„ This website also provide links to related fold families, thus identifying
a predicted structure for the new protein. Other websites employ a
cluster analysis of proteins based on pair wise alignment score of all
of proteins in the SwissProt database.

„ This site offer an alternate method of finding a relationship between a


new sequence and all of other sequences in SwissProt, and thus for
discovering a link to a known protein structure
If the new protein predicted to have only
similar structural fold for family
multiple sequence alignment of this protein can be used for structural
modeling

„ Structural analysis :
1. The presence of small amino acid motifs in protein can be indicator of
biochemical function is analyzed. The Prosite catalogue can be used to
search a new protein sequence for motifs

2. Spacing and rearrangement of specific amino acids e.g. hydrophobic


proteins provides important structural clues that can be used for modeling

3. The tendency of certain aminoacid combinations to occur in a given type of


secondary structure provides methods for predicting where this structures
are likely to be occur in new sequences.

4. new protein sequence can be aligned with these models to determine


whether the sequence matches to any of protein family based on protein
fold by using again PSSMs and HMMs. This procedure is known is threading
a sequence into structure.
Structural analysis
„ The structural analysis from previous step provides clues as to
the presence of active site, regions of secondary and 3-D
structure, and the order of predicted secondary structure.

„ If these predictions are convincing enough it may be possible


to identify a new protein as a member of known structural
class.

„ structural alignment of a new protein of a known structure


provide a starting 3-D model of a protein by using computer
graphics and protein modeling software the amino acids can
be positioned to accommodate available space and
interactions with neighboring aminoacids.
If no match is observed
„ Protein that fails to show any relationship to proteins of
known structure are candidates for structural analysis.

„ There are approx. 500 to 600 known fold families, and new
structures are frequently found to have already known
structural fold.

„ Accordingly proteins with known relatives of known structure


may represent a novel structural fold.
Primary database
PIR Family and super family classification based on
sequence alignment

Swiss- Port Protein sequence database

GeneBank DNA sequence database

DDBJ DNA sequence database

EMBL DNA sequence database


Secondary Database
SCOP Structural classification of proteins
PRINTS Protein fingerprints (aligned motifs)
HSSP Sequences similar to proteins of known
structure
3-D PSSM Uses a scoring matrices based on structural
similarity
Prosite Group of proteins of similar biochemical
function on basis of amino acid patterns
BLOCKS Ungapped blocks in families
Structural Database
SCOP Structural classification of
proteins
PDB ( protein data bank) Structural details with links to
other site for analysis
When we can do
comparative modeling
Swiss modeling
Energy minimization

You might also like