You are on page 1of 41

Different Levels of Protein Structure

1
METHODS FOR PROTEIN STRUCTURE
PREDICTION
1. EXPERIMANTAL
METHODS

2. COMPUTIONAL METHODS

4
Experimental Protein Structure Determination

• X. ray crystallography
– most accurate
– in vitro
– needs crystals
– ~$100-200K per structure
– time consuming and expansive.
• NMR
– fairly accurate
– in vivo
– no need for crystals
– limited to very small proteins
– time consuming and hardly .
• Electron-microscopy
– imaging technology
– low resolution
– not more observable.
5
Computational method
• Major Techniques
– Template Modeling
• Homology Modeling
• Threading
• Both are use known protein structure

– Template-Free Modeling
• ab initio Methods
– Physics-Based
– Knowledge-Based
– without use of known protein
structure

6
Homology Modelling

• It is also called comparitive modeling.


• predict protein structures based on
sequence homology with known structure.
• Principle:-
• if two proteins share a high enough
sequence similarity,they are likely to have
very similar three dimensional structure.
• modeling server:-modbase,swiss-model
etc.
• Fail in absence of homology

8
Homology ModellingSoftwares
1.template selection (BLAST and FASTA) 2.sequence
alignment (T-coffee and PRALINE)
3.model building (CODA) (a)backbone model
building (b)loop modeling
4.side chain refinement (SCWRL)
5.model refinement using energy function (GROMOS)
6.model evalution (PROCHECK & WHAT IF)
•Has been observed that even proteins with 30% sequence
identity fold into similar structures
•Does not work for remote homologs (< 30% pairwise identity)
Homology Modeling: How it works

o Find template

o Align target sequence


with template

o Generate model:
-add loops
-add sidechains

o Refine model

1
Template recognition and initial alignment :
• Compare the sequence of the unknown protein with all the sequence of
known structures store in protein data bank
• Blast this sequence against PDB sequences- obtain a list of known protein
structures that match the sequence
• Blast uses a residue exchange scoring matrix . Residues that are easily
exchanged get a better score than residues that have different properties.
• Function specific conserved residues get best score.
• Blast will provide a list of possible templates for the unknown structure. To
make the best initial alignment , blast uses an alignment matrix based on
residue exchange matrix and adds extra penalties for opening and
extension of a gap between residues
• The target sequence is sent to a blast server, which searches the pdb to
obtain a list of possible templates and their alignments.
• The best hit has to be chosen, which is not necessarily the first one
Alignment correction
• fine tune and adjust the blast alignments
• Example : al > glu is possible but unlikely in a
hydrophobic core, so these residues should not be
aligned
• Examine the template structure to check which residues
are in the core
hence likely to change than residues at the outside
• Insertions and deletions can be made in those parts of the
sequence which are highly variable
• These can be done region of protein which are highly
variable
• Shift the gap after deletions to be aligned properly
Backbone generation and loop modelling
• The coordinates of the template backbone are copied to target
structure from pdb
• When the residues are identical, the side chain coordinates are also
copied.
• Note that pdb file may contain small offsets or errors , so try to use
multiple similar templates.
• When a target sequence contain a gap, one option is to delete the
corresponding residues in template. But this create a fracture in the
template.
• When the template sequence contains a gap, there are no backbone
coordinates known for these residues in model. The target back bone
has to be cut to insert newer residues.
• These major changes cannot be modeled in secondary structure
elements hence place them in loops and strands therefore surface
loops are flexible and difficult to predict
Side chain modeling
• Note that the conserved residues were already copied> now we just need to
place the side chains
• Copy the torsion angles carbon alpha/beta to the target.
• Rotamers tend to be conserved in homologous proteins and can be predicted as
backbone configuration strongly prefer a specific rotamer.
• Moreover, libraries of flanking or neighboring residues can also help to
estimate the side chain positioning.
• The backbone of tyrosine strongly prefers two rotamers and the real side chain
may fit one of them
Model optimization:
• What is need for further optimization?
• Because ethe updated side chains can effect the backbone and this can effect
the structure prediction
Model validation:
so the model should be checked again for normal ranges of bumps, bond angles ,
torsion angles, bond lengths. Other properties ,like the distribution of polar/
apolar residues can be compared with real structures.
Limitation

• Limited to structure of template.


• Cannot study conformational
changes
Threading model
• In homology model of prediction of protein structure we
obtain sequences of the target protein and we sent to
protein data base to get a matching protein sequence and
we drawn out a similar structure to template protein.
• What if we do not get a matching sequence from protein
data base?
• The second method to follow when this problem arises is
threading method or fold recognition method.
• In this we recognize motifs that is combination of
secondary structures of protein and we search the data
base of secondary structure.
• A protein fold is defined by the way the secondary
structure elements of the protein structure are arranged
relative to each other in space.
• The secondary elements include alpha helixes, beta pleat
sheet , folds ,
coils etc
• You will be surprised know that in nature only 5000 stable
protein folds are present.
• If we have data base of folds that will help us in protein
structure
recognition.
• Fold recognition means finding the best fit of a sequence to
a set of candidate folds.
Threading

• Given:
– sequence of protein 'P 'with unknown structure
– Database of known folds
• Find:
– Most plausible fold for 'P'
– Evaluate quality of such arrangement
• Places the residues of unknown 'P' along the
backbone of a known structure and determines
stability of side chains in that arrangement

3
Threading and fold recognition

•predicts the structural fold of unknown protein


sequences by fitting the sequence into a
structural database and selecting the best fitting
fold.
•we can identify structurally similar proteins
even without detectablesequence similarity.
•two algorithms:-
1.pairwise energy based method 2.profile based
method
10
1.Pairwise energy based method (threading)
• Searched for a structural fold database to find the best
matching structural fold using energy based criteria.
• Using dynamic programming and heuristic approaches.
• Calculate energy for raw model.
• Lowest energy fold that correspond to the structurally a
group of most compatible fold.
2.profile based method (fold recognition)
• A profile is constructed for related protein structures.
• Generated by superimposition of the structures to
expose corresponding residues.
• Secondary structure type,polarity,hydrophobicity.
• The protein fold to be predicted does not exist in the
fold library,method will fail. 3D-PSSM

36
Application of nmr and xray in
proteomics

3
7
4/6/2019 SRINIVAS COLLEGE OF 3
8
PHARMACY
NMR APPLICATION

• About 17% structure deposited in protein data bank , most of which donot
have corresponding crystal structures which is solved by NMR
spectroscopy
• It is the basis for a wide range of experiments to determine stucture
function relationship
• To investigate dynamics of proteins
• To distinguish multiple conformations
• To compare apo and holo form of proteins and map the binding site of
their co factors
• Weakly binding ligands can be determined by nmr spectroscopy

4
0

You might also like