You are on page 1of 33

Protein Tertiary

Structure Prediction

Dr. Zoya Khalid


Zoya.khalid@nu.edu.pk
Protein Tertiary Structure Prediction

Structure:
Traditional experimental methods:
X-Ray or NMR to solve structures;
generate a few structures per day worldwide
cannot keep pace for new protein sequences

Strong demand for structure prediction:


more than 30,000 human genes;
10,000 genomes will be sequenced in the next 10 years.

Unsolved problem after efforts of three decades.


Structure-Function Relationship

But a structure is a key to understand the


detailed mechanism.

A predicted structure is a powerful tool for


function inference.

Trp repressor as a function switch


Protein Folding
• Protein folding is the process by which a protein structure assumes its
functional shape or conformation.
• All protein molecules are heterogeneous unbranched chains of amino
acids.
• By coiling and folding into a specific three-dimensional shape they are
able to perform their biological function.
Misfunctions
• Proteins can miss function for several reasons. When a protein is miss
folded it can lead to denaturation of the protein.
• Denaturation is the loss of protein structure and function. The miss
folding does not always lead to complete lack of function but only
partial loss of functionality.
• The miss-functioning of proteins can sometimes lead to diseases in
the human body.
• When proteins don't fold correctly, or misfold, diseases occur.
Structure Prediction Methods
Any given protein sequence

Compare sequence with proteins have solved structure

> 30% < 30% < 30%

Homology Fold ab initio


Modeling Recognition Folding

Structure selection

Structure refinement

Final Structure
Homology Modeling
3. Structure Modeling
• Model the main chain conformation using the template structure
• Model the side chain conformation
• When the alignment is correct, the backbone of the target can be
created.
• The coordinates of the template-backbone are copied to the target.
• When the residues are identical, the side-chain coordinates are also
copied.
4. Model Optimization
5. Model Validation
Example Homology Modeling

Valine Glutamine Change in


Rotamer

Amino acid Energy


Substitution Minimisation

Template Structure Initial Model Output Model(s)


Fold Recognition
(Protein Threading)
Fold recognition
• Goal: to find protein with known structure which best
matches a given sequence
• Since similarity between target and the closest to it
template is not high, sequence-sequence alignment
methods fail
• Solution: threading – sequence-structure alignment
method
Protein Threading
• Aligning each amino acid in the target
sequence to a position in the template
structure, and evaluating how well the target
fits the template.
• After the best-fit template is selected, the
structural model of the sequence is built based
on the alignment with the chosen template.
• Protein threading is based on two basic
observations: that
• the number of different folds in nature is fairly
small (approximately 1300)
• that 90% of the new structures submitted to the
PDB in the past three years have similar structural
folds to ones already in the PDB.
Scoring Function

• how preferable to put two particular residues nearby: E_p


Total Energy =
E_p + E_g + E_s • alignment gap penalty: E_g
• how well a residue fits a structural environment: E_s
Threading- Function
Threading an NP-Hard Problem
• Finding the optimal alignment is NP-
hard in the general case where –
there are variable length gaps
between the core segments.

• Interactions between the amino
acids

• Amino Acids Preferences


Components of Threading
• Template library
• Use structures from DB classification categories (PDB)
• Scoring function
• Single and pairwise energy terms
• Alignment
• Consideration of pairwise terms leads to NP-hardness
• heuristics
• Confidence assessment
• Z-score, P-value similar to sequence alignment statistics
• Improvements
• Local threading, multi-structure threading
Ab- Initio
Prediction
Ab initio Prediction
• Ab initio means ‘from the beginning’
• Predicts 3D structure from 1D sequence data
• Not using the information of the structures from homology
sequences, or because of those structures are not available
• Target function (Free energy minimization)
Ab initio Prediction
• Assumption
• The structure that a protein folds is the structure with the lowest
global free energy (or structure ensembles whose energy are very
close to global minimum energy)
• Finding native-like protein conformations requires
developing :
• An accurate potential function that permits calculation of the free
energy given a structure
• An efficient method for searching for energy minima
Why are ab initio structure prediction
calculations hard…..??
• Many degrees of freedom/residue
• Remote non-covalent interactions
• Nature does not go through all conformations
• Applicable to short sequences only
Ab initio Prediction
• The Goal: To provide an approach that relies more on
physical principles than on information from known
proteins.

• The problem can be formulated as a global minimization


problem, as it is assumed that the tertiary structure occurs
at the global minimum of the free energy function of the
primary sequence.

You might also like