You are on page 1of 21

Protein folding

Protein folding
• Protein folding is the physical process by which a protein chain is translated into its
native three-dimensional structure, typically a "folded" conformation, by which the
protein becomes biologically functional.
• Primary structure of a protein, its linear amino-acid sequence, determines its native
conformation.
• Formation of a secondary structure is the first step in the folding process that a protein
takes to assume its native structure.
• Formation of intramolecular hydrogen bonds provides another important contribution to
protein stability. α-helices are formed by hydrogen bonding of the backbone to form a
spiral shape. The β pleated sheet is a structure that forms with the backbone bending over
itself to form the hydrogen bonds.
• Secondary structure hierarchically gives way to tertiary structure formation, there may
also be covalent bonding in the form of disulfide bridges formed between
two cysteine residues.
• Tertiary structure of a protein involves a single polypeptide chain; however, additional
interactions of folded polypeptide chains give rise to quaternary structure formation.
Driving forces of protein folding
• Folding is a spontaneous process that is mainly guided by hydrophobic interactions,
formation of intramolecular hydrogen bonds, van der Waals forces, and it is opposed
by conformational entropy.
• The next type of interaction in protein folding is the hydrophobic interactions within the
protein.
• Globular proteins acquire distinct compact native conformations in water as a result of
the hydrophobic effect.
• These macromolecules may be regarded as "folding themselves", the process also
depends on the solvent (water or lipid bilayer), the concentration of salts, the pH,
the temperature, the possible presence of cofactors and of molecular chaperones.
• Protein folding must be thermodynamically favorable within a cell in order for it to be a
spontaneous reaction. protein folding is a spontaneous reaction, then it must assume a
negative Gibbs free energy value. Gibbs free energy in protein folding is directly related
to enthalpy and entropy.
Molecular chaperones
• Molecular chaperones are a class of proteins that aid in the correct folding
of other proteins in vivo.
• Chaperones may assist in folding even when the nascent polypeptide is
being synthesized by the ribosome.
• Misfunctions
• Proteins can miss function for several reasons. When a protein is miss
folded it can lead to denaturation of the protein. Denaturation is the loss
of protein structure and function.1 The miss folding does not always lead
to complete lack of function but only partial loss of functionality. The
miss functioning of proteins can sometimes lead to diseases in the human
body.
• Example- Alzheimer's Disease, Cystic Fibrosis
Protein function prediction
Simple Function Prediction
• The easiest way to infer the molecular function of an uncharacterized
sequence is by finding an obvious and well-characterized homologue.
• BLAST (sequence-sequence local alignment tool) (e.g., Blast2GO)
• PSI-BLAST (profile-sequence local alignment tool)
• Problem: many proteins do not have obvious homologs
Integrative Approaches
• Similarity grouping
• Phylogenomics
• Sequence patterns
• Sequence clustering
• Machine learning
• Network approach
Pattern-Based Methods
• Classify proteins by locally conserved sequence patterns, which often
indicate the functions of the whole protein (e.g. active site motifs).
• InterPro: the best gateway to pattern-based functional annotations, which
collates patterns at all levels into hierarchically arranged database entries.
• InterProScan server is a meta-tool, which scans the query sequence against
ten core member databases, from which the output is collected and
presented in a simple, non-redundant manner.
• PROSITE scan query sequences against short, position-specific residue
profiles that are characteristic of individual protein families
• PRINTS follows a similar principle but uses discontinuous profiles
(“fingerprints”)
InterPro members
• Pfam, SUPERFAMILY, PRODOM, SMART, Gene3D
• PRODOM automatically clusters evolutionary conserved sequence
segments, based on recursive PSI-BLAST searches of UniProtKB.
• All others use hidden Markov models (HMMs), generated from
multiple sequence alignments, to represent sequence families
Pfam and SMART
• Pfam focuses on the functional aspect of the “domain” defintion.
Classifying sequences into a large number of relatively small
(functionally conserved) families.
• SMART consists of a considerably smaller but completely manually
curated set of families.
SUPERFAMILY and Gene3D

• Based on structural classifications, assigning sequences to the domain


families defined in the Structural Classification of Proteins (SCOP)
and CATH databases
• These families are usually much bigger (less functionally conserved)
than, those in Pfam – they often contain very remote homologues, only
detectable by patterns of structural conservation
Machine Learning Methods
• Learn a relationship between characteristic combinations of sequence features
function categories in a training set of known sequences.
• Support vector machines
• Neural networks
Data Driven Machine Learning Approach

Training
Prediction

Training Data
Classifier: Map New
Protein Split Input to Output Data
Function Data
Test Data
Test

Input: sequence features Training: Build a classifier


Output: function category Test: Test the model

Key idea: Learn from known data and Generalize to unseen data
Tools and Resources
Tools and Resources
Computer-aided (in silico) drug design
Drug & Drug design
• Drug : A substance used in the diagnosis, treatment, or prevention of a
disease or as a component of a medication.
• The drug is most commonly an organic small molecule that activates
or inhibits the function of a biomolecule such as a protein, which in
turn results in a therapeutic benefit to the patient.
• Drug design, often referred to as rational drug design or
simply rational design, is the inventive process of finding
new medications based on the knowledge of a biological target.
Structure & Ligand-based drug design
• Structure-based drug design-drug design that relies on the knowledge of
the three-dimensional structure of the biomolecular target.
• Ligand-based drug design (or indirect drug design) relies on knowledge of
other molecules that bind to the biological target of interest.
• These other molecules may be used to derive a pharmacophore model that
defines the minimum necessary structural characteristics a molecule must
possess in order to bind to the target.
• A quantitative structure-activity relationship (QSAR), in which a correlation
between calculated properties of molecules and their experimentally
determined biological activity, may be derived. These QSAR relationships
in turn may be used to predict the activity of new analogs.
Virtual screening
• Virtual screening (VS) is a
computational technique used
in drug discovery to search
libraries of small molecules in
order to identify those
structures which are most
likely to bind to a drug target,
typically
a protein receptor or enzyme.

You might also like