Protein Engineering - Edited

1
Protein Engineering
Student’s Name
Institution
Instructor
Assignment Due Date

2
Protein Structure and Function
Protein is an important macromolecule needed by every cell in the body. Proteins are
made up of polypeptide chains consisting of amino acids joined together by peptide bonds.
Gene coding determines the exact amino acid sequence in a particular polypeptide (Alley,
2019). Novel synthesized polypeptides must fold up in order to attain stability. Overall, the
protein structure attained by folding is described in four different levels. The primary
structure is the simplest structure made of amino acid sequence, forming a polypeptide chain.
The amino acid sequence is dependent on the D.N.A. sequence code. In secondary structure,
hydrogen bonds form between the peptide backbone resulting in regular recurring amino acid
arrangements (Alley, 2019). The secondary structured protein may then fold again into a
three-dimensional conformation known as tertiary structure.

3
After protein synthesis, a polypeptide folds into its tertiary form, and the
conformation formed is dependent on the amino acid sequence. During protein folding, the
major stabilizing forces include hydrophobic bonds, electrostatic attractions like Van der
Waals' forces and covalent linkages like disulfide bonds.
Problem statement
Protein’s properties are not always optimized for high efficiency in their application.
Factors such as low stability, non-optimal amino acid sequence and undesired substrate
specificity pose a challenge in protein industrialization. In vivo, proteins are synthesized and
folded to function optimally (Alley, 2019). However, the proteins are not well evolved to be
used in vitro for pharmaceutics, commercial, medicine, research, and industrial purposes
(Alley, 2019). Proteins used in biotechnology need to have high stability to be stored for
longer periods, and their expression levels need to be high to aid in large-scale
manufacturing. To obtain these properties, changes can be made through protein engineering.
Protein engineering
Protein engineering is when novel proteins are synthesized or existing protein
sequence/ structure is amended to achieve desired results such as novel functionality, change
in substrate specificity and stability (Alley, 2019). It aims to improve the efficacy and
pharmacokinetics of recombinant proteins by modifying their structure. An example of
recombinant protein engineered to improve its efficacy is insulin (Alley, 2019). It was
discovered that storing insulin composing its native molecules formed dimeric and hexameric
protein structures (Alley, 2019). At the injection site, the high order species formed settle for
a longer period resulting in slower therapeutic onset. To mitigate this challenge, the amino
acid sequence of the insulin proteins was altered and modified to generate fast-acting insulin.
4
Protein engineering improves the stability and solubility of proteins. Their structure
and bonds give the stability of proteins (Alley, 2019). Protein engineering modifies the
protein structure and bonds. For example, disulphide bonds (+22°-, 3 s-s bridges) can be
introduced to modify folding, ɒ-helices dipoles can be stabilized, and the hydrophobic
cavities filled. An example of stabilized protein through engineering is the phage T4
lysozyme (Singh,2018). Two cysteines were introduced into the lysozyme at a very close
distance, which allowed a spontaneous reaction to occur, forming a disulphide bridge
between the amino acid residues 21-142. This increased the M.P.M.P. temperature by 11°c
and greatly stabilized the protein without the structure becoming more rigid and changing its
functional properties. Further, unlike the wild-type lysozyme, the triple-disulphide variant
unfolds at a temperature of 23. 4ºc higher.
It modifies protein structures to overcome their expression limitations. New
functional proteins are also generated, for example, antibodies and biosensors. Protein
engineering can also be used to study protein function (Singh,2018). Enzymes and enzyme
inhibitors can be made more stable, e.g., the thermostable Taq polymerase used in PCR and
efficient by improving their catalytic function. New functional proteins can be fabricated by
combining proteins of different known domains, which can be used in biotechnology or as
drugs for therapeutics (Singh,2018). Examples of therapeutic proteins include conjugated
antibodies, e.g., Fc receptors that attract macrophages, and radioactively tagged
immunoglobulins (Singh,2018). The Human monoclonal antibody Humira is also known as
Adalimumab, is the best-selling engineered drug worldwide. Adalimumab is a disease-
modifying antirheumatic drug (D.M.A.R.D.) used to treat autoimmune diseases such as
rheumatoid arthritis. It functions to inactivate the TNF-ɒ. It was the first human monoclonal
antibody that was engineered in vitro (Singh,2018).

5
Protein engineering is facilitated by other fields like bioinformatics and PCR in
molecular biology. Genomics and proteomics generate a vast amount of biological data to be
stored and easily retrieved for analysis (Singh,2018). Bioinformatics, as a scientific
discipline, is involved in storing, retrieving, and analyzing data. Raw biological data like
gene and protein sequences can be stored in a computer database, and data analysis and
interrogation can be done using computer programs (Singh,2018). Protein-focused
bioinformatics can provide the following resources: Protein structural databases that store and
organize proteins in terms of their three-dimensional structure, sequence databases, protein
family databases and protein function databases.
Polymerase chain reaction (PCR) allows D.N.A. sequences to be amplified.
Amplification aids in the engineering of proteins, as seen in the protein engineering
approaches section (Singh,2018). The process involves three steps: denaturation, annealing
and extension. During denaturation, a sample containing dsDNA is heated at 94°c to separate
the strands into single strands (Singh,2018). It reaction temperature is then reduced to 50-56ºc
for primers' annealing to complementary strands (Singh,2018). After annealing, the
temperature is raised to 72°c and a thermostable D.N.A. polymerase known as Taq
polymerase is used to extend the new strands on the 3' end in the 5' to 3' direction. The first
cycle is then complete resulting in a double initial D.N.A. amount. To obtain millions and
hundreds of millions of copies, 25-30 cycles are repeated.
In recombinant D.N.A. technology, protein sequences can be re-engineered to
optimize codons for recombination. The expressed proteins are expressed as fusion proteins
to enable the addition of affinity tags. Fusion with affinity tags such as glutathione S-
transferase (G.S.T.). Improves protein solubility (Singh,2018). Other affinity tags that can be
used are hexahistidine, calmodulin-binding peptide and maltose-binding protein (M.B.P.)
(Singh,2018). Short peptides can also be used to tag proteins since they are small. Therefore,
6
they would not affect the structure and function of the protein. The short peptides also have
preexisting antibody handles, which can be used to purify the protein (Singh,2018). Affinity
purification ensures contaminants are washed away to retain the desired proteins
(Singh,2018). Green fluorescent protein can be used to confirm the expression of the cells by
visualizing them fluorescent under U.V.U.V. light. The GFP makes the cells fluoresce
different colours allowing a variety of proteins to be visualized simultaneously (Singh,2018).
GFP can also be used to clone biosensors.
Approaches for Protein Engineering
Rational Design Approach
The rational design approach is the most classical protein engineering method that
involves 'site-directed mutagenesis' of proteins. In rational design, the structure and function
of a protein must be known first before a rational gene mutation is carried out (Singh,2018).
This design generally requires fewer screening efforts since it is targeted. First, rationally
design changes are made in the cloned D.N.A. construct of the protein to be expressed
(Singh,2018). The protein is then expressed, purified, and assessed to the desired changes that
have been achieved. Alterations/ mutations in the D.N.A. construct are sited-directed.
Site-directed mutagenesis includes insertion, deletion, or replacement of single or
more amino acids. This approach is useful if the residue change proves to make the desired
change in the protein structure or function (Singh,2018). There are two methods of site-
directed mutagenesis; 'overlap extension' and 'whole plasmid single round PCR'. In 'overlap
extension', two primer pairs are used, one of each pair containing a mutant codon with a
mismatched sequence (Singh,2018). Two PCR cycles are done to produce two double-
stranded D.N.A. molecules. All the two primer pairs are used in the first cycle of PCR
(Singh,2018). Denaturation and annealing give two heteroduplexes, with one strand of each
7
having the desired mutagenic codon. After, extension is done by D.N.A. polymerase to fill
the 3' to 5' overlapping ends. The nonmutated primer set is then used to amplify the
mutagenic D.N.A. in the second PCR cycle.
‘Whole plasmid single round PCR’ method is the basis of Stratagene ‘QuickChange
S.T.M. Kit'. The method requires two primers with desired mutations complimentary to the
dsDNA plasmid template. PCR is performed, and a mutated plasmid is formed. This plasmid
has broken, but they do not overlap. Selective digestion using Dpml methylase is done to
make the plasmid a circular nicked vector. The nicked vector is then transformed into
competent cells where the D.N.A. nick is repaired. The result is a mutated circular plasmid.
https://www.eurekaselect.com/images/graphical-abstract/cpps/19/1/001.jpg
A case study of the rational design approach is the protein engineering of the enzyme
Subtilisin. Subtilisin is a protease enzyme added to detergents used in washing machines to
improve efficiency (Singh,2018). Protein engineering modifies the enzyme to be
thermostable and active at high temperatures and high pH (Singh,2018). The enzyme's half-
life is five minutes at 65ºc, while the half-life on a homologous protein Thermoactinomyces
8
Vulgaris is 17 hours at the same temperature. Their amino acid differences bring about this
temperature discrepancy despite being 57% identical.
The short half-life of the Subtilisin enzyme is due to the oxidation of amino acid
methionine at position 22 (Singh,2018). The oxidation causes the inactivation of the enzyme.
Methionine is positioned adjacent to a catalytic serine residue, making it sensitive
(Singh,2018). Its oxidation causes side-chain bulk to increase, and even an electronegative
oxygen atom can be introduced, causing negative effects on the catalytic activity. To modify
the enzyme to be thermostable, site-directed mutagenesis of the subtilisin gene is done
(Singh,2018). The methionine is substituted for alanine, while in E. coli, the engineered
Subtilisin has high stability and activity at high temperatures. Currently, this engineering of
Subtilisin is widely adopted to manufacture laundry detergent.
Directed Evolution Approach
In most cases of protein engineering, the structure and function of the protein of
interest are not well known; therefore rational approach becomes difficult. An alternative
approach known as directed evolution that involves random mutagenesis can then be used
(Singh,2018). In this method, the protein structure and mechanism do not have to be known;
instead, the only requirement is that a suitable selection scheme is available to suit the desired
protein properties (Singh,2018). A library of D.N.A. constructs is generated, sometimes

9
through bioinformatics then all the constructs are expressed to proteins (Singh,2018). The
expressed proteins are screened for desired characteristics (Hit) by high throughput. A large-
scale expression of the D.N.A. construct of the obtained designed characteristic protein is
done.
Random mutagenesis involves a simple technique known as ‘saturation mutagenesis’.
Saturation mutagenesis involves substituting a single amino acid with all the possible amino
acids at that position (Singh,2018). A library of all possible point mutation proteins results
from the substitution. Another technique is ‘region-specific mutagenesis’. This combines
both the rational and random approaches of protein engineering (Singh,2018). A specific
region is simultaneously replaced with amino acid residues to express a protein with new
specificities.
https://www.researchgate.net/profile/Frances_Arnold/publication/51436919/figure/
fig1/AS:305735431933966@1449904415671/Schematic-outline-of-a-typical-directed-
evolution-experiment-The-researcher-begins-with_W640.jpg
Phage Display as a Selection Strategy

10
After generating a library of protein mutants, the protein mutant with desired
characteristics is identified or isolated by screening assay and selection strategies such as
phage display. Phage display was invented in 1985 by G. Smith to present foreign peptides
on filamentous bacteriophages (Allison,2018). Since then, the method has been widely
adopted in protein engineering to produce large peptides and antibodies. The phage
phenotype and genotype are physically linked forms the basis of the technique
(Allison,2018). The linkage aids in obtaining identical phage particles from the same clone of
E. coli (Allison,2018). Libraries of up to 1010 different variants can be created and used for
affinity screening to study and characterize protein-ligand interactions.
Filamentous bacteriophages of E. coli (f1, fd and M13) are all used in phage display.
In this method, the M13 E. coli filamentous bacteriophage is used. The bacteriophage is a
virus infecting the bacterium (Allison,2018). The phage contains a D.N.A. genome that is
circular and single-stranded. Ten proteins are encoded by the genome, 5 of which are
structural proteins of the virion (Allison,2018). The protein coat that encloses the genome
encodes 2700 copies of gpVIII. Gene fragments encoding a library of peptide or polypeptide
are displayed at the M13 coat protein gene pVIII (Allison,2018). These gene fragments fuse
with the protein to form part of the capsid.
A major limitation to the phage display selection strategy is the loss of functionality
of the protein coat. To overcome this problem, hybrid phages were developed, and
modifications on the protein coat were made. Drugs derived from phage display include
Adalimumab, whose target is the tumour necrosis factor-alpha, Raximacumab, whose target
is the protective antigen of the Bacillus anthracis. The romiplostim target is the
thrombopoietin receptor. Ranibizumab is also developed using the phage display targeting the
Vascular Endothelial Growth Factor A.

11
A guide selection that involved mouse mAb was used in phage display to discover the
drug Adalimumab. Briefly, a heavy chain murine antibody was combined with some light
chains and selected to bind to hTNF.
D.N.A. shuffling ( Stemmer)
https://bitesizebio.com/wp-content/uploads/2016/07/shuffling.jpg
D.N.A. shuffling is an in vitro technique of combining homologous genes. Uses error-
prone PCR technology, also known as ‘sexual PCR’ (Kikuchi, 2017). First, PCR amplifies
and prepares a pool of closely related genes with varying point mutations (Kikuchi, 2017).
PCR gives products of the same size as the template. DNase I is then used to break down the
molecules into random fragments creating random nicks along the D.N.A. strands (Kikuchi,
2017). After the random fragmentation, agarose gel is used to purify and obtain fragments of
the desired size. After the fragments of the desired size are obtained, they are denatured,
annealed, and extended to reassemble them (Kikuchi, 2017). during the PCR process, no
primers are added. Denaturing involves heating the double-stranded fragment to separate
them into single strands. The temperatures are then lowered so that overlapping fragments
can anneal with the single-stranded fragments by the number of bases complementary to the
overlapping region (Kikuchi, 2017). These homologous templates anneal to prime each other.
The annealing forms 5’ and 3’ overhangs.

12
Extension of the overhangs is done at increased temperature optimum for the PCR
DNA polymerase. The D.N.A. polymerase uses the other annealed strand to extend the 5'
overhangs of one template. Since the D.N.A. polymerase can only extend from a 5' end, the
3' overhang is not extended. Multiple cycles of the PCR are done to amplify and obtain
millions of cloned genes.
Error-Prone PCR
Error-prone PCR is a Variant of the normal PCR that is used to generate genomic
libraries that are randomized. Tiny amounts of parent D.N.A. molecule can be amplified to
produce a large number of mutated genes. The ability of the thermostable Taq polymerase to
anneal incompatible base-pairs and amplify them under imperfect conditions constitutes the
working principle of this technique (Kikuchi, 2017). The imperfect conditions cause the
polymerase to make mistakes while pairing the base pairs inculcating into the newly
synthesized D.N.A. strands having errors (Kikuchi, 2017). Taq polymerase also lacks proof-
reading ability making the error-prone PCR technique to be efficient.
Conclusion
Protein’s properties are not always optimized for high efficiency in their application.
Factors such as low stability, non-optimal amino acid sequence and undesired substrate
specificity pose a challenge in protein industrialization. Protein engineering is when novel
proteins are synthesized or existing protein sequence/ structure is amended to achieve desired
results such as novel functionality, change in substrate specificity and stability. Protein
engineering is facilitated by other fields like bioinformatics and PCR in molecular biology.
The rational design approach is the most classical protein engineering method that involves
'site-directed mutagenesis' of proteins. In rational design, the structure and function of a
protein must be known first before a rational gene mutation is carried out. However, in most
13
cases of protein engineering, the structure and function of the protein of interest are not well
known; therefore rational approach becomes difficult. An alternative approach known as
directed evolution that involves random mutagenesis can then be used.
References
14
Alley, E.C., Khimulya, G., Biswas, S. et al. Unified rational protein engineering with
sequence-based deep representation learning. Nat Methods 16, 1315–1322 (2019).
https://doi.org/10.1038/s41592-019-0598-1
Allison, James. "Phage Display Helps Study The Interaction Of Proteins".
Whatisbiotechnology.Org, 2018,
https://www.whatisbiotechnology.org/index.php/science/summary/phage/phage-
display-helps-study-the-interaction-of-proteins.
Kikuchi M., Harayama S. (2017) D.N.A. Shuffling and Family Shuffling for In Vitro Gene
Evolution. In: Braman J. (eds) In Vitro Mutagenesis Protocols. Methods in Molecular
Biology™, vol 182. Humana Press, Totowa, NJ. https://doi.org/10.1385/1-59259-194-
9:243
Singh, R. K., Lee, J. K., Selvaraj, C., Singh, R., Li, J., Kim, S. Y., & Kalia, V. C. (2018).
Protein Engineering Approaches in the Post-Genomic Era. Current protein & peptide
science, 19(1), 5–15. https://doi.org/10.2174/1389203718666161117114243

Protein Engineering - Edited

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Protein Engineering - Edited

Uploaded by

Copyright:

Available Formats

1

Assignment Due Date

Protein Structure and Function

three-dimensional conformation known as tertiary structure.

Waals' forces and covalent linkages like disulfide bonds.

Protein engineering is when novel proteins are synthesized or existing protein

pharmacokinetics of recombinant proteins by modifying their structure. An example of

cavities filled. An example of stabilized protein through engineering is the phage T4

distance, which allowed a spontaneous reaction to occur, forming a disulphide bridge

unfolds at a temperature of 23. 4ºc higher.

It modifies protein structures to overcome their expression limitations. New

combining proteins of different known domains, which can be used in biotechnology or as

drugs for therapeutics (Singh,2018). Examples of therapeutic proteins include conjugated

antibodies, e.g., Fc receptors that attract macrophages, and radioactively tagged

immunoglobulins (Singh,2018). The Human monoclonal antibody Humira is also known as

Adalimumab, is the best-selling engineered drug worldwide. Adalimumab is a disease-

modifying antirheumatic drug (D.M.A.R.D.) used to treat autoimmune diseases such as

antibody that was engineered in vitro (Singh,2018).

Protein engineering is facilitated by other fields like bioinformatics and PCR in

stored and easily retrieved for analysis (Singh,2018). Bioinformatics, as a scientific

interrogation can be done using computer programs (Singh,2018). Protein-focused

organize proteins in terms of their three-dimensional structure, sequence databases, protein

family databases and protein function databases.

Polymerase chain reaction (PCR) allows D.N.A. sequences to be amplified.

Amplification aids in the engineering of proteins, as seen in the protein engineering

for primers' annealing to complementary strands (Singh,2018). After annealing, the

temperature is raised to 72°c and a thermostable D.N.A. polymerase known as Taq

hundreds of millions of copies, 25-30 cycles are repeated.

In recombinant D.N.A. technology, protein sequences can be re-engineered to

used are hexahistidine, calmodulin-binding peptide and maltose-binding protein (M.B.P.)

different colours allowing a variety of proteins to be visualized simultaneously (Singh,2018).

GFP can also be used to clone biosensors.

Approaches for Protein Engineering

Rational Design Approach

Site-directed mutagenesis includes insertion, deletion, or replacement of single or

mutagenic D.N.A. in the second PCR cycle.

Subtilisin. Subtilisin is a protease enzyme added to detergents used in washing machines to

improve efficiency (Singh,2018). Protein engineering modifies the enzyme to be

temperature discrepancy despite being 57% identical.

Methionine is positioned adjacent to a catalytic serine residue, making it sensitive

the enzyme to be thermostable, site-directed mutagenesis of the subtilisin gene is done

Subtilisin is widely adopted to manufacture laundry detergent.

Directed Evolution Approach

protein properties (Singh,2018). A library of D.N.A. constructs is generated, sometimes

Random mutagenesis involves a simple technique known as ‘saturation mutagenesis’.

from the substitution. Another technique is ‘region-specific mutagenesis’. This combines

Phage Display as a Selection Strategy

characteristics is identified or isolated by screening assay and selection strategies such as

affinity screening to study and characterize protein-ligand interactions.

with the protein to form part of the capsid.

Vascular Endothelial Growth Factor A.

chains and selected to bind to hTNF.

D.N.A. shuffling ( Stemmer)

D.N.A. shuffling is an in vitro technique of combining homologous genes. Uses error-

The annealing forms 5’ and 3’ overhangs.

millions of cloned genes.

reading ability making the error-prone PCR technique to be efficient.

specificity pose a challenge in protein industrialization. Protein engineering is when novel

'site-directed mutagenesis' of proteins. In rational design, the structure and function of a

known; therefore rational approach becomes difficult. An alternative approach known as

directed evolution that involves random mutagenesis can then be used.

sequence-based deep representation learning. Nat Methods 16, 1315–1322 (2019).

Allison, James. "Phage Display Helps Study The Interaction Of Proteins".