You are on page 1of 14

1

Protein Engineering

Student’s Name

Institution

Instructor

Assignment Due Date


2

Protein Structure and Function

Protein is an important macromolecule needed by every cell in the body. Proteins are

made up of polypeptide chains consisting of amino acids joined together by peptide bonds.

Gene coding determines the exact amino acid sequence in a particular polypeptide (Alley,

2019). Novel synthesized polypeptides must fold up in order to attain stability. Overall, the

protein structure attained by folding is described in four different levels. The primary

structure is the simplest structure made of amino acid sequence, forming a polypeptide chain.

The amino acid sequence is dependent on the D.N.A. sequence code. In secondary structure,

hydrogen bonds form between the peptide backbone resulting in regular recurring amino acid

arrangements (Alley, 2019). The secondary structured protein may then fold again into a

three-dimensional conformation known as tertiary structure.


3

After protein synthesis, a polypeptide folds into its tertiary form, and the

conformation formed is dependent on the amino acid sequence. During protein folding, the

major stabilizing forces include hydrophobic bonds, electrostatic attractions like Van der

Waals' forces and covalent linkages like disulfide bonds.

Problem statement

Protein’s properties are not always optimized for high efficiency in their application.

Factors such as low stability, non-optimal amino acid sequence and undesired substrate

specificity pose a challenge in protein industrialization. In vivo, proteins are synthesized and

folded to function optimally (Alley, 2019). However, the proteins are not well evolved to be

used in vitro for pharmaceutics, commercial, medicine, research, and industrial purposes

(Alley, 2019). Proteins used in biotechnology need to have high stability to be stored for

longer periods, and their expression levels need to be high to aid in large-scale

manufacturing. To obtain these properties, changes can be made through protein engineering.

Protein engineering

Protein engineering is when novel proteins are synthesized or existing protein

sequence/ structure is amended to achieve desired results such as novel functionality, change

in substrate specificity and stability (Alley, 2019). It aims to improve the efficacy and

pharmacokinetics of recombinant proteins by modifying their structure. An example of

recombinant protein engineered to improve its efficacy is insulin (Alley, 2019). It was

discovered that storing insulin composing its native molecules formed dimeric and hexameric

protein structures (Alley, 2019). At the injection site, the high order species formed settle for

a longer period resulting in slower therapeutic onset. To mitigate this challenge, the amino

acid sequence of the insulin proteins was altered and modified to generate fast-acting insulin.
4

Protein engineering improves the stability and solubility of proteins. Their structure

and bonds give the stability of proteins (Alley, 2019). Protein engineering modifies the

protein structure and bonds. For example, disulphide bonds (+22°-, 3 s-s bridges) can be

introduced to modify folding, ɒ-helices dipoles can be stabilized, and the hydrophobic

cavities filled. An example of stabilized protein through engineering is the phage T4

lysozyme (Singh,2018). Two cysteines were introduced into the lysozyme at a very close

distance, which allowed a spontaneous reaction to occur, forming a disulphide bridge

between the amino acid residues 21-142. This increased the M.P.M.P. temperature by 11°c

and greatly stabilized the protein without the structure becoming more rigid and changing its

functional properties. Further, unlike the wild-type lysozyme, the triple-disulphide variant

unfolds at a temperature of 23. 4ºc higher.

It modifies protein structures to overcome their expression limitations. New

functional proteins are also generated, for example, antibodies and biosensors. Protein

engineering can also be used to study protein function (Singh,2018). Enzymes and enzyme

inhibitors can be made more stable, e.g., the thermostable Taq polymerase used in PCR and

efficient by improving their catalytic function. New functional proteins can be fabricated by

combining proteins of different known domains, which can be used in biotechnology or as

drugs for therapeutics (Singh,2018). Examples of therapeutic proteins include conjugated

antibodies, e.g., Fc receptors that attract macrophages, and radioactively tagged

immunoglobulins (Singh,2018). The Human monoclonal antibody Humira is also known as

Adalimumab, is the best-selling engineered drug worldwide. Adalimumab is a disease-

modifying antirheumatic drug (D.M.A.R.D.) used to treat autoimmune diseases such as

rheumatoid arthritis. It functions to inactivate the TNF-ɒ. It was the first human monoclonal

antibody that was engineered in vitro (Singh,2018).


5

Protein engineering is facilitated by other fields like bioinformatics and PCR in

molecular biology. Genomics and proteomics generate a vast amount of biological data to be

stored and easily retrieved for analysis (Singh,2018). Bioinformatics, as a scientific

discipline, is involved in storing, retrieving, and analyzing data. Raw biological data like

gene and protein sequences can be stored in a computer database, and data analysis and

interrogation can be done using computer programs (Singh,2018). Protein-focused

bioinformatics can provide the following resources: Protein structural databases that store and

organize proteins in terms of their three-dimensional structure, sequence databases, protein

family databases and protein function databases.

Polymerase chain reaction (PCR) allows D.N.A. sequences to be amplified.

Amplification aids in the engineering of proteins, as seen in the protein engineering

approaches section (Singh,2018). The process involves three steps: denaturation, annealing

and extension. During denaturation, a sample containing dsDNA is heated at 94°c to separate

the strands into single strands (Singh,2018). It reaction temperature is then reduced to 50-56ºc

for primers' annealing to complementary strands (Singh,2018). After annealing, the

temperature is raised to 72°c and a thermostable D.N.A. polymerase known as Taq

polymerase is used to extend the new strands on the 3' end in the 5' to 3' direction. The first

cycle is then complete resulting in a double initial D.N.A. amount. To obtain millions and

hundreds of millions of copies, 25-30 cycles are repeated.

In recombinant D.N.A. technology, protein sequences can be re-engineered to

optimize codons for recombination. The expressed proteins are expressed as fusion proteins

to enable the addition of affinity tags. Fusion with affinity tags such as glutathione S-

transferase (G.S.T.). Improves protein solubility (Singh,2018). Other affinity tags that can be

used are hexahistidine, calmodulin-binding peptide and maltose-binding protein (M.B.P.)

(Singh,2018). Short peptides can also be used to tag proteins since they are small. Therefore,
6

they would not affect the structure and function of the protein. The short peptides also have

preexisting antibody handles, which can be used to purify the protein (Singh,2018). Affinity

purification ensures contaminants are washed away to retain the desired proteins

(Singh,2018). Green fluorescent protein can be used to confirm the expression of the cells by

visualizing them fluorescent under U.V.U.V. light. The GFP makes the cells fluoresce

different colours allowing a variety of proteins to be visualized simultaneously (Singh,2018).

GFP can also be used to clone biosensors.

Approaches for Protein Engineering

Rational Design Approach

The rational design approach is the most classical protein engineering method that

involves 'site-directed mutagenesis' of proteins. In rational design, the structure and function

of a protein must be known first before a rational gene mutation is carried out (Singh,2018).

This design generally requires fewer screening efforts since it is targeted. First, rationally

design changes are made in the cloned D.N.A. construct of the protein to be expressed

(Singh,2018). The protein is then expressed, purified, and assessed to the desired changes that

have been achieved. Alterations/ mutations in the D.N.A. construct are sited-directed.

Site-directed mutagenesis includes insertion, deletion, or replacement of single or

more amino acids. This approach is useful if the residue change proves to make the desired

change in the protein structure or function (Singh,2018). There are two methods of site-

directed mutagenesis; 'overlap extension' and 'whole plasmid single round PCR'. In 'overlap

extension', two primer pairs are used, one of each pair containing a mutant codon with a

mismatched sequence (Singh,2018). Two PCR cycles are done to produce two double-

stranded D.N.A. molecules. All the two primer pairs are used in the first cycle of PCR

(Singh,2018). Denaturation and annealing give two heteroduplexes, with one strand of each
7

having the desired mutagenic codon. After, extension is done by D.N.A. polymerase to fill

the 3' to 5' overlapping ends. The nonmutated primer set is then used to amplify the

mutagenic D.N.A. in the second PCR cycle.

‘Whole plasmid single round PCR’ method is the basis of Stratagene ‘QuickChange

S.T.M. Kit'. The method requires two primers with desired mutations complimentary to the

dsDNA plasmid template. PCR is performed, and a mutated plasmid is formed. This plasmid

has broken, but they do not overlap. Selective digestion using Dpml methylase is done to

make the plasmid a circular nicked vector. The nicked vector is then transformed into

competent cells where the D.N.A. nick is repaired. The result is a mutated circular plasmid.

https://www.eurekaselect.com/images/graphical-abstract/cpps/19/1/001.jpg

A case study of the rational design approach is the protein engineering of the enzyme

Subtilisin. Subtilisin is a protease enzyme added to detergents used in washing machines to

improve efficiency (Singh,2018). Protein engineering modifies the enzyme to be

thermostable and active at high temperatures and high pH (Singh,2018). The enzyme's half-

life is five minutes at 65ºc, while the half-life on a homologous protein Thermoactinomyces
8

Vulgaris is 17 hours at the same temperature. Their amino acid differences bring about this

temperature discrepancy despite being 57% identical.

The short half-life of the Subtilisin enzyme is due to the oxidation of amino acid

methionine at position 22 (Singh,2018). The oxidation causes the inactivation of the enzyme.

Methionine is positioned adjacent to a catalytic serine residue, making it sensitive

(Singh,2018). Its oxidation causes side-chain bulk to increase, and even an electronegative

oxygen atom can be introduced, causing negative effects on the catalytic activity. To modify

the enzyme to be thermostable, site-directed mutagenesis of the subtilisin gene is done

(Singh,2018). The methionine is substituted for alanine, while in E. coli, the engineered

Subtilisin has high stability and activity at high temperatures. Currently, this engineering of

Subtilisin is widely adopted to manufacture laundry detergent.

Directed Evolution Approach

In most cases of protein engineering, the structure and function of the protein of

interest are not well known; therefore rational approach becomes difficult. An alternative

approach known as directed evolution that involves random mutagenesis can then be used

(Singh,2018). In this method, the protein structure and mechanism do not have to be known;

instead, the only requirement is that a suitable selection scheme is available to suit the desired

protein properties (Singh,2018). A library of D.N.A. constructs is generated, sometimes


9

through bioinformatics then all the constructs are expressed to proteins (Singh,2018). The

expressed proteins are screened for desired characteristics (Hit) by high throughput. A large-

scale expression of the D.N.A. construct of the obtained designed characteristic protein is

done.

Random mutagenesis involves a simple technique known as ‘saturation mutagenesis’.

Saturation mutagenesis involves substituting a single amino acid with all the possible amino

acids at that position (Singh,2018). A library of all possible point mutation proteins results

from the substitution. Another technique is ‘region-specific mutagenesis’. This combines

both the rational and random approaches of protein engineering (Singh,2018). A specific

region is simultaneously replaced with amino acid residues to express a protein with new

specificities.

https://www.researchgate.net/profile/Frances_Arnold/publication/51436919/figure/

fig1/AS:305735431933966@1449904415671/Schematic-outline-of-a-typical-directed-

evolution-experiment-The-researcher-begins-with_W640.jpg

Phage Display as a Selection Strategy


10

After generating a library of protein mutants, the protein mutant with desired

characteristics is identified or isolated by screening assay and selection strategies such as

phage display. Phage display was invented in 1985 by G. Smith to present foreign peptides

on filamentous bacteriophages (Allison,2018). Since then, the method has been widely

adopted in protein engineering to produce large peptides and antibodies. The phage

phenotype and genotype are physically linked forms the basis of the technique

(Allison,2018). The linkage aids in obtaining identical phage particles from the same clone of

E. coli (Allison,2018). Libraries of up to 1010 different variants can be created and used for

affinity screening to study and characterize protein-ligand interactions.

Filamentous bacteriophages of E. coli (f1, fd and M13) are all used in phage display.

In this method, the M13 E. coli filamentous bacteriophage is used. The bacteriophage is a

virus infecting the bacterium (Allison,2018). The phage contains a D.N.A. genome that is

circular and single-stranded. Ten proteins are encoded by the genome, 5 of which are

structural proteins of the virion (Allison,2018). The protein coat that encloses the genome

encodes 2700 copies of gpVIII. Gene fragments encoding a library of peptide or polypeptide

are displayed at the M13 coat protein gene pVIII (Allison,2018). These gene fragments fuse

with the protein to form part of the capsid.

A major limitation to the phage display selection strategy is the loss of functionality

of the protein coat. To overcome this problem, hybrid phages were developed, and

modifications on the protein coat were made. Drugs derived from phage display include

Adalimumab, whose target is the tumour necrosis factor-alpha, Raximacumab, whose target

is the protective antigen of the Bacillus anthracis. The romiplostim target is the

thrombopoietin receptor. Ranibizumab is also developed using the phage display targeting the

Vascular Endothelial Growth Factor A.


11

A guide selection that involved mouse mAb was used in phage display to discover the

drug Adalimumab. Briefly, a heavy chain murine antibody was combined with some light

chains and selected to bind to hTNF.

D.N.A. shuffling ( Stemmer)

https://bitesizebio.com/wp-content/uploads/2016/07/shuffling.jpg

D.N.A. shuffling is an in vitro technique of combining homologous genes. Uses error-

prone PCR technology, also known as ‘sexual PCR’ (Kikuchi, 2017). First, PCR amplifies

and prepares a pool of closely related genes with varying point mutations (Kikuchi, 2017).

PCR gives products of the same size as the template. DNase I is then used to break down the

molecules into random fragments creating random nicks along the D.N.A. strands (Kikuchi,

2017). After the random fragmentation, agarose gel is used to purify and obtain fragments of

the desired size. After the fragments of the desired size are obtained, they are denatured,

annealed, and extended to reassemble them (Kikuchi, 2017). during the PCR process, no

primers are added. Denaturing involves heating the double-stranded fragment to separate

them into single strands. The temperatures are then lowered so that overlapping fragments

can anneal with the single-stranded fragments by the number of bases complementary to the

overlapping region (Kikuchi, 2017). These homologous templates anneal to prime each other.

The annealing forms 5’ and 3’ overhangs.


12

Extension of the overhangs is done at increased temperature optimum for the PCR

DNA polymerase. The D.N.A. polymerase uses the other annealed strand to extend the 5'

overhangs of one template. Since the D.N.A. polymerase can only extend from a 5' end, the

3' overhang is not extended. Multiple cycles of the PCR are done to amplify and obtain

millions of cloned genes.

Error-Prone PCR

Error-prone PCR is a Variant of the normal PCR that is used to generate genomic

libraries that are randomized. Tiny amounts of parent D.N.A. molecule can be amplified to

produce a large number of mutated genes. The ability of the thermostable Taq polymerase to

anneal incompatible base-pairs and amplify them under imperfect conditions constitutes the

working principle of this technique (Kikuchi, 2017). The imperfect conditions cause the

polymerase to make mistakes while pairing the base pairs inculcating into the newly

synthesized D.N.A. strands having errors (Kikuchi, 2017). Taq polymerase also lacks proof-

reading ability making the error-prone PCR technique to be efficient.

Conclusion

Protein’s properties are not always optimized for high efficiency in their application.

Factors such as low stability, non-optimal amino acid sequence and undesired substrate

specificity pose a challenge in protein industrialization. Protein engineering is when novel

proteins are synthesized or existing protein sequence/ structure is amended to achieve desired

results such as novel functionality, change in substrate specificity and stability. Protein

engineering is facilitated by other fields like bioinformatics and PCR in molecular biology.

The rational design approach is the most classical protein engineering method that involves

'site-directed mutagenesis' of proteins. In rational design, the structure and function of a

protein must be known first before a rational gene mutation is carried out. However, in most
13

cases of protein engineering, the structure and function of the protein of interest are not well

known; therefore rational approach becomes difficult. An alternative approach known as

directed evolution that involves random mutagenesis can then be used.

References
14

Alley, E.C., Khimulya, G., Biswas, S. et al. Unified rational protein engineering with

sequence-based deep representation learning. Nat Methods 16, 1315–1322 (2019).

https://doi.org/10.1038/s41592-019-0598-1

Allison, James. "Phage Display Helps Study The Interaction Of Proteins".

Whatisbiotechnology.Org, 2018,

https://www.whatisbiotechnology.org/index.php/science/summary/phage/phage-

display-helps-study-the-interaction-of-proteins.

Kikuchi M., Harayama S. (2017) D.N.A. Shuffling and Family Shuffling for In Vitro Gene

Evolution. In: Braman J. (eds) In Vitro Mutagenesis Protocols. Methods in Molecular

Biology™, vol 182. Humana Press, Totowa, NJ. https://doi.org/10.1385/1-59259-194-

9:243

Singh, R. K., Lee, J. K., Selvaraj, C., Singh, R., Li, J., Kim, S. Y., & Kalia, V. C. (2018).

Protein Engineering Approaches in the Post-Genomic Era. Current protein & peptide

science, 19(1), 5–15. https://doi.org/10.2174/1389203718666161117114243

You might also like