You are on page 1of 57

DR THOMAS SHAFEE

CC BY 4.0

PROTEIN
ENGINEERING
MANIPULATING PROTEIN
STRUCTURE AND FUNCTION FOR
BIOTECHNOLOGY
- DR THOMAS
CC BY 4.0
SHAFEE

OUTLINE
WHAT PROTEIN ENGINEERING IS
‐ Background and introduction
Definition
Aims
Applications

RATIONAL DESIGN DIRECTED EVOLUTION


‐ Knowledge-based engineering ‐ Brute force engineering
Methods Methods
Examples Examples
Limitations Limitations

COMBINED APPROACHES
‐ The best of both worlds

2
WHAT PROTEIN
ENGINEERING
IS
DEFINITION | AIMS | APPLICATIONS
- DR THOMAS
CC BY 4.0
SHAFEE

PROTEIN ENGINEERING

noun: protein engineering
Techniques … to manipulate the structure and function of a
protein so that it acquires specific desired properties

noun: genetic engineering
The alteration of the genome of an organism by laboratory
techniques

Oxford English Dictionary 4


- DR THOMAS
CC BY 4.0
SHAFEE

AIMS OF PROTEIN
ENGINEERING

Catalysis
Temperature Specificity
Activity
pH Sterioselectivity
Environ- Regulation
Salinity
Solvent
ment
Sequence
Temperature
Host organism
Express Solvent
Solubility
pH
Characterise stability
Salt
Proteolysis
Substrate Solve structure
Regulation Function
Binding partners Allergenicity
Toxicity
Immunogenicity

Natural evolution Analysis Protein engineering

Images: Wikimedia commons 5


- DR THOMAS
CC BY 4.0
SHAFEE

APPLICATIONS OF PROTEIN
ENGINEERING

PHARMACEUT
AGRICULTURE RESEARCH
ICALS

WHITE
BIOENERGY
BIOTECH
Images: Wikimedia commons 6
RATIONAL
DESIGN
METHODS | EXAMPLES | LIMITATIONS
- DR THOMAS
CC BY 4.0
SHAFEE

THE RATIONAL
DESIGN
PROCESS

Knowledge
‐ Based on protein
knowledge
Structure
Mechanism Hypothesis
Dynamics
Natural variation

‐ Analogous to
mechanical Modification
engineering

8
- DR THOMAS
CC BY 4.0
SHAFEE

TOPICS
Active site
modification

Split
Disulphides
proteins

Chemical
modification

Fusion
proteins Cyclisation

Bio- Computer
informatics modelling

9
- DR THOMAS
CC BY 4.0
SHAFEE

CHEMICAL MODIFICATION
‐ Formaldahyde Formaldehyde
Extensive modification ⟶ Inactivated toxoid production

‐ PEGylation
Flexible hydrophilic coat ⟶ Solubility
Poly-ethylene glycol
Reduced accessibility ⟶ Protease resistance and non-antigenicity
Increased size ⟶ Serum half-life

‐ Fluorophores
Fluorescein
Fluorescent labelling ⟶ Tracking location or dynamics

‐ Prosthetic catalytic groups


Modified reactivity ⟶ Altered or novel catalysis

Selenocysteine
‐ Considerations
Exposure of modified residues
Original function of modified residues

10
- DR THOMAS
CC BY 4.0
SHAFEE

‐ Optimise uricase as gout treatment


URICASE

AIM
Reduce immunogenicity
IMMUNOGENICTY Increase serum half-life
Uricase 10kDa
‐ Attached PEG polymers Uric acid Poly-ethylene glycol
ENGINEERI

Lysine coupling
+
Allantoin
‐ Optimised PEG number and length
Maximise improvements
NG

Avoid destabilisation or activity reduction


PEG-uricase
(model)
‐ Optimal PEG number and length
10kDA polymers
9 polymers per subunit of the tetramer

1000x reduced antigenicity


OUTCOME


Also improved solubility at neutral pH
Also increased serum half-life

‐ Krystexxa (Crealta Pharmaceuticals)


PEG-uricase in the management of treatment-resistant gout and hyperuricemia, (M Sherman 2008)
Images: Wikimedia commons 11
- DR THOMAS
CC BY 4.0
SHAFEE

Mutation

SITE-DIRECTED Primer
DN
MUTAGENESIS A

Exponential increase Exponential decrease

‐ Modified PCR
Whole plasmid Linear increase Linear increase
Overlap extension

‐ Introduce point
mutations
‐ Introduce short Non-methylated DpnI - digested
insertions or
deletions

12
- DR THOMAS
CC BY 4.0
SHAFEE

FUSION PROTEINS
Gene 1
‐ Creation Gene 2
Remove stop codon of first gene
Ligate genes together in frame
Include linker codons
Gene 1 Linker Gene 2
‐ Aims
Combine the properties of the components
E.g. Addition of antibody Fc fragment to proteins increases their serum halflife
Co-localise the components
E.g. Set of enzymes that work in a reaction pathway

‐ Considerations
Linker length and flexibility
Ability for proteins rotate relative to each other
Distance between protein components
Protease resilience
Ability for domains to fold

13
- DR THOMAS
CC BY 4.0
SHAFEE

‐ Create a polymerase for long templates


PFU POLYMERASE

AIM
Increase processivity
PROCESSIVITY Retain fidelity and stability

‐ Fusion dsDNA
Polymerase
Pyrococcus furiosus DNA polymerase (Pfu) binder
ENGINEERING

Sulfolobus solfataricus dsDNA binding domain (Sso7d)

‐ Linker Pfu (80 U/ml)

Short tripeptide linker

‐ Generality
Also works with other polymerases
Template length (kb)

‐ Improvements Pfu-S (10 U/ml)


OUTCOME

10x increase in processivity


Improved salt tolerance
Can amplify >15kb templates

‐ Phusion (New England Biolabs)


Template length (kb)
A novel strategy to engineer DNA polymerases for enhanced processivity and improved performance in vitro, (Y Wang 2004)
New England Biolabs 14
- DR THOMAS
CC BY 4.0
SHAFEE

SPLIT PROTEINS
‐ Creation Original protein

Locate flexible, surface loops


Create two open reading frames
First half of protein with stop codon in loop
Second half of protein with start codon in loop Split
protein
‐ Aims
Couple colocalisation to activity Substrate Product Half-proteins
Fuse half-proteins to other proteins colocalised by
fusion proteins
Measure protein binding
Biosensor
Logic gates A B

‐ Considerations
Half-proteins must: fold independently
not spontaneously ‘dimerise’ Active split
be inactive when apart protein
bind and be active when brought together

15
- DR THOMAS
CC BY 4.0
SHAFEE

DISULPHIDES
‐ Creation
Mutation of two codons to cysteine
Protein kept in oxidising environment

‐ Aims
Stability enhancement
Enthalpy increase ≈ 3.5 kcal/mol
Entropy decrease ≈ Logarithm of trapped loop length

‐ Considerations
Inter-cysteine distance
Inter-cysteine orientation
Trapped loop length and flexibility
Original function of mutated residues
Original function of flexibility
Folding pathway of protein (multistep)

16
- DR THOMAS
CC BY 4.0
SHAFEE

CYCLISATION
Cyclisable
‐ Creation

Fraction of total structures


Termini of most proteins happen to be close together
Express protein with extra linker to bridge gap
Ligate peptide ends

‐ Aims
Thermostability
Up to 1.7 kcal/mol
Protease resistance N- to C- termini distance
Especially exopeptidase (Å)

energy ΔΔGcycl (kcal/mol)


Improvement in folding
‐ Considerations
Linker length
Ligation method
Chemically (e.g. by solid-phase synthesis)
Enzymatically (e.g. by sortase)
Linker length (residues)
CyBase: a database of cyclic protein sequences and structures, with applications in protein discovery and engineering, (C Wang 2008)
Effect of Backbone Cyclization on Protein Folding Stability: Chain Entropies of both the Unfolded and the Folded States are Restricted, (H Zhou 17
- DR THOMAS
CC BY 4.0
SHAFEE

CONOTOXIN ‐ Increase conotoxin protease resistance

AIM
Pain killer activity by specific binding to ion channels

STABILITY ‐ Improve stability in human blood


ENGINEERI

‐ Produced whole peptide by solid-phase synthesis


Linker length of 5, 6, or 7 residues
NG

cMII-5, cMII-6, cMII-7 Original MII Cyclic cMII-6

‐ cMII-5
No longer folded or functional

‐ cMII-6 and cMII-7 retained full activity In purified EndoGluC protease

Specific ion channel blocking


OUTCOME

Minimal structural difference

‐ Reduced protease susceptibility


With purified EndoGluC protease (a)
In human blood plasma
In human blood plasma (b)

Engineering stable peptide toxins by means of backbone cyclization: stabilization of the alpha-conotoxin MII, (R Clark 2005) 18
- DR THOMAS
CC BY 4.0
SHAFEE

ACTIVE SITE MODIFICATION


‐ Creation
Structural insight into function of active site residues PROTEIN
Site-directed mutagenesis to alter key functional groups
STRUCTURE
‐ Aims Scaffold for supporting active site
Modify binding Modulate dynamics
Affinity
Specificity ACTIVE SITE
Sterioselectivity
Modify catalysis BINDING CATALYTIC
Modify regulation SITES SITE
Bind and orient Stabilise transition state
‐ Considerations substrate Stabilise leaving groups
Requires knowledge of protein structure and Form intermediate
mechanism covalent bonds
Mutations may have additional, unpredicted effects

19
- DR THOMAS
CC BY 4.0
SHAFEE

BIOINFORMATIC
APPROACHES
‐ Codon optimisation
Different organisms have different tRNA ratios
Matching codon frequency to host increases expression
Naturally
‐ Considerations existing
Altered codons can affect mRNA (stability, 2° structure, IRES) sequences
Increased translation rates can cause misfolding

Hypothesis
‐ Consensus sequence
Most mutations are mildly destabilising
Through genetic drift, homologues accumulate different mutations
Therefore consensus should be more stable than existing sequences Modification

‐ Considerations
Availability of homologous sequences

20
- DR THOMAS
CC BY 4.0
SHAFEE

PHYTASE Improve phytase thermostability

AIM

STABILITY Improving phosphorous bioavailability in animal feed

‐ Align 13 related fungal sequences

sequences
Starting
ENGINEERING

Sequences 50 - 70% identical to each other


If no consensus in column ⟶ most common
residue (*) Con.
⟶ residue from most stable (^)

‐ Starting thermostabilities (TM) 56 - 63 °C

‐ Final TM = 78 °C
Crystal structure resolves loops too flexible to be seen in
natural phytases
OUTCOME

Some residues form hydrogen bond network

‐ Later work further increased T M to 90°C


Added 6 extra sequences to alignment
Changed consensus residues that weren’t stabilising

From DNA sequence to improved functionality: using protein sequence comparisons


to rapidly design a thermostable consensus phytase, (M Lehmann 1999) 21
- DR THOMAS
CC BY 4.0
SHAFEE

COMPUTATIONAL
MODELLING

‐ Improving stability Computer


Model energy of folded and unfolded protein variants simulation
‐ Improving activity
Increase existing catalysis
Catalyse new reactions, never seen in nature Hypothesis
e.g. Kemp elimination or Retro-aldol

‐ Considerations
Requires deep knowledge of reaction mechanism Modification
Requires extreme computational power
Simulation either ignores: quantum mechanism of active site
or structure and dynamics in rest of protein

22
- DR THOMAS
CC BY 4.0
SHAFEE

DE NOVO
ENZYME
DESIGN

‐ Disembodied
amino acids placed
to stabilise reaction
transition state
‐ Existing protein
structures searched
for backbones with
correct orientations
‐ Other residues in
active site
optimised for ‐ Theozymes
Theoretical enzyme
packing
Quantum mechanical modelling

De novo enzymes by computational design, (H Kries 2013) 23


- DR THOMAS
CC BY 4.0
SHAFEE

CREATING A Enzymatically catalyse unnatural reaction

AIM

RETRO-ALDOLASE Retro-aldol reaction not performed by any known enzyme
O OH O O

‐ Theozyme Enzyme +
O
Amino acids positioned to increase reactivity of
O

nucleophilic Lys, stabilise transition state, stabilise


leaving group
ENGINEERING

Protein structures searched for backbones that could


correctly position these residues
Surrounding residues optimised for packing

‐ 42 designs in 13 protein scaffolds


Active sites grafted onto backbone
Genes synthesised and expressed

‐ 75% of variants showed rate enhancements 10 1-104 kcat/kuncat


OUTCOME

Still many orders of magnitude worse than natural enzymes

‐ Crystal structure of most active complexed with covalent inhibitor


Confirmed mechanism proceeds as designed

Robust design and optimization of retroaldol enzymes, (E Althoff 2012) 24


- DR THOMAS
CC BY 4.0
SHAFEE

PROS AND CONS OF


RATIONAL DESIGN
BENEFITS LIMITATIONS
‐ Intellectually satisfying ‐ Requires deep understanding
Natural variation
‐ Controlled outcome
Structure
‐ Range of available techniques Dynamics
Mechanism
‐ Increasing computational power …for starting protein and changes

‐ High failure rate


Failures rarely reported

25
INTERMISSION
DIRECTED
EVOLUTION
METHODS | EXAMPLES | LIMITATIONS
- DR THOMAS
CC BY 4.0
SHAFEE

THE
EVOLUTION
CYCLE

‐ Mimic natural
evolution
Single gene evolved in
cycle Screening (fitness
Mutagenesis differences)
(variation)
Gene amplification (heredity)

‐ Experimental
control over
Mutation rate
Environment
Selection pressure

‐ Protein
understanding not
required

Images: Wikimedia commons 28


- DR THOMAS
CC BY 4.0
SHAFEE

TOPICS

Generating variation Detecting fitness differences

Screening (fitness
Mutagenesis differences)
(variation)
Gene amplification (heredity)

Ensuring heredity

29
- DR THOMAS
CC BY 4.0
SHAFEE

UNDERSTANDING
MUTATIONS a
100%
b
100%

80% 80%
100%
60% 60%
noun: distribution of fitness effects 80%

Frequency

Frequency
40% 60%
40%

The relative proportions of fitnesses within a 20%


40%
20%
population of mutants – i.e. the ratio of 20%

deleterious:neutral:beneficial mutations 0% 0%
0%

3
noun: epistasis

Trait value
2

The non-additive effects of multiple mutations 1

30
- DR THOMAS
CC BY 4.0
SHAFEE

GENERATING
VARIATION

Starting gene
Point
mutations

‐ Point mutations
Error prone PCR
Random mutations
Whole gene or region
Insertions and
deletions
‐ Insertion/deletions
Difficult protocols
Large effects

‐ Shuffling
Requires similar genes Shuffling
Can use multiple genes

Images: Wikimedia commons 31


- DR THOMAS
CC BY 4.0
SHAFEE

STABILISING Improve thermostability of p450

AIM

CYTOCHROME P450 Used for oxidisation in white biotech

‐ T50 of starting proteins


50% irreversibly unfolded after 10 minutes
55 °C | 44 °C | 49 °C
ENGINEERING

‐ Shuffled parent genes in 8 fragments


Recombination points chosen to be in loops

‐ Screened thermostability of 184 variants


Expressed in cell lysates
Measured percent soluble after heating

‐ Found stabilised variant


OUTCOME

64 °C
retains oxidase activity

‐ Generated model to predict stability from sequence


Synthesised sequence with highest predicted stability

Exploring protein fitness landscapes by directed evolution, (P Romero 2009)


A diverse family of thermostable cytochrome P450s created by recombination of stabilizing fragments, (Y Li 2007) 32
- DR THOMAS
CC BY 4.0
SHAFEE

CONSIDERATIONS

‐ Mutation rate
Proportion of beneficial mutations
Likelihood of finding synergistic interactions
Avoiding multiple mutants all being deleterious
Expected screening/selection throughput

‐ Library size
Throughput of the screening/selection
Number of possible single or double mutations for the gene
Full coverage of library can guarantee that best available variant was found

‐ Types of mutations
Single nt mutations can’t convert a codon to all others
Shuffling introduces large numbers of mutations likely to be functional

33
- DR THOMAS
CC BY 4.0
SHAFEE

DETECTING FITNESS
DIFFERENCES
SELECTION SCREENING
Survival to next Survival to next
Desired activity Desired activity
round round

‐ Direct coupling
Quantitative Sorting above
assay threshold
‐ Indirect coupling

34
- DR THOMAS
CC BY 4.0
SHAFEE

SELECTION SYSTEMS
SELECTION
Survival to next
Desired activity
round

‐ Direct coupling

Immobilised target
A B C
‐ Binding
Immobilised substrate
Bind, wash, elute

‐ Cell survival
Toxic A B C
Genome kept the same except compound
target gene
Antibiotic resistance / Auxotrophy
Non-toxic
compound

35
- DR THOMAS
CC BY 4.0
SHAFEE

BIOFUEL EFFLUX Improve efflux of biofuels from bacteria

AIM

TRANSPORTER Affects overall rate of bio-production of n-octane

‐ Octane not toxic enough True target

‐ Cell survival selection Toxic surrogate


ENGINEERING

Genomic transporter removed


epPCR library of transporter gene on plasmid
Growth on toxic surrogate substrate n-octanol
Relative growth proportional to efflux
Extract mixed population plasmids

‐ Combine identified mutations

Variant with 4 mutations identified as optimal


OUTCOME

‐ Surprise
promiscuous activity
47% improved n-octane efflux
400% improved α-pinene efflux
α-pinene
‐ Only one of the mutations in the channel of transporter

Directed evolution of an E. coli inner membrane transporter for improved efflux of biofuel molecules, (J L Foo 2013) 36
- DR THOMAS
CC BY 4.0
SHAFEE

SCREENING SYSTEMS
R OH

+ H2O
Enzym
+ R–H
SCREENING
e

NO2 NO2 Survival to next


Desired activity
round
Microtitre plate (96 well)

Quantitative Sorting above


assay threshold
‐ Indirect coupling
‐ Microtitre plate
Microdroplet sorter Colourgenic (e.g. para-nitrophenol)
Fluorogenic (e.g. fluorescein)

‐ Microfluidics
Flow cytometry
Microdroplet emulsion
Images: Wikimedia commons 37
- DR THOMAS
CC BY 4.0
SHAFEE

CONSIDERATIONS
SELECTION SCREENING
High throughput High control
High sensitivity Information-rich
Usually cheap
‐ Microtitre plate
‐ Binding Low throughput
Versatile High sensitivity
Doesn’t require surrogate substrate Simple to create
Can’t assay catalysis
‐ Microfluidics
‐ Cell survival High throughput
Multiple rounds fast Low sensitivity
Simple to use but difficult to create Difficult to create
Difficult to tune dynamic range

38
- DR THOMAS
CC BY 4.0
SHAFEE

CONSIDERATIONS
BOTH SELECTION AND SCREENING
‐ You get what you screen for
Surrogate substrates may be different from true targets
Evolved variants mostly display the ‘easiest’ solution (most evolvable)

‐ You have to have a starting point


Low-level promiscuous activities against non-native substrates

‐ Improving upon nature is difficult


Proteins are typically already highly optimised for their native function
Improving upon the native function is typically very difficult

39
- DR THOMAS
CC BY 4.0
SHAFEE

ENSURING HEREDITY
COMPARTMENT COVALENT LINK

Gene Protein

Gene

Protein

‐ After identifying an improved protein variant, it is necessary to isolate


the DNA sequence that encodes it.

40
- DR THOMAS
CC BY 4.0
SHAFEE

COMPARTM
ENT Compartment = Cell
Each cell expresses one gene variant
Substrate must be able to enter cell
Or protein displayed on cell surface
Each cell assayed (e.g. FACS)

Gene
Compartment = Microtitre plate
‐ Compartment Cells grown in clonal populations in plate wells
Protein and gene are co- Each population expresses one gene variant
localised either in vivo Protein Cells lysed to release protein
or in vitro Substrate added
Storage copy made
Cell lysate assayed
‐ In vivo examples
Cellular assays
Compartment = Microdroplet
‐ In vitro examples Each compartment contains
Gene variant
Microtitre plate
Transcription-translation (cellular or in vitro)
Micro-droplet sorting
Substrate
Each microdroplet assayed

41
- DR THOMAS
CC BY 4.0
SHAFEE

COVALENT Linker without

LINK stop codon


Protein
Ribosome display
‐ Covalent link Gene mRNA Gene ends in linker without stop codon
In vitro transcription- Translation in cold conditions, Mg2+
translation Stalled ribosome binds protein C-ter
Chemical bond between Ribosome
protein and gene
Protein
‐ Co-translation mRNA display
Gene mRNA
examples Puromycin is attached to the mRNA 3’
Translation ends
Phage display
Puromycin Puromycin covalently binds protein C-ter
Ribosome display
mRNA display
DNA display
‐ Transient Benzylguanine is added to the DNA 3’
AGT fusion
compartment protein Protein expressed as a fusion with
examples AGT fusion gene
DNA
alkylguanine DNA alkyltransferase
(AGT) in emulsion
Bead display
Benzylguanine covalently inhibits AGT
DNA display Benzylguanine
Compartment removed when reaction
complete

42
- DR THOMAS
CC BY 4.0
SHAFEE

‐ Make high-affinity therapeutic antibody


ANTIBODY

AIM
Target = signalling or receptor protein
MATURATION Affinity (kd) < 1 nanomolar

Phage M13
‐ Phage display Phage
library
‐ Library size up to 1010 Gene library
Fusion of gene to viral coat protein gene Immobilised
target
ENGINEERING

Panning (binding and washing)


Phage infection of bacteria

‐ Repeat half cycle Gene


Increasing selection stringency

‐ Repeat full cycle pIII coat protein


Introducing new variation Antibody fragment

‐ Sub nanomolar affinities


OUTCOME

Single round from extremely large naïve libraries


Improved through multiple rounds

‐ Most commercialised antibody therapeutics use these methods


E.g. adalimumab, necitumumab, belimumab, trastuzumab

Selecting and screening recombinant antibody libraries, (H Hoogenboom 2006)


Images: Wikimedia commons 43
- DR THOMAS
CC BY 4.0
SHAFEE

CONSIDERATIONS
COMPARTMENTALISATION COVALENT LINKAGE
Substrate must enter compartment Can engineer toxic activities
Typically binding only
‐ In vivo
High protein expression ‐ mRNA/ribosome display
RNA not very stable
‐ In vitro
Can engineer toxic activities ‐ DNA display
High control over environment High control over environment

44
- DR THOMAS
CC BY 4.0
SHAFEE

EVOLUTION ROUNDS
‐ Selection consistency
Increase stringency
Alter selected properties

‐ Number of variants taken forwards


Variation Avoid getting stuck in a local maximum
Fitness differences Ease of analysis
Heredity Shuffle successful variants together

‐ Number of rounds
Improve upon last round
Find epistatic interactions
Diminishing returns (frequency and magnitude)

45
- DR THOMAS
CC BY 4.0
SHAFEE

RATIONALISING RESULTS

‐ Output of directed evolution is a set of improved variants


Variation
Fitness diffs.
Heredity
‐ Understanding improvement
Typically difficult
Intellectually satisfying
May inform future engineering

‐ Methods
Output
Measuring effects of mutations individually
Finding the minimal set of mutations that give the improvement
Removes ‘hitchhiking’ mutations
Structural and mechanistic studies

46
- DR THOMAS
CC BY 4.0
SHAFEE

PROS AND CONS OF DIRECTED EVOLUTION

BENEFITS LIMITATIONS
‐ Simple concepts ‐ Starting activity range
Requires some starting activity
‐ Widely applicable Can rarely improve native activity
‐ High success rate ‐ Requires high-throughput assay
Typically enzyme-specific
‐ Requires no knowledge of protein
Screens can be expensive
(or of mutations)
Synergistic mutations hard to find

‐ Understanding results is not trivial


Especially to apply to another protein

‐ You get what you screen for


Surrogate substrates can yield artefacts
Side-effects if not constrained

47
- DR THOMAS
CC BY 4.0
SHAFEE

COMBINED
APPROACHES
CUTTING EDGE IDEAS
- DR THOMAS
CC BY 4.0
SHAFEE

BLENDED TECHNIQUES

SIMULTANEOUS
‐ Smart libraries
Maximising library fitness
Finding synergistic mutations

SEQUENTIAL
‐ Improving designed enzymes
Optimising crude designs
Compensating for structural disruption

49
- DR THOMAS
CC BY 4.0
SHAFEE

SMART LIBRARIES
‐ The problem with mutagenic libraries
High throughput screening is difficult and expensive
Beneficial mutations are rare
Prior knowledge Focussed mutagenesis
Synergistic mutations are even rarer
natural previous
DNA synthesis
variation experiment
‐ Optimise DFE proportions structure mechanism semi-random codons
Reduce mutations likely to be deleterious
Increase mutations likely to beneficial
Make multiple simultaneous mutations to find synergy

‐ Choosing mutations to include in library


Focussed saturation around active site
Natural variation
Computational model prediction

50
- DR THOMAS
CC BY 4.0
SHAFEE

LIPASE ‐ Alter stereoselectivity of a bacterial lipase


STEREOSELECTIVIT

AIM
For use as stereospecific esterase in white biotech

Y ‐ Compare Iterative Saturation Mutagenesis to epPCR


A:1,2
‐ Iterative Saturation Mutagenesis
Systematic mutation around active site pocket Substrate
6 residues randomised in three pairs B:3,4 C:5,6
ENGINEERING

Focussed custom codons Active site binding


pocket residues
‐ Microtitre plate UV absorbance assay
Screened 10,000 variants in three rounds

‐ Compare to previous studies


50,000 epPCR variants screened

‐ Sterioselectivity
OUTCOME

ISM increase 1.1 ⟶ 594 (3 mutations)


epPCR increase 1.1 ⟶ 51 (6 mutations)

‐ Extreme positive epistasis


Removing any one of the mutations reduced stereoselectivity to <3

Iterative Saturation Mutagenesis Accelerates Laboratory Evolution of Enzyme Stereoselectivity:


Rigorous Comparison with Traditional Methods, (M Reetz 2010) 51
- DR THOMAS
CC BY 4.0
SHAFEE

IMPROVING DESIGNED
ENZYMES
Initial protein designs or modifications done rationally

Engineered protein then evolved to optimise function

‐ Optimisation of ‘crude’ designs


Catalytic residue angles
Binding interactions
Hydrophobic core packing

‐ Compensatory mutations
Offset structural destabilisation

52
- DR THOMAS
CC BY 4.0
SHAFEE

CREATING A Enzymatically catalyse unnatural reaction

AIM

RETRO-ALDOLASE Retro-aldol reaction not performed by any known enzyme
O OH O O

‐ ‘Theozyme’ Enzyme +
O
Amino acids positioned to increase reactivity of
O

nucleophilic Lys, stabilise transition state, stabilise


leaving group
ENGINEERING

Protein structures searched for backbones that could


correctly position these residues
Surrounding residues optimised for packing

‐ 42 designs in 13 protein scaffolds


Active sites grafted onto backbone
Genes synthesised and expressed

‐ 75% of variants showed rate enhancements 10 1-104 kcat/kuncat


OUTCOME

Still many orders of magnitude worse than natural enzymes

‐ Crystal structure of most active complexed with covalent inhibitor


Confirmed mechanism proceeds as designed

Robust design and optimization of retroaldol enzymes, (E Althoff 2012) 53


- DR THOMAS
CC BY 4.0
SHAFEE

IMPROVING A Further increase designer enzyme efficiency

AIM

RETRO-ALDOLASE Designed enzyme rate well below natural enzymes

‐ RA45: screen 1000 variants Evolved RA45


10 rounds of epPCR and screening
ENGINEERING

Top 1-10% most active each round

‐ RA95: screen 800 variants


5 round saturation mutagenesis around active site
Consensus surface mutations introduced for stability
8 rounds of epPCR and screening
Top 1% most active each round

‐ RA45 ⟶ 700x activity increase Designed RA95


Saturation
epPCR
mutagenesis
14 mutations throughout structure
OUTCOME

‐ RA95 ⟶ 4,400x activity


increase
Altered catalytic lysine residue!
13 total mutations throughout structure

Robust design and optimization of retroaldol enzymes, (E Althoff 2012)


Evolution of a designed retro-aldolase leads to complete active site remodelling, (L Giger 2013) 54
- DR THOMAS
CC BY 4.0
SHAFEE

SUMMARY
WHAT PROTEIN ENGINEERING IS
Manipulation of protein structure and function to acquire specific properties

RATIONAL DESIGN DIRECTED EVOLUTION


‐ Use of structural and mechanistic ‐ Mimic of natural evolution cycle
knowledge Generating diversity
Manual, bioinformatic and Detecting fitness differences
computational approaches
Ensuring heredity
‐ Limited by understanding ‐ Limited by throughput

COMBINED APPROACHES
Semi-rational ‘smart’ libraries (Simultaneous)
Improving upon previous designs (Sequential)

55
- DR THOMAS
CC BY 4.0
SHAFEE

[END]
‐ Recommended reading
Recent advances in engineering proteins for biocatalysis, (Y Li 2014)
Exploring protein fitness landscapes by directed evolution, (P Romero 2009)
Beyond directed evolution—semi-rational protein engineering and design, (S Lutz 2010)

‐ Further reading for more specific topic reviews


Protein disulfide engineering, (A Dombkowski 2014)
Split-protein systems: beyond binary protein–protein interactions, (S Shekhawat 2011)
A critical analysis of codon optimization in human therapeutics, (V Mauro 2014)
Computational tools for designing and engineering enzymes, (J Damborsky 2014)
Engineering proteins for thermostability: the use of sequence alignments versus rational design and directed
evolution, (M Lehmann 2001)
Advances in the directed evolution of proteins, (M Lane 2014)
Expanding the Enzyme Universe: Accessing Non-Natural Reactions by Mechanism-Guided Directed
Evolution, (H Renata 2015)
New genotype–phenotype linkages for directed evolution of functional proteins, (H Leemhuis 2006)
The role of phage display in therapeutic antibody discovery, (C Chan 2014)
Semi-rational approaches to engineering enzyme activity: combining the benefits of directed evolution and
rational design, (R Chica 2005)

‐ References for example experiments are footnoted on the slides

56
- DR THOMAS
CC BY 4.0
SHAFEE

OPEN QUESTIONS TO
CONSIDER

‐ Why do most rational design experiments fail?


‐ Why do later rounds of directed evolution typically give
smaller improvements?
‐ How can you know if the property you’re trying to engineer
is even possible?
‐ How can you maximise your chances of success?
‐ Which techniques are the best to combine?

57

You might also like