You are on page 1of 884

www.ebook3000.

com
Understanding Enzymes
This page intentionally left blank

www.ebook3000.com
Understanding Enzymes
Function, Design, Engineering, and Analysis

edited by
Allan Svendsen
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2016 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works


Version Date: 20160419

International Standard Book Number-13: 978-981-4669-33-7 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reason-
able efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The
authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.
copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organiza-
tion that provides licenses and registration for a variety of users. For organizations that have been
granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Contents

Introduction xix

PART I ENZYME FUNCTION

1 A Short Practical Guide to the Quantitative Analysis of


Engineered Enzymes 3
Christopher D. Bayer and Florian Hollfelder
1.1 Introduction 3
1.2 Quantifying Reaction Progress 4
1.3 Typical Saturation Plots Give Michaelis–Menten
Parameters 5
1.4 What Can Go Wrong? 8
1.5 Dealing with Multiphasic and Pre-Steady-State
Kinetics 12
1.6 Evaluating Enzymes 16

2 Protein Conformational Motions: Enzyme Catalysis 21


Xinyi Huang, C. Tony Liu, and Stephen J. Benkovic
2.1 Introduction 21
2.2 Multidimensional Protein Landscape and the
Timescales of Motions 22
2.3 Conformational Changes in Enzyme–Substrate
Interactions 26
2.4 Conformational Changes in Catalysis 28
2.4.1 Protein Dynamics of DHFR in the Catalytic
Cycle 30
2.4.2 Temporally Overlap: Correlation Does Not
Mean Causation 32
2.4.3 Fast Timescale Conformational Fluctuations 34
March 28, 2016 10:38 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

vi Contents

2.4.4 Effect of Conformational Changes on the


Electrostatic Environment 36
2.5 Conservation of Protein Motions in Evolution 38
2.6 Designing Protein Dynamics 39
2.7 Concluding Remarks 40

3 Enzymology Meets Nanotechnology: Single-Molecule


Methods for Observing Enzyme Kinetics in Real Time 47
Kerstin G. Blank, Anna A. Wasiel, and Alan E. Rowan
3.1 Introduction 48
3.2 Single-Turnover Detection 53
3.2.1 Fluorescent Reporter Systems 53
3.2.2 Measurement Setup 56
3.2.3 Data Analysis 57
3.3 Single-Enzyme Kinetics 60
3.3.1 Candida antarctica Lipase B 63
3.3.2 Thermomyces lanuginosus Lipase 67
3.3.3 α-Chymotrypsin 73
3.3.4 Nitrite Reductase 78
3.3.5 Summary 84
3.4 New Developments Facilitated by Nanotechnology 88
3.4.1 Nano-optical Approaches 89
3.4.2 Nano-electronic Approaches 96
3.4.3 Nanomechanical Approaches 103
3.4.4 Summary 108
3.5 Conclusion 110

4 Interfacial Enzyme Function Visualized Using Neutron, X-Ray,


and Light-Scattering Methods 125
Hanna Wacklin and Tommy Nylander
4.1 Phospholipase A2 : An Interfacially Activated Enzyme 126
4.1.1 Neutron Reflection 129
4.1.2 Ellipsometry 130
4.1.3 Activity of Naja mossambica mossambica PLA2 130
4.1.4 Fate of the Reaction Products 133
4.1.5 The Lag Phase and Activation of Pancreatic
PLA2 135
4.1.6 Distribution of Products during the Lag Phase 138

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Contents vii

4.1.7 Hydrolysis of DPPC by Pancreatic PLA2 139


4.1.8 Role of the Reaction Products in PLA2
Activation 141
4.1.9 Effect of pH and Activation by
Me-β-cyclodextrin 144
4.2 Other Lipolytic Enzyme Reactions on Surfaces 150
4.2.1 Triacylglycerol Lipases and the Role of Lipid
Liquid Crystalline Nanostructures 150
4.3 Cellulase Enzymes 154
4.4 Conclusion 158

5 Folding Dynamics and Structural Basis of the Enzyme


Mechanism of Ubiquitin C-Terminal Hydroylases 167
Shang-Te Danny Hsu
5.1 Introduction 169
5.1.1 UCH-L1 171
5.1.1.1 Genetic association between UCH-L1
and neurodegenerative diseases 171
5.1.1.2 UCH-L1 in oncogenesis 175
5.1.2 Molecular Insights into the Pathogenesis
Associated with UCH-L1 175
5.1.3 UCHL3 177
5.1.4 UCHL5 178
5.1.5 BAP1 179
5.2 UCH Structures 180
5.3 Folding Dynamics and Kinetics 183
5.4 Substrate Recognition 184
5.5 Enzyme Mechanism 186
5.6 Conclusion 189

6 Stabilization of Enzymes by Metal Binding: Structures of Two


Alkalophilic Bacillus Subtilases and Analysis of the Second
Metal-Binding Site of the Subtilase Family 203
Jan Dohnalek, Katherine E. McAuley, Andrzej M. Brzozowski,
Peter R. Østergaard, Allan Svendsen, and Keith S. Wilson
6.1 Introduction: Subtilases and Metal Binding 203
6.1.1 Calcium-Binding Sites in Bacillus: Proposal for
a Standard Nomenclature 209
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

viii Contents

6.1.2 The Weak Metal-Binding Site 214


6.2 Two New Structures of Subtilases with Altered
Calcium Sites 216
6.2.1 Proteinase SubTY 216
6.2.1.1 The overall fold 216
6.2.1.2 The active site 216
6.2.1.3 SubTY calcium and sodium sites 218
6.2.1.4 SubTY disulfide bridge 219
6.2.2 SubHal 220
6.2.2.1 The unliganded form of SubHal 220
6.2.2.2 The SubHal:CI2A complex 221
6.2.2.3 Termini, surface, and pH stability of
SubHal 221
6.2.2.4 The two crystallographically
independent SubHal:CI2A complexes 223
6.2.2.5 The calcium sites in SubHal 224
6.2.2.6 The active site of SubHal 226
6.2.3 Enzymatic Activity of SubTY and SubHal 228
6.2.4 Comparison of SubTY and SubHal with Other
Subtilases 228
6.2.5 The SubHal C-domain Compared to the
Eukaryotic PCs, Furin and Kexin 232
6.2.5.1 Active site comparison 233
6.2.5.2 The specificity pockets 234
6.2.5.3 Inhibitor CI2A binding 234
6.2.6 Activity Profiles 236
6.2.7 Comparison of Metal Binding at the Strong and
Weak Sites in the S8 Family 236
6.2.8 The Ca-II and Na-II Metal-Binding Sites 237
6.3 Conclusion: Implications for Structural Studies of
Enzymes 248
6.4 Materials and Methods 249
6.4.1 SubTY 249
6.4.1.1 Protein production and purification 249
6.4.1.2 Purification of the SubTY:CI2A (1:1)
complex 250
6.4.1.3 Crystallization 250
6.4.1.4 Structure determination 251

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Contents ix

6.4.2 SubHal 251


6.4.2.1 Protein production and purification 251
6.4.2.2 Purification of the SubHal:CI2A (1:1)
complex 252
6.4.2.3 Crystallization 252
6.4.2.4 Structure determination 253
6.4.3 Protease Assays 256
6.4.4 pH Stability 257
6.4.5 Data Deposition 257

7 Structure and Functional Roles of Surface Binding Sites in


Amylolytic Enzymes 267
Darrell Cockburn and Birte Svensson
7.1 Introduction 267
7.2 Identification of SBSs: X-Ray Crystallography 271
7.3 Bioinformatics of SBS Enzymes 273
7.4 Binding Site Isolation 275
7.5 Protection of Binding Sites from Chemical Labeling 277
7.6 Nuclear Magnetic Resonance 277
7.7 Binding Assays 278
7.8 Activity Assays 282
7.9 Future Prospects 283
7.10 Conclusion 286

8 Interfacial Enzymes and Their Interactions with Surfaces:


Molecular Simulation Studies 297
Nathalie Willems, Mickaël Lelimousin, Heidi Koldsø,
and Mark S. P. Sansom
8.1 Introduction 297
8.2 Enzyme Interactions at Interfaces 299
8.3 Molecular Dynamic Simulations of Biomolecular
Systems 301
8.4 Lipases 303
8.4.1 Atomistic MD Studies of Lipase Interactions
with Interfaces 304
8.4.2 The Role of Water in Lipase Catalysis at
Interfaces 307
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

x Contents

8.5 Coarse-Grained MD Studies of Interfacial Enzymes:


Orientation and Interactions 309
8.5.1 Phospholipase A2 309
8.5.2 PTEN 310
8.6 Conclusions 311

PART II ENZYME DESIGN

9 Sequence, Structure, Function: What We Learn from


Analyzing Protein Families 321
Michael Widmann and Jürgen Pleiss
9.1 Introduction 321
9.2 Detection of Inconsistencies Utilizing a Standard
Numbering Scheme 323
9.3 Identification of Functionally Relevant Positions 327
9.4 The Modular Structure of Thiamine
Diphosphate–Dependent Decarboxylases 330
9.5 Stereoselectivity-Determining Positions: The
S-Pocket Concept in Thiamine
Diphosphate–Dependent Decarboxylases 333
9.6 Regioselectivity-Determining Positions: Design of
Smart Cytochrome P450 Monooxygenase Libraries 336
9.7 Substrate Specificity–Determining Positions: The
GX/GGGX Motif in Lipases 340
9.8 Conclusion 341

10 Bioinformatic Analysis of Protein Families to Select


Function-Related Variable Positions 351
Dmitry Suplatov, Evgeny Kirilin, and Vytas Švedas
10.1 Introduction 352
10.2 Bioinformatic Analysis of Evolutionary Information
to Identify Function-Related Variable Positions 359
10.2.1 Problem Definition 359
10.2.2 Scoring Schemes in the Variable Position
Selection: High-Entropy, Subfamily-Specific,
and Co-Evolving Positions 361
10.2.3 Association of the Variable Positions with
Functional Subfamilies 366

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Contents xi

10.2.4 How to Select Functionally Important


Positions as Hotspots for Further
Evaluation: Implementation of Statistical
Analysis 366
10.3 The Bioinformatic Analysis of Diverse Protein
Superfamilies 369
10.3.1 Bioinformatic Challenges at Studying
Enzymes 369
10.3.2 Zebra: A New Algorithm to Select
Functionally Important Subfamily-Specific
Positions from Sequence and Structural
Data 370
10.4 Subfamily-Specific Positions as a Tool for Enzyme
Engineering 375
10.5 Conclusion 377
11 Decoding Life Secrets in Sequences by Chemicals 387
Zizhang Zhang
11.1 Introduction 388
11.2 Linking an Enzyme’s Activity to Its Sequence 389
11.3 Refining the Sequence Space to a Specific Function
by Directed Evolution 395
11.4 Linking Chemistry to -Omics with High-Throughput
Screening Methods 398
11.5 Finding Large Sequence Space of a Specific
Function from Microbial Diversity 400
11.6 Linking Sequences to Substromes at the Molecular
Level 404
11.6.1 Biocatalytic Study of EHs 405
11.6.2 Pharmacological Study of EHs 407
11.6.3 Mechanistic Study of EHs 407
11.6.4 What We Have Learned from the Studies of
EH 410
11.6.5 Technologies with Potentials in
Genochemistry Approach 410
11.7 Correlating with Computational Methods 410
11.8 Problems That Genochemistry Can Potentially
Tackle 413
11.9 Conclusion 414
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

xii Contents

12 Role of Tunnels and Gates in Enzymatic Catalysis 421


Sérgio M. Marques, Jan Brezovsky, and Jiri Damborsky
12.1 Introduction 421
12.2 Protein Tunnels 423
12.2.1 Structural Basis and Function 423
12.2.2 Identification Methods 427
12.2.3 Molecular Engineering 429
12.3 Protein Gates 431
12.3.1 Structural Basis and Function 431
12.3.2 Identification Methods 437
12.3.3 Molecular Engineering 440
12.4 Conclusions 442

13 Molecular Descriptors for the Structural Analysis of Enzyme


Active Sites 465
Valerio Ferrario, Lydia Siragusa, Cynthia Ebert,
Gabriele Cruciani, and Lucia Gardossia
13.1 Introduction: Molecular Descriptors for
Investigation of Enzyme Catalysis 465
13.2 Molecular Descriptors Based on Molecular
Interaction Fields 467
13.3 Multivariate Statistical Analysis for Processing and
Interpretation of Molecular Descriptors 472
13.4 Grind Descriptors for the Study of Substrate
Specificity 475
13.5 VolSurf Descriptors for the Modeling of Substrate
Specificity 477
13.6 Differential MIFS Descriptors for the Study of
Enantioselectivity 479
13.7 Hybrid MIFS Descriptors for the Computation of
Entropic Contribution to Enantiodiscrimination 481
13.8 Analysis of Enzyme Active Sites for Rational
Enzyme Engineering 484
13.9 BioGPS Descriptors for in silico Rational Design
and Screening of Enzymes 489
13.10 Conclusions 495

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Contents xiii

14 Hydration Effects on Enzyme Properties in Nonaqueous


Media Analyzed by MD Simulations 501
Diana Lousa, António M. Baptista, and Cláudio M. Soares
14.1 Enzyme Reactions in Nonaqueous Solvents 502
14.2 Classes of Nonaqueous Solvents 503
14.3 The Role of Water in Nonaqueous Biocatalysis 504
14.4 Effect of Water Content on Enzyme Structure and
Dynamics 504
14.5 Effect of Water Content on Enzyme Selectivity 507
14.6 Hydration Mechanisms of Enzymes in Polar and
Nonpolar Solvents 508
14.7 Enzyme Behavior as a Function of Water Activity 510
14.8 Hydration Effects on Enzyme Reactions in Ionic
Liquids 512
14.9 Hydration Effects on Enzyme Reactions in
Supercritical Fluids 514
14.10 Conclusions 516

15 Understanding Esterase and Amidase Reaction Specificities


by Molecular Modeling 523
Per-Olof Syrén
15.1 Introduction 523
15.2 Fundamental Catalytic Concepts 525
15.2.1 Fundamental Chemistry of Amides and
Esters 525
15.2.2 Esterases and Amidases and Their
Metabolic Significance 525
15.2.3 Fundamental Chemical Aspects of Amidase
and Esterase Catalysis 526
15.2.4 Impact of Stereoelectronic Effects on the
Enzymatic Reaction Mechanism 529
15.3 Molecular Modeling of Fundamental Catalytic
Concepts 529
15.3.1 QM Calculations on Amidases and
Esterases 529
15.3.2 MD Simulations on Amidases and Esterases 535
15.3.3 QM/MM Simulations on Amidases and
Esterases 539
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

xiv Contents

15.4 Outlook and Implications for Enzyme Design 544


15.5 Additional Comments 546

PART III ENZYME DIVERSITY

16 Toward New Nonnatural TIM-Barrel Enzymes Using


Computational Design and Directed Evolution Approaches 561
Mirja Krause and Rik K. Wierenga
16.1 Introduction 562
16.2 General Aspects of Protein Engineering 566
16.2.1 Library Creation Methods 569
16.2.2 Structure-Based Library Design 572
16.2.3 Optimal Libraries for Directed Evolution
Methods 574
16.2.4 Data-Driven Design (Semirational Design) 578
16.2.5 Protein Engineering by Selection and
Screening Methods 579
16.3 Directed Evolution Studies with TIM-Barrel
Enzymes 584
16.3.1 Protein Engineering Studies of TIM-Barrel
Proteins 586
16.3.2 The Kemp Eliminases 590
16.4 Concluding Remarks 596

17 Handling the Numbers Problem in Directed Evolution 613


Carlos G. Acevedo-Rocha and Manfred T. Reetz
17.1 Introduction 614
17.2 Saturation Mutagenesis in Directed Evolution 617
17.3 Statistical Analyses 620
17.3.1 Conventional Statistics Based on the
Patrick and Firth Algorithm 620
17.3.2 Statistics Based on the Nov Algorithm 624
17.4 How to Group and Randomize Amino Acid
Positions 626
17.5 Fitness Landscapes 628
17.5.1 Fujiyama vs. Badlands Fitness Landscapes 628
17.5.2 Fitness-Pathway Landscapes and How to
Escape from Local Minima 630
17.6 Conclusions and Perspectives 636

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Contents xv

18 Hints from Nature: Metagenomics in Enzyme Engineering 643


Esther Gabor, Birgit Heinze, and Jürgen Eck
18.1 Metagenomics and the Ideal Enzyme 644
18.2 Molecular Microdiversity 647
18.3 Metagenomic Enzyme Chimera 650
18.4 Outlook 653

19 A Functional and Structural Assessment of Circularly


Permuted Bacillus circulans Xylanase and Candida
antarctica Lipase B 657
Stephan Reitinger and Ying Yu
19.1 Introduction 657
19.2 Naturally Occurring Circular Permutations:
Selected Examples 658
19.3 Circular Permutation of Bacillus circulans
Xylanase 661
19.4 Circular Permutation on Candida antarctica
Lipase B 669
19.5 Conclusion 674

20 Ancestral Reconstruction of Enzymes 683


Satoshi Akanuma and Akihiko Yamagishi
20.1 Introduction 683
20.2 Reconstruction of an Ancestral Protein Sequence 684
20.2.1 Overview 684
20.2.2 Methods for Ancestral Sequence
Reconstruction 684
20.2.3 Early Works 686
20.3 The Commonote 687
20.3.1 The Last Universal Common Ancestor, the
Commonote 687
20.3.2 Theoretical Studies on the Environmental
Temperature of the Commonote 688
20.3.3 Reconstruction of an Ancestral Nucleoside
Diphosphate Kinase 689
20.3.4 Estimation of the Environmental
Temperature of the Commonote 692
20.4 Application to Designing Thermally Stable Proteins 693
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

xvi Contents

20.4.1 Design of Thermally Stable Proteins 693


20.4.2 Case Studies to Create Thermally Stable
Enzymes by Introducing Ancestral
Residues as Amino Acid Substitutions 694
20.4.3 Reconstruction of Thermally Stable,
Ancestral DNA Gyrase Using a Small Set of
Homologous Amino Acid Sequences 696
20.5 Conclusion 697

PART IV ENZYME SCREENING AND ANALYSIS

21 High-Throughput Screening or Selection Methods for


Evolutionary Enzyme Engineering 707
Shuobo Shi, Hongfang Zhang, Ee Lui Ang,
and Huimin Zhao
21.1 Introduction 708
21.2 Selection 710
21.2.1 Solid-Medium-Based Selection 717
21.2.2 Liquid-Medium-Based Selection 719
21.2.3 Display-Based Selection 722
21.3 Screening 724
21.3.1 Chromatography- and
Mass-Spectrometry-Based Screening 725
21.3.2 Solid-Medium-Based Screening 726
21.3.3 Microtiter-Plate-Based Screening 727
21.3.4 Yeast Two-/Three-Hybrid System 729
21.3.5 FACS-Based Screening 729
21.3.6 Microfluidics-Based Screening 732
21.4 Conclusions and Prospects 734

22 Nanoscale Enzyme Screening Technologies 745


Helen Webb-Thomasen and Andreas H. Kunding
22.1 Introduction 745
22.2 Approaches to Nanocompartmentalization of
Enzymes 746
22.2.1 Liposomes 747
22.2.1.1 Addressability 747
22.2.1.2 Reagent exchange 749

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Contents xvii

22.2.2 Polymersomes and VirusLike Particles 751


22.2.3 Water-in-Oil Emulsion Droplets 752
22.2.3.1 Addressability 755
22.2.3.2 Reagent exchange 755
22.3 Microfabricated Chip Devices for Enzyme
Compartmentalization and Screening 756
22.3.1 Microfluidic-Generated Emulsion Droplets 757
22.3.2 Microfabricated Arrays 762
22.3.2.1 Optical fiber microarrays 762
22.3.2.2 Elastomeric microarrays 763
22.3.2.3 Surface tension microarrays 765
22.4 Conclusion and Current Challenges 767
22.5 Future Improvements 769

23 Computational Enzyme Engineering: Activity Screening


Using Quantum Chemistry 777
Martin R. Hediger
23.1 Motivation 778
23.2 Introduction 779
23.3 Methods 780
23.3.1 Calculation Engines 780
23.3.2 Molecular Modeling 782
23.3.3 Software 786
23.4 Applications 786
23.4.1 Overview 786
23.4.2 Engineering Candida antarctica Lipase B 787
23.4.3 Engineering Bacillus circulans Xylanase 793
23.5 Conclusions 800

24 In Silico Screening of Enzyme Variants by Molecular


Dynamics Simulation 805
Hein J. Wijma
24.1 Potential Applications of MD Simulations For
Improving Enzymes 805
24.2 Molecular Dynamics vs. Other in silico Methods 809
24.3 Improving Catalytic Activity by MD Screening 812
24.3.1 Transition-State Simulation 812
24.3.2 High-Energy Intermediate Simulation 814
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

xviii Contents

24.3.3 Substrate Simulation with Near-Attack


Conformations 815
24.3.4 Substrate Simulation with Monitoring of H
Bonds 817
24.4 Predicting and Improving Binding Affinity 818
24.5 MD Screening to Improve Enzyme Stability 819
24.6 Improving Correlation between MD and
Experiment 822
24.6.1 Force Field Inaccuracies 822
24.6.2 Sampling Concerns 823
24.6.3 Other Concerns 824
24.7 Outlook and Further Possibilities 825

25 Kinetic Stability of Variant Enzymes 835


Jose M. Sanchez-Ruiz
25.1 Kinetics vs. Thermodynamics in Protein Stability 835
25.2 Mutation Effects on Kinetic Stability: A Description
Based on the Transition State for Irreversible
Denaturation 838
25.3 Kinetic Stability Linked to the Breakup of
Interactions in the Transition State: Pro-dependent
Proteases 841
25.4 Kinetic Stability Linked to Substantially Unfolded
Transition States: Thioredoxin and Phytase
Enzymes 842
25.5 Role of Solvation Barriers in Kinetic Stability:
Lipases and Triose Phosphate Isomerases 848
25.6 Concluding Remarks 852

Index 859

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Introduction

More than three decades ago, the hope emerged that protein
engineering would be able to predict protein and enzyme function
on the basis of X-ray crystal structures. The expectations were that
we should be able to create goal-oriented functions in the enzyme of
interest. A large effort was made to obtain the structures of enzymes
of great importance for understanding biological processes and
enzymes of general commercial interest in many industries. A large
variety of structures of enzymes from many biological pathways,
as well as enzymes of commercial interest, have been solved,
including carbohydrate-acting enzymes, proteolytic enzymes, and
lipolytic enzymes, and have helped tremendously in understanding
the structure–function relationships. They have also revealed how
much we still need to learn in order to manipulate genes to make
enzymes react in a desired way.
Today, there are at least two major focuses on gaining benefit
from and knowledge about enzyme function: (1) data analysis and
(2) a more detailed understanding. Much learning cannot be said
to be statistically feasible, but I hope the scientific society will still
accept a few examples as feasible hypotheses to investigate further.
With the increasing knowledge on enzyme function, with input from
atomistic mobility and hydrogen bonding, the shifting electrostatics
situation due to mobility and changes in relative coordinated
atoms and macroscopic dependencies on enzyme environment
changes leaves us with a very complex multidimensional space
for how enzymes work. This makes it nearly experimentally
unfeasible to have enough statistics on all the possible impact
characteristics, as theoretically needed, making it difficult to draw
sound, comprehensive, and significant conclusions. Commonly, even
very large data sets will reveal single conclusions but are incorrectly
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

xx Introduction

drawn since the number of data sets for each parameter alone is too
few to make findings statistically significant. The data analysis will
definitely add to a more detailed understanding and to suggestions
for function. Some chapters touch upon data-driven discovery, but
most of the chapters are focused on hypothesis-driven research
testing one specific enzyme in a specific environment and with few
parameters, giving exciting insights into the complexity of enzyme
nature.
During my work in developing enzymes for technical use and
work on the enzyme–substrate interaction, it has been tempting to
combine the information from quantum mechanical calculations of
the energetics in the catalytic reaction, and the overall molecular
mobility using standard force fields, as well as electrostatics
calculations and docking in order to inform on three important
topics of enzyme function, namely (1) the initial substrate binding
to the enzyme, (2) the important local fitting to accommodate the
correct spatial state that can contain the reactive state as seen
by molecular dynamics mobility and hydrogen bonding patterns,
and (3) the reactive state energetics as measured by quantum
mechanical calculations. This overall reaction could be stated in a
formula as shown below:
Enzyme function = f (overall binding)
+ f (local fluctations and interactions) + f (reactive energy)
Or in other words, enzyme function is a function of three major key
factors: (1) the overall fitting of the substrate for binding with the
correct orientation for the more detailed local interactions in the
nearer active site surroundings, (2) the necessary hydrogen bonding
and electrostatic interactions to secure the correct arrangements for
the catalysis reaction to take place, and (3) the quantum mechanical
energy in the catalysis reaction. Seen from molecular dynamics
simulations some hydrogen bonds are only present at a certain
time during the simulation, indicating that activity only occurs when
the structure is in a certain subdomain structure containing the
important hydrogen bonds. If certain hydrogen bonds are in place
at the same time the reaction can occur. If one of the three stated
factors is not fulfilled at the same time, then no reaction occurs.
Examples of important hydrogen bonds are presented in Chapter 15.

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Introduction xxi

In Chapter 10 on sequences and design the combination of sequence


alignment information, docking, and molecular simulation of variant
molecules to extract more combinatorial information is discussed.
This book focuses on the current understanding obtained in
the past 10–15 years to the present. In the 1980s focus was on
making 3D structures and understanding and analyzing proteins. In
the 1990s focus was on diversity methods and screening methods,
whereas in the 2000s the focus has been on bioinformatics and
simulation methods and statistical methods, as well as ultrahigh-
throughput methods with revised views on proteins. Today we hope
the analysis of large data will help find the desired results. Many
new technologies have brought new insights into enzyme function,
with emphasis on single-molecule behavior and molecular mobility
and electrostatics, as well as enzymes working on large substrates
and complex substrates. Focus on the mobility impact on substrate
interaction can be found in Chapter 2.
The book is divided into four major sections: enzyme function
(Chapters 1–8), enzyme design (Chapters 9–15), enzyme diversity
(Chapters 16–20), and enzyme screening and analysis (Chapters
21–25). The enzyme function part addresses the enzyme kinetics
on simple substrates in Chapter 1, as well as the more complex
interaction on larger substrates in Chapters 4, 7, and 8. Also
structural aspects are addressed in Chapter 6, NMR structures in
Chapter 5, and further dynamic aspects in Chapters 2 and 3. The
enzyme design part is focused on the sequence-derived design
methods in Chapters 9, 10, and 11, as well as in Chapter 20, and
3D structural methods. The 3D structural design/understanding
is mainly discussed in Chapters 12–15. The design area is also
covered partly under enzyme diversity, especially in Chapter 16,
which has a review of both diversity methods and some design ideas.
Further under enzyme diversity are handled metagenomics, circular
permutations, and ancestral reconstruction in Chapters 18, 19, and
20, respectively, as well as the number issues in directed evolution in
Chapter 17. The enzyme screening and analysis part includes both in
silico screening in Chapters 23 and 24 and wet chemistry screening
methods in Chapters 21 and 22, as well as an example of analysis of
enzyme variants in Chapter 25.
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

xxii Introduction

The computer simulations reveal great insight into the function


of enzymes and can help in designing new functionalities and
activities. The predictive power is still not precise, but we can use
the simulations to screen for potential variants of interest, which
then need testing for the desired function. Decades ago, one specific
predicted variant was selected for testing—today it is commonly
understood that a certain number of the, say, top 10 or 100
candidates could potentially be of interest. The speed of computers
today allows for this kind of suggestions and sometimes also a
reasonable simplification is used for making the screening possible.
Chapters 23 and 24 address these possibilities. Also Chapter 16
touches upon the in silico design possibilities.
It is now more than a decade ago that enzyme promiscuity
became a major field of interest. The versatility of enzymes and
their activities are more open today than ever and the general
EC classification system is seldom fully explanatory today. A few
chapters touch upon the promiscuity—not from a specificity issue
but rather a reaction mechanistic view; see Chapters 15 and 23.
Other screening methods in the wet chemistry part are being
developed, and while screening has come out of the first decade in
protein engineering, the limitations are getting more visible and the
possibilities better utilized. A few chapters address the methodolo-
gies (Chapters 16, 17, 21, and 22)—micronanotechnology has gone
into the screening area and possibilities for very high numbers have
become a reality. Smart techniques to secure the picking of hits are
important and an interesting method is mentioned in Chapter 22.
In an earlier book I edited, Enzyme Engineering: Function, Design,
Variant Generation and Screening, the focus was more on the variant
generation and screening part and less on the function and design
part. In this book the main focus is on enzyme function and
design and less on variant generation and screening methods. This
reflects the fact that many new insights into the more complex
enzyme function have emerged during the past many years. Massive
quantities of information on variants of enzymes and the multiple
states of the structures as well as single-molecule insight have added
to the colligative understanding of enzyme function.
The production of many mutations has, besides a lot of data, also
resulted in the realization of how little we still understand about

www.ebook3000.com
March 21, 2016 12:20 PSP Book - 9in x 6in 00-Allan-Svendsen-Prelims

Introduction xxiii

enzyme function. Therefore, this has been emphasized in the first


eight chapters with examples from the versatility of factors influ-
encing enzyme activity and enzyme–substrate interaction. Around
20 years ago the main enzyme understanding was based on simple
kinetics and soluble substrate interactions. In industry, we are
aware that the main enzyme function often occurs under conditions
other than the simple substrate–enzyme interaction theory, very
well described with mathematical equations. Chapter 3 (on single-
enzyme function) and Chapter 2 (on enzyme motions) emphasize
the rather complicated behavior of the enzymatic function, which
continues to open new depths of understanding. Examples of these
complicated behaviors are presented in Chapter 4 on surface-active
enzymes and Chapter 7 on the carbohydrate-hydrolyzing enzyme
family.
During the work on writing the book chapters representing
important directions in enzyme research on enzyme function, de-
sign, engineering, and analysis, recent aspects have been published,
including enzymes’ use of the energy coming from the catalyzed
chemical reaction itself, which adds to the chapters on mobility
of the enzymes. Also the importance of electrostatics and the
impact on enzyme function has not been directly addressed in
the chapters but is clearly a major part of some of the added
chapters and has been established as an important factor in enzyme
function and catalysis. Clearly, more combinations of these factors
mentioned in the chapters and above are needed in the future to
further understand the full functional space of enzymes and thus
understand how to address improvements by protein engineering.
This page intentionally left blank

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

PART I

ENZYME FUNCTION

1
This page intentionally left blank

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

Chapter 1

A Short Practical Guide to the


Quantitative Analysis of Engineered
Enzymes

Christopher D. Bayer and Florian Hollfelder


Department of Biochemistry, University of Cambridge, 80 Tennis Court Road,
Cambridge CB2 1GA, UK
fh111@cam.ac.uk

1.1 Introduction

Quantitative analysis of any protein engineering effort is necessary


to find out to what extent the properties of a catalyst have been
altered and to assess whether the engineering objectives were
successfully met. A quantitative framework is also a prerequisite
for understanding in which way a catalyst’s properties have been
modified.
This chapter summarizes the key elements of a straightforward
standard analysis and provides a working knowledge of the
parameters used to characterize enzyme reactivity. Useful textbooks
are available [1–4] and should be consulted. The textbooks make

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

4 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

many finer points that go beyond the scope of this chapter and lead
up to the elucidation of enzyme mechanisms. Enzyme mechanisms
should never be considered proven—instead they usually are a work
in progress where a mark of success is a plausible scenario that is
consistent with the available evidence. Every scenario should then
be revised when additional experiments come to the fore.

1.2 Quantifying Reaction Progress

Kinetics start invariably with the measurement of a time course


of product formation. This requires a way to detect the product,
either continuously (e.g., by monitoring the emergence of a spec-
troscopically active product or alternatively the disappearance of
a reactant) or discontinuously. Continuous monitoring is preferred,
but reactions can also be monitored discontinuously by withdrawing
aliquots that are quenched and analyzed separately to derive the
product concentration. If the product concentration cannot be
measured, the disappearance of the substrate can alternatively be
monitored.
When the product concentration (P ) is plotted against time a
progress curve emerges (Fig. 1.1). The linear initial portion of this
curve is then fitted to give an initial rate v.

• This initial rate v is ideally measured over a very short


timescale so that the substrate concentration does not
change over the data fit to a linear equation. Most enzymatic
time courses will eventually deviate from linearity, but an
asymptotic fit to its very early, quasi-linear part will yield a
sufficiently accurate approximation to the initial rate v.
• It is necessary to make sure that there is no systematic
deviation from linearity for the time points that are used to
derive the initial rate v. Measuring very early and for short
time interval means that the concentration of the product
is small enough that the dangers of, for example, product
inhibition, reverse reaction, or other processes leading to
enzyme inactivation are minimized.

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

Typical Saturation Plots Give Michaelis–Menten Parameters 5

Figure 1.1 Generic time course of an enzymatic reaction. An initially linear


product increase shows deviation from linearity at later time points. The
initial rate v is obtained by measuring the slope of the linear part of the
curve.

If the appearance of the product can be fitted to an exponential


equation P = P0 · e−kt , the rate constant kobs can also be derived
from exponential fits.

1.3 Typical Saturation Plots Give Michaelis–Menten


Parameters

Figure 1.2 shows the characteristic effect of increasing substrate


concentration on the initial rate of an enzyme-catalyzed reaction at
constant enzyme concentration (E 0 ). At higher substrate concentra-
tions, the catalyst will eventually become saturated, so nonlinear
curves (as shown in Fig. 1.2) are typically observed. Saturation
kinetics are seen as a hallmark of enzyme catalysis and can be
straightforwardly rationalized in terms of catalysis taking place in
the active site of an enzyme—at a certain substrate concentration, at
a given time, the active sites of all enzyme molecules in the solution
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

6 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

Figure 1.2 A Michaelis–Menten curve for an enzyme-catalyzed reaction.


The rate v of the catalyzed reaction reaches a maximum (vmax ) at high,
saturating concentrations of substrate S. (Inset) The Michaelis–Menten
equation [4].

are bound to the substrate and operate at their maximum (first-


order) rate.
Fitting the curve to the basic Michaelis–Menten equation (inset in
Fig. 1.2) gives the parameters kcat and KM , which define the kinetic
behavior of a particular enzyme toward a particular substrate. Even
without fitting, the Michaelis constant KM can be read off in a useful
visual check as the substrate concentration at which the rate is half
the maximum value of Vmax reached at high, saturating substrate
concentrations. kcat is obtained from this limiting rate as Vmax /E 0 ,
where E 0 is the concentration of the enzyme used. The initial rate
at low substrate concentration (when S approaches 0 and S  KM )
is linearly correlated with S, with a slope Vmax /KM . The rates of
reactions catalyzed by enzymes are generally measured under the
following steady-state conditions: after the enzyme, substrate, and
enzyme–substrate complex (the Michaelis complex E ·S [see Scheme
1.1]) have already reached thermodynamic equilibrium, and E · S is
present at a constant steady-state concentration.

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

Typical Saturation Plots Give Michaelis–Menten Parameters 7

Historically a large number of linearization methods have been


used, but nonlinear curve fitting is now a straightforward operation
(e.g., using readily available software packages such as KaleidaGraph
or Orgin).
This treatment can be (and is) carried out for all sorts of
enzymes, but the Michaelis–Menten parameters only have a distinct
meaning for a simple one-step reaction such as

Scheme 1.1

In general, the observed initial rate for this most simple model
of an enzymatic reaction is described by the Michaelis–Menten
equation:
kcat · E 0 S
v=
KM + S
Here the Michaelis–Menten parameters can be interpreted as
follows:
• The catalytic constant kcat (also referred to as the turnover
number) is a measure of catalytic efficiency and has units
of s−1 . It describes the number of molecules of substrate
converted per second per active site at saturating substrate
concentrations. The rate constant kcat also reports on
the free-energy difference between the E . S complex and
the transition state for the enzyme-catalyzed reaction (the
smaller the energy difference, the larger the value of kcat ).
Thus kcat is a first-order rate constant for the reaction of the
bound substrate. kcat is related to Vmax as follows:
Vmax = kcat E 0 .
It can be compared directly with other first-order reac-
tions, for example, intramolecular reactions. It is easy to
interpret—bigger is better.
• The Michaelis constant KM is a measure of binding—
generally of all enzyme-bound species (substrate(s) and
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

8 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

intermediate(s)). It may be a substrate binding constant


in the simplest case when the binding step is a rapid pre-
equilibrium (i.e., koff  kcat ). KM then becomes equal to
the dissociation constant of the Michaelis complex E . S.
Expressed as a formula, we see that, if koff  kcat holds, KM
can be expressed as koff /kon , which is how Kd is defined:
koff + kcat koff
KM = ≈ = Kd .
kon kon
In cases where the above approximations hold, KM is easy
to interpret: smaller values correspond to tighter binding.
• kcat /KM is the apparent second-order rate constant
(M−1 ·s−1 ) at very low substrate concentrations (S  KM ),
in which case the Michaelis–Menten equation simplifies to
the expression
 
kcat
v= E0S
KM
kcat /KM is a useful measure of the overall efficiency of the
catalyst, including the binding step. It reports on the free-
energy difference (Fig. 1.3) between the free substrate and
enzyme and the transition state (the smaller the energy
difference is, the larger is kcat /KM ). It can be compared
with other second-order rate constants. It is therefore a
particularly useful quantity to use in comparing mutant
enzymes with the wild-type enzyme (see later) or assessing
the ability of two alternative substrates to compete with
each other for the same enzyme.

1.4 What Can Go Wrong?

Steady-state parameters are generally straightforwardly obtained,


but the following points summarize a few checks designed to probe
whether this has been done responsibly:
• Are the assumptions in the Michaelis–Menten model valid?
The steady-state assumption is only valid when E < S,
but sometimes, this scenario cannot be experimentally
implemented (e.g., when the enzyme is a bad catalyst and

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

What Can Go Wrong? 9

(a)

(b)

ΔG

Figure 1.3 Simplified thermodynamic schemes for comparison of an


idealized enzymatic one-step process with the corresponding background
reaction. (a) Thermodynamic box illustrating ground-state (KS ) and
transition-state (KTS ) binding. The bottom line represents the reaction in
the enzyme active site and the upper line the uncatalyzed or background
reaction, with the enzyme present but not involved. KTS is formally the
dissociation constant of the transition state from the enzyme. The free-
energy scheme (b) shows the relationships between the kinetic parameters
and the corresponding free-energy differences and can be related to
= = = =
the Michaelis–Menten parameters by kcat /kuncat = Kcat /Kuncat = KS /KTS .
Reproduced from Ref. [4] with permission of the Royal Society of Chemistry.
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

10 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

higher concentrations are needed to measure a reliable rate


and/or in case the KM toward a substrate is particularly
low).
• Were the initial rates v obtained from linear time courses?
Not infrequently, Michaelis–Menten plots in the literature
are based on measurement of just one experimental time
point (instead of a time course). Together with the zero
point, one can draw a line, but this construct does not
reflect a genuine time course. If the increase in product
concentration is still linear at the point in time when
this measurement was taken, then this treatment can
exceptionally turn out to be right. But in most cases the
time frame in which the increase in product concentration
is linear must be experimentally determined; this means
more than one time point for product formation has to be
determined.
• Are there sufficient data points to fit a curve to the nonlinear
Michaelis–Menten equation?
If a Michaelis–Menten curve cannot be fully described,
for example, because the substrate is insoluble at concen-
trations above KM , a large uncertainty exists about the
shape of the nonlinear curve. Any curve fit—be it obtained
computationally or by a linearization method (e.g., a
historical Lineweaver–Burke plot)—will be meaningless. A
curve-fitting program may indicate a seemingly reasonable
fit, but that is misleading. The program does not assess the
data critically, so the experimenter has to make a number of
checks:

(i) There should be initial rate data for at least >5 ×


KM (ideally 10 × KM ) so that the saturation profile is
overall well covered. If saturation is not reached, kcat
will be badly off.
(ii) There should ideally be a minimum of three initial rate
data points per KM . This should be the case at least for
the substrate concentration range between S = 0 and
3 × KM so that the nonlinear portion of the saturation
profile is well described. If too few data points are

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

What Can Go Wrong? 11

available for the nonlinear part of the saturation curve,


the fit is not well described and the accuracy of the KM
determined in this way will be poor.

A simple test to probe whether a sufficient number of points has


been obtained is to remove one or more points from the Michaelis–
Menten plot and see whether the fit still makes sense (i.e., whether
it still describes the remaining data points similarly well as when all
points are considered). If not, it means that you have too few data
points. If the curve fit is sensitive to the inclusion of one or very few
points, clearly more data are needed to avoid being at the mercy of
one or a few measurements that may have an experimental error.
In cases where it is experimentally impossible to obtain a full,
hyperbolic Michaelis–Menten curve based on a sufficient number of
initial rates, fits (and the Michaelis–Menten parameters) should not
be reported without a disclaimer, as an error that goes beyond the
standard deviation provided by the curve-fitting program cannot be
excluded.
However, even in cases where no saturation can be achieved, it
is always possible to fit the linear proportion of the initial rate data
at substrate concentrations well below KM (ideally KM >10 × S) to
a linear equation. As described earlier, under such conditions the
initial rate is proportional to the substrate concentration, with the
slope corresponding to kcatt /KM . Comparisons of enzymes and their
mutants on the basis of kcat /KM can be accurately determined and
safely used for assessing, for example, substrate specificity.
When Michaelis–Menten parameters that have been experimen-
tally measured are compared to literature values for the enzyme
studied previously, differences will almost invariably emerge. Here
are sources for these deviations:

• Differences in kcat
Since kcat is obtained using the equation Vmax /E 0 = kcat ,
this value may carry an error if the enzyme concentration
is determined inaccurately. In addition to poor estimates of
enzyme concentration the presence of inactive enzymes (e.g., as
a result of partial denaturation during purification or storage)
would skew the kcat value.
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

12 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

• Differences in KM
KM values may appear to be artificially increased if competitive
inhibitors are present, for example, when KM is determined
with an impure enzyme preparation. This may happen in the
case that an inhibitor copurifies or when buffer components are
acting as inhibitors.
Usually KM should be the parameter that does not vary
between enzyme preparations. This insight has been useful in
the case of a retraction of a publication in which the inability
to reproduce the kinetic data for a computationally generated
enzyme variant [5] had been ascribed to the presence of
Escherichia coli enzyme performing the same task, and the
paper was retracted [6]. However, Kirsch [7] and Richard [8]
questioned this explanation because the reported KM was not
identical to the E. coli enzyme. Therefore the result in question
must have been artifactual, suggesting that some sort of foul
play (and not contamination with an E. coli enzyme) had led to
the irreproducible result.

1.5 Dealing with Multiphasic and Pre-Steady-State


Kinetics

The reaction sequences and mechanisms of most enzymes are not


well described by the simplified one-step reaction with a fast pre-
equilibrium binding step (Scheme 1.1): in such cases, the Michaelis–
Menten parameters will not have a simple meaning. Instead the
values for kcat and KM correspond to composites of multiple
microscopic rate constants, defying straightforward molecular
interpretations. Steady-state kinetics at best allow access to the
slowest step of a more complex multistep reaction (see Fig. 1.4A
for an example of a mechanism with two products formed in two
sequential, irreversible steps). For each enzyme the meaning of the
Michaelis–Menten parameters has to be established—and often no
unambiguous interpretation is possible. In this case a discussion
of the differences to the classical Michaelis–Menten model should
be included in any publication. Even if no straightforward interpre-
tation is possible, the parameters derived from Michaelis–Menten

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

Dealing with Multiphasic and Pre-Steady-State Kinetics 13

Figure 1.4 Multistep reactions, such as the case shown in (A), can give
rise to burst and lag kinetics. Here, the time courses for formation of
two products P and Q will show burst and lag kinetics (B) if the second
irreversible step described by k3 is rate limiting. A burst occurs if the
detected product is released before a rate-limiting step; formation of a
product released during the rate-limiting step will show lag kinetics (see
Ref. [9] for an excellent discussion of the mathematical solution of the
rate equations). After an initial transient pre-steady-state phase, a linear
steady-state regime is reached. The linear phase corresponds to the data
used for determination of initial rates v (to be used for determination of
Michaelis–Menten parameters). Under ideal conditions, the transient allows
extraction of information on microscopic rate constants. This also holds
for the x and y intercepts of the extrapolated linear slopes, which is the
amplitude of the burst; see dashed lines in (B). The burst amplitude π is
determined at multiple enzyme concentrations and should be proportional
to the enzyme concentration or at maximum the same as the enzyme
concentration (if k3  k2 ). (C) Example data for hydrolysis of a phosphonate
monoester by the enzyme RlPMH. Reprinted from Ref. [15], Copyright
(2008), with permission from Elsevier. Burst amplitudes exceeding the
enzyme concentration indicate that the observed burst is not a kinetic burst
but an artifact, possibly due to enzyme inactivation or a slow conformational
change.
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

14 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

curves are often the only quantitative descriptors of an enzyme and


therefore widely used.
Experimental insight into the sequence of elementary steps of an
enzyme reaction comes from observation of reaction progress in the
very first seconds of turnover, before the system has equilibrated to
the steady state. Special equipment like a stopped- or quenched-flow
apparatus is required in most cases in order to reveal such transient
pre-steady-state time traces (one example of which is shown in
Fig. 1.4C).
Transient phases of enzymatic time courses may show burst or
lag kinetics (Fig. 1.4B) that point to more complex mechanisms,
and usually several lines of evidence have to be followed up
to construct a kinetic scheme that contains all microscopic rate
constants. However, this effort can be started with an analysis of the
characteristics of the transient phase: a burst is consistent with a
two-step reaction, as shown in Fig. 1.4A, and the detected product
(P) is released in a step before the rate-limiting step (see Fig.
1.4B). Initially, the fast second step can produce the product rapidly,
but the reaction slows down and reaches the steady state as the
concentration of free enzyme drops and the regeneration of free
enzyme (last step, k3 in Fig. 1.4A) becomes rate limiting. Observation
of a lag indicates that the detected product (Q) is released in
the rate-limiting step described by k3 . The maximum steady-state
rate of formation of Q is only reached once the intermediate EQ
has accumulated to its maximum steady-state concentration. For a
useful in-depth treatment of the rate equations that hold in this case
and their derivation, refer to Refs. [2, 9].
It is advisable to test if an observed burst or lag is a genuinely
kinetic phenomenon, as an apparent transient may be caused by
enzyme inactivation or a slow conformational change, in particular
if it occurs on longer timescales of >10 s (or even minutes). For a
burst, this can be done by plotting the burst amplitude π against
the enzyme concentration (Fig. 1.4C). The burst amplitude π for the
formation of P in the reaction shown in Fig. 1.4A is given by the
expression

E0
π=
1 + (k3 /k2 )

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

Dealing with Multiphasic and Pre-Steady-State Kinetics 15

This means that for a true kinetic burst, its amplitude is proportional
to the enzyme concentration and equivalent in case k3  k2 . In
other cases, the burst is smaller than the stoichiometric amount of
enzyme, for example, when competition between rates of product
formation and product release exists: the faster the product is
released, the faster is the approach to the steady state, and the burst
amplitude is decreased. A burst amplitude that is larger than the
enzyme concentration can be seen, for example, when the burst
generates product concentrations above KP so that the following
linear phase underestimates the real rate and leads to a larger
apparent burst amplitude.
Even if no burst or lag is observed, careful analysis is required
to rule out the presence of a more complex mechanism. A burst/lag
could be too fast to be observed even if stopped-flow equipment is
used, as the typical dead time of the instruments is in the range of 1
ms. In this case examination of the y intercepts of the extrapolated
steady-state slopes (as shown in Fig. 1.4B) can be used to infer that
a pre-steady-state burst has occurred [10]: if the steady-state slopes
at different enzyme concentrations do not extrapolate back to the
origin (and instead give a y axis intercept >0), the burst was too
fast to be observed. Only if neither a burst nor a y axis intercept
is observed is it possible to assume that the first chemical step
described by k2 in the mechanism shown in Fig. 1.4A is the rate-
limiting step (k2  k3 ). As the general expression for the steady-
state rate kcat for this mechanism is kcat = k2 k3 /(k2 + k3 ), in this
case kcat reflects the first step, that is, k2 . A useful discussion of these
relationships for the example of glycosidases can be found in Ref.
[11].
While kcat and KM can be made up of complex terms, their
ratio kcat /KM is immediately useful because more complex terms
contributing to kcat and KM often cancel out. kcat /KM measures the
energy barrier from free enzyme and substrate to the transition
state of the first irreversible step of the enzymatic reaction. There-
fore specificity comparisons based on kcat /KM can be made even
in case of a multistep mechanism (e.g., when reaction sequences
share the same irreversible step as in specificity comparisons of
promiscuous hydrolases [12]).
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

16 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

1.6 Evaluating Enzymes

To assess the proficiency of enzymes, a comparison of the


uncatalyzed and the enzymatic rate of a reaction is important. While
the rates of spontaneous reactions range from 10−20 to 10−1 s−1 , the
rates (kcat ) of the same reactions catalyzed by enzymes fall within a
range of 10–106 s−1 (and the great majority between 10 and 103 s−1 )
[13, 14]. Unless we appreciate whether an enzyme has overcome a
more or less difficult free-energy barrier, we do not know whether
its catalytic power is more or less remarkable.
On one level enzymes can be characterized by rate constants:
kcat is the first-order rate constant referring to the reaction of
the enzyme with a fully bound substrate, kcat /KM is the second-
order rate constant under subsaturating conditions, and KM can
be the substrate-binding constant if substrate binding is rapid and
reversible in a one-step reaction. These parameters are adequate to
characterize the practical and evolutionary utility of an engineered
enzyme function as they describe how quickly a product is formed.
Rate accelerations by contrast allow an evaluation of enzymes
as catalysts. Enzymes achieve enormous rate accelerations and
effectively bind transition states with high affinity. An evaluation of
transition-state stabilization for native and promiscuous reactions
benchmarks the proficiency of an enzyme and allows a quantitative
discussion of the effects that lead to catalysis and selective
recognition of the two (or more) alternative transition states.
Figure 1.3B shows a free-energy diagram and a reaction scheme
for the most basic one-step reaction in which the mechanisms of
catalyzed and uncatalyzed reactions are identical. Several useful
comparisons of rates can be made: for example, a comparison
between the rate of the enzyme reaction at saturating substrate
concentrations and the background reaction in water (kcat /kuncat )
gives the first-order rate enhancement. The determination of kuncat
requires careful attention: a particular kobs is simply a first-order
rate constant measured under a given set of reaction conditions.
Observed rates may vary with pH and may depend on other
components of the reaction mixture, such as the buffer (Fig. 1.3)
used to maintain a constant pH. In this case the ratio kcat /kobs

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

Evaluating Enzymes 17

Figure 1.5 Derivation of background rate constants kuncat . (A) The plot
contrasts two different reactions that are buffer catalyzed (dashed line) or
independent of buffer concentration (solid line). (B) The plot shows a pH
rate profile that can help to interpret the observed rates. The intercept at
zero buffer concentration, k0 in (A), represents the spontaneous reaction
with water, k0 , case 1 in (B), or the sum of the hydroxide- and water-
catalyzed rates (k0 + kOH [OH]), case 2 in (B). Reproduced from Ref. [4] with
permission of the Royal Society of Chemistry.

compares the enzyme reaction with the sum total of several


chemically distinct processes occurring in water.
Figure 1.5A shows how the reactions in water can be more clearly
defined, and independent processes isolated, by extrapolating the
observed rate constants kobs back to zero buffer concentration. If
this extrapolation is made at a pH where kobs is pH independent
(Fig. 1.5B), then the y axis intercept (k0 in Fig. 1.5A) will be kH2 O ,
representing the pH-independent reaction with H2 O. In cases where
this extrapolation yields an intercept k0 on the y axis that depends
on pH, there is an additional reaction catalyzed by OH− (or H3 O+ ).
An extrapolated rate constant in this situation is not kH2 O but yet
another kobs , this time the sum of the hydroxide (or hydronium)
and water-catalyzed reactions. (kHO− is a second-order rate constant
that can be determined by plotting these values for kobs against the
hydroxide ion concentration; see below.) kcat /kuncat indicates by how
much the transition-state energies of the enzyme reaction and the
uncatalyzed competition reactions (e.g., the sum of the reactions of
water, buffer, and hydroxide) differ. This comparison is practically
useful, as it represents a direct comparison to a rate in water, the
usual solvent for enzymatic reactions. It does, however, not always
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

18 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

reflect comparisons of the same functional groups. For example, if


the reaction in the enzyme active site involves another active site
nucleophile instead of water, a direct comparison with a reaction of
the appropriate nucleophile is required.
Such a comparison can be made by using the second-order rate
constant k2 . The comparison between the second-order rates of the
same reactive group in solution and in the enzyme active site is
given by (kcat /KM )/k2 . This second-order rate enhancement refers
to the lowering of the activation barrier by the enzyme compared
to the uncatalyzed reaction. It can represent a better comparison
of transition states, if in each state, the same functional group
(e.g., nucleophile) is used. In cases where water is the reagent (i.e.,
hydrolytic reactions), the concentration of water in water (55 M) can
be used to convert the first-order rate constant (kH2 O ) into a second-
order rate constant (kw ).
A third measure is catalytic proficiency ([kcat /KM ]/kuncat ) [14].
Depending on the difficulty of the chemical background reaction
catalytic proficiencies ([kcat /KM ]/kuncat ) from 108 (for the thermody-
namically undemanding CO2 hydration by carbonic anhydrase) up
to 1026 (for the more difficult phosphate monoester hydrolysis by
fructose-1,6-bisphosphatase) have been calculated [16].
Following the definition of catalysis as a selective recognition of
the transition state versus the ground state, apparent binding con-
stants can be used for evaluation of the transition-state binding. KTS
is obtained by dividing kuncat by kcat /KM and represents the upper
limit for the dissociation constant of the enzyme and the altered
substrate in the transition state. Values of KTS range up to 10−26 M
and are formally much stronger than ordinary binding processes but
include the thermodynamic benefits of catalytic effects (including
those that cannot be classed as binding events, for example,
nucleophilic catalysis or solvation and medium effects on reactivity).
For comparison, KM serves as an indicator for ground-state binding.

Acknowledgments

FH is an ERC starting investigator. CDB was supported by a BBSRC


studentship and the Cambridge European Trust.

www.ebook3000.com
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

References 19

References

1. Cook, P., and Cleland, W. W. (2012). Enzyme Kinetics and Mechanism


(Garland Science, New York).
2. Cornish-Bowden, A. (2013). Fundamentals of Enzyme Kinetics, 4th ed.
(Wiley-Blackwell, Weinheim).
3. Fersht, A. R. (1999). Structure and Mechanism in Protein Science: Guide
to Enzyme Catalysis and Protein Folding, 3rd ed. (W. H. Freeman, New
York).
4. Kirby, A. J., and Hollfelder, F. (2009). From Enzyme Models to Model
Enzymes (Royal Society of Chemistry, Cambridge).
5. Dwyer, M. A., Looger, L. L., and Hellinga, H. W. (2004). Computational
design of a biologically active enzyme, Science, 304, pp. 1967–1971.
6. Dwyer, M. A., Looger, L. L., and Hellinga, H. W. (2008). Retraction, Science,
319, p. 569.
7. Kirsch, J. F. (2008). Comment on retraction in Ref. 6. Science, e-letter
response.
8. Richard, J. P. (2008). Comment on retraction in Ref. 6. Science, e-letter
response.
9. Bender, M. (1967). Alpha-chymotrypsin: enzyme concentration and
kinetics, J. Chem. Educ., 44, pp. 84–88.
10. van Loo, B., Berry, R., Boonyuen, U., Mohamed, M. F., Golicnik, M., Hengge,
A. C., and Hollfelder, F. (2015). Transition state interactions in enzyme-
catalyzed sulfate monoester hydrolysis by Pseudomonas aeruginosa
arylsulfatase, in preparation.
11. Sinnott, M. (2007). Chapter 3 in Carbohydrate Chemistry and Biochem-
istry: Structure and Mechanism (Royal Society of Chemistry, Cambridge),
pp. 305–312.
12. van Loo, B., Bayer, C. D., Fischer, G., Stefanie Jonas, S., Valkov, E.,
Mohamed, M. F., Vorobieva, A., Dutruel, C., Hyvönen, M., and Hollfelder,
F. (2015). Balancing specificity and promiscuity in enzyme evolution:
multidimensional activity transitions in the alkaline phosphatase
superfamily, submitted.
13. Wolfenden, R., and Snider, M. J. (2001). The depth of chemical time and
the power of enzymes as catalysts, Acc. Chem. Res., 34, pp. 938–945.
14. Lad, C., Williams, N. H., and Wolfenden, R. (2003). The rate of
hydrolysis of phosphomonoester dianions and the exceptional catalytic
proficiencies of protein and inositol phosphatases, Proc. Natl. Acad. Sci.
U S A, 100, pp. 5607–5610.
February 19, 2016 18:56 PSP Book - 9in x 6in 01-Allan-Svendsen-c01

20 A Short Practical Guide to the Quantitative Analysis of Engineered Enzymes

15. Jonas, S., van Loo, B., Hyvonen, M., and Hollfelder, F. (2008). A new
member of the alkaline phosphatase superfamily with a formylglycine
nucleophile: structural and kinetic characterisation of a phospho-
nate monoester hydrolase/phosphodiesterase from Rhizobium legumi-
nosarum, J. Mol. Biol., 384, pp. 120–136.
16. Wolfenden, R. (2006). Degrees of difficulty of water-consuming reac-
tions in the absence of enzymes, Chem. Rev., 106, pp. 3379–3396.

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Chapter 2

Protein Conformational Motions:


Enzyme Catalysis

Xinyi Huang, C. Tony Liu, and Stephen J. Benkovic


Department of Chemistry, Pennsylvania State University, University Park,
PA 16802, USA
xinyi.huang@bio-techne.com, tonyliuc@gmail.com, sjb1@psu.edu

2.1 Introduction

Enzymes display a variety of remarkable catalytic functions essen-


tial for cell viability and reproduction. Understanding the origin
of enzymatic rate enhancement has been a goal of biochemistry
for more than half a century. Initial attempts to understand
enzyme mechanisms were based on steady-state kinetics studies
that provided a measure of the catalytic efficiency characterized
by turnover number and the strength of substrate binding. Along
with direct studies of enzymes, the combination of physical organic
chemistry with protein structure fostered the hypothesis that the
catalytic efficiency of enzymes was attributable to the restriction of
substrate rotations and the orientation of catalytic groups within
the active site. To facilitate an optimal low-energy transition state,

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

22 Protein Conformational Motions

a more complete molecular description of catalysis is developed


through progress in rapid transient kinetics methods that extended
the time range available for observation, leading to the discovery of
intermediates that virtually occur in every enzyme catalytic cycle.
Various types of evidence in structural biology and biophysics
opened a new era in enzymology and focused attention on the
role of protein motions during enzyme catalysis. These methods
demonstrated conformational heterogeneity in an enzyme catalytic
cycle and provided evidence for how molecular binding at distal
sites could also control enzymatic activity through induced confor-
mational changes. These observations prompted the development
of models that connect the dynamics of enzyme conformational
changes to catalytic function. Complementary to the experimental
studies was the development of computational simulation as
a powerful tool to discover protein conformations with low
probability and a short lifetime inaccessible to other methods. In
this chapter, we present a review of recent studies on the role
of structural dynamics occurring during the enzymatic catalytic
process. Current progress on studying small-scale atomic motions
in catalytic sites to global domain fluctuation provides essential
links to an enzyme’s substrate recognition, chemical turnover, and
transition-state stabilization during the course of the catalytic cycle.

2.2 Multidimensional Protein Landscape and the


Timescales of Motions

Protein motions ranging from picoseconds to seconds have been


implicated in playing vital roles in the enzymatic process, such
as ligand binding, product dissociation, reorganization of the
reactive Michaelis–Menten complex, and the chemical reaction itself
[1–9]. One of the main driving forces in the exploration of the
functional significance of protein conformational changes derive
from advances in modern spectroscopic techniques, which has
improved researchers’ ability to probe structural changes over a
broad time range. These techniques include time-resolved X-ray
crystallography, solution nuclear magnetic resonance (NMR),
Fourier transform infrared (FTIR) spectroscopy, ultrafast 1D/2D

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Multidimensional Protein Landscape and the Timescales of Motions 23

infrared (IR) spectroscopy, Förster resonance energy transfer


(FRET), and single-molecule techniques. New techniques are con-
tinuously being developed to monitor the rearrangement of the
3D protein architecture with finer details and time resolution. For
example, using picosecond time-resolved X-ray diffraction from a
myoglobin mutant, one can watch the frame-by-frame evolution of
the protein conformations that are associated with facilitating the
migration of a bound carbon monoxide (CO) ligand from one binding
site to another [10, 11]. Techniques like this greatly add to the static
information provided by conventional crystallography data, which
typically represent thermodynamically stable states of a protein that
might not be functionally important in a predominately aqueous
environment. This is especially true under physiological conditions
where there is ample thermal energy in the system for a high degree
of thermal fluctuations of chemical bonds. It would be impossible
to truly comprehend the intricate relationship between enzyme
function and structure without these sophisticated spectroscopic
tools.
The energetics of an enzyme can be best described as possessing
a multidimensional energy landscape composed of multiple confor-
mational states, where the population of the various states is defined
by the relative potential free energy of each state (Fig. 2.1) [12,
13]. The energy barriers that separate the different states define
the rate, or the ease, for the interconversion between different
conformational states. Depending on the amount of thermal energy
in a system, a protein can fluctuate between various equilibrium
structures, thus exhibiting a large ensemble of conformations. This
multidimensional energy landscape description of a protein has
been corroborated by theoretical studies, which can examine protein
conformational changes that occur in different timescales. Typically,
closely related equilibrium structures are separated by a smaller
energy barrier, while structures that differ by large conformational
changes can easily be separated by a barrier that is at least 10–20
kcal/mol or more. Experimentally, it is easier to extract information
on protein motions that result in significant conformational changes
(larger barrier and slower timescale process). Molecular dynamic
simulations can examine subtle changes (rapid changes with a
small energy barrier) in the atomic coordinates that occur along
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

24 Protein Conformational Motions

Figure 2.1 Multidimensional energy landscape of an enzyme. Conforma-


tional changes from one structural state can be on the trajectory of the
reaction coordinate or orthogonal to the reaction coordinate or contain
components of both.

the trajectory of an enzymatic reaction. Consequently theoretical


studies have provided many valuable insights that are difficult to
obtain with current experimental means.
While considering the relationship between protein function
and conformational fluctuations, we need to establish a clear
picture of the type of motions that are available in a protein
and distinguish the motions that are relevant to function. There
exists a wide spectrum of available motions spanning from events
that occur on timescales of femtoseconds to seconds (Fig. 2.2)
[6]. These events can be further classified into two groups, fast
picosecond–nanosecond motions that reflect the local fluctuations
of the protein residues and slow microsecond–second motions
that involve the collective conformational changes of a protein
in transitioning between equilibrium conformations. Examples of
fast local fluctuations include bond vibration, bond rotation, and
hydrogen bond formation/disruption. These rapid local fluctuations
allow the protein to sample from a large ensemble of structurally
similar states that are separated by very low-energy barriers.
Collectively, these fast timescale local fluctuations can produce the
larger structural rearrangements of a protein that are associated

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Multidimensional Protein Landscape and the Timescales of Motions 25

Figure 2.2 Various protein motions spanning a broad range of timescales


and the current experimental capacities. Note that while techniques like
UV-Vis, fluorescence, and IR spectroscopy have the capability of following
events faster than the microsecond timescale, in most cases they are
only useful for events on the microsecond–second timescale. Except for
MD simulations, it is still very difficult to experimentally monitor protein
conformational fluctuations that are faster than the microsecond timescale.

with overcoming a higher-energy barrier. These slower timescale


(millisecond–second) conformational changes have attracted spe-
cial interest in recent years because most enzymatic reactions also
take place on this timescale [14].
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

26 Protein Conformational Motions

2.3 Conformational Changes in Enzyme–Substrate


Interactions

The catalytic cycle of an enzyme typically involves a series of


ligand-binding and dissociation events on either side of the actual
chemical transformation. In fact, enzyme–ligand interactions are
often optimized to ensure good catalytic efficiency and reaction
specificity. Binding interactions between ligand and protein require
close contact to ensure the short-range forces; thus it is reasonable
to assume that a substrate must have a matching shape to fit into
the active site of the enzyme. This process was initially explained
as the lock-and-key analogy proposed by Emil Fischer [15]. Confor-
mations of the free and ligand-bound protein (lock) are essentially
the same, and the substrate (key) should have the correct shape
to fit into the active site (keyhole) (Fig. 2.3A). Although Fischer’s
lock-and-key model insightfully emphasized the importance of
shape complementarity in an enzyme’s specificity, its rigid-docking

Figure 2.3 Models of enzyme–substrate interactions. Proteins are shown as


rectangles with a binding site to substrates (yellow triangles). (A) Lock and
key. Conformations of the free and ligand-bound protein are essentially the
same, with an active site that correctly matches the shape of the substrate.
(B) Induced fit. Proteins in unbound form can only weakly interact with the
substrate, and the initial interaction induces the conformational changes
in proteins to strengthen the binding. (C) Conformational selection. The
protein samples an ensemble of conformations, and the substrate selects
the conformer that allows optimal interaction. The binding process induces
a population shift in favor of the conformer that binds the substrate.

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Conformational Changes in Enzyme–Substrate Interactions 27

inherence is in contradiction to the intrinsic fluctuations observed


in the protein structure.
Conformational changes in proteins vary from global movements
of the domain to local fluctuations. Koshland’s induced fit model,
which was originated from the conformational changes observed
between the substrate-bound and the substrate-unbound forms of
a given enzyme, takes flexibility into consideration [16]. In this
model, binding of the substrate consequentially drives the protein
toward a new conformation that is more complementary to its
binding partner (Fig. 2.3B). Thus, the geometric fit is ensured only
after the structural rearrangements of the proteins induced by their
interaction with the substrate(s). The induced fit model is evidently
supported by numerous examples of X-ray structures of the same
proteins in free form without a ligand and in bound form with
ligands [17, 18]. However, it still treats proteins as if they exist in
a single, stable conformation under given experimental conditions
and ignores the consequence of thermal fluctuations.
It is now accepted that proteins have an intrinsic ability to sample
a vast ensemble of conformations irrespective of the substrate
[19, 20]. The conformational selection model originally proposed
by Weber takes into account this structural heterogeneity and
suggests that all protein conformations are predisposed to undergo
conformational fluctuations that are related to, or even required
for, their biological functions [21]. Conformational ensembles exist
in equilibrium, and weakly populated conformations with higher
energy are responsible for recognizing and binding to ligands. The
substrate simply stabilizes the fittest conformations that already
exist in an unbound state [22]. Once a substrate selects its most
favored conformation, a subsequent population shift toward that
conformer occurs (Fig. 2.3C). Such a process has been explained
using energy landscape theory [2, 23]. A protein may not be defined
by a single conformation but rather by an ensemble of closely related
conformations, or substrate complexes, that coexist in equilibrium
due to internal degrees of freedom that permit the structure to
relax/rearrange without altering the structural fold. Experimental
results from single-molecule, NMR, and other spectroscopic studies
have successfully proved that unliganded proteins are capable of
populating different substrates [24–26]. The energy landscape near
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

28 Protein Conformational Motions

the native state (i.e., the lowest energy) contains several minima
corresponding to these substrates. The most suitable substrates
other than the native conformation will bind the ligand, shifting the
equilibrium toward complex formation [2].
The concept of conformational selection has been applied to
explain the molecular recognition in several enzyme models. One
example is the adenylate kinase that catalyzes interconversion of
adenine nucleotides. A two-state conformational switch between
open and closed states along with ligand binding was initially
observed in X-ray crystallography [5]. As part of the catalytic
process, a conformational switch from open to closed states
is strongly correlated to the catalysis cycle as discovered in a
recent NMR relaxation dispersion study [27]. Remarkably, structural
fluctuation into a bound conformation is essentially independent of
ligand binding, as shown by NMR spectroscopy and single-molecular
FRET in the absence of a substrate [5, 28]. Another example is
dihydrofolate reductase (DHFR). This enzyme adopts five different
intermediate complexes to complete catalysis, and each complex can
fluctuate into a conformation resembling the next and/or previous
steps in the catalytic cycle [23, 28], indicating that molecular binding
in DHFR alters the nature of thermally accessible states in the
conformational ensemble, organizing the protein structure for the
binding of an upcoming ligand or the release of a product.

2.4 Conformational Changes in Catalysis

Once the substrates are bound inside the active site, a main function
of an enzyme is to bring the reactive species into an appropriate
orientation and configuration inside its shielded protein cavity,
which provides an electrostatic environment highly favorable for the
chemical transformation to occur [4, 12, 29–34]. The electrostatic
environment inside the active site is largely responsible for lowering
the transition-state energy of an enzymatic reaction, making it a
more energy-efficient process than the corresponding uncatalyzed
reaction. Synthetic catalysts are designed to achieve a similar
goal, lowering the activation energy barrier. Unlike most small-
molecule catalysts or artificial enzymes, natural enzymes are large,

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Conformational Changes in Catalysis 29

complex biological molecules consisting of many flexible domains.


Structural determination of an enzyme often yields an ensemble
of conformations under slightly different experimental conditions,
such as pH, salt concentrations, and the buffer used. The B-factors
determined from X-ray crystallography can pinpoint regions of a
protein that exhibit a higher degree of disorder. Often the more
disordered portions are the loop regions, and in many cases the
mobility of a specific loop region of a particular protein has been
linked to the catalytic process. While many of the conformational
changes around these flexible domains are important for ligand
selection, it is becoming clear that certain flexibilities are important
for catalysis.
The notion that a protein has evolved to obtain a certain
tertiary structure with a preorganized catalytic site to assist the
necessary chemical reaction is correct but inadequate [33–37]. This
view considers the conformational fluctuations leading up to the
transition state of an activation energy barrier as insignificant. In
a more expanded view, catalysis is described from the perspective
of an ensemble of conformations (Fig. 2.1) across the free-energy
landscape of an enzyme and that multiple conformational changes
are involved in enzyme processes [9, 12, 13, 38, 39]. From this
perspective, the probability of sampling the reaction-conductive
configuration is related to the free-energy barrier of the enzyme
reaction. The probability of sampling the different configurations
(both reactive and nonreactive) depends on the relative energy
barriers that separate the available configurations. For example,
single-molecule fluorescence studies have also shown that the
enzymatic turnover rate for a single biomolecule can fluctuate over
a wide continuum of timescales (milliseconds to several seconds),
consistent with the protein sampling multiple conformers along the
reaction coordinate [40].
To fully address the relationship between conformational change
and catalysis is difficult because it is still not clear how to quantify
the degree of contribution. In fact, while the involvement of protein
motions in enzymatic processes is widely accepted, how much
they contribute to the overall reduction of the energy barrier
(catalysis) is still being debated. What is known is that there are
conformational changes that occur concurrently with the chemical
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

30 Protein Conformational Motions

step in the catalytic cycle of an enzyme. In many cases selective


mutations that hinder these protein conformational changes also
result in the reduction of the rates of reactions [9, 41–45]. These
experimental observations have suggested a tantalizing possibility
of direct coupling (i.e., a causal and functional connection) between
enzyme motions and catalysis. In the following sections we will focus
mainly on the one-enzyme system to extrapolate the structure–
activity relationship in enzyme catalysis.

2.4.1 Protein Dynamics of DHFR in the Catalytic Cycle


DHFR has emerged as a common enzyme model for studying the re-
lationship between structure and function, and herein we will use it
as an example to illustrate the importance of conformational change
in catalysis. DHFR is a ubiquitous enzyme that catalyzes the reduced
nicotinamide adenine dinucleotide phosphate (NADPH)-dependent
conversion of 7,8-dihydrofolate (DHF) to 5,6,7,8-tetrahydrofolate
(THF), which is involved in the biosynthesis of purines, thymidylate,
and several amino acids (Scheme 2.1) [46]. The catalytic cycle of the
Escherichia coli DHFR (ecDHFR) consists of five major complexes
(Fig. 2.4A), with the rate-limiting step being the release of THF
from the E:NADPH:THF complex. X-ray crystallography [47], NMR
relaxation dispersion [23], and photophysical data [48, 49] have
shown significant conformational changes along the catalytic cycle,
especially in the flexible active-site Met20 loop (residues 9–24;
Fig. 2.4B).

Scheme 2.1 DHFR-catalyzed reduction of dihydrofolic acid.

Specifically, the enzyme can adopt either a closed conformation


(as in E:NADPH and E:NADPH:DHF) or an occluded conformation
(as in E:NADP+ :THF, E:THF, and E:NADPH:THF). The closed confor-

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Conformational Changes in Catalysis 31

Figure 2.4 The five major complexes and (B) the two main enzyme
conformations in the catalytic cycle of ecDHFR. Complexes in the closed
conformation (PDB 7DFR) are shown in green color, and complexes in the
occluded conformations (PDB 1RC4) are shown in pink color.

mation is best described as having the Met20 loop packed against


the nicotinamide ring of NADPH. In contrast, the loop protrudes
into the active site in the occluded conformation, while sterically
extruding the nicotinamide. NMR relaxation dispersion experiments
found a rate constant of ∼1200 s−1 at 300 K for the transition
from the closed conformation adopted by the Michaelis–Menten
complex (using E:NADP+ :FOL to model the E:NADPH:DHF complex)
to the initial product complex (E:NADP+ :THF), which is in the
occluded conformation [23]. Within experimental error, this kinetic
value determined for the conformational change across the hydride
transfer step matches the pre-steady-state hydride transfer rate
(khyd = 950 s−1 at 298 K) for the chemical reaction [46]. Therefore,
there is a temporal correlation between the conformational change
and the actual chemical reaction.
Subsequent studies showed that the poly-proline mutation
(N23PP) in ecDHFR can prevent the active-site Met20 loop from
undergoing its closed to occluded conformational change [41].
This mutation was inspired by examining the human DHFR (Homo
sapiens DHFR, or hsDHFR), which has a proline-rich PWPP region in
the Met20 loop that keeps the loop solely in the closed conformation
through the entire catalytic cycle. Similar NMR relaxation dispersion
experiments have shown that the N23PP mutation introduced into
wild-type ecDHFR does restrict the millisecond-timescale Met20
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

32 Protein Conformational Motions

loop conformational transition observed in the wild-type enzyme


associated with the conversion of the reactive ternary Michaelis–
Menten complex to the product complex. Furthermore, such a muta-
tion also lowered the enzyme-promoted hydride transfer reaction by
∼30 times. Together, the NMR studies on the wild-type (WT) enzyme
and the motion-restricted mutant suggest that both the ligand
release (turnover step) and the hydride transfer are somehow linked
to the rates of conformational changes. It is important to point out
that the conformational changes monitored in these NMR studies
represent the collective structural rearrangements and not just the
specific motion of a domain or a residue. The transition from the
closed to occluded conformations involves the global reorganization
of protein residues, with the changes in the Met20 loop orientation
being the most pronounced feature.

2.4.2 Temporally Overlap: Correlation Does Not Mean


Causation
Aside from the Met20 loop, the conformational changes in other
flexible regions of DHFR have been implicated as important for
catalysis [23, 50, 51]. Other flexible regions include the Gly51
(residues 48–54), GH (residues 142–150), and FG loops (residues
116–132), and theoretical studies have found that these regions
also exhibit large conformational fluctuations that are coupled to
the hydride transfer reactions (Fig. 2.4B). To gain insight into how
specific motions might be related to catalysis, fluorescence probes
were installed into selective regions of the protein (Fig. 2.5) [49].
By utilizing the FRET signal formed between Alexa Fluor 555
maleimide and QSY 35 iodoacetamide probes (Förster distance R 0
or the distance at which the energy transfer efficiency is at 50% for
this pair is 24 Å) [48], one can selectively monitor the movements of
the Met20, Gly51, GH, and FG loops during the hydride transfer step.
At pH 7.0 the rate constants for the conformational change
(measured by changes in the FRET intensity) and the catalyzed hy-
dride transfer reaction are essentially identical within experimental
error [49]. Consistent with the NMR relaxation dispersion study on
ecDHFR, these specific motions (just the relative change between
the probes and not the global rearrangement of protein structure)

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Conformational Changes in Catalysis 33

Figure 2.5 The locations of the inserted FRET probes, which are used
to monitor the changes in the distance and geometry between the
individual probe pairs along the reaction coordinate. Yellow spheres, QSY
35 iodoacetamide; blue spheres, Alexa Fluor 555 maleimide.

also occur on the same timescales as the chemical reaction. Another


example of temporally related conformational change–chemical
reaction is the ultrafast pump-probe IR and stopped-flow FTIR
(SF-FTIR) spectroscopy study, which detected both picosecond and
millisecond protein motions during the catalytic cycle of coenzyme
B12 –dependent ethanolamine ammonia lyase [52]. Protein motions
that occur on the same timescale as the chemical reaction may
simply reflect the intrinsic flexibility of the protein and have
no connection with the chemical step [5, 53, 54]. In fact, a
perturbation that affects the activation energy barrier (or rate) of
the chemical step should have a similar effect on the reaction-
coupled motions. The above FRET study with ecDHFR revealed a
disconnection between the measured conformational changes and
the chemical transformation despite the initial temporal similarity
between the two events. As expected, the chemical reaction khyd
is sensitive to the pH (which affects the protonation of DHF) and
isotope substitution on the cofactor (which affects the transfer of a
negatively charged hydride). However, the conformational changes
were found to be insensitive to both pH and isotope substitution.
This indicates that the specific conformational changes monitored
in the above study are only related to the hydride transfer step, in
the sense that they both occur in the same temporal dimension.
This was supported by empirical valence bond molecular dynamics
simulations that examined the equilibrium thermally averaged Cα –
Cα distance changes on the residues that were attached to the FRET
probes. Simulations found no changes in the thermally averaged
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

34 Protein Conformational Motions

distance between the reactant state and the transition state of the
reaction. In fact, it is not surprising there are many reports that
claim reaction-coupled protein motions without solid evidence of
how these structural changes might be related to the energetic of
the chemical reaction. It is highly probable that many of these claims
have no causal basis.

2.4.3 Fast Timescale Conformational Fluctuations


In Sections 2.4.1 and 2.4.2 we focused on showing timescale motions
(millisecond–second). Fast timescale dynamics in proteins are the
intrinsic properties encoded in the amino acid sequences. Although
there is no doubt that fast dynamics impact protein structure,
the contribution of motions in this timescale during the catalysis
cycle is still unclear. Obtaining experimental evidence to solve the
puzzle requires fast reaction techniques to extend the timescale
available for detection (Fig. 2.2), and exciting progress has been
made in recent years. Advances in X-ray technology allow the
possibility of producing structural models that include a timescale
of residue motions [55, 56]. As mentioned in an earlier section, one
remarkable example is the real-time observation of carbon monox-
ide migration correlated with side-chain motions in myoglobin using
Laue X-ray diffraction [57]. However, this method requires the
reaction to be triggered within the crystal structure and the crystal
lattice to tolerate structural changes, which limits its application to
most enzymes. NMR relaxation methods are another powerful tool
in characterizing picosecond dynamics, but the technique requires
faster local dynamics than the overall tumbling time of the protein
[6]. Measurement of bond vibrations on the femtosecond timescale
has been achieved through progress in laser technology. The
recently developed methods such as 4D ultrafast electron diffraction
(UED), ultrafast crystallography (UEC), and ultrafast microscopy
(UEM) provide atomic-scale resolution in space and time, allowing
direct observation of the transient structures in enzymes and the
chemical steps, including the breaking and forming of bonds and the
transfer of protons, hydride ions, and electrons [58, 59].
It is of interest to note that this timescale is consistent with
the transition-state lifetimes estimated for both enzymatic reactions

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Conformational Changes in Catalysis 35

Figure 2.6 Simplified energy landscape of enzyme catalytic reaction.


Reaction barrier crossing is typically on a fast timescale, while the catalytic
cycle is on a slow timescale.

and gas-phase reactions [60], essentially linking fast dynamics to the


chemical reaction of breaking and forming of reactive bonds. Besides
the chemical turnover, transitions commonly occurring in enzymes
such as the making and breaking of the water structure, hydrogen
bonding, and molecular vibrations also occur on the fast timescale
[4]. One proposal suggests that fast protein dynamics dominates
the transition-state barrier crossing during enzymatic catalysis [61]
(Fig. 2.6). In this theory, the height of the transition-state barrier
is determined by the probability of rare promoting vibrations at
the catalytic site. Fast local fluctuations involving simultaneous
enzyme–substrate interactions on the femtosecond timescale allow
stochastic searches of multidimensional conformation space for
the transition-state ensembles. Once the dynamic search obtains
optimized configurations in the catalytic site, reaction chemistry
occurs. However, this timescale is many orders of magnitude faster
than ligand binding and the conformational changes known to occur
during enzyme catalysis [61, 62].
Others have also speculated the presence of a special promoting
mode of motions that occur in the femtosecond–nanosecond
timescales, which can facilitate the cleavage/formation of chemical
bonds at the transition state through the compression (electro-
statically or sterically) of reacting substrates [8, 63, 64]. For
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

36 Protein Conformational Motions

example, the activation energy barrier for proton-coupled electron


transfer reactions is related to the donor–acceptor distances (DADs).
The barrier decreases as the DAD decreases, and the process
becomes barrierless at extremely short distances [65]. It may
be that there are motions that help to lower the activation
energy barrier by compressing the DAD. Bandaria et al. applied
ultrafast 2D IR echo spectroscopy to study a transitionstate complex
mimic of formate dehydrogenase (FDH) and found hydrogen bond
fluctuations between four active-site residues and a transitionstate
ligand analog (azide, N− 3 ) on the order of the femtosecond timescale
[66]. This was interpreted as motions relevant for the optimized
DAD between the reacting substrates in the transition state. This
interpretation ignores that fact that similar compression of reacting
species would also exist in the uncatalyzed reaction in solution
[34]. Also, the energetic costs (e.g., from electrostatic repulsion
and entropy) for compressing the DAD were unaccounted for. More
importantly, such an analysis only considers the energy costs for
bond breaking/forming when the system is at the transition state
of a reaction coordinate, and it completely ignores the energy
cost to get to the transition state (Fig. 2.6). The average DAD
typically decreases as the reaction evolves along the collective
reaction coordinate from the reactant state to the transition state,
which usually occurs on the millisecond–second timescale for most
enzymatic reactions. A process that takes milliseconds to seconds to
complete is typically associated with an energy barrier of at least
10 kcal/mol at room temperature. Therefore, there is a temporal
and energetic disconnect between these promoting modes and a
typical enzyme reaction. It is not clear how these fast motions
detected along the reaction coordinate can contribute energetically
to overcoming the energy barrier of the chemical reaction. In other
words, currently there is no convincing evidence of the catalytic
relevance of such promoting modes of motions.

2.4.4 Effect of Conformational Changes on the


Electrostatic Environment
So what is the significance of protein motions in catalysis? It
is logical to think that conformational rearrangements of the

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Conformational Changes in Catalysis 37

protein could significantly influence the electrostatic environment


in the active site, and an appropriate electrostatic environment
is necessary to stabilize the transition state of the reaction.
Vibrational IR spectroscopy has been used to probe the local
electrostatic environment in a number of biomolecules [67–71].
For example, Boxer and coworkers have used nitrile (CN) probes
to detect differences in the electrostatic field projected along the
CN axis in proteins upon ligand binding and protein folding for
5 -3-ketosteroid isomerase (KSI) [32, 36, 71–74]. The changes
in the active-site electrostatics of ecDHFR are investigated by
incorporating small and catalytically nonintrusive IR-active probes
(thiocyanate group) at site-specific locations. By analyzing the
experimental data with complementary vibrational spectroscopy
methods and mixed quantum mechanical/molecular mechanical
(QM/MM) simulations one can extract information relating to the
electrostatics and degree of hydration of the microenvironments
surrounding the thiocyanate probes [75]. Not surprisingly, the
study showed significant changes in the microenvironments that
constitute the heterogeneous ecDHFR active site as the enzyme
takes on different conformations while progressing along the
catalytic cycle. The electrostatic contributions from the active-
site microenvironments surrounding the probe can be further
decomposed into individual residues and ligands. This analysis was
able to demonstrate that the electrostatic interactions between the
protein and the substrates/ligands help in orienting the reacting
species in a geometry that maintains a large electric field favoring
hydride transfer from the cofactor to the substrate. In addition,
the enzyme’s active-site environment provides a significant portion
(∼33%) of the electric field that facilitates the transfer of a
negatively charged hydride. Moreover, it is reasonable to think
that the active-site electrostatic environment will be influenced by
the conformational arrangement of the enzyme. This would imply
that as the enzyme undergoes these millisecond–second timescale
collective conformational changes during the catalytic cycle, the
active-site electrostatics should change concurrently. Therefore,
while individual protein motions probably cannot contribute signi-
ficantly to catalysis, collective thermal conformational fluctuations
that occur as the system evolves from the reactant-state to the
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

38 Protein Conformational Motions

transition-state reactions are important for optimizing the degree


of energy stabilization for the reaction transition state [4, 30, 31, 75,
76]. Future progress in the ability to quantitatively probe relevant
electrostatic changes as a function of protein motions along the
reaction coordinate could provide the necessary insights into the
structure–electrostatic activity relationship in enzyme catalysis.

2.5 Conservation of Protein Motions in Evolution

If protein motions play an important role in the function of enzymes,


it is reasonable to hypothesize that catalytically relevant motions
might be conserved through evolution. This is no different than
the idea of important catalytic residues being conserved among
homologous enzymes. Many evolutionarily conserved residues
serve important structural roles, and mutations to these residues
would have serious consequences on protein folding. Since it
is known that homologous enzymes found in different species often
share a conserved structural core, it is possible that conformational
fluctuations within the core structure are also conserved.
Despite the fact that human and E. coli DHFRs share a very low
sequence identity agreement (∼26% identity alignment), a common
overall structural scaffold has been retained over the billions of
years since the species diverged in evolution [51]. Both DHFRs
are highly active in converting DHF into THF. Thus, in view of the
trillions of generations that E. coli has undergone since [77] the 26%
identity may represent a floor to the divergence with possible
retention of structure and function. By comparing the DHFRs from
E. coli, Mycobacterium tuberculosis, Candida albicans, and Homo
sapiens, conformational fluctuations detected to be relevant to the
catalyzed hydride transfer reactions were found to be the same
among the four species examined [50]. Not only were the confor-
mational fluctuations found in the same regions of the proteins,
the behavior of the motions were similar, too. Similar findings
were obtained for a peptidyl-prolyl isomerase (CypA) and RNaseA,
where similar reaction-related motions were detected in enzymes
from different species. In other words, these data suggest that
homologous enzymes utilize similar conformational motions to

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

Designing Protein Dynamics 39

facilitate the same chemical transformations. The study also showed


that these conserved motions can be found both around and distal
to the active site, implicating a network of conserved interactions
that connect the fluctuations around surface regions to those in the
active site.
Similar conclusions were obtained with combined NMR relax-
ation dispersion and ligand titration experiments on two structural
homologues of the pancreatic ribonuclease family, RNase A and
RNase 3 [78]. These two homologues also exhibited very similar,
and yet functionally distinct, clusters of millisecond conformational
fluctuations. However, some have also suggested that homologous
enzymes might evolve to employ different protein dynamics to carry
out the same chemical reaction. It is possible that the same enzyme
in different species can adjust to different environments/needs
simply by fine-tuning conformational flexibility without altering the
main core structure. Regardless, the above examples show that while
the concept of evolutionarily conserved functional protein motions
is not well understood or clearly defined, it is clear that there is
certain mobility encoded in the primary sequence of a protein,
translating into specific conformational motions that are integral
to enzyme function and are more regulated than just random
thermodynamic fluctuations.

2.6 Designing Protein Dynamics

One of the challenges in design of artificial proteins with biological


activities is that the catalytic proficiency falls far short of native
enzymes [79]. Current design efforts tend to consider only the
chemical steps in an enzyme catalytic cycle. With defining transition
state ensembles for a potential catalytic mechanism, the designing
processes are focused on tailoring the active site for recognition
of transition-state mimics. For example, the rational design applies
a computational algorithm to optimize the orientation of the
composite transition-state rigid body and the catalytic side chains
in input protein scaffolds [80, 81]. Similarly, the catalytic antibodies
raised in response to transition-state analogs in the immune system
are anticipated to catalyze the desired reaction by forcing the bound
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

40 Protein Conformational Motions

substrate to resemble the transition state [82]. Despite some exciting


successes, artificial catalysts typically achieve rate enhancements
in the range of 104 –106 relative to solution chemistry, failing to
reach the range of 1012 –1020 commonly found in natural enzymes
[79]. Thus, strong binding of the transition state is not sufficient to
achieve catalytic rate enhancement in native enzymes.
Essential steps in conformational changes for ligand recognition
and chemistry severely limit the catalytic efficiency in designed
enzymes. Naturally occurring enzymes usually have large protein
matrices, and dynamics from the full protein architecture occurs
throughout the catalytic cycle [23]. Small and subtle motions of
the protein backbone both around the bound ligands and remote
from the catalytic site are crucial to the function of the catalyst,
as discussed above. Thus, a new design strategy should consider
the dynamic contribution, including conformational changes for
redistribution of the conformer populations and distal effects
essential to chemical turnover, as one of the keys to create highly
efficient artificial catalysts.

2.7 Concluding Remarks

Protein flexibility and conformational changes play important roles


in enzyme functions. For an enzymatic reaction, specific collective
motions are necessary for the reactive complex to sample various
conformations along the reaction coordinate. These progressive
conformational changes, from one conformer to another as the
enzyme goes from the reactant state to the transition state, occur
concurrently with the catalyzed reaction, resulting in gradual
optimization of the active-site electrostatics to complement the
transition state. This allows the active-site electrostatics to provide
greater energy stabilization for the transition-state species in the
transition state than in the reactant state. Similarly, the enzyme can
utilize the same strategy to provide a more favorable environment
for certain ligands (e.g., substrate binding and product releasing).
It is true that quantitative assessment of the catalytic contribution
from conformational motions is difficult, and this topic has raised
many confusions and disagreements in the literature. Nevertheless,

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

References 41

there is a strong interest in the field and a consensus on the


importance of the issue.

References

1. Benkovic, S. J., and Hammes-Schiffer, S. (2006). Biochemistry. Enzyme


motions inside and out, Science, 312(5771), pp. 208–209.
2. Boehr, D. D., Nussinov, R., and Wright, P. E. (2009). The role of dynamic
conformational ensembles in biomolecular recognition, Nat. Chem. Biol.,
5(11), pp. 789–796.
3. Frauenfelder, H., et al. (2009). A unified model of protein dynamics, Proc.
Natl. Acad. Sci. U S A, 106(13), pp. 5129–5134.
4. Hammes, G. G., Benkovic, S. J., and Hammes-Schiffer, S. (2011).
Flexibility, diversity, and cooperativity: pillars of enzyme catalysis,
Biochemistry, 50(48), pp. 10422–10430.
5. Henzler-Wildman, K. A., et al. (2007). Intrinsic motions along an
enzymatic reaction trajectory, Nature, 450(7171), pp. 838–844.
6. Henzler-Wildman, K., and Kern, D. (2007). Dynamic personalities of
proteins, Nature, 450(7172), pp. 964–972.
7. Masterson, L. R., et al. (2010). Dynamics connect substrate recognition
to catalysis in protein kinase A, Nat. Chem. Biol., 6(11), pp. 821–828.
8. Schwartz, S. D., and Schramm, V. L. (2009). Enzymatic transition states
and dynamic motion in barrier crossing, Nat. Chem. Biol., 5(8), pp. 551–
558.
9. Nagel, Z. D., and Klinman, J. P. (2009). A 21st century revisionist’s view
at a turning point in enzymology, Nat. Chem. Biol., 5(8), pp. 543–550.
10. Schotte, F., et al. (2003). Watching a protein as it functions with 150-
ps time-resolved X-ray crystallography, Science, 300(5627), pp. 1944–
1947.
11. Bourgeois, D., et al. (2003). Complex landscape of protein structural
dynamics unveiled by nanosecond Laue crystallography, Proc. Natl.
Acad. Sci. U S A, 100(15), pp. 8704–8709.
12. Benkovic, S. J., Hammes, G. G., and Hammes-Schiffer, S. (2008). Free-
energy landscape of enzyme catalysis, Biochemistry, 47(11), pp. 3317–
3321.
13. Frauenfelder, H., Sligar, S. G., and Wolynes, P. G. (1991). The energy
landscapes and motions of proteins, Science, 254(5038), pp. 1598–
1603.
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

42 Protein Conformational Motions

14. Wolfenden, R., and Snider, M. J. (2001). The depth of chemical time and
the power of enzymes as catalysts, Acc. Chem. Res., 34(12), pp. 938–
945.
15. Fischer, E. (1894). Einfluss der konfiguration auf die wirkung der
enzyme, Ber. Dtsch. Chem. Ges., 27(3), pp. 2985–2993.
16. Koshland, D. E. (1958). Application of a theory of enzyme specificity to
protein synthesis, Proc. Natl. Acad. Sci. U S A, 44(2), pp. 98–104.
17. Flores, S., et al. (2006). The database of macromolecular motions: new
features added at the decade mark, Nucleic Acids Res., 34(Database
issue), pp. D296–D301.
18. Qi, G., Lee, R., and Hayward, S. (2005). A comprehensive and non-
redundant database of protein domain movements, Bioinformatics,
21(12), pp. 2832–2838.
19. Henzler-Wildman, K. A., et al. (2007). A hierarchy of timescales in
protein dynamics is linked to enzyme catalysis, Nature, 450(7171), pp.
913–916.
20. Bahar, I., Chennubhotla, C., and Tobi, D. (2007). Intrinsic dynamics of
enzymes in the unbound state and relation to allosteric regulation, Curr.
Opin. Struct. Biol., 17(6), pp. 633–640.
21. Weber, G. (1972). Ligand binding and internal equilibria in proteins,
Biochemistry, 11(5), pp. 864–878.
22. Ma, B., et al. (2002). Multiple diverse ligands binding at a single protein
site: a matter of pre-existing populations, Protein Sci., 11(2), pp. 184–
197.
23. Boehr, D. D., et al. (2006). The dynamic energy landscape of dihydrofo-
late reductase catalysis, Science, 313(5793), pp. 1638–1642.
24. Greenleaf, W. J., Woodside, M. T., and Block, S. M. (2007). High-resolution,
single-molecule measurements of biomolecular motion, Annu. Rev.
Biophys. Biomol. Struct., 36, pp. 171–190.
25. Hinterdorfer, P., and Dufrene, Y. F. (2006). Detection and localization of
single molecular recognition events using atomic force microscopy, Nat.
Methods, 3(5), pp. 347–355.
26. Busenlehner, L. S., and Armstrong, R. N. (2005). Insights into enzyme
structure and dynamics elucidated by amide H/D exchange mass
spectrometry, Arch. Biochem. Biophys., 433(1), pp. 34–46.
27. Wolf-Watz, M., et al. (2004). Linkage between dynamics and catalysis in
a thermophilic-mesophilic enzyme pair, Nat. Struct. Mol. Biol., 11(10),
pp. 945–949.

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

References 43

28. Hanson, J. A., et al. (2007). Illuminating the mechanistic roles of enzyme
conformational dynamics, Proc. Natl. Acad. Sci. U S A, 104(46), pp.
18055–18060.
29. Doshi, U., and Hamelberg, D. (2014). The dilemma of conformational dy-
namics in enzyme catalysis: perspectives from theory and experiment,
Adv. Exp. Med. Biol., 805, pp. 221–243.
30. Garcia-Meseguer, R., et al. (2013). Studying the role of protein dynamics
in an S(N)2 enzyme reaction using free-energy surfaces and solvent
coordinates, Nat. Chem., 5(7), pp. 566–571.
31. Kosugi, T., and Hayashi, S. (2012). Crucial role of protein flexibility
in formation of a stable reaction transition state in an alpha-amylase
catalysis, J. Am. Chem. Soc., 134(16), pp. 7045–7055.
32. Suydam, I. T., et al. (2006). Electric fields at the active site of an enzyme:
direct comparison of experiment with theory, Science, 313(5784), pp.
200–204.
33. Warshel, A., et al. (2006). Electrostatic basis for enzyme catalysis, Chem.
Rev., 106(8), pp. 3210–3235.
34. Kamerlin, S. C., and Warshel, A. (2010). At the dawn of the 21st century:
is dynamics the missing link for understanding enzyme catalysis?
Proteins, 78(6), pp. 1339–1375.
35. Loveridge, E. J., et al. (2012). Evidence that a “dynamic knockout” in
Escherichia coli dihydrofolate reductase does not affect the chemical
step of catalysis, Nat. Chem., 4(4), pp. 292–297.
36. Jha, S. K., et al. (2011). Direct measurement of the protein response to an
electrostatic perturbation that mimics the catalytic cycle in ketosteroid
isomerase, Proc. Natl. Acad. Sci. U S A, 108(40), pp. 16612–16617.
37. Childs, W., and Boxer, S. G. (2010). Solvation response along the reaction
coordinate in the active site of ketosteroid isomerase, J. Am. Chem. Soc.,
132(18), pp. 6474–6480.
38. Glowacki, D. R., Harvey, J. N., and Mulholland, A. J. (2012). Taking
Ockham’s razor to enzyme dynamics and catalysis, Nat. Chem., 4(3), pp.
169–176.
39. Hammes-Schiffer, S., and Benkovic, S. J. (2006). Relating protein motion
to catalysis, Annu. Rev. Biochem., 75, 519–541.
40. English, B. P., et al. (2006). Ever-fluctuating single enzyme molecules:
Michaelis-Menten equation revisited, Nat. Chem. Biol., 2(2), pp. 87–94.
41. Bhabha, G., et al. (2011). A dynamic knockout reveals that confor-
mational fluctuations influence the chemical step of enzyme catalysis,
Science, 332(6026), pp. 234–238.
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

44 Protein Conformational Motions

42. Wang, L., et al. (2006). Coordinated effects of distal mutations on


environmentally coupled tunneling in dihydrofolate reductase, Proc.
Natl. Acad. Sci. U S A, 103(43), pp. 15753–15758.
43. Wang, Z., et al. (2012). A remote mutation affects the hydride transfer
by disrupting concerted protein motions in thymidylate synthase, J. Am.
Chem. Soc., 134(42), pp. 17722–17730.
44. Toney, M. D., Castro, J. N., and Addington, T. A. (2013). Heavy-enzyme
kinetic isotope effects on proton transfer in alanine racemase, J. Am.
Chem. Soc., 135(7), pp. 2509–2511.
45. Pudney, C. R., et al. (2013). Fast protein motions are coupled to enzyme
h-transfer reactions, J. Am. Chem. Soc., 135(7), pp. 2512–2517.
46. Fierke, C. A., Johnson, K. A., and Benkovic, S. J. (1987). Construction
and evaluation of the kinetic scheme associated with dihydrofolate
reductase from Escherichia coli, Biochemistry, 26(13), pp. 4085–4092.
47. Sawaya, M. R., and Kraut, J. (1997). Loop and subdomain movements
in the mechanism of Escherichia coli dihydrofolate reductase: crystallo-
graphic evidence, Biochemistry, 36(3), pp. 586–603.
48. Antikainen, N. M., et al. (2005). Conformation coupled enzyme catalysis:
single-molecule and transient kinetics investigation of dihydrofolate
reductase, Biochemistry, 44(51), pp. 16835–16843.
49. Liu, C. T., et al. (2013). Temporally overlapped but uncoupled motions in
dihydrofolate reductase catalysis, Biochemistry, 52(32), pp. 5332–5334.
50. Ramanathan, A., and Agarwal, P. K. (2011). Evolutionarily conserved
linkage between enzyme fold, flexibility, and catalysis, PLOS Biol., 9(11),
p. e1001193.
51. Liu, C. T., et al. (2013). Functional significance of evolving protein
sequence in dihydrofolate reductase from bacteria to humans, Proc.
Natl. Acad. Sci. U S A, 110(25), pp. 10159–10164.
52. Russell, H. J., et al. (2012). Protein motions are coupled to the reaction
chemistry in coenzyme B12-dependent ethanolamine ammonia lyase,
Angew. Chem., Int. Ed. Engl., 51(37), pp. 9306–9310.
53. Kurkcuoglu, Z., et al. (2012). Coupling between catalytic loop motions
and enzyme global dynamics, PLOS Comput. Biol., 8(9), p. e1002705.
54. Williams, J. C., and McDermott, A. E. (1995). Dynamics of the flexible
loop of triosephosphate isomerase: the loop motion is not ligand gated,
Biochemistry, 34(26), pp. 8309–8319.
55. Merritt, E. A. (1999). Expanding the model: anisotropic displacement
parameters in protein structure refinement, Acta Crystallogr., Sect. D:
Biol. Crystallogr., 55(Pt 6), pp. 1109–1117.

www.ebook3000.com
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

References 45

56. Bourgeois, D., and Royant, A. (2005). Advances in kinetic protein


crystallography, Curr. Opin. Struct. Biol., 15(5), pp. 538–547.
57. Schotte, F., et al. (2004). Picosecond time-resolved X-ray crystallogra-
phy: probing protein function in real time, J. Struct. Biol., 147(3), pp.
235–246.
58. Zewail, A. H. (2006). 4D ultrafast electron diffraction, crystallography,
and microscopy, Annu. Rev. Phys. Chem., 57(1), pp. 65–103.
59. Zhong, D. (2007). Ultrafast catalytic processes in enzymes, Curr. Opin.
Chem. Biol., 11(2), pp. 174–181.
60. Chatfield, D. C., et al. (1992). Control of chemical reactivity by quantized
transition states, J. Phys. Chem., 96(6), pp. 2414–2421.
61. Schwartz, S. D., and Schramm, V. L. (2009). Enzymatic transition states
and dynamic motion in barrier crossing, Nat. Chem. Biol., 5(8), pp. 551–
558.
62. Quaytman, S. L., and Schwartz, S. D. (2007). Reaction coordinate of an
enzymatic reaction revealed by transition path sampling, Proc. Natl.
Acad. Sci. U S A, 104(30), pp. 12253–12258.
63. Nunez, S., et al. (2004). Promoting vibrations in human purine
nucleoside phosphorylase. A molecular dynamics and hybrid quantum
mechanical/molecular mechanical study, J. Am. Chem. Soc., 126(48), pp.
15720–15729.
64. Cui, Q. A., and Karplus, M. (2002). Promoting modes and demoting
modes in enzyme-catalyzed proton transfer reactions: a study of models
and realistic systems, J. Phys. Chem. B, 106(32), pp. 7927–7947.
65. Borgis, D., and Hynes, J. T. (1993). Dynamic theory of proton tunneling
transfer rates in solution: general formulation, Chem. Phys., 170(3), pp.
315–346.
66. Bandaria, J. N., et al. (2010). Characterizing the dynamics of functionally
relevant complexes of formate dehydrogenase, Proc. Natl. Acad. Sci. U S
A, 107(42), pp. 17974–17979.
67. Waegele, M. M., Culik, R. M., and Gai, F. (2011). Site-specific spectro-
scopic reporters of the local electric field, hydration, structure, and
dynamics of biomolecules, J. Phys. Chem. Lett., 2, pp. 2598–2609.
68. Kim, H., and Cho, M. (2013). Infrared probes for studying the structure
and dynamics of biomolecules, Chem. Rev., 113(8), pp. 5817–5847.
69. Weitman, H., et al. (2001). Solvatochromic effects in the electronic
absorption and nuclear magnetic resonance spectra of hypericin in
organic solvents and in lipid bilayers, Photochem. Photobiol., 73(2), pp.
110–118.
March 15, 2016 10:40 PSP Book - 9in x 6in 02-Allan-Svendsen-c02

46 Protein Conformational Motions

70. Taft, R. W., and Kamlet, M. J. (1980). Linear solvation energy relation-
ships. 8. Solvent effects on NMR spectral shifts and coupling-constants,
Org. Magn. Reson., 14(6), pp. 485–493.
71. Fafarman, A. T., et al. (2010). Decomposition of vibrational shifts of
nitriles into electrostatic and hydrogen-bonding effects, J. Am. Chem.
Soc., 132(37), pp. 12811–12813.
72. Fafarman, A. T., et al. (2012). Quantitative, directional measurement of
electric field heterogeneity in the active site of ketosteroid isomerase,
Proc. Natl. Acad. Sci. U S A, 109(6), pp. E299–308.
73. Bagchi, S., Fried, S. D., and Boxer, S. G. (2012). A solvatochromic model
calibrates nitriles’ vibrational frequencies to electrostatic fields, J. Am.
Chem. Soc., 134(25), pp. 10373–10376.
74. Bagchi, S., Boxer, S. G., and Fayer, M. D. (2012). Ribonuclease S dynamics
measured using a nitrile label with 2D IR vibrational echo spectroscopy,
J. Phys. Chem. B, 116(13), pp. 4034–4042.
75. Liu, C. T., et al. (2014). Probing the electrostatics of active site microen-
vironments along the catalytic cycle for Escherichia coli dihydrofolate
reductase, J. Am. Chem. Soc., 136(29), pp. 10349–10360.
76. Doshi, U., et al. (2012). Resolving the complex role of enzyme
conformational dynamics in catalytic function, Proc. Natl. Acad. Sci. U S
A, 109(15), pp. 5699–5704.
77. Brocks, J. J., et al. (1999). Archean molecular fossils and the early rise of
eukaryotes, Science, 285(5430), pp. 1033–1036.
78. Gagne, D., et al. (2012). Conservation of flexible residue clusters among
structural and functional enzyme homologues, J. Biol. Chem., 287(53),
pp. 44289–44300.
79. Baker, D. (2010). An exciting but challenging road ahead for computa-
tional enzyme design, Protein Sci., 19(10), pp. 1817–1819.
80. Siegel, J. B., et al. (2010). Computational design of an enzyme catalyst for
a stereoselective bimolecular Diels-Alder reaction, Science, 329(5989),
pp. 309–313.
81. Rothlisberger, D., et al. (2008). Kemp elimination catalysts by computa-
tional enzyme design, Nature, 453(7192), pp. 190–195.
82. Hilvert, D. (2000). Critical analysis of antibody catalysis, Annu. Rev.
Biochem., 69, pp. 751–793.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Chapter 3

Enzymology Meets Nanotechnology:


Single-Molecule Methods for Observing
Enzyme Kinetics in Real Time

Kerstin G. Blank,a,b Anna A. Wasiel,a and Alan E. Rowana


a Department of Molecular Materials, Institute for Molecules and Materials,

Radboud University, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands


b Max Planck Institute of Colloids and Interfaces, 14424 Potsdam, Germany

kerstin.blank@mpikg.mpg.de

Enzymes are complex molecular machines. They break down


catalytic reactions into several substeps, often involving different
conformations. These different conformations are a direct conse-
quence of the multidimensional energy landscape of the enzymes.
The energy landscape further determines regulation processes and
might further allow parallel reaction pathways. Information about
the kinetics of multistep reactions, regulation events, and parallel
reaction pathways is intrinsically difficult to obtain in ensemble
measurements. Single-molecule experiments are perfectly designed
for studying these processes. They not only provide the possibility
for observing an enzyme individually but also, more importantly,
allow for following the enzymatic reaction in real time, directly
yielding the sequence of events.

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

48 Enzymology Meets Nanotechnology

In this chapter we describe how single-molecule fluorescence


microscopy and spectroscopy can be used to obtain the desired
kinetic information. In these experiments, fluorogenic substrates
and cofactor reporter systems are utilized for observing the
sequence of individual turnover reactions. We summarize the
results obtained for the enzymes Pseudozyma Candida antarctica
lipase B (CaLB), Thermomyces lanuginosus lipase (TLL), bovine α-
chymotrypsin, and nitrite reductase from Alcaligenes species. Using
these examples, we describe the novel information that has been
obtained, illustrate the power of the single-molecule approach,
and highlight the limitations of currently used single-molecule
fluorescence detection schemes.
Current bottlenecks, such as the artificial nature of the substrates
and the limited signal-to-noise (S:N) ratio, can be addressed with
a number of new developments in the field of nanotechnology.
Nano-optical approaches directly address the S:N ratio problem by
reducing the background noise and facilitating signal enhancement
mechanisms. Nano-electronic approaches, for example, based on
carbon nanotube field-effect transistors, allow for the use of natural
enzyme substrates. Nanomechanical techniques, such as the atomic
force microscope, can be used to position enzymes in sensor
hotspots or to mechanically manipulate an enzyme. While still under
development, many techniques have already been implemented for
studying enzymatic reactions. Nanotechnology can provide both
better detection schemes as well as strategies to control enzymatic
activity. It is therefore an important and powerful addition to the
field of single-molecule enzymology.

3.1 Introduction

Enzymes are specific and efficient catalysts with the ability of


evolving new functions. Essential to life processes, they accelerate
biochemical reactions with impressive rate enhancements. The most
efficient enzymes increase the reaction rate by 1019 -fold compared
to the uncatalyzed reaction [1]. Even though our knowledge about
chemical reaction mechanisms and enzyme structures has increased
tremendously, still little is known about how enzymes achieve

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Introduction 49

Figure 3.1 Timescales of protein motion and functional events. The


timescale accessible with single-molecule experiments is indicated in gray.

this enormous rate of acceleration [2]. One crucial aspect that


contributes to the catalytic efficiency of enzymes is that the chemical
reaction is broken down into several substeps.
Often these substeps involve conformational changes of the
enzyme. Enzymes are intrinsically dynamic molecules as they
consist of polymeric amino acid chains folded into complex 3D
structures. Conformational diversity originates, for example, from
side-chain rotations, movements of active-site loops, and domain
rearrangements (Fig. 3.1) [3–5]. Nuclear magnetic resonance (NMR)
relaxation–dispersion experiments have been employed success-
fully for studying these dynamics in different steps of the catalytic
cycle [3, 6]. In a number of cases the timescale of the observed
conformational change was found to be similar to the timescale of
the enzymatic reaction, suggesting that a conformational change
might be the rate-limiting step of the reaction [6, 7]. In this context,
protein dynamics is frequently discussed as another important
factor contributing to the catalytic efficiency [2, 3, 5, 8]. It is
important to note, however, that the timescales of conformational
changes and the timescales of the chemical step of the reaction
are not necessarily matched [2, 8]. Conformational changes on the
millisecond timescale are most likely not directly connected to
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

50 Enzymology Meets Nanotechnology

the catalytic reaction itself but rather contribute to an accurate


positioning of the reactants for the chemical step.
Conformational changes are not only important for facilitating
the catalytic reaction, they are also crucial for the regulation of
enzymatic activity either by posttranslational modifications (e.g.,
phosphorylation [9]) or by allosteric effectors. Already in the
1960s, the Monod–Wyman–Changeux (MWC) [10] and Koshland–
Némethy–Filmer (KNF) [11] models of allosteric regulation have
been introduced, describing effector binding as conformational
selection or induced fit, respectively. In the case of induced fit, the
effector binds to one conformation, thereby inducing a conforma-
tional change to a new conformation with higher (activator) or
lower (inhibitor) activity. In contrast, the conformational selection
model assumes a large number of conformational states. The
effector binds to the conformation it fits best, stabilizing this
conformation. Subsequently, the equilibrium is shifted toward the
bound conformation (population shift).
Both models can be easily explained when considering the en-
ergy landscape theory of proteins. Originally developed to visualize
protein folding [12, 13], it can also explain the conformational space
utilized by the enzyme for regulation processes. Considering this
energy landscape, induced fit and conformational selection are two
intrinsically related processes. In both cases, the binding energy of
the effector is used to stabilize a certain conformation on the energy
landscape. The same process might contribute to the enzymatic
reaction itself, where the free energy of the system is altered in every
step along the reaction pathway (Fig. 3.2a). The free-energy profile
of the reaction is described by the combination of the energy of
the chemical substeps and the conformational energy of the enzyme
itself.
The key difference between the induced fit and the confor-
mational selection models is the number of conformations that
the enzyme visits at a given point in time. While the induced fit
model assumes only one conformation, this number is large in the
conformational selection model. In this context, the conformational
selection model poses an interesting question about the enzymatic
reaction itself. Can all these conformations be involved in the
catalytic reaction? Or in other words, are multiple reaction pathways

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Introduction 51

Figure 3.2 Alternative illustrations showing the free-energy profile of an


enzymatic reaction. (a) In the conventional two-dimensional representation
the free energy of the reaction steps is plotted against the reaction
coordinate. (b) Considering conformational changes, a three-dimensional
energy landscape includes both a conformational and a chemical coordinate.
It appears likely that many enzymes utilize multiple pathways on this three-
dimensional energy landscape. Adapted from Ref. [14], Copyright (2007),
with permission from Elsevier.

possible? If they are, do all these reaction pathways require the same
time for one catalytic cycle? And, do all these reaction pathways have
the same rate-limiting step?
Figure 3.2b illustrates the possibility of multiple reaction
pathways. The scheme considers a chemical and a conformational
coordinate. In every substep the enzyme proceeds along the chem-
ical coordinate, while eventually visiting a different conformation.
If the enzyme follows only one pathway through this 3D landscape,
the energies along this pathway can be projected on the reaction
coordinate as shown in Fig. 3.2a. Currently, the existence of multiple
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

52 Enzymology Meets Nanotechnology

reaction pathways is a matter of debate [14–16] and convincing


experimental proof is difficult to obtain. Ensemble kinetic mea-
surements only provide a time and number average of the overall
efficiency of the catalytic reaction; possible rate fluctuations cannot
be resolved. NMR measurements fail to identify states that are
populated with a probability below 0.5% [17]. More importantly,
they are difficult to perform in the presence of a substrate as
the substrate is consumed during the measurement. Molecular
dynamics (MD) simulations have provided many new insights into
protein conformational dynamics as they allow for observing the
time sequence of events with impressive time resolution. MD
simulations, however, cannot yet reach the millisecond-to-second
timescale of the enzymatic reaction and consequently do not capture
the relevant conformational changes.
Single-molecule experiments have the potential to fill in an
important gap, shedding light on the possible role of conforma-
tional changes. Single-molecule enzymology looks at enzymes as
individual molecules, overcoming the averaging problem intrinsic
to classical ensemble measurements. Many technological advances
have been implemented since the 1960s, when Boris Rotman [18]
measured the activity of individual β-D-galactosidase molecules
compartmentalized in water-in-oil droplets. Using a fluorogenic
substrate, which was converted by the enzyme into a fluorescent
product, Rotman [18] was able to observe the accumulated product
molecules after many hours. Today, using sensitive detectors, single
fluorophores can be detected with a nanosecond time resolution
[19–21]. State-of-the-art single-molecule experiments allow for
observing an individual enzyme in real time as it is performing
the catalytic reaction or a conformational change. In this way
the sequence of individual enzymatic turnovers or conformational
motions can be recorded.
In the first part of this chapter we are focusing on fluorescence-
based approaches for measuring the sequence of single turnovers.
This turnover sequence contains all kinetic information of the
reaction, including fluctuations in the enzymatic rate related to
regulation events or alternative reaction pathways. In some cases
also substeps of an enzymatic reaction cycle can be observed. We
are starting with a description of the experimental setup and the

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Turnover Detection 53

typical steps of a single-turnover experiment. On the basis of our


own experience, we will describe several enzyme–substrate model
systems that have been characterized in detail. We will summarize
the unique information that has been obtained and also discuss
current limitations. The second part of this chapter introduces
novel detection schemes enabled by new developments in the
nanotechnology field that have the potential to overcome these
limitations. We will conclude this chapter with an outlook of where
we see the future potential of single-molecule enzymology.

3.2 Single-Turnover Detection

Before we describe how single-molecule fluorescence experiments


can be utilized for studying the kinetics of enzymatic reactions and
for investigating possible related conformational changes, we are
first focusing on the technological requirements for the detection
of individual enzymatic turnovers or reaction substeps. Single-
turnover detection requires a fluorescent reporter system that
changes its fluorescent properties during the time course of an
enzymatic reaction cycle (Fig. 3.3a). These changes need to be
detected using a fluorescence microscope with single-fluorophore
sensitivity and high time resolution ideally in the nanosecond
range. The recorded fluorescence time trace finally needs to be
analyzed with statistical methods and the kinetic information
needs to be extracted. Many different data analysis methods have
been developed and are described in the literature. An extensive
discussion is beyond the scope of this chapter, and we will only focus
on the methods used for the examples described in the following.

3.2.1 Fluorescent Reporter Systems


The fluorescent reporter system is the most crucial component of a
single-turnover experiment. In general, every reporter system used
to measure enzyme activity at the ensemble level can also be used
at the single-molecule level, provided that the fluorophore shows
sufficient brightness for single-molecule detection. Fluorescent
cofactors (Fig. 3.3b) and fluorogenic substrates (Fig. 3.3d) are
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

54 Enzymology Meets Nanotechnology

Figure 3.3 Single-turnover detection using fluorescent reporter systems.


(a) The enzyme is immobilized on a glass coverslip in the detection volume
of a confocal fluorescence microscope. Illustrated is the example of a
fluorogenic substrate where every individual fluorophore produced by the
enzyme is recorded in real time, yielding the desired turnover sequence. (b)
A fluorogenic substrate is converted into a fluorescent product molecule as a
result of the enzymatic reaction. (c) A fluorescent cofactor is cycled through
a fluorescent on-state and a nonfluorescent off-state during one turnover
cycle. (d) A FRET donor is coupled to the enzyme in a distance that allows
efficient energy transfer with a cofactor that is cycled between two states
during one turnover cycle.

typical reporter systems that have been used both at the single-
molecule and the ensemble level.
Fluorescent cofactors can be found in oxidoreductases. Both
flavin adenine dinucleotide (FAD) and flavin mononucleotide (FMN)
have been used for single-molecule experiments [22–25]. In such
an experiment, every enzymatic turnover cycle consists of two
half reactions that are detected as a fluorescent on-state (oxidized
cofactor) and a nonfluorescent off-state (reduced cofactor). This
approach has been extended to enzymes that do not contain a
fluorescent cofactor but instead contain a cofactor that absorbs light
at a certain wavelength (Fig. 3.3c) [26–28]. The cofactor can be
used as the acceptor of a Förster resonance energy transfer (FRET)

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Turnover Detection 55

process. When a donor fluorophore is coupled to the enzyme in close


proximity to the cofactor, energy transfer to the cofactor can take
place, reducing the fluorescence emission from the coupled donor
fluorophore. FRET only occurs when the cofactor is in a state where
it absorbs light but not when it is switched to a nonabsorbing state.
As a result, the donor intensity reports on the state of the cofactor
during the catalytic reaction cycle.
In both of the above cases, the measurement requires following
the same fluorophore until it photobleaches. This restricts the
measurement time to a few seconds, limiting the amount of
information that can be obtained from one enzyme. This problem
can be overcome with the use of fluorogenic substrates (Fig. 3.3a,d).
When monitoring the conversion of a fluorogenic substrate into flu-
orescent product molecules, a new fluorophore is produced in every
enzymatic turnover cycle [29–37]. This allows measurement times
of several minutes or even up to one hour, depending on the turnover
rate of the enzyme. Measurements only need to be terminated once
a high number of product molecules has accumulated in solution,
reducing the signal-to-noise (S:N) ratio. Fluorogenic substrates are,
for example, available for oxidoreductases. The enzyme horseradish
peroxidase has been measured at the single-molecule level using the
fluorogenic substrate dihydrorhodamine 6G [29, 31]. Another class
of enzymes that can be measured using fluorogenic substrates is
hydrolases such as lipases and esterases [33, 34, 37–39], proteases
[30, 35, 36], glycosidases [32], and phosphatases.
Another common approach for measuring enzymes at the single-
molecule level is the use of FRET-based reporter systems that allow
for monitoring conformational changes [40–44]. FRET is a sensitive
reporter of distance changes in the 2–8 nm range and provides a
direct readout of domain or loop movements. In such an experiment,
a donor fluorophore and an acceptor fluorophore are coupled to
specific positions on the enzyme. These positions are expected to
move relative to each other upon substrate or ligand binding. This
distance change between the fluorophores can be determined from
the FRET efficiency, which is described by the following equations:

IDA
E FRET = 1 − (3.1)
ID
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

56 Enzymology Meets Nanotechnology

where IDA is the fluorescence intensity in the presence of the


acceptor and ID is the fluorescence intensity in the absence of the
acceptor.
1
E FRET = (3.2)
1 + (r/R0 )6
where r is the donor–acceptor distance and R0 is the Förster radius
(distance where E FRET = 50%).
A major drawback of this type of experiment is that the FRET
reporter system does not directly report on the catalytic reaction
but only on the conformational changes accompanying it. In many
cases the kinetics of the reaction and of the conformational changes
will be related directly. Still FRET remains an indirect readout that
is further limited by photobleaching of donor and acceptor. Further
FRET is only useful if the enzyme structure and the conformational
change are already known so that the FRET labels can be attached
in appropriate positions. In the following section, only reporter
systems that give direct access to the turnover sequence and the
underlying kinetics are described. The reader interested in FRET
measurements is referred to recent reviews describing these in
more detail [45, 46].

3.2.2 Measurement Setup


Once a suitable fluorescent reporter system has been found, the
single-turnover sequence is usually recorded using a confocal
fluorescence microscope [47, 48]. In a confocal microscope the
excitation laser is focused into a diffraction-limited spot, resulting
in a detection volume of approximately 1 fL (10−15 L). In such a
small detection volume even a single fluorescent molecule has a
high concentration in the nanomolar range. The fluorophore can be
detected with a sufficiently high signal above the background, which
is composed of scattering and fluorescent impurities. Light emitted
from the fluorophore is detected with an avalanche photodiode
(APD). APDs record the sequence of individual photons emitted
by the fluorophore (and the background) with a time resolution
in the low nanosecond range (Fig. 3.3a). This high time resolution
is essential to ensure the accurate detection of single turnovers,
even at high turnover rates (>100 turnovers per second). The

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Turnover Detection 57

use of APD detectors also facilitates multiparameter fluorescence


measurements where not only the fluorescence intensity (photons/
time) but also the lifetime and anisotropy of the fluorophore can
be recorded [49, 50]. In this way, additional information about the
catalytic process can be obtained.
To be able to follow the reaction of one single enzyme, the
enzyme needs to be immobilized on the surface of a coverslip.
Ideally, the immobilization is performed in a site-specific way and
uses low concentrations of enzymes. This reduces heterogeneities
to a minimum and ensures sufficient spacing between the enzymes
so that, indeed, single enzymes are measured. Fluorescent labeling
of the enzyme sample allows for the localization of the enzymes on
the surface by scanning the confocal laser spot over the surface. Once
the location of the enzyme molecules is known, the confocal volume
is moved at the position of one enzyme and the fluorescence signal
originating from the catalytic reaction is monitored as a function of
time.

3.2.3 Data Analysis


Following data collection, the obtained photon arrival time traces
are analyzed in two steps. In the first step, time intervals with
a high photon count rate (high intensity; on-state) are separated
from intervals with a low photon count rate (low intensity; off-
state). Subsequently, the durations of each state, the on- and off-state
waiting times, are determined. These are the starting points for the
second step—the kinetic analysis of the system.
For the first step, the on–off assignment, two different methods
are available. The most frequently used method relies on applying a
threshold for separating the high and low intensity levels (Fig. 3.4a).
An alternative and more objective method is change-point analysis
(Fig. 3.4b). Also for the second step different approaches can be
used for extracting the desired kinetic information from the data,
such as the fitting of waiting-time probability distributions (Fig.
3.4c), correlation analysis (Fig. 3.4d), and hidden Markov modeling
(Fig. 3.4e).
The threshold approach requires binning to convert the photon
arrival time trace into an intensity time trace. To achieve this, the
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

58 Enzymology Meets Nanotechnology

Figure 3.4 Steps of the data analysis procedure. For the on–off assignment
two different methods are possible. (a) Binning converts the photon
arrival time trace into an intensity (photons/time) time trace. In the
next step a threshold is used to separate intervals with high and low
intensity. (b) Change-point analysis determines the probability of a certain
photon to be an intensity change point. The calculation is based on a
maximum-likelihood algorithm that determines the log-likelihood ratio
(LLR) for every photon along the photon arrival time trace. (c) Plotting the
duration of every off- (on-) time into a histogram yields the corresponding
probability distribution. (d) Correlations between subsequent off-times can
be visualized in a 2D correlation plot where the duration of off-time i is
plotted against the duration of the following off-time i +1. (e) The number
of states underlying an on-time–off-time sequence and the corresponding
rate constants can be represented using hidden Markov models.

number of photons in a given time interval (normally between


1 and 5 ms) is counted and plotted against the time axis of the
experiment (Fig. 3.4a). Once the intensity time trace has been
obtained, the threshold is applied to the time trace. Separating the
two intensity levels is usually not a problem when the on- and off-
state intensities are sufficiently different (high S:N ratio), but can

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Turnover Detection 59

become problematic for data with a low S:N ratio [35]. Besides this
problem, the threshold approach has two additional drawbacks. The
choice of the bin size is often arbitrary and limits the time resolution
of the experiment to the bin size used. Also the best threshold
position can be difficult to determine, introducing some subjectivity
into the data analysis procedure.
With the goal of reducing the subjective factors in the on–off
assignment, change-point analysis was developed as an alternative.
Change-point analysis uses a statistical hypothesis test to identify
changes in the photon count rate along the photon arrival time trace
(Fig. 3.4b) [35, 51]. Using a maximum-likelihood algorithm, the most
likely change point (i.e., the photon where a change in the intensity
level is the most likely) is identified in the complete photon arrival
time trace. No binning of the data is required. In the next step, the
time trace is cut into two pieces at the change-point photon. Then
change-point detection is performed again on the two fragments of
the time trace. This process is repeated until no more change points
can be found. Lastly, the on- and off-times that correspond to the
intervals between two change points are determined.
Having obtained the on-time–off-time sequence, the kinetics of
the enzymatic reaction can be investigated. Kinetic constants are
most easily obtained from the probability distributions of the on-
and off-times (Fig. 3.4c). For a reaction with a single rate constant
the data are described by an exponential function. Deviations
from single-exponential kinetics indicate a more complex kinetic
behavior of the enzyme. Frequently, multiple rate constants can
be extracted from the probability distributions, but the analysis
becomes less and less accurate, with an increasing number of
processes contributing to the enzymatic reaction.
More complex kinetic behavior requires different strategies
to describe the enzymatic reaction. When using probability dis-
tributions only the duration of each individual off- (on-) time
is considered. Important information about the duration of the
preceding or the following on- or off-time is lost. This sequence of
events contains important information, however. The reaction might
follow defined pathways in a kinetic scheme that can be identified
from correlations between successive off–off, on–on, off–on, and
on–off pairs. Correlations can, for example, be visualized in a 2D
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

60 Enzymology Meets Nanotechnology

correlogram. In the plot shown in Fig. 3.4d the diagonal indicates


the correlations between successive off-times.
Correlation analysis is a powerful tool to identify complex kinetic
behavior; it does not allow for establishing a kinetic scheme,
however. Obtaining a unique kinetic scheme is one of the key
challenges of the data analysis procedure. The observable of the
reaction is the on–off sequence. While the readout provides only
two states, the kinetic scheme might consist of a higher number
of states (Fig. 3.4e). These states are hidden and cannot be
observed directly; the kinetic scheme is a hidden Markov model
[52]. Very often multiple kinetic schemes can be found that are
able to produce the measured on–off sequence. Currently, many
theoreticians are working on this topic, developing methods that
allow for determining the best kinetic scheme describing the data
[53, 54]. A rigorous test of these methods is currently limited by the
availability of high-quality experimental data, as will be described in
more detail below.

3.3 Single-Enzyme Kinetics

The main goal of single-turnover experiments is to extract as


much detail as possible about the kinetics of the enzymatic
reaction. Enzyme kinetics is commonly described using a two-step
reaction scheme, where the first step corresponds to substrate
binding and the second step to the chemical reaction (Eq. 3.3).
The corresponding kinetic constants can be obtained using the
Michaelis–Menten equation (Eq. 3.4)
k1 k2
E + S  ES→E + P (3.3)
k−1

where S is the substrate, E the enzyme, ES the enzyme–substrate


complex, and P the product.
vmax [S]
v= (3.4)
KM + [S]
wherev is the reaction velocity, [S] is the substrate concentration,
vmax is the maximum velocity of the reaction, and KM is the Michaelis
constant.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 61

The Michaelis–Menten equation is also valid for single-molecule


experiments. In the classical form of the Michaelis–Menten equation,
the time course of the reaction is described using the measured
changes in the concentrations of the substrate, the enzyme–
substrate complex, and the product. In the single-molecule form,
the enzyme is considered to reside in the different substeps of the
reaction with a certain probability. This probability directly refers to
the amount of time it spends in the respective substeps [16].
It is widely accepted that KM is a measure of the affinity between
the enzyme and the substrate and that kcat describes the efficiency
of the chemical step (k2 = kcat = vmax /E ). This simple treatment
is also valid for enzymatic reactions that consist of more complex
reaction schemes with more substeps. The individual rate constants
of the substeps cannot be obtained, however. Instead KM and kcat are
composed of several rate constants. To really obtain the individual
rate constants, more advanced approaches are required, such as
stopped flow measurements [4]. When using a fluorescent cofactor
as the reporter system, substeps of the reaction can be directly
observed in a single-turnover experiment, providing direct access to
the corresponding rate constants [22–25].
Single-turnover experiments are not only useful for observing
reaction substeps, they are also very powerful for following
regulation events. A single-molecule time trace immediately shows
if an enzyme switches between active and inactive phases. A single-
turnover experiment is therefore ideal for studying the action of
inhibitors. After adding an inhibitor, an inactive phase indicates
that the inhibitor is bound to the enzyme. The durations of active
and inactive phases are directly related to the on- and-off rates
of the inhibitor and can be used to extract the corresponding
kinetic information. Whereas ensemble measurements yield only
the equilibrium inhibition constant Ki , single-molecule experiments
give direct access to the rates of the reaction. This information is
difficult to obtain in ensemble experiments.
Heterogeneities in the turnover sequence in the absence of any
regulation events might point toward multiple reaction pathways as
discussed in Section 3.1. Assuming that the enzyme is characterized
by a complex energy landscape, these different pathways might
require different times to complete. The reaction might still have the
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

62 Enzymology Meets Nanotechnology

Table 3.1 Summary of enzymes measured with single-turnover resolution


using a fluorescence-based readout

Enzyme Readout Substrate Reference


Cholesterol oxidase Cofactor Cholesterol [23]
Horseradish peroxidase Fluorogenic substrate Dihydrorhodamine 6G [31]
DHOD (Escherichia coli) Cofactor DHO, DCIP [24]
Candida antarctica (CaLB) Fluorogenic substrate BCECF-AM, CFDA [37, 39]
lipase B
Thermomyces lanuginosus Fluorogenic substrate CFDA [33, 34]
lipase (TLL)
β-Galactosidase (E. coli) Fluorogenic substrate Resorufin-β-D- [32]
galactopyranoside
DHOD (Lactococcus lactis) Cofactor DHO, fumarate [25]
α-Chymotrypsin (bovine) Fluorogenic substrate (suc-AAPF)2 -Rhodamine110 [30, 35, 36]
Nitrite reductase FRET with cofactor Nitrite [27]
(Alcaligenes faecalis)
Nitrite reductase FRET with cofactor Nitrite [26, 28]
(Alcaligenes xylosoxidans)
P450 oxidoreductase Fluorogenic substrate Resazurin [55]
(Sorghum bicolor)

DHOD: dihydroorotate dehydrogenase A; DHO: dihydroorotate; DCIP: dichlorophenol indophe-


nol; BCECF-AM: 2 ,7 -bis-(2-carboxyethyl)-5-(and-6)-carboxyfluorescein acetoxymethyl ester;
CFDA: 5-(6-)carboxyfluorescein diacetate

same rate-limiting step. But the height of the barrier that needs to
be crossed might vary over time. Alternatively, also the rate-limiting
step might be different along the different reaction pathways. In both
cases, the rate of the reaction will fluctuate over time, a phenomenon
called dynamic disorder. It is not possible to observe dynamic
disorder in an ensemble experiment, as these rate fluctuations are
averaged out when looking at a large number of enzymes.
Overall, single-turnover experiments are a powerful strategy
to investigate enzyme kinetics. A number of model systems have
been studied with the goal of resolving details of kinetic schemes,
investigating the kinetics of regulation events, and finding evidence
for dynamic disorder. A list of examples is given in Table 3.1.
In the following section we will focus on four examples in
more detail: lipase B from Candida antarctica, the lipase from
Thermomyces lanuginosus, bovine α-chymotrypsin, and nitrite re-
ductases (NiRs) from A. faecalis and A. xylosoxidans.

www.ebook3000.com
March 28, 2016 10:39 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 63

3.3.1 Candida antarctica Lipase B


Lipase B is a thermostable enzyme produced by the yeast strain
C. antarctica (recently reclassified as Pseudozyma antarctica). The
33 kDa enzyme preferentially catalyzes the hydrolysis of small
triglycerides such as tributyrin with a relatively low regioselectivity
for the different alcohol groups on glycerol [56]. Having a Ser-
His-Asp catalytic triad, it exhibits the same catalytic mechanism
as a serine protease. Despite its deep and narrow active site, it
also shows activity toward a broad range of other esters, making
C. antarctica lipase B (CaLB) an attractive biocatalyst. It is active
in organic solvents where it can also catalyze esterification and
transesterification reactions with high enantioselectivity [56].
Lipase activity in aqueous solvents (the natural surroundings
of the enzyme) is usually regulated by interfacial activation: lipase
activity increases when the triglyceride substrate forms an emulsion
and the lipase can interact with a hydrophobic interface [57]. In the
absence of a lipid interface, the enzyme is in a closed conformation.
Its hydrophobic active site is covered with a “lid” that prevents its
exposure to the surrounding environment. In the presence of the
interface, the lid opens and the active site becomes accessible. In
contrast to other lipases, no clear evidence of interfacial activation
has been reported for CaLB thus far, even though a possible lid has
been identified [58, 59]. Possible factors that might regulate the
activity of CaLB are still a topic of intensive research.
The absence of interfacial activation and the broad substrate
specificity of CaLB make this robust protein a good candidate for
single-molecule studies [37, 38]. CaLB was successfully immobilized
on a hydrophobic glass surface functionalized with dichlorodi-
methlysilane. Single turnovers were detected using the fluorogenic
substrate 2 ,7 -bis-(2-carboxyethyl)-5(or-6-)carboxyfluorescein
acetoxymethyl ester (BCECF-AM) that was converted by the en-
zyme into a fluorescent product (Fig. 3.5). Interestingly, no other
immobilization techniques were successful. Neither the entrapment
in agarose or polyacrylamide gels nor the deposition of CaLB
on a hydrophilic glass surface yielded active enzymes for single-
turnover experiments. Clearly, the immobilization method needs to
be customized to the enzyme’s needs and depends on its unique
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

64 Enzymology Meets Nanotechnology

Figure 3.5 Single-turnover detection of C. antarctica lipase B (CaLB)


using the substrate 2 ,7 -bis(2-carboxyethyl)-5-(and-6)-carboxyfluorescein
acetoxymethyl ester (BCECF-AM). CaLB was immobilized on a hydrophobic
glass surface functionalized with dichlorodimethylsilane. The possible lid of
CaLB is colored in blue, and the active-site residues are in red.

properties. It appears likely that CaLB activity requires a hydropho-


bic surface to mimic its natural function as a lipase. Eventually
the hydrophobic surface stabilizes CaLB upon immobilization. One
might also speculate that the interface stabilizes the active, open
conformation of CaLB, providing an indirect proof of interfacial
activation.
To be able to measure the turnover sequence at the single-
enzyme level, several other aspects need to be considered. The
enzyme, with a size of ∼5 nm, is significantly smaller than the optical
resolution of ∼250 nm. To ensure that the recorded turnovers
are really originating from one enzyme only, the overall number
of enzymes on the surface needs to be low. Also the location of
the enzymes on the surface needs to be known (Fig. 3.6a). In
the CaLB experiment the enzymes were fluorescently labeled. This
allowed for establishing their location with the confocal microscope
before the activity measurement started. Once the detection volume
had been positioned at the location of an enzyme, the attached

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 65

Figure 3.6 Typical data of a single-turnover experiment. (a) Surface scan


showing the location of fluorescently label enzymes (10 × 10 μm). (b)
Fluorescence intensity time trace after the addition of substrate showing
the signal detected in the absence (1) and the presence (2) of an enzyme in
the detection volume. The high-intensity peak (∗) at the start of the enzyme
time trace (2) originates from the fluorescence label coupled to the enzyme.
The label is bleached after less than 1 s of measurement time.

fluorophore was photobleached with the laser and the activity


measurement was started. Figure 3.6b shows a typical turnover
time trace at the position of an enzyme, together with a control
measurement recorded on the empty surface.
The control experiment is important for determining if the events
detected are indeed originating from enzymatic turnovers. In the
experiment shown, an increase in the background fluorescence was
observed when adding BCECF-AM to the sample. The background
originates from fluorescent product molecules that are already
present in a low concentration in the substrate solution at the
beginning of the measurement. On top of the background, a clear on
and off blinking behavior was observed when placing the laser at the
position of an enzyme instead of the empty surface (Fig. 3.6b). Every
turnover produces a new fluorophore that remains in the detection
volume for some time before it diffuses away (or photobleaches).
This blinking could be observed for a measurement time up to one
hour. The length of the single-turnover time trace is only limited by
product accumulation in the sample, leading to a gradual rise in the
background and reducing the S:N ratio.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

66 Enzymology Meets Nanotechnology

The possibility of measuring CaLB for extended periods of time


yielded a high number of turnovers, allowing for a detailed statistical
analysis. The off-times were determined using the threshold method
[38] and plotted into a histogram, as described above (Fig. 3.7a).
For all substrate concentrations tested (0.6, 0.8, and 1.4 μM),
the histograms could not be fitted with one exponential function.
Instead a stretched exponential had to be used (Eq. 3.5):
α
P (toff ) = e−(t/τ ) (3.5)
where τ is the time constant and α is the stretching exponent (α = 1
for a monoexponential function).
A stretched exponential fit is not expected for a reaction
following standard reaction kinetics [38]. It points toward the
existence of multiple parallel processes. This was a first indication
that CaLB exhibits dynamic disorder. It has been proposed that
dynamic disorder originates from different enzyme conformations
that hydrolyze the substrate with different rate constants [16, 23,
38]. Each conformation contributes one exponential to the overall
off-time histogram, yielding the observed stretched exponential.
For a more detailed kinetic analysis of the system the fluctuating
enzyme model was established (Fig. 3.7b). In this model, the off-
state consists of a spectrum of conformational substates that can
interconvert. The on-state is formed upon substrate conversion.
Each substate is able to convert the substrate into product but at
a different rate. After substrate conversion, the product molecule
diffuses out of the detection volume and the system switches back
to the off-state. Using this model, it was established that CaLB shows
periods of high activity that alternate with periods of low activity.
In the high-activity period, the rate of the catalytic reaction was
determined to be kfast = 125 s−1 . In contrast, the average rate was
only 4 s−1 , leading to the conclusion that CaLB is active for only 3%
of the total measurement time. A possible explanation is that only
a small number of conformations show high catalytic activity, while
many others are (almost) inactive. This behavior cannot be observed
in ensemble measurements, where high- and low-activity periods
are averaged out. It is a very interesting observation, however, that
opens up new questions about a possible interfacial activation of
CaLB.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 67

Figure 3.7 Kinetic analysis of CaLB measured at a substrate concentration


of 1.4 μM BCECF-AM. (a) The off-time probability distribution shows clear
deviations from single-exponential kinetics. Note that the y axis is plotted
on a log scale so that a monoexponential process yields a straight line. (b)
The fluctuating enzyme model describes a possible kinetic scheme that is
able to explain the observed stretched exponential behavior.

3.3.2 Thermomyces lanuginosus Lipase


The lipase from T. lanuginosus (formerly Humicola lanuginosa) is
another important industrial biocatalyst that has been studied at
the single-molecule level. In contrast to CaLB, interfacial activation
of this lipase has been observed experimentally using a number
of different conditions [60, 61]. When setting up a single-turnover
experiment, the experimental conditions need to be chosen such
that interfacial activation can occur. In a first attempt, the same
immobilization method was tested, as described above for CaLB. But
no activity could be detected using the fluorogenic substrate 5(6)-
carboxyfluorescein diacetate (CFDA). While the hydrophobic surface
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

68 Enzymology Meets Nanotechnology

Figure 3.8 Experimental setup to measure the single-turnover sequence


of TLL. The enzyme was immobilized via a BSA foot to facilitate a defined
orientation of the TLL–BSA conjugate on the surface, preventing TLL
inactivation. CFDA was used as a substrate. The lid of TLL is colored in blue,
and the active-site residues are in red.

seems to be a good environment for CaLB, it does not yield active


T. lanuginosus lipase (TLL). It is not clear if the enzymes denature
as a consequence of random adsorption or if the surface does not
facilitate interfacial activation.
To overcome this problem, a site-specific immobilization method
was developed using a protein foot [33]. This elegant protocol is
based on bovine serum albumin (BSA) that was conjugated to TLL
using defined positions in both proteins (Fig. 3.8). BSA is often used
as a surface coating as it adsorbs to hydrophilic and hydrophobic
surfaces. It was expected that TLL–BSA heterodimers would
adsorb to a surface via the BSA part of the molecule, preventing
direct contact between TLL and the surface. The heterodimer
was prepared using the well-established Cu(I)-catalyzed Huisgen
cycloaddition reaction. TLL, genetically modified to carry only one

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 69

surface accessible lysine, was acetylene-functionalized by coupling


4-pentynoic acid to this lysine residue (Lys46). BSA, naturally
carrying a free cysteine at position Cys34, was functionalized
with azide using 3-azidopropyl-1-maleimide. Subsequently, the
two proteins were coupled in a copper-catalyzed click reaction.
To visualize the enzymes on the surface, the BSA was further
functionalized with the fluorescent dye Alexa Fluor 488. In this
context, the BSA foot offers an additional advantage. It is not
necessary to label the enzyme directly, which might eventually affect
its activity.
In an ensemble control experiment using CFDA, the TLL–
BSA conjugate showed a higher activity than the unmodified
enzyme. This effect was attributed to a (partial) lid opening
caused by the presence of the hydrophobic BSA. For the single-
molecule experiments, the TLL–BSA conjugate was deposited on
a hydrophobic coverslip. After locating the enzymes, a solution
of CFDA was added and the measurement was started. Just as in
the CaLB experiments, single turnovers could be detected after
bleaching the fluorescence label used to locate the enzymes. Clearly
the BSA foot was essential to obtain active enzymes on the surface,
suggesting that the heterodimer is indeed oriented such that the
TLL molecule is not in contact with the surface. It might also be
speculated, however, that the measurement became possible due to
the interfacial activation caused by the presence of the BSA.
Data analysis was again performed using the threshold method,
and an average number of 17 turnovers/second was determined
from the on-time–off-time trace. Similar to CaLB, evidence for the
fluctuating enzyme model was found. The analysis of consecutive
off-times revealed that a short off-time is more likely to be followed
by another short off-time. Likewise, a long off-time precedes another
long off-time with a higher probability. Correlations between
subsequent events are most easily visualized in 2D correlograms
(Fig. 3.9). For TLL a clear diagonal was observed in the correlogram
where each off-time was plotted against the directly following off-
time in the on–off sequence. Even after 15 turnover events this
memory effect was still visible.
In terms of the fluctuating enzyme model this memory ef-
fect is another indication for the presence of different, slowly
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

70 Enzymology Meets Nanotechnology

Figure 3.9 2D correlation plots using the off-times of the TLL reaction.
Shown are the durations of each off-time and its directly following off-time
(n = 1) as well as the 15th and the 100th following off-time. A clear diagonal
is observed when the duration of the directly following off-time is plotted,
indicating correlations between these off-times. Some correlations are still
present when the off-times are separated by 15 turnovers (n = 15). After
100 turnovers the correlations are lost (n = 100).

interconverting enzyme conformations with a different activity. As


long as the enzyme resides in one conformation it catalyzes the
reaction with a certain rate constant. Once it changes conformation,
the memory is lost. Taking together the CaLB and TLL results
described so far, it can be concluded that both enzymes exhibit
dynamic disorder. The origin of this dynamic disorder is not clear,
however. Fluctuations between different conformations might be
an intrinsic property of these and maybe also other enzymes.
Alternatively, the observed dynamic disorder might originate from
the fact that interfacially activated enzymes have been measured
under suboptimal conditions.
To investigate interfacial activation of TLL in more detail, a new
series of experiments was performed mimicking a more natural
environment for the enzyme. TLL was attached to surface-tethered
liposomes [34]. In this way, assuming that the lid opens upon
interaction with the lipid membrane, the enzyme was kept in close
proximity to its activator, the lipid membrane. The experiment
was designed such that the accessibility of the bilayer could be
controlled, providing the possibility of regulating TLL activity
(Fig. 3.10).
For enzyme attachment and surface immobilization, the lipo-
somes contained ∼0.1% of biotinylated lipids carrying the biotin
group on a poly(ethylene glycol) linker (DSPE-PEG2000 -biotin).

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 71

Figure 3.10 Experimental setup to investigate the regulation of TLL


activity. TLL is immobilized on liposomes containing a different number of
PEGylated lipids. The number of PEG chains determines the access of TLL to
the lipid membrane. (a) If the amount of PEGylated lipids is low the enzyme
can access the layer easily and become activated at the interface. (b) If the
surface is covered with PEG chains, the access of the enzyme to the lipid
layer is limited.

In addition, they were doped with the intercalating fluorescent


dye 1,1 -dioctadecyl-3,3,3 ,3 -tetramethylindodicarbocyanine per-
chlorate (DiD) for detecting their location on the surface. The
biotinylated and fluorescently labeled liposomes were incubated
with a preformed complex of biotinylated enzyme and Alexa Fluor
488–labeled streptavidin to allow for binding of the enzymes to
the liposomes. Using an excess of liposomes, it was ensured that
every liposome contained maximally one enzyme. Subsequently,
the liposome preparation was added to a BSA-passivated and
streptavidin-functionalized glass surface for liposome immobiliza-
tion. As both the liposomes and the enzyme-bound streptavidin
carried different fluorescent labels, enzymes bound to the liposome
could be distinguished from nonspecifically adsorbed enzymes and
liposomes that did not contain an enzyme.
The liposomes were further supplemented with 0% to 2.1%
of PEGylated lipids (DOPE-PEG2000 ). While the PEG chains used
for attaching the enzyme ensured its mobility and bilayer access,
the additional PEG chains hindered the enzyme from accessing
the liposome surface. Using the different liposome preparations,
single-molecule experiments were performed in the same way, as
described above. The average turnover rate varied between 7 and
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

72 Enzymology Meets Nanotechnology

26 s−1 for the highest and the lowest fraction of PEGylated lipids,
respectively. This dependence of TLL activity on the number of
PEG chains clearly indicates that the PEG chains contribute to the
regulation of TLL activity in an indirect fashion.
A more detailed kinetic analysis again showed deviations
from monoexponential behavior, indicating that more than one
conformational state is involved in the catalytic reaction. In contrast
to previous experiments, the data could not be fitted with a stretched
exponential. Two exponentials were sufficient to obtain a good fit
of the data, suggesting that only two conformations are relevant for
the observed catalytic reaction. On the basis of the catalytic rate
constants for the two conformations kact1 = 230 s−1 and kact2 =
12.5 s−1 , it appears likely that these conformations correspond to
the active, lid-open (kact1 ) and closed (kact2 ) conformations. These
rate constants are independent of the amount of PEGylation and,
consequently, represent intrinsic rate constants of the open and
closed conformations. Bilayer access merely shifted the equilibrium
between the open and closed conformations, thereby regulating the
time the enzyme spent in the open and closed conformations. Bilayer
access did not only stabilize the open conformation, it further
affected the rate constant for lid opening. At the same time it left
the rate constant for lid closing unaffected.
This experiment clearly shows that the regulation of lipase
activity can be studied at the single-molecule level in an experiment
that introduces an external control parameter: the access of the
enzyme to the bilayer. For TLL, interfacial activation originates
from the energetic stabilization of the active conformation. This
clearly points toward a conformational selection mechanism where
the equilibrium of conformational states is shifted toward this
conformation. This mechanism is in agreement with the related
structural transition. The inactive conformation of TLL is fully
closed, and no substrate binding is possible. Substrate can only bind
to the open conformation that is favored by the contact with the
lipid layer. These results provide new insights into the regulation
mechanism. They question the presence of dynamic disorder and
the accompanying memory effect that has been observed in earlier
experiments [23, 32, 33, 37].

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 73

3.3.3 α-Chymotrypsin
The digestive protease α-chymotrypsin from bovine pancreas is one
of the most thoroughly studied enzymes. α-chymotrypsin is a 25
kDa serine protease with a Ser-Asp-His catalytic triad. It specifically
cleaves peptides at the C-terminal side of large hydrophobic amino
acids such as leucine, phenylalanine, and tryptophan. The catalytic
reaction proceeds via a covalent intermediate. After cleavage of
the peptide bond, the C-terminal fragment dissociates immediately,
while the N-terminal fragment remains covalently bound to the
catalytic serine. The N-terminal peptide fragment is released from
the enzyme in a second step using a water molecule to hydrolyze the
covalent intermediate.
As for most other proteases, α-chymotrypsin activity is tightly
regulated. The enzyme is expressed as a proenzyme that is
processed into the mature and active enzyme by proteolytic
cleavage. The activation mechanism involves the internal cleavage
of the amino acid chain into three fragments (amino acids 1–13,
16–146, and 149–245). In the active enzyme the three fragments
are held together by disulfide bonds and noncovalent interactions.
The enzyme has a narrow pH optimum at pH 8.0. α-chymotrypsin
activity decreases upon lowering the pH due to protonation of the
catalytic histidine. At high pH, the salt bridge between Ile16 and
Asp 194 is disrupted, resulting in deprotonation of the N-terminal
amine of Ile16. In the absence of the salt bridge, α-chymotrypsin
adopts a different conformation. This conformation has a similar
structure as the proenzyme and is consequently inactive [62]. The
rate constants for this conformational change have been determined
using ensemble techniques. They are pH dependent and range from
3.1 s−1 (pH 5.6) to 0.24 s−1 (pH 9.5) for the conformational change
from the inactive to the active conformation and from 0.8 s−1 (pH
5.6) to 1.5 s−1 (pH 9.5) for the conformational change from the active
to the inactive conformation [63]. As the conformational changes
occur on the millisecond-to-second timescale they are accessible in
single-molecule experiments.
Fluorogenic substrates for α-chymotrypsin were designed using
Rhodamine 110, a frequently used fluorophore for the synthesis of
protease substrates. It carries two amine groups that are utilized
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

74 Enzymology Meets Nanotechnology

Figure 3.11 Experimental setup for measuring α-chymotrypsin activity at


the single-molecule level. The Rhodamine 110–based substrate carries two
peptide chains that are cleaved by α-chymotrypsin in a two-step reaction.
Both the intermediate and the final product of the reaction are fluorescent.
They can be distinguished on the basis of their different quantum yield 
and fluorescence lifetime τ .

for coupling a short peptide via the C-terminus (Fig. 3.11). As α-


chymotrypsin recognizes the amino acid sequence AlaAlaProPhe
with high specificity, the substrate (succinyl-AlaAla ProPhe)2 -
Rhodamine 110 (sAAPF2 -Rh110) was used to detect the activity of
single enzymes immobilized in an agarose gel.
The fluorogenic substrate sAAPF2 -Rh110 is cleaved by the
enzyme in a two-step reaction (just as the fluorescein-based
lipase substrates described above). Double-substituted Rhodamine
110 and fluorescein substrates are frequently used in ensemble
measurements. But obtaining accurate kinetics of such a two-step
reaction is intrinsically difficult, as the intermediate and the final
product have similar emission wavelengths but different quantum
yields (Fig. 3.11). For both fluorescein- and Rhodamine 110–based
substrates, the intermediate is less fluorescent than the final product
[64, 65]. Typically the final product is used for a calibration curve.
This calibration does not account for the presence of the less
fluorescent intermediate, however. This is especially problematic
under initial rate conditions where mostly the intermediate is
produced, leading to an underestimation of the reaction velocity
[65]. Additional problems arise from the possibility that the first

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 75

and the second step might have different rate constants and the
possibility of intermediate channeling [66–68]. Even though the
corresponding rate equations have been established, fitting the
progress curve of the enzymatic reaction is not possible due to a
large number of fitting parameters [65].
The two-step reaction might also complicate single-turnover
detection if the intermediate is not bright enough to be detected
at the single-molecule level. A large number of turnovers would
be lost, preventing any kinetic analysis. If the intermediate can be
detected, however, a detailed kinetic analysis is possible. Assuming
that the intermediate and the final product can be distinguished
on the basis of their different fluorescent properties, even the
occurrence of channeling can be investigated on the basis of the
turnover sequence. The intermediate and the final product possess
not only a different brightness but also a different fluorescence
lifetime (Fig. 3.11). Using a time-correlated single-photon counting
(TCSPC) detection scheme, both the fluorescence intensity and the
fluorescence lifetime can be detected for every individual enzymatic
turnover [36, 49, 50].
A TCSPC experiment of α-chymotrypsin, performed at a sub-
strate concentration of 30 μM, revealed that the intermediate could
be detected and that it was the dominant fluorescent species
in the sample (Fig. 3.12). It was clearly the fluorophore that
had been produced by the enzymatic reaction (on-state) and
also contributed to the background signal (off-state). Hardly any
Rhodamine 110 was produced. This surprising result is easily
explained when considering the low enzyme concentration used for
a single-turnover experiment. The substrate and the intermediate
compete for binding to the active site. The substrate concentration
is far higher than the intermediate concentration, however, so the
second hydrolysis step is extremely unlikely to occur. This would
be different in the case of intermediate channeling where the
intermediate rebinds with an increased probability. Consequently,
intermediate channeling could be excluded in this experiment—
an observation that is consistent with the catalytic mechanism of
α-chymotrypsin. The intermediate corresponds to the C-terminal
product that is released first, while the peptide remains covalently
bound until the second reaction step occurs. It would have been
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

76 Enzymology Meets Nanotechnology

Figure 3.12 Two-dimensional histograms showing the lifetimes and inten-


sities of all individual on- and off-states. The maximum of the distribution
is centered at a lifetime of 2.7 and 2.5 ns for the on- and off-states,
respectively. The on-time histogram also shows a very small population with
a higher intensity and a lifetime of approximately 3.7 ns, which most likely
corresponds to Rhodamine 110. The histograms show that the intermediate
was the main species that was produced in the enzymatic reaction.

interesting to lower the substrate concentration to facilitate the


detection of the second reaction step. This was not possible,
however, as the overall number of turnovers would have been too
low at the required substrate concentration.
The fact that only the intermediate was detected when using
a high substrate concentration does not allow for a more detailed
analysis of the two-step reaction. It ensures, however, that every
fluorescent molecule is detected and that the observed reaction has
1:1 stoichiometry. This is an important prerequisite for investigating
the presence of a (pH-induced) inactive conformation and possible
dynamic disorder. In a first series of α-chymotrypsin experiments
performed close to its pH optimum (pH 7.5), nonexponential off-
time distributions were observed when using threshold analysis.
They were interpreted as dynamic disorder [30].
As the detected reaction intermediate has low brightness, the
single-turnover time traces have an intrinsically low S:N ratio.
The on- and off-state intensity levels overlap, partially impeding
the performance of threshold analysis. With the goal of determining
the error of threshold analysis in low S:N (<5:1) ratio data, it was
compared with change-point analysis in subsequent experiments
(Fig. 3.13). With both methods deviations from single-exponential

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 77

Figure 3.13 Single-turnover experiment of α-chymotrypsin analyzed with


the threshold method and with change-point analysis. (a) The off-time
histograms have a fundamentally different shape, depending on the data
analysis method used. (b) Autocorrelation analysis and 2D correlograms
(n = 1; timescale = 40 ms) show correlations between subsequent
turnovers when threshold analysis is used but not after change-point
analysis.

kinetics were observed in the off-time histograms at short


timescales. After threshold analysis, the histogram showed a very
high number of short events, whereas this number was significantly
lower when using change-point analysis. More importantly, no
correlations between events were seen when using change-point
analysis (Fig. 3.13b). With threshold analysis, the correlations only
occurred at short timescales where the differences between the
off-time histograms were the largest. Besides the shape of the
histograms, the observation of correlated turnover events is also
critically influenced by the data analysis method used.
This result has been confirmed using a large set of simulated data
with different S:N ratios and intensity levels [35]. The simulations
support the observation that threshold analysis overestimates the
number of short off-times. This artifact is introduced by accidentally
dividing noisy on-times into several shorter on and off-times. In
contrast, change-point analysis underestimates the number of short
off-times when the number of photons is too low to detect them
with a sufficient statistical accuracy. It is clear from these results
that the previously determined off-time histograms obtained using
threshold analysis might not (only) represent the contribution of
different conformations to the catalytic reaction. These findings are
of fundamental importance as they question previous observations
of dynamic disorder and memory effects [23, 32, 33, 37].
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

78 Enzymology Meets Nanotechnology

Resulting from these problems in low S:N ratio data, no detailed


analysis of the pH-dependent activity of α-chymotrypsin has been
performed so far. Still the experiments with α-chymotrypsin can
be considered as the most detailed single-enzyme experiment
performed to date. Essential knowledge has been obtained about
the amount of information that can be extracted from a single-
turnover experiment when using double-substituted substrates.
α-chymotrypsin is an ideal model system for investigating possible
technical improvements in the measurement setup as the measured
rates can be directly compared to the ensemble rates available for
both the catalytic reaction and the pH-dependent conformational
change. Improvements in the fluorogenic substrate design yielding
substrates with 1:1 stoichiometry and high brightness are an
essential next step forward [65]. Further, new detection schemes,
as described in Sections 3.4.1 and 3.4.2, will allow for increased S:N
ratios, facilitating more accurate single-turnover detection.

3.3.4 Nitrite Reductase


Bacterial NiRs, such as the enzymes from A. faecalis and A.
xylosoxidans, are key dissimilatory enzymes in the global nitrogen
cycle. They are involved in the denitrification process where nitrate
(NO− 2 ) is reduced to the gaseous products nitrous oxide (NO)
and nitrogen (N2 ) in a stepwise fashion. Copper-containing NiRs
(CuNiRs) reduce NO− 2 to NO. They are trimeric enzymes with a
molecular weight of 37 kDa per monomer. Every monomer contains
two copper ions that are both directly involved in the catalytic
reaction (Fig. 3.14a). The type 1 copper site (T1) accepts an electron
from an electron donor and passes it on to the type 2 copper
site (T2) where NO− 2 is reduced to NO. CuNiRs are structurally
related to the multicopper oxidases (MCOs). The industrially used
laccases are a typical example of this group of enzymes [69, 70]. NiR
activity is usually detected electrochemically or spectroscopically
following a change in the optical properties of the electron donor
[71]. Electrodes functionalized with NiR can be used as NO− 2 sensors
[72].
The redox state of oxidoreductases can often be read out
optically. Many cofactors such as FAD and FMN show different

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 79

Figure 3.14 FRET reporter system for following the catalytic reaction of
nitrite reductase. (a) Structure of the trimeric enzyme from A. faecalis
showing the T1 and T2 copper sites. The fluorescent dye ATTO 655 is
coupled in close proximity to the T1 site, allowing for FRET from the
fluorophore to the oxidized T1 copper site. (b) The oxidized T1 site of A.
faecalis CuNiR (FRET acceptor) shows a broad absorption spectrum with
characteristic peaks at 450 and 590 nm. The FRET donor ATTO 655 has
an emission maximum at 684 nm. The spectral overlap is indicated in gray.
During the catalytic reaction the T1–Cu ion cycles between its blue oxidized
state (Cu2+ ) and a colorless reduced state (Cu+ ). This switching can be
observed directly in single-molecule time traces utilizing changes in (c) the
donor intensity or (d) the lifetime.

absorption and fluorescence spectra in the oxidized and the reduced


state. In the case of CuNiR, the copper ion bound at the T1 site
changes its optical properties when it is cycled between the oxidized
and reduced states. It is colorless in the reduced state (Cu+ ),
whereas the oxidized state (Cu2+ ) displays blue color characterized
by a broad absorption spectrum (Fig. 3.14b). In contrast to FAD
and FMN, Cu2+ is not fluorescent, limiting assay sensitivity. Making
use of FRET, a new reporter system was designed that provides a
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

80 Enzymology Meets Nanotechnology

direct and highly sensitive readout for the oxidation state of the T1
site [73]. The oxidized Cu2+ ion was used as a FRET acceptor for a
number of fluorescent dyes that emit fluorescence in the 650–800
nm region (Cy5 [73], ATTO 655 [27], and ATTO 647N [26, 28]). In
the oxidized state, where the Cu2+ ion absorbed light, it efficiently
quenched the donor fluorescence and a low donor emission was
observed. In the reduced state, the Cu+ ion did not absorb and was
switched off as a FRET acceptor, leading to high donor emission.
At the ensemble level, the fluorescence intensity of an ATTO 655–
labeled CuNiR sample increased when NaNO2 was added as an
oxidant and decreased when ascorbate was added as a reductant
[27]. At the single-molecule level, switching between a high FRET
and a low FRET state can potentially be observed in single-turnover
time traces. As FRET affects both donor intensity and donor lifetime,
the FRET efficiency can potentially be determined from both readout
parameters (Fig. 3.14c,d).
The applicability of the FRET reporter system for single-turnover
detection was tested using the following experimental setup. A L93C
mutant of the A. faecalis CuNiR was labeled with an amine-reactive
derivative of the fluorophore ATTO 655 under conditions favoring
N-terminal coupling. Using a low dye concentration in the labeling
experiment, maximally one monomer was labeled with a donor
fluorophore. For the ATTO 655–T1–Cu2+ FRET pair a Förster radius
of 3.5 nm was calculated. The distance between the N-terminus
and the T1–Cu was estimated to be 3.9 nm, leading to a FRET
efficiency of 30%–45% [27]. The labeled enzymes were immobilized
on a thiol-functionalized surface using a bis-maleimide crosslinker.
After establishing the location of the enzymes on the surface, the
confocal volume was placed at the position of an enzyme and the
donor fluorescence was measured in the presence and absence
of the substrate. Only in the presence of the substrate (NO2− ),
the electron donor (ascorbate) and the redox mediator (phenazine
ethosulfate) distinct switching between two intensity levels was
observed, indicating that the T1–Cu is cycled between its oxidized
and reduced forms [27].
The obtained single-turnover sequence provides information
about the durations of the oxidized and reduced states. It can
potentially be utilized for investigating the multistep catalytic

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 81

Figure 3.15 Possible kinetic scheme for the reduction of nitrite (S) to nitric
oxide (P). In the first reaction step, the T1–Cu site gets reduced using an
electron from, for example, ascorbic acid. In the following step two options
are possible. In pathway A, nitrite binds to the enzyme first, followed by
electron transfer to the T2 site. Alternatively, the electron is transferred to
the T2 site before nitrite binding occurs (pathway B). In the last step nitrite
is reduced to nitric oxide, leaving two oxidized Cu sites. When the reaction
follows pathway A, the system spends more time in the low FRET state (high
donor emission) than when pathway B is utilized.

reaction of CuNiR in more detail (Fig. 3.15). The details of this


mechanism have been a matter of debate. After reduction of the
T1–Cu site in the first reaction step, two alternative pathways are
possible. NO2− binding can occur before the electron is transferred
to the T2 site (binding-first pathway). Alternatively, NO2− binding
might follow electron transfer (reduction-first pathway). In addition,
a random sequential mechanism has been proposed where both
pathways coexist. Ensemble experiments have shown NO2− - and pH-
dependent deviations from Michaelis–Menten kinetics that could
be explained with a random sequential mechanism where the
binding-first pathway is dominant at low pH and high substrate
concentrations, whereas the reduction-first pathway is more likely
at high pH and low substrate concentrations [71]. These possibilities
can be tested using the FRET time traces as the durations of the low
FRET and the high FRET state depend on the pathway utilized by the
enzyme (Fig. 3.15).
As indicated above, the FRET efficiency was only 30%–45% so
that a clear separation of the high and low FRET states was not
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

82 Enzymology Meets Nanotechnology

possible using the threshold method. The problem was overcome


by using autocorrelation analysis of the binned intensity time traces
[27]. Autocorrelation analysis performs a global analysis of the
timescale(s) of the reaction without the need of extracting the
durations of individual events. The measurement was performed
under conditions that were considered to favor the reduction-first
pathway [71]. Using the corresponding kinetic model for fitting the
autocorrelation data, a good fit was obtained and the rate constants
for electron transfer were determined, suggesting that control over
the reaction conditions has been achieved. The autocorrelation data
also indicated a distribution of these rates suggesting dynamic
disorder. This disorder was tentatively assigned to differences in
the redox potentials of the T1–Cu and T2–Cu sites originating from
variations in the hydrogen-bonding network around the active site.
With the goal of confirming these initial results, new experiments
were designed with optimized measurement setups that allowed
for a more accurate assignment of the FRET states. In a first series
of experiments the CuNiR from A. faecalis was replaced with the
enzyme from A. xylosoxidans [28]. The T1–Cu2+ of the A. xylosoxidans
enzyme shows a stronger absorption in the 600 nm region, leading
to an improved FRET efficiency of 70%–80%. Also the labeling
and immobilization strategy was modified. The agarose-entrapped
enzyme was labeled site specifically with a thiol-reactive derivative
of ATTO 647N using a K329C mutation. Most importantly, the
assignment of the low FRET and the high FRET state was based on
the donor lifetime instead of the donor intensity (Fig. 3.14d) using a
TCSPC measurement. For the reduced T1–Cu+ (low FRET) a lifetime
of 3.7 ns was determined, whereas the lifetime of the oxidized T1–
Cu2+ (high FRET) was only 1.1 ns, allowing for a clear discrimination
of the two states.
A time-averaged analysis of the donor lifetime reflects the
relative amount of time the enzyme spends in the oxidized
or reduced state; for example, the higher the average lifetime,
the longer the duration of the reduced state. The analysis of
several hundred enzymes yielded two enzyme populations that
were tentatively assigned to the two possible reaction pathways.
Subsequently, the durations of every high and low FRET state were
obtained for enzymes from both populations separately. Fits to the

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 83

corresponding histograms showed that the two populations differed


by the time the enzymes spent in the reduced state: 20–30 ms versus
10 ms, respectively. This observation supports the coexistence of the
binding-first and reduction-first pathways. Moreover, an increase in
the substrate concentration led to a higher fraction of enzymes in the
binding-first population, a result that is expected when more NO2−
is available for binding to the enzyme [71].
Using the same ATTO 647N–labeled enzyme from A. xylosoxidans,
a measurement setup was developed that allowed for measuring the
turnover sequence of one single CuNiR enzyme without the need
for immobilization [26]. To facilitate long observation times (several
seconds), a single-enzyme molecule was caught in a so-called anti-
Brownian electrokinetic (ABEL) trap. In the ABEL trap, the position
of the enzyme molecule is tracked and subsequently controlled by
using electroosmotic forces. Due to the higher FRET efficiency of the
A. xylosoxidans enzyme, the high and low FRET states were clearly
separated and the corresponding waiting times were determined
using change-point analysis. Comparing the relative duration of
the low versus the high FRET state, it was established that the
enzyme spent a larger fraction of time in the oxidized high FRET
state when the substrate concentration was increased. Performing
a global fit to a set of waiting-time distributions obtained from
measurements at different NO2− concentrations, all rate constants
of the kinetic scheme shown in Fig. 3.15 could be obtained. The
kinetic scheme supports the results from the lifetime experiment.
At low substrate concentrations, the reduction-first pathway is
dominant, whereas the binding-first pathway is utilized at high
substrate concentrations. At intermediate substrate concentrations
the reaction follows the proposed random sequential mechanism.
The CuNiR example clearly demonstrates the power of the single-
molecule approach for determining the rate constants of a multistep
kinetic scheme. Experiments at different pH values can provide
additional support for the kinetic scheme. Novel information might
further be obtained when immobilizing the enzyme on an electrode
to control the redox potential of the system, an approach that
has already been shown to be successful in ensemble experiments
[74]. This example also highlights the main drawback of the
cofactor reporter system, however. The time traces obtained contain
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

84 Enzymology Meets Nanotechnology

a far lower number of turnovers than when using fluorogenic


substrates. As a consequence, the waiting times from many enzymes
have to be combined when constructing waiting-time histograms.
This prohibits any statistically sound conclusions about dynamic
disorder and memory effects.

3.3.5 Summary
Single-turnover experiments have contributed to a more detailed
understanding of enzyme kinetics on multiple levels. The experi-
ments with α-chymotrypsin and NiR have helped to clarify substeps
of the kinetic scheme. In the α-chymotrypsin case the fluorogenic
substrate is hydrolyzed in a two-step reaction. It has long been
speculated if the intermediate is hydrolyzed immediately after it
has been formed and before it can diffuse away from the active site
(intermediate channeling). In ensemble experiments this question
can only be answered if a method is available that allows for measur-
ing the intermediate and the product concentration independently.
As they cannot be distinguished in a fluorescence experiment,
time-consuming high-performance liquid chromatography (HPLC)
measurements need to be performed that do not yield accurate
kinetic information. In contrast, the two reaction steps can easily
be distinguished in single-molecule experiments making use of
both the fluorescence intensity and the fluorescence lifetime. In
these experiments no evidence was found that would support
channeling.
As for many other redox enzymes, the reaction of NiR follows
multiple steps involving substrate binding and electron transfer. It
has been speculated if the sequence of these steps is clearly defined
or if they can occur in a random order. By making clever use of
the optical properties of the cofactor, not only the single-turnover
sequence but also reaction substeps are observed at the single-
molecule level. The duration of these substeps contains the kinetic
information required to test the sequence of events in one catalytic
cycle. The combination of single-molecule and ensemble results
supports the hypothesis that the enzyme can utilize two different
reaction pathways, depending on the substrate concentration and
the solution pH.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 85

The sequence of events not only is crucial for investigating


reaction substeps but also contains information about regulation
events that switch the enzyme into a more or less active state. This
has been demonstrated for the enzyme TLL that is activated upon
contact with a lipid bilayer. A clear correlation is observed between
the overall enzymatic activity and the frequency of interaction
with the bilayer. A detailed kinetic analysis also reveals the rate
constants for both the catalytic reaction and the regulatory process.
Even though regulation events have only been studied for TLL so
far, single-turnover experiments are a powerful method to also
investigate the kinetics of soluble activators and inhibitors. In
contrast to steady-state ensemble experiments that provide only
the equilibrium inhibition constant Ki , single-turnover experiments
have the potential to give direct access to the rate constants of
enzyme activation and inhibition.
The processes described above are directly relevant for the func-
tion of enzymes. Single-turnover time traces have also suggested
that the rate-limiting step of the enzymatic reaction step fluctuates
in time—an observation that does not appear to be functionally
relevant. These fluctuations, termed as dynamic disorder, are
characterized by a stretched exponential off-time distribution and
the related memory effect. Considering the frequently rugged energy
landscapes of proteins, dynamic disorder has been explained with
the presence of different enzyme conformations that possess dif-
ferent rate constants for the catalytic reaction. These observations
have stimulated the interest of experimentalists and theoreticians
who wish to understand the energy landscape of enzymes in more
detail. The existence of alternative reaction pathways is also directly
relevant for enzyme evolution and the laboratory optimization of
enzymes [75, 76] as well as for drug design [77].
Dynamic disorder has been observed for CaLB [37, 38], TLL
[33], and α-chymotrypsin [30] in initial experiments. For TLL
and α-chymotrypsin these interpretations have been questioned
in more detailed follow-up experiments [34, 35]. The detailed
analysis performed for α-chymotrypsin suggests that observations
of dynamic disorder are a data analysis artifact [35]. An accurate
assignment of on- and off-times is difficult for data with a low S:N
ratio where the intensity levels of the on- and off-states overlap.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

86 Enzymology Meets Nanotechnology

Even though change-point analysis is more objective and can be


considered more reliable from this point of view, it can only identify
the intensity changes accurately if the data are of sufficient quality.
At the current stage it is neither possible to find sufficient support
for dynamic disorder nor possible to unambiguously rule out its
existence.
In this context the limitations of the currently used reporter
systems and measurement setups need to be discussed critically.
For three of the enzymes described above activity is measured
using a fluorogenic substrate. These substrates are ideal reporter
systems as they allow long measurement times. They have a
number of drawbacks, however. They are synthesized using
relatively hydrophobic dyes by coupling the functional group to
the fluorophore unit. Depending on the functional group, the
resulting enzyme substrates might suffer from a low solubility
in aqueous solution, limiting the concentration range that can be
employed for activity measurements. The upper concentration limit
is determined not only by the substrate solubility but also by the
purity of the substrate. As a direct consequence of the synthesis
procedure, the substrate always contains a small amount of the
free fluorophore. Even nanomolar concentrations of the fluorophore
are problematic in single-turnover experiments. The fluorophores
diffuse through the detection volume where they might be mistaken
as enzymatic turnovers or cause an increase in the background
signal. These issues become even more problematic if the substrate
autohydrolyzes, which is especially critical for esterase substrates.
The most frequently used fluorogenic substrates are based on
the xanthene dyes fluorescein and Rhodamine 110 (Rh110). These
substrates carry two functional groups that are cleaved by the
enzyme in two steps. As illustrated for α-chymotrypsin above,
the intermediate and the product possess different fluorescent
properties. Monofunctionalized Rh110 derivatives are fluorescent
[64, 65] and can be detected at the single-molecule level [36].
This is most likely not the case for monofunctionalized fluorescein
derivatives [78–82]. If the formed intermediate cannot be detected,
many turnovers will be missed and no accurate kinetic information
can be obtained from the time traces. Except for α-chymotrypsin
no attention has been given to this problem so far. Substrates

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Single-Enzyme Kinetics 87

with 1:1 stoichiometry are urgently required. First attempts in this


direction have been made for Rh110-based substrates. One of its
amine groups can be converted into a urea group, for example,
yielding morpholinecarbonyl Rh110 (MC-Rh110) [65, 83]. This
Rh110 derivative shows a relatively high brightness compared
to other monofunctionalized Rh110 derivatives. Considering MC-
Rh110 as the product of the reaction, a peptide can be coupled to
the second amino group, yielding the enzyme substrate (Fig. 3.16a).
Substrates with 1:1 stoichiometry can also be easily synthesized
using the fluorophore Singapore Green (Fig. 3.16b), which is a hybrid
of Rh110 and Tokyo Green. As it carries only one amino group, only
one peptide can be coupled to the fluorophore [84]. Alternatively,
a hydroxymethyl derivative of Rh110 might be used. Monosubsti-
tution of this derivative with a peptide induces a spirocyclization
reaction yielding a nonfluorescent structure (Fig. 3.16c) [85].

Figure 3.16 Rhodamine-110-based next-generation fluorogenic substrates


with 1:1 stoichiometry. (a) Conversion of one amino group into a urea
group (morpholinecarbonyl Rhodamine 110) yields a monosubstituted
derivative with a high brightness. The second amino group can be used
for coupling the peptide. As the urea group is not cleaved enzymatically,
a monosubstituted enzyme substrate is obtained. (b) The fluorophore
Singapore Green contains only one amino group for coupling the peptide. (c)
Functionalization of hydroxymethyl Rhodamine 110 with only one peptide
leads to spirocyclization of the structure. The spirocyclized form is not
fluorescent as the π -conjugated system of the xanthene unit is absent.
Reprinted from Ref. [82], Copyright (2014), with permission from Elsevier.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

88 Enzymology Meets Nanotechnology

Similar strategies are available for fluorescein and its derivative


Tokyo Green [86]. The brightness of monofunctionalized fluores-
ceins depends on the redox potential between the xanthene unit
and the benzene ring. The redox potential can be tuned by varying
the substituents on the corresponding ring systems. In this way a
number of monofunctionalized fluorogenic fluorescein derivatives
have been synthesized, among them substrates for β-galactosidase
[86] and alkaline phosphatase [87]. Even though these substrates
show great potential, they have not been tested in single-molecule
experiments so far.
The S:N ratio does depend not only on the brightness of the
fluorescent product but also on the size of the detection volume.
The bigger is the detection volume, the higher is the contribution
from scattering and diffusing fluorescent molecules. While scattered
photons can frequently be filtered out, this is not possible
for fluorescence photons originating from product molecules. A
smaller detection volume would reduce the number of diffusing
fluorophores that reside in the detection volume while leaving the
signal from the enzymatic product largely unaffected. Using confocal
microscopy, a smaller detection volume is prohibited by the optical
diffraction limit, however. Near-field optical approaches facilitated
by the use of nanostructures have the potential to overcome these
limitations. Alternatives to optical detection schemes have also
emerged in the last few years. These new developments will be
described in the following.

3.4 New Developments Facilitated by Nanotechnology

Single-molecule fluorescence approaches have clearly proven their


potential for unraveling kinetic details of enzymatic reactions that
are not accessible with conventional ensemble techniques. Before
single-turnover detection can become a broadly applied standard
method, a number of technological limitations have to be overcome.
A key bottleneck is the S:N ratio that can maximally be achieved with
conventional diffraction-limited detection schemes. Nano-optical
approaches directly target this problem. Nanostructures designed to
harness near-field optical phenomena reduce the effective detection

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 89

volume and can potentially even enhance the detected fluorescence


signal. Single-turnover detection remains limited to enzymes where
a fluorescence reporter system is available, however. Fluorescence
reporter systems are also highly artificial. The ideal experiment
would allow single-turnover detection using the enzyme’s natural
substrate. Nano-electronic approaches, for example, utilizing carbon
nanotubes (CNTs) as sensors, are currently being developed for the
label-free detection of enzymatic turnovers at the single-molecule
level. Lastly, nanomanipulation methods utilizing the atomic force
microscope (AFM) might contribute in special cases. The appli-
cations of the AFM cover the study of processive enzymes, the
positioning of enzymes in the sensing hotspot, and the mechanical
manipulation of enzymes.

3.4.1 Nano-optical Approaches


In confocal microscopy the size of the detection volume is defined
by the diameter of the confocal volume (xy plane) and the depth
of the focal plane in the z direction. The depth of the focal plane
depends on the size of the pinhole and can be adjusted within a
certain range. The diameter of the detection volume is determined
by the diffraction limit, however, and cannot become smaller than
approximately half of the wavelength of the excitation light. As a
direct consequence, the detection volume in a confocal microscope
cannot become smaller than approximately 1 fL (10−15 L). To ensure
that maximally one fluorophore is present in the detection volume,
the fluorophore concentration cannot exceed a few nanomolar.
Considering, for example, that the fluorogenic substrate contains
approximately 0.1% of fluorescent product molecules, the substrate
concentration cannot exceed a few micromolar. The average KM of
an enzyme lies between 100 μM and 1 mM, however (Fig. 3.17).
Kinetic measurements where the substrate concentration exceeds
the KM value are only possible for a rather small number of enzymes.
Measurements below the KM value might not capture the full kinetic
behavior of the enzyme. Especially possible dynamic disorder might
be hidden when the diffusion of the substrate to the active site is rate
limiting [32]. An obvious, but technically challenging, solution to this
problem is reducing the size of the effective detection volume.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

90 Enzymology Meets Nanotechnology

Figure 3.17 Distribution of KM values considering all enzyme–substrate


pairs in the BRENDA database (http://www.brenda-enzymes.info/).

In recent years, exciting new developments have achieved a


dramatic reduction in the size of the detection volume using near-
field optical effects. These effects become dominant in nanostruc-
tured materials where the size of the structures is smaller than the
wavelength of the light. The near-field scanning optical microscope
(NSOM) was the first detection technique to make use of these
effects [88, 89]. In NSOM the light needs to pass through a small
hole (aperture) with a diameter of around 50 nm [89]. As the
size of the aperture is smaller than the wavelength, a direct
propagation of the light through the aperture is not possible. Instead,
a nonpropagating evanescent wave is generated that can excite
fluorophores in the direct vicinity of the aperture. Similar to confocal
microscopy, NSOM is a raster scanning technique that is frequently
used for imaging. Its application for measuring enzymatic turnover
sequences is difficult, as the aperture has to be positioned in close
proximity to the enzyme for extended periods of time.
Zero-mode waveguides (ZMWs) make use of the same principle
but are far better suited for the measurement of enzymatic activity.
ZMWs contain an array of nanometer-sized holes in a metallic film
supported on a microscope coverslip (Fig. 3.18a). Their fabrication
usually involves the deposition of a 150- to 400-nm-thick gold or
aluminum layer on the coverslip followed by hole milling using an
electron or focused ion beam [90]. The typical hole size ranges
from 30 to 300 nm. The most attractive feature of ZMWs is that

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 91

Figure 3.18 Detection of single-enzyme activity in zero-mode waveguide


structures (ZMWs). (a) A typical layout showing a 200-nm-thick aluminum
layer on top of the glass coverslip. The usually cylindrical holes can have a
diameter between 30 and 300 nm. In the example shown the whole array is
illuminated, facilitating the observation of enzymatic activity in many holes
simultaneously. (b) Following passivation of the metal surface, the enzyme
is immobilized on the glass surface where the intensity of the evanescent
field is highest. Single turnovers can be observed using either an area (CCD
camera) or a point detector (APD).

the nanosized holes simultaneously act as apertures and nanosized


reaction vessels, that is, the enzyme can be directly immobilized
at the bottom of the hole where the excitation intensity is highest
(Fig. 3.18b). Depending on the excitation wavelength, an evanescent
field is generated in the holes of this nanoaperture array that
decays within a few tens of nanometers from the coverslip surface.
For example, when using a 30 nm aperture, an effective detection
volume of approximately 10 zeptoliters (10−21 liters) is obtained,
five orders of magnitude smaller than in a confocal microscope
[91]. ZMWs of this size consequently allow for increasing the
substrate concentration into the physiologically relevant μM–mM
concentration range (Fig. 3.17).
ZMWs not only allow for the use of significantly higher
concentrations of fluorogenic substrates but, impressively, also
facilitate the use of intrinsically fluorescent substrates (Fig. 3.18b).
In a first proof-of-principle experiment, the catalytic activity of
DNA polymerase was observed following the incorporation of
the fluorescently labeled nucleotide coumarin-dCTP [91]. DNA
polymerase was immobilized by nonspecific adsorption in 43-
nm-sized holes fabricated in an aluminum film. At a coumarin-
dCTP concentration of 7.5 μM, fluorescence bursts were detected,
each representing a nucleotide incorporation event followed by
photobleaching of the coumarin label.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

92 Enzymology Meets Nanotechnology

This approach was subsequently optimized [92] to yield one


of the most powerful next-generation DNA sequencing techniques
currently available (www.pacificbiosciences.com). The optimization
involved a well-defined protocol facilitating the selective and
site-specific immobilization of DNA polymerase. Biotinylated DNA
polymerase was bound to a streptavidin functionalized glass surface
while the aluminum was passivated against nonspecific adsorption
with a polyphosphonate polymer [93]. More importantly, all four
nucleotides were fluorescently labeled at the terminal phosphate
group. In this way, nucleotide binding resulted in a fluorescence
burst that was terminated by the enzymatic cleavage of the
pyrophosphate and rapid diffusion of the fluorescence label out
of the detection volume. As every nucleotide carried a spectrally
distinct fluorophore, they could be easily discriminated and the
sequence of nucleotide incorporation events could be followed in
real time. Obviously, the resulting time traces do not only contain
the desired nucleotide sequence but also contain detailed kinetic
information about the catalytic reaction of DNA polymerase.
This example impressively demonstrates the power of the ZMW
setup for performing enzyme activity measurements in the presence
of high fluorophore concentrations. The small detection volume
did not only increase the S:N ratio to approximately 25:1 [92]
but also, facilitated by a large difference in timescales, allowed
the discrimination of binding (and conversion) events from the
simple diffusion of fluorescent molecules through the detection
volume. In this way the data could be corrected for diffusion events
that are frequently mistaken as enzymatic turnovers in confocal
detections schemes. Moreover, the dramatic increase in the S:N ratio
allowed for the use of coumarin derivatives, which is not possible in
conventional detection schemes [91].
The smaller detection volume is clearly the main advantage
of ZMW structures. On top of that, the metal walls defining the
aperture wall of the ZMW can lead to fluorescence enhancement
[94, 95]. This phenomenon, called metal-enhanced fluorescence,
increases the S:N ratio even further. Metal-enhanced fluorescence
depends both on the wavelength and the material. It originates
from the coupling between the electromagnetic waves of the light
and surface plasmons of the metal. Enhancement is consequently

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 93

strongest for metals supporting surface plasmons in the region of


the fluorescence wavelength, such as gold. Using gold apertures with
a diameter of 120 nm, an enhancement of 12 was observed for
the dye AlexaFluor 647 at an excitation wavelength of 633 nm. In
contrast, the enhancement factor was only around 3 when using
Rhodamine 6G excited at 488 nm [94].
The presence of the metal in the vicinity of the fluorophore
affects both the excitation and the emission intensities. Coupling
between the fluorophore dipole and the surface plasmon resonance
oscillations in the nearby metal increases the quantum yield and
decreases the fluorescence lifetime. In this way a much larger
number of photons can be collected from a fluorophore placed in
proximity to the metal surface. The effect is biggest for fluorophores
with a low quantum yield [96]. At the same time also the local
excitation intensity is increased due to resonance between the
excitation light and the surface plasmons [97]. The excitation
enhancement is frequently larger than the emission enhancement
and strictly depends on the geometry of the nanostructure.
The hole geometry of the ZMW is far from ideal for achieving
a high excitation enhancement. In contrast, metal nanoparticles
are very efficient structures for enhancing the incident field. They
act as point light sources that efficiently concentrate the local
excitation intensity. The nanoparticle acts like an antenna collecting
the electromagnetic radiation from its surroundings and localizing it
in its direct vicinity. This local concentration of excitation intensity is
even more efficient when using a pair of nanoparticles. Placed at the
right distance, the excitation field is highest in the middle between
the two particles [98]. This geometry with two point sources
mimics the simplest design of a radio wave antenna, and optical
nanostructures inspired by this geometry are frequently called
optical antennas [99]. Using these antenna structures, excitation
enhancements of at least 100-fold have been achieved [96, 100]. One
of the biggest challenges in fabricating optical antennas are their
small dimensions in the nanometer range [99]. A number of different
designs have been tested, and the most promising for biological
applications are shown in Fig. 3.19.
The first antenna structure used for single-molecule fluorescence
detection was a gold bowtie antenna fabricated using electron-
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

94 Enzymology Meets Nanotechnology

beam lithography (Fig. 3.19a) [96]. The fluorescence enhancement


was quantified using the near-infrared terrylene derivative TPQDI
immobilized in a poly(methyl methacrylate) (PMMA) layer cov-
ering a surface with multiple antenna structures. The dyes were
positioned randomly with respect to the antenna gaps all across
the surface, and the highest enhancement observed for a single
dye located in the gap was 1,340-fold. Fluorophore immobilization
facilitates a straightforward quantification of the enhancement. This
is more complicated with freely diffusing molecules. Illuminated
with a diffraction limited excitation laser, molecules are not only
excited in the sensor hotspot. They are also exposed to the excitation
laser outside the gap, even though with a lower excitation intensity.
As the number of molecules in the overall excited volume is far
larger than in the gap region, they contribute to the fluorescence
background, limiting the S:N ratio. To overcome this problem with
freely diffusing molecules, a next-generation design was introduced
that places the optical antenna inside a rectangular hole similar
to the ZMWs described above [100]. In this antenna-in-box design
(Fig. 3.19b) the enhancement effect can be utilized in combination
with a reduced detection volume. Focused ion beam lithography
was used to fabricate the nanostructures, and a fluorescence
enhancement of up to 1100-fold was observed for freely diffusing
Alexa Fluor 647 molecules. Subsequently, the diffusion of a number
of fluorescently labeled biomolecules was investigated in the gap
region using fluorescence correlation spectroscopy, highlighting that
the experimental setup facilitates the analysis of biological systems.
To eliminate the need for expensive top-down nanofabrication
techniques, an alternative bottom-up approach has been developed.
Two gold nanoparticles were placed in close proximity using a
DNA origami structure that self-assembles on a glass coverslip (Fig.
3.19c) [98]. DNA origami is a very powerful approach for folding
DNA into highly defined and stable 3D structures [101, 102]. DNA
oligonucleotides integrated into the structure can be modified with
functional groups, allowing for the introduction of specific coupling
sites at well-defined positions. In this way, the nanoparticles consti-
tuting the antenna were positioned very precisely. More importantly,
the molecule of interest was placed between the nanoparticles
where the excitation intensity is highest. In a first proof-of-principle

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 95

Figure 3.19 Optical antennas designed for fluorescence enhancement. (a)


Bowtie nanoantenna made of gold deposited on a transparent surface; gap
distance between 5 and 80 nm. (b) Antenna-in-box design combining an
antenna structure with a nanoaperture; gap distance between 12 and 40
nm. (c) Antenna fabricated from two gold nanoparticles attached to a well-
defined DNA origami structure; gap distance 23 nm. Reprinted from Ref.
[82], Copyright (2014), with permission from Elsevier.

experiment, the DNA origami structure was extended with short


sequences of single-stranded DNA. Association and dissociation of a
complementary ATTO 655–labeled oligonucleotide was monitored,
facilitated by a fluorescence enhancement of 60-fold.
DNA origami addresses one of the key challenges of current
nano-optical approaches. It is extremely difficult to specifically
immobilize a molecule of interest at a desired position in a
nanostructure that is composed of different materials. The glass and
the gold (or aluminum) surfaces need to be treated with different
chemistries [93] to prevent nonspecific binding and facilitate a site-
specific attachment of the molecule at the desired position. Even
with a carefully optimized immobilization protocol, the yield of
nanostructures that contain a molecule of interest in the ZMW hole
or in the sensor hotspot of an antenna structure is low. When adding
the molecules in solution, the ZMWs are randomly loaded following
a Poisson distribution. With a calculated average concentration of
one enzyme per hole, 37% of the holes contain no enzyme, 37% of
the holes contain one enzyme, 18% contain two enzymes, and 6%
even contain three enzymes. To avoid multiple enzymes per hole,
lower concentrations need to be used, leaving a large number of
holes empty. For the bowtie nanoantenna and the antenna-in-box
design it is even more difficult to place the molecule of interest in the
sensor hotspot. Also the distance of the molecules from the metal
walls is important, as the metal interaction will directly influence
the fluorescence intensity [103]. This problem has been addressed
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

96 Enzymology Meets Nanotechnology

with making use of the positioning accuracy of the AFM (Section


3.4.3) that facilitates the placement of individual molecules with
nanometer precision [103, 104]. An easier and much faster strategy
is the use of self-assembled DNA origami antenna structures where
the nanoparticles and the molecule of interest are automatically
placed at the correct distances with respect to each other.
Overall, both ZMWs and antenna structures have a high po-
tential for improving the S:N ratio of single-molecule fluorescence
measurements. While ZMWs have already evolved into a highly
powerful approach for single-enzyme measurements, this step
still needs to be taken for the antenna structures. Considering
the impressive results so far, nano-optical approaches will greatly
contribute to the field of single-molecule enzymology in the near
future. These new detection schemes will expand the range of
potential enzyme substrates that can be used. They will allow the use
of fluorogenic substrates based on low-quantum-yield fluorophores
and of fluorescent substrates such as DNA nucleotides [91]. This will
ultimately give access to a much larger number of different enzymes
that can be studied at the single-molecule level with fluorescence
techniques.

3.4.2 Nano-electronic Approaches


Optical approaches crucially depend on the availability of fluoro-
genic substrates or other fluorescent reporter systems. Even though
nano-optical approaches expand the range of possible substrates,
the number of easily accessible designs remains small compared to
the diversity of enzymatic reactions. More importantly, the large size
of the fluorophores might alter the binding of the substrate into the
active site and, as a result, the kinetics of the reaction. The ultimate
dream in single-molecule enzymology is to detect the turnover
sequence using the natural enzyme substrate, thereby eliminating
the need for introducing artificial labels. Much effort has been
invested in the development of electronic detection schemes that
are able to sense redox-active or charged analytes such as enzyme
substrates, intermediates, or products [105–109]. To perform such
measurements at the single-molecule level is challenging; just as for
the previously described fluorescence detection schemes, electronic

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 97

detection systems are characterized by their own intrinsic noise


[109]. Miniaturized sensors with a high sensitivity are required to
ensure that the signal from a single molecule can be detected.
CNT sensors are ideal candidates due to their small size and
unique chemical structure [105, 108, 110]. CNTs are structurally
related to graphene (Figs. 3.20 and 3.21). In graphene, carbon
atoms are arranged in a sp2 -hybridized honeycomb lattice that gives
graphene its unique electronic, mechanical, and optical properties.
This lattice structure is rolled up into a tube in the case of CNTs.
Different configurations are possible to close the tube, and CNTs
can have different structures and diameters. The structure directly
affects the electronic properties, making some CNTs conducting
(metallic) and others semiconducting [108]. Single-walled CNTs
(SWCNTs) consist of one layer only, whereas multiwalled CNTs
(MWCNTs) consist of several layers of tubes often combining tubes
with semiconducting and metallic properties.
SWCNTs have a number of advantages over MWCNTs for single-
molecule sensing applications. In SWCNTs every atom is exposed to
the environment, making it the ideal sensor [110]. As they consist of
only one tube, they are either metallic or semiconducting, allowing
their use for different types of applications [105, 108]. Metallic
SWCNTs are characterized by a high charge mobility. Combined with
their large surface-to-volume ratio, their extraordinary conductivity
and electron transfer rates make them ideal electrode materials
for electrochemical applications [108, 114]. Their functionality has
been demonstrated for a large number of analytes; however, single-
molecule sensitivity has not been reached so far. Most likely, the
detection of single-electron transfer reactions is hidden in the noise
of the measurement, as the CNT simultaneously interacts with a
large number of solvent molecules in its environment. In contrast,
sensors based on semiconducting SWCNT have recently reached the
single-molecule level [111, 115, 116].
Semiconducting SWCNTs allow for the fabrication of transistor-
like devices (Fig. 3.20a) [117]. In these CNT field-effect transistors
(CNT–FETs) one CNT is directly connected to two electrodes. In
this way a voltage is applied across the CNT and the current
through the CNT is measured. As the CNT is semiconducting,
no current is flowing through the CNT without manipulating the
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

98 Enzymology Meets Nanotechnology

charge environment of the CNT. If, on the other hand, the charge
environment surrounding the CNT is altered, electrons or holes are
injected into the CNT. These charge carriers can freely move on
the CNT lattice, and a current is detected. This can, for example,
be achieved by altering the solution potential between the CNT
and a so-called liquid gate electrode. The application of a positive
gate voltage injects electrons into the CNT, whereas a negative gate
voltage generates holes as charge carriers.
Charged biomolecules bound to the CNT can alter the local
CNT charge environment, thereby altering CNT conductivity [110].
Similar to electrochemical CNT sensors, biosensors can be designed
that are able to specifically detect the binding of analytes to
functionalized CNTs. CNT–FETs have been used for the detection of a
broad range of biological analytes making use of DNA hybridization
or antibody–antigen interactions [105]. The first example showing
that enzyme activity can be observed with a CNT–FET device was
shown by Dekker and coworkers using the enzyme glucose oxidase
[118]. The CNT–FET was functionalized with a small number of
approximately 50 enzyme molecules and a clear change in CNT–FET
conductivity was observed when the substrate glucose was added.
The first successful single-turnover experiment using a CNT–
FET device was performed with the enzyme lysozyme from phage
T4 [111]. T4 Lysozyme is an 18.6 kDa enzyme that hydrolyzes the
proteoglycan of bacterial cell walls. Its active site is located between
two domains that open and close during the catalytic reaction. A
single lysozyme molecule was immobilized on the CNT–FET using
the linker molecule pyrene-maleimide (Fig. 3.20b). Pyrene is a
frequently used linker for the noncovalent functionalization of CNTs
[119]. In polar solvents it forms a strong π –π stacking interaction
with the CNT. The lysozyme was modified at the genetic level to
introduce a cysteine at a specific position (Ser90 → Cys). This
mutation allows for the site-specific coupling of the enzyme to the
maleimide functional group of the pyrene linker. Pyrene is not only
an easy method for attaching proteins to CNTs. More importantly,
it does not alter the CNT structure chemically so that the unique
electronic properties of the CNT are maintained.
When bacterial cell wall particles were added to the lysozyme
functionalized CNT–FET, the current started switching between two

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 99

Figure 3.20 Experimental setup for measuring enzymatic activity on a


CNT–FET. (a) FET layout. The CNT is connected between the source and
the drain electrodes. When a voltage Vsd is applied, the current I flowing
through the CNT can be measured. Using a third electrode, the gate
electrode, the solution potential Vgate is altered. The resulting change in
CNT conductivity is read out as the corresponding change in the current.
When an enzyme is immobilized on the CNT, the charges on the protein
surface also influence the electrostatic environment of the CNT, affecting
its conductivity. If the electrostatic environment changes as a function of
the catalytic reaction, enzyme activity can be detected as a change in the
current. (b) Enzyme immobilization. The CNT can be coated with the dye
molecule pyrene that forms strong π –π interactions with the CNT lattice.
Pyrene can be modified with functional groups such as reactive esters or
maleimide, allowing for the coupling of proteins via their amino or thiol
groups, respectively. Shown is pyrene-maleimide that has been used for the
site-specific attachment of lysozyme in Refs. [111–113] after mutating a
surface-exposed serine into cysteine (S90C).

levels (Fig. 3.21a). A number of control experiments, for example,


with two inactive mutants, confirmed that this switching behavior
is the result of enzymatic activity [111, 112]. The resulting time
traces (Fig. 3.21a) can be analyzed in a similar way to fluorescence
time traces, for example, using a threshold. A detailed analysis
of lysozyme activity time traces revealed two distinct phases
characterized by fast (284 s−1 ) and slow (24 s−1 ) switching events.
This observation might at first sight provide supporting evidence
for different conformational states that catalyze the reaction with
different rates. In the case of lysozyme, these rate fluctuations
originated from the heterogeneous nature of the substrate, however.
Proteoglycan is a crosslinked polymeric material that is processively
hydrolyzed by the enzyme. Whenever the enzyme is working on the
same polymer strand it catalyzes the reaction with a given rate that
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

100 Enzymology Meets Nanotechnology

Figure 3.21 Detection of local charge fluctuations resulting from the


enzymatic reaction. (a) Turnover time trace showing the current fluctua-
tions resulting from the enzymatic activity of lysozyme. The high-current
state corresponds to an enzymatic turnover, while the low-current state
represents the enzyme without bound substrate. (b) Position of the charged
amino acids lysine 83 (K83) and arginine 119 (R119) relative to the
attachment site. During the catalytic reaction, the enzyme performs a hinge-
bending motion accompanied by a significant movement of the side chains
of K83 and R119. Reprinted with permission from Ref. [113]. Copyright
2013 American Chemical Society.

is well described by a monoexponential waiting-time distribution


[111]. The processivity is disturbed when the enzyme arrives at a
crosslink where it needs to reorient and find a new polymer strand
[112].
Following these initial experiments with lysozyme, single-
turnover experiments have subsequently been performed using the
catalytic domain of the enzyme cAMP-dependent protein kinase
A (PKA) [120] and the Klenow fragment of DNA polymerase I
[121]. For PKA, three current levels were detected that most
likely represent different substeps of the enzymatic reaction. PKA
binds the two substrates ATP and the model peptide Kemptide
sequentially, indicating that CNT–FETs are powerful tools for
following reaction substeps in real time. For DNA polymerase,
incorporation of the bases A, T, and G/C could be distinguished based
on different current levels, even though with a low S:N ratio. Further,
different rates were observed for the formation of G/C or A/T base
pairs.
Despite impressive progress, a rational design of an enzyme CNT–
FET is not yet possible, as the detection mechanism is not fully
understood. In general, three different sensing mechanisms might
contribute to the detected signal from a biological point of view.
Also semiconducting CNTs might be able to participate in electron

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 101

transfer reactions where an electron of a redox reaction is exchanged


with the CNT. Alternatively, the binding of a charged substrate and
the subsequent release of a charged product might temporarily
alter the charge environment of the CNT and consequently its
conductivity [113, 122]. Lastly, charged amino acids on the enzyme
surface might move as the result of a conformational change.
The resulting redistribution of charged amino acids relative to
the attachment site on the CNT might cause a change in CNT
conductivity [113].
Considering the experiments described above, a redox sensing
mechanism can be directly ruled out as none of the measured
enzymes catalyzes a redox reaction. The other two sensing
mechanisms are more difficult to distinguish experimentally. Ex-
periments using different attachment sites have shown that not
all orientations of the enzyme on the CNT lead to the desired
current fluctuations, highlighting the importance of the attachment
site [113]. Different attachment sites are characterized by different
local charge environments. But at the same time the active site is
positioned differently relative to the CNT surface. Using lysozyme
as a model system, the charge environment around the attachment
site (C90) was investigated in more detail. It was found that two
charged amino acids (K83 and R119) are located in proximity to
the attachment site (Fig. 3.22b). More importantly, the side chain
positions of these amino acids were significantly different in the
open and the closed conformations of the enzyme. This suggests that
two positive charges are switching between different positions as a
function of the catalytic hinge-bending motion. A series of mutants
was investigated where the positively charged K83 and R119 were
mutated to neutral alanine or negatively charged glutamate residues
either individually or in combination. The corresponding CNT–FET
devices all showed a clear switching behavior in the presence of
substrate, however, with a different magnitude and sign of the
current with respect to the baseline. This systematic experiment
clearly shows that the CNT–FET is sensitive to the number and
the nature of charges in its vicinity, supporting the hypothesis that
the relocation of charged amino acids is crucial for the sensing
mechanism.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

102 Enzymology Meets Nanotechnology

Figure 3.22 Analysis of processive enzymes with AFM force spectroscopy.


(a) Experimental setup for measuring the catalytic activity of the enzyme
dextran sucrase (DSase). The enzyme is coupled to the surface, and a short
piece of dextran polymer is coupled to the cantilever. (b) The incorporation
of new sugar monomers increases the length of the dextran molecule. This
length increase can be extracted from the measured force–distance curves.

These results clearly suggest that substrate conversion is not


detected directly but that the CNT–FET senses conformational
changes that occur during the catalytic reaction. These experiments
are consequently similar to FRET-based detection schemes that
are designed for the detection of conformational changes [41].
Even though there is often a clear correlation between substrate
turnover and the conformational change, it cannot be excluded
that unproductive binding events might also involve conformational
changes that lead to detectable changes in the signal. In fact, the fast
phase observed for lysozyme contains many of such unproductive
events while the enzyme is searching for a new polymer strand.
Even though some clues can be learned from the mutational
analysis of lysoyzme [113], many questions remain. How close
do the charged amino acids have to be to the CNT surface or
the attachment point at a given salt concentration? How much
do the charged amino acids have to move to yield a detectable
signal? How exactly does the CNT sense changes in its charge
environment? Is it possible to engineer charges into proteins that
allow for sensing a conformational change? Despite their huge
potential, CNT–FET detection schemes might eventually only be
useful for enzymes that undergo large conformational changes such
as the well-characterized hinge-bending motion of lysozyme. More

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 103

systematic studies are needed to investigate the application range


and the limitations of this novel and promising new detection
scheme. Only once the above questions are answered will it become
possible to choose the best attachment point, to optimize the S:N
ratio and to relate the behavior of the enzyme to the detected signal.
In parallel to the development of CNT–FETs, other detection
schemes are being developed that allow for the measurement
of electronic or electrochemical signals, partially directly related
to the catalytic reaction. For example, many efforts are directed
at the detection of redox reactions in single enzymes using
miniaturized nano-electrodes [107]. Further, single-molecule DNA
sequencing in hemolysin nanopores is an impressive example of how
individual enzymatic turnovers can be sensed electronically. One
implementation of nanopore sequencing uses an exonuclease that is
coupled to the nanopore [106]. The cleaved bases are translocated
through the nanopore one by one where they cause a corresponding
change in the current flowing through the nanopore. This approach
is easily adapted to other (processive) enzymes that release product
molecules for translocation through the nanopore. Partially driven
by the desire to sequence individual DNA molecules, many other
detection schemes are currently being developed and will extend the
possibilities for studying single enzymes electronically.

3.4.3 Nanomechanical Approaches


In addition to the nano-optical and nano-electronic approaches
described above, nanomechanical approaches complement the
single-molecule toolkit in a number of ways. Nanomechanical
techniques, mostly the AFM, can be used for a wide range of different
applications aimed at studying enzymatic turnover reactions. The
main component of the AFM is a cantilever containing a small tip
with a typical radius of 10–15 nm. When the tip is brought close
to a surface, interactions with the surface exert a force on the
cantilever, thereby bending it. Bending of the cantilever is detected
using a laser beam focused on the back of the cantilever. Upon
bending, the cantilever deflects the laser beam so that it reaches the
photodetector in a different position. The bending of the cantilever
is directly proportional to the force acting on it.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

104 Enzymology Meets Nanotechnology

The AFM is used in two fundamentally different operation


modes [123, 124]. It can be used for scanning the topology of
a surface (imaging mode) or for measuring the exact force the
cantilever experiences at one specific position on the surface (force
spectroscopy mode). In imaging mode the cantilever is moved
relative to the surface in the xy direction. In the z direction the
distance between the cantilever tip and the surface is regulated
with a feedback loop to maintain a constant force between the tip
and the surface (contact mode imaging). Features on the surface
with different height consequently require an adjustment of the
cantilever position. The height information, representing the sample
topology, can be directly extracted from the feedback signal. In con-
tact mode, the tip touches the surface and is dragged across it. This
can be damaging to soft biological samples. Other, less destructive,
imaging techniques such as tapping mode have been developed
as an alternative. AFM imaging has, for example, been used for
following the catalytic reaction of single phospholipase enzymes.
A phospholipid layer was deposited on a flat mica surface and the
enzymes were added to the sample in different concentrations. The
enzymes cleaved the phospholipids on the surface, causing them to
dissociate. Enzymatic activity consequently produced surface areas
without phospholipids that were characterized by a lower height. At
very low enzyme concentrations, the tracks of single enzymes could
be followed as they removed the phospholipids from the surface
[125].
In force spectroscopy mode the cantilever is only moved in the
z direction. It is used for measuring the binding forces of molecular
interactions. In a typical experiment the cantilever is functionalized
with a ligand and the corresponding receptor is attached to the
surface. When the cantilever is brought into contact with the
surface, the ligand binds to the receptor. Subsequently, the cantilever
is retracted from the surface. The receptor–ligand interaction
experiences an increasing force, thereby bending the cantilever.
When the interaction cannot withstand the force anymore, it breaks
and the cantilever returns to its resting position. This type of
experiment yields the rupture force of the interaction as well as
information about the length of the molecules that have been
stretched. Typically force spectroscopy is used for the measurement

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 105

of bond dissociation events but rarely for enzymatic reactions. It is


an interesting approach, however, to determine the turnover rate of
processive enzymes, which is often difficult to access.
Using force spectroscopy, the rate of the processive enzyme
dextran sucrase (DSase) was determined (Fig. 3.23) [126]. DSase
cleaves sucrose and links the released D-glucose monomer to
a dextran polymer, thereby increasing the polymer length. The
dextran polymer remains bound to the enzyme in all steps of the
reaction so that the dextran–DSase interaction could be probed
with force spectroscopy. The length of the growing dextran chain
was extracted from the distance information in the obtained force–
distance curves. Knowing the monomer length, the number of
incorporated monomers was determined from the length increase
between two time intervals. In this way, the rate of the reaction was
obtained directly, even though single-turnover resolution was not
possible.
The above examples show that the AFM is a powerful tool
for studying processive enzymes using both its imaging and force
spectroscopy modes. Besides single-molecule studies of processive
enzymes, the AFM has also been used as a nanomanipulation tool. It
has for example been utilized for positioning single molecules on a
surface with a very high accuracy in the low nanometer range. Using
the AFM, a single molecule can specifically be picked up in a depot
area containing a high density of molecules. The cantilever-bound
molecule can then be transported to the desired surface area where
it is released [104, 127]. This strategy has, for example, been used
for placing single titin kinase molecules in a hole of a zero-mode
waveguide structure for further analysis [127]. In the future, the
same strategy can be used for placing single-enzyme molecules in
the sensor hotspot of nanoantennas, thereby significantly increasing
the number of functional nanostructures.
The AFM can not only enable more controlled single-molecule
fluorescence experiments by placing an enzyme in the proper
position in a nanostructure; it can further be combined with
fluorescence detection to obtain additional information about a
molecular system. The AFM can, for example, be used to stretch
an enzyme mechanically while the catalytic reaction is followed
using single-molecule fluorescence detection. The applied force is
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

106 Enzymology Meets Nanotechnology

Figure 3.23 Combination of AFM force spectroscopy with fluorescence


detection. (a) Experimental setup. The enzyme, containing a GCN4 peptide
sequence fused to its N-terminus, is immobilized to the glass coverslip.
The AFM cantilever is functionalized with an antibody specific for the
GCN4 peptide. This antibody–antigen interaction allows for stretching
the enzyme up to a certain threshold force. The response of the enzyme to
the applied force is monitored using a fluorogenic substrate. (b) Overlay of
the force (black) and fluorescence (gray) time trace. Negative forces indicate
contact of the cantilever with the surface (approach). Upon retraction
of the cantilever, the force rises above zero, providing evidence that an
interaction between the molecules attached to the cantilever and the surface
is established. Once the threshold force is reached, the antibody–antigen
interactions rupture and the measured force returns to zero. After a certain
waiting time, an increase in the fluorescence signal is observed that has
originated from enzymatic activity.

known to alter the energy landscape of the enzyme. In this way


different enzyme conformations can be stabilized and their activities
can be monitored using a fluorogenic substrate. This strategy
provides valuable information about the energy landscape of the
enzyme, even though the structures of these conformations are not
accessible.
This strategy was used for investigating the enzyme CaLB
in a first proof-of-principle experiment that was designed to
validate the experimental setup (Fig. 3.23a) [39]. Making use
of a C-terminal cysteine, the enzyme was coupled covalently to
the surface of a maleimide-functionalized glass coverslip. To be
able to apply a force onto the enzyme, it further needs to be
connected to the AFM cantilever. An antibody–antigen interaction

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 107

was used for this purpose. The enzyme carried a GCN4-derived


peptide that was fused to the N-terminus at the genetic level. The
cantilever was functionalized with an agarose bead that contained
the corresponding anti-GCN4 antibody. In this way the cantilever
contained a large number of antibodies so that many interactions
could be formed when the cantilever was brought into contact with
the surface. This strategy did not only allow for the highly efficient
formation of many interactions with the GCN4–CaLB fusion proteins
but also provided a threshold force (60 pN) that could maximally act
on the enzyme when the cantilever was retraced from the surface. In
this way unfolding of the enzyme was prevented.
The reversibility and specificity of the antibody–antigen interac-
tion further allowed for a sequence of approach and retract cycles
where the enzyme was picked up by the cantilever, stretched, and
released. Using the fluorogenic substrate 5-(6-)-carboxyfluorescein
diacetate (CFDA), the fluorescence signal was monitored during
these cycles using a total internal reflection fluorescence (TIRF)
microscope. In many of these cycles a higher fluorescence intensity
was observed following a rupture event detected with the AFM
(Fig. 3.23b). This higher intensity was observed roughly 1.6 s after
the enzyme had been released from the cantilever. Most likely the
enzyme was stretched into an inactive conformation by the applied
force. Once it was released from the cantilever, it rearranged into
its equilibrium conformation. This rearrangement involved multiple
steps representing different conformational states. One of these
conformations was catalytically active and is responsible for the
observed activity. Interestingly, the equilibrium conformation of
CaLB is not very active when CFDA is used as a substrate. Most likely
this bulky substrate does not fit well into the active site. The force-
induced catalytically active conformation might possess a more open
active site, facilitating easier access of the substrate.
Even though experimentally challenging, the above results with
CaLB clearly highlight the potential of combined single-molecule
force and fluorescence measurements. Such measurements have a
high potential for studying enzymatic reactions that are naturally
controlled by mechanical stimuli. One example is the aforemen-
tioned protein titin kinase [127]. This protein has regulatory
function in the muscle where it detects the force acting along
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

108 Enzymology Meets Nanotechnology

the muscle protein titin. It has a so-called cryptic catalytic site


that is only active once the protein is partially unfolded by the
applied force. Many similar, and largely unexplored, mechanical
regulation mechanisms exist in nature that can be characterized
with nanomechancial techniques combined with single-molecule
fluorescence detection [128, 129]. Mechanical stress might also
influence enzymatic activity in bioreactors so that understanding of
these parameters is of high fundamental interest in many fields of
research.

3.4.4 Summary
Recent developments in nanotechnology have yielded a wide range
of optical, electrical, and mechanical techniques that can improve
our understanding of enzymes. Single-molecule fluorescence tech-
niques have already proven their power for investigating the kinetics
of enzymes, including the detection of reaction intermediates.
Despite huge progress in the field, diffraction-limited detection
schemes, such as confocal microscopy, suffer from a number of
drawbacks. One key limitation is the relatively large detection
volume that causes low S:N ratios as well as artifacts from product
molecules entering the detection volume. This problem can be
overcome with nano-optical and nano-electronic approaches that
are both characterized by significantly smaller detection volumes.
Nano-optical approaches utilize nanostructures for confining the
excitation light in tiny holes or in the gap of a nanoantenna, thereby
reducing the size of the detection volume at least 100-fold. The
improved S:N ratio justifies the increased effort of fabricating the
nanostructures. Currently, no general solution has been found to
immobilize an enzyme in a defined position either in the ZMW
holes or in the antenna hotspot. This problem can, for example, be
overcome with the use of the AFM that can be used to transport
a single molecule to a defined area on a surface with nanometer
precision.
Nano-electronic techniques do not require a fluorescent reporter
system but allow the use of natural, unmodified enzyme substrates.
CNT–FETs only sense charge fluctuations very close to their surface,
indicating that conformational changes are detected instead of

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

New Developments Facilitated by Nanotechnology 109

substrate binding and product release. This makes the S:N ratio
independent of the presence of substrate and product molecules,
enabling a larger range of substrate concentrations to be used
in the measurements. This is a clear advantage of electronic
measurements. On the other hand, they do not give direct access
to the rate of the enzymatic reaction. It would be very interesting
to combine a CNT–FET experiment with a fluorescence experiment
using a fluorogenic substrate. In this way the conformational
changes could be directly correlated with enzymatic turnover events
and productive and nonproductive conformational changes could be
identified.
Nanomechanical approaches are a powerful addition, especially
when combined with fluorescence detection. Mechanical influences
on enzymes have long been ignored due to the lack of proper
characterization methods. In recent years, not only the AFM but also
other mechanical techniques such as optical tweezers and magnetic
tweezers have been combined with single-molecule fluorescence
detection [128]. With these new possibilities of combining force
and fluorescence measurements, new experimental strategies are
emerging for systematically studying possible correlations between
force and enzyme regulation. Considering that mechanical effects
are easily investigated in MD simulations, the corresponding
structural changes can be visualized directly providing structural
insight. It is expected that insights into the mechanical properties of
protein structures can be used to explain conformational dynamics
and allosteric effects.
A general trend is observed to combine different techniques. The
combined AFM-fluorescence experiment aimed at studying CaLB
is one example. The proposed combination of a CNT–FETs with a
fluorescence readout is another interesting approach that allows
for correlating two different properties of an enzyme with the
goal of learning more about its function. Clearly electrochemical
approaches will also benefit from a combination with fluorescence
detection. Many enzyme cofactors such as FAD or FMN are
fluorescent and redox switching of the cofactor can be detected
as a change in its fluorescence. As single-electron transfer events
cannot (yet) be read out electronically, the state of the cofactor can
instead be detected using single-molecule fluorescence detection
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

110 Enzymology Meets Nanotechnology

[130, 131]. Clearly, developments in nanotechnology will open up


many new possibilities for studying enzyme kinetics and dynamics
at the single-molecule level.

3.5 Conclusion

Conformational changes and multistep kinetic schemes are intrinsi-


cally linked on a multidimensional energy landscape. Every new step
along the reaction coordinate might be linked to a different enzyme
conformation. In addition, off-pathway conformations might be
involved in the regulation of enzyme activity or facilitate alternative
reaction pathways. Dynamic disorder is a direct consequence of
alternative reaction pathways if these pathways possess different
rate-limiting steps. Despite impressive progress in the field, it
remains difficult to determine the kinetics of conformational
changes along and perpendicular to the reaction coordinate. Single-
molecule fluorescence detection is a powerful strategy to obtain this
kinetic information, as it allows for following the catalytic reaction of
a single enzyme in real time. Making use of fluorogenic substrates or
the optical properties of redox cofactors, single-turnover time traces
can be obtained that contain the desired kinetic information.
The experiments summarized for CaLB, T. lanuginosus lipase, α-
chymotrypsin, and NiR demonstrate the current status of the field.
They impressively show that single-turnover detection provides
access to the kinetic constants and the kinetic schemes of multistep
reaction sequences and regulatory processes. The experimental
designs are powerful but also have a number of limitations
hampering the accurate detection of the turnover sequence. It is
often possible to clearly identify a small number of states with
distinctly different rate constants. Dynamic disorder, on the other
hand, is characterized by a large number of conformational states
with an almost continuous spectrum of rate constants. Better-
quality data with a higher S:N ratio is required before any definitive
conclusions about dynamic disorder can be drawn, either ruling out
or confirming its contributions to the catalytic reaction.
Recent developments in nanotechnology can directly address
the most important bottlenecks in single-turnover experiments.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

Conclusion 111

Both nano-optical and nano-electronic detection schemes facilitate


a more accurate turnover detection. They further expand the range
of useful enzyme substrates so that more different enzymes can
be studied. The development of new detection schemes is largely
driven by the desire to sequence single-DNA molecules. Many
emerging detection schemes such as zero-mode waveguides and
nanopores were originally developed with the aim of following the
activity of single molecules of DNA polymerase. It is obvious that
these now well-established DNA sequencing methods will also find
application for other enzymes and that other new techniques will
emerge in the future.
These new developments will soon be implemented for studying
many more enzymes at the single-molecule level and ultimately shed
light on the question if enzymatic reactions show dynamic disorder.
To really understand the complex energy landscape of enzymes,
methods will have to be implemented to manipulate the energy
landscape. Investigating mutations and ligand binding are the most
obvious strategies to test how these parameters affect the kinetic
scheme. Other strategies might involve the mechanical manipulation
of the enzyme structure or alterations of the microenvironment
by immobilization or the addition of crowding agents. We still
know little about how the structure of an enzyme determines
its dynamics and how dynamic processes affect catalysis. Single-
enzyme experiments are one approach to contribute to this
important direction of enzymology.

Acknowledgments

The authors thank Petri Turunen for help with preparing Fig. 3.6,
as well as Turunen and Emilia Grad for critically reading the
chapter. This work was funded by the Netherlands Organization
of Scientific Research (NWO; VICI (AER) and VIDI (KB) grants),
the Human Frontier Science Program (HFSP), the Foundation for
Fundamental Research on Matter (FOM), and the Dutch National
Research School Combination Catalysis Controlled by Chemical
Design (NRSC Catalysis).
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

112 Enzymology Meets Nanotechnology

References

1. Wolfenden, R., and Snider, M. J. (2001). The depth of chemical time and
the power of enzymes as catalysts, Acc. Chem. Res., 34, pp. 938–945.
2. Olsson, M. H. M., Parson, W. W., and Warshel, A. (2006). Dynamical con-
tributions to enzyme catalysis: critical tests of a popular hypothesis,
Chem. Rev., 106, pp. 1737–1756.
3. Boehr, D. D., Dyson, H. J., and Wright, P. E. (2006). An NMR perspective
on enzyme dynamics, Chem. Rev., 106, pp. 3055–3079.
4. Callender, R., and Dyer, R. B. (2006). Advances in time-resolved ap-
proaches to characterize the dynamical nature of enzymatic catalysis,
Chem. Rev., 106, pp. 3031–3042.
5. Henzler-Wildman, K., and Kern, D. (2007). Dynamic personalities of
proteins, Nature, 450, pp. 964–972.
6. Loria, J. P., Berlow, R. B., and Watt, E. D. (2008). Characterization of
enzyme motions by solution NMR relaxation dispersion, Acc. Chem.
Res., 41, pp. 214–221.
7. Eisenmesser, E. Z., Millet, O., Labeikovsky, W., Korzhnev, D. M., Wolf-
Watz, M., Bosco, D. A., Skalicky, J. J., Kay, L. E., and Kern, D. (2005).
Intrinsic dynamics of an enzyme underlies catalysis, Nature, 438, pp.
117–121.
8. Kamerlin, S. C., and Warshel, A. (2010). At the dawn of the 21st century:
is dynamics the missing link for understanding enzyme catalysis?,
Proteins, 78, pp. 1339–1375.
9. Volkman, B. F., Lipson, D., Wemmer, D. E., and Kern, D. (2001). Two-
state allosteric behavior in a single-domain signaling protein, Science,
291, pp. 2429–2433.
10. Monod, J., Wyman, J., and Changeux, J. P. (1965). On the nature of
allosteric transitions: a plausible model, J. Mol. Biol., 12, pp. 88–118.
11. Koshland, D. E., Jr., Nemethy, G., and Filmer, D. (1966). Comparison
of experimental binding data and theoretical models in proteins
containing subunits, Biochemistry, 5, pp. 365–385.
12. Dill, K. A., and Chan, H. S. (1997). From Levinthal to pathways to
funnels, Nat. Struct. Biol., 4, pp. 10–19.
13. Wolynes, P. G., Onuchic, J. N., and Thirumalai, D. (1995). Navigating the
folding routes, Science, 267, pp. 1619–1620.
14. Swint-Kruse, L., and Fisher, H. F. (2008). Enzymatic reaction sequences
as coupled multiple traces on a multidimensional landscape, Trends
Biochem. Sci., 33, pp. 104–112.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

References 113

15. Ma, B., Kumar, S., Tsai, C. J., Hu, Z., and Nussinov, R. (2000). Transition-
state ensemble in enzyme catalysis: possibility, reality, or necessity?, J.
Theor. Biol., 203, pp. 383–397.
16. Xie, S. N. (2001). Single-molecule approach to enzymology, Single Mol.,
2, pp. 229–236.
17. Korzhnev, D. M., and Kay, L. E. (2008). Probing invisible, low-
populated states of protein molecules by relaxation dispersion NMR
spectroscopy: an application to protein folding, Acc. Chem. Res., 41, pp.
442–451.
18. Rotman, B. (1961). Measurement of activity of single molecules of
beta-D-galactosidase, Proc. Natl. Acad. Sci. U S A, 47, pp. 1981–1991.
19. Gell, C., Brockwell, D., and Smith, A. (2006). Handbook of Single
Molecule Fluorescence Spectroscopy (Oxford University Press, Oxford).
20. Moerner, W. E., and Fromm, D. P. (2003). Methods of single-molecule
fluorescence spectroscopy and microscopy, Rev. Sci. Instrum., 74, pp.
3597–3619.
21. Tinnefeld, P., and Sauer, M. (2005). Branching out of single-molecule
fluorescence spectroscopy: challenges for chemistry and influence on
biology, Angew. Chem., Int. Ed., 44, pp. 2642–2671.
22. Brender, J. R., Dertouzos, J., Ballou, D. P., Massey, V., Palfey, B. A., Entsch,
B., Steel, D. G., and Gafni, A. (2005). Conformational dynamics of the
isoalloxazine in substrate-free p-hydroxybenzoate hydroxylase: single-
molecule studies, J. Am. Chem. Soc., 127, pp. 18171–18178.
23. Lu, H. P., Xun, L., and Xie, X. S. (1998). Single-molecule enzymatic
dynamics, Science, 282, pp. 1877–1882.
24. Shi, J., Palfey, B. A., Dertouzos, J., Jensen, K. F., Gafni, A., and Steel, D.
(2004). Multiple states of the Tyr318Leu mutant of dihydroorotate
dehydrogenase revealed by single-molecule kinetics, J. Am. Chem. Soc.,
126, pp. 6914–6922.
25. Shi, J., Dertouzos, J., Gafni, A., Steel, D., and Palfey, B. A. (2006).
Single-molecule kinetics reveals signatures of half-sites reactivity in
dihydroorotate dehydrogenase A catalysis, Proc. Natl. Acad. Sci. U S A,
103, pp. 5775–5780.
26. Goldsmith, R. H., Tabares, L. C., Kostrz, D., Dennison, C., Aartsma, T. J.,
Canters, G. W., and Moerner, W. E. (2011). Redox cycling and kinetic
analysis of single molecules of solution-phase nitrite reductase, Proc.
Natl. Acad. Sci. U S A, 108, pp. 17269–17274.
27. Kuznetsova, S., Zauner, G., Aartsma, T. J., Engelkamp, H., Hatzakis, N.,
Rowan, A. E., Nolte, R. J., Christianen, P. C., and Canters, G. W. (2008).
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

114 Enzymology Meets Nanotechnology

The enzyme mechanism of nitrite reductase studied at single-molecule


level, Proc. Natl. Acad. Sci. U S A, 105, pp. 3250–3255.
28. Tabares, L. C., Kostrz, D., Elmalk, A., Andreoni, A., Dennison, C., Aartsma,
T. J., and Canters, G. W. (2011). Fluorescence lifetime analysis of
nitrite reductase from Alcaligenes xylosoxidans at the single-molecule
level reveals the enzyme mechanism, Chem. Eur. J., 17, pp. 12015–
12019.
29. Comellas-Aragones, M., Engelkamp, H., Claessen, V. I., Sommerdijk, N.
A., Rowan, A. E., Christianen, P. C., Maan, J. C., Verduin, B. J., Cornelissen,
J. J., and Nolte, R. J. (2007). A virus-based single-enzyme nanoreactor,
Nat. Nanotechnol., 2, pp. 635–639.
30. De Cremer, G., Roeffaers, M. B., Baruah, M., Sliwa, M., Sels, B. F.,
Hofkens, J., and De Vos, D. E. (2007). Dynamic disorder and stepwise
deactivation in a chymotrypsin catalyzed hydrolysis reaction, J. Am.
Chem. Soc., 129, pp. 15458–15459.
31. Edman, L., Foldes-Papp, Z., Wennmalm, S., and Rigler, R. (1999). The
fluctuating enzyme: a single molecule approach, Chem. Phys., 247, pp.
11–22.
32. English, B. P., Min, W., van Oijen, A. M., Lee, K. T., Luo, G., Sun, H.,
Cherayil, B. J., Kou, S. C., and Xie, X. S. (2006). Ever-fluctuating single
enzyme molecules: Michaelis-Menten equation revisited, Nat. Chem.
Biol., 2, pp. 87–94.
33. Hatzakis, N. S., Engelkamp, H., Velonia, K., Hofkens, J., Christianen, P. C.,
Svendsen, A., Patkar, S. A., Vind, J., Maan, J. C., Rowan, A. E., and Nolte, R.
J. (2006). Synthesis and single enzyme activity of a clicked lipase-BSA
hetero-dimer, Chem. Commun., pp. 2012–2014.
34. Hatzakis, N. S., Wei, L., Jorgensen, S. K., Kunding, A. H., Bolinger, P.
Y., Ehrlich, N., Makarov, I., Skjot, M., Svendsen, A., Hedegard, P., and
Stamou, D. (2012). Single enzyme studies reveal the existence of
discrete functional states for monomeric enzymes and how they are
“selected” upon allosteric regulation, J. Am. Chem. Soc., 134, pp. 9296–
9302.
35. Terentyeva, T. G., Engelkamp, H., Rowan, A. E., Komatsuzaki, T., Hofkens,
J., Li, C. B., and Blank, K. (2012). Dynamic disorder in single-enzyme
experiments: facts and artifacts, ACS Nano, 6, pp. 346–354.
36. Terentyeva, T. G., Hofkens, J., Komatsuzaki, T., Blank, K., and Li, C. B.
(2013). Time-resolved single molecule fluorescence spectroscopy of
an alpha-chymotrypsin catalyzed reaction, J. Phys. Chem. B, 117, pp.
1252–1260.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

References 115

37. Velonia, K., Flomenbom, O., Loos, D., Masuo, S., Cotlet, M., Engelborghs,
Y., Hofkens, J., Rowan, A. E., Klafter, J., Nolte, R. J., and de Schryver, F. C.
(2005). Single-enzyme kinetics of CALB-catalyzed hydrolysis, Angew.
Chem., Int. Ed., 44, pp. 560–564.
38. Flomenbom, O., Velonia, K., Loos, D., Masuo, S., Cotlet, M., Engelborghs,
Y., Hofkens, J., Rowan, A. E., Nolte, R. J., Van der Auweraer, M., de
Schryver, F. C., and Klafter, J. (2005). Stretched exponential decay
and correlations in the catalytic activity of fluctuating single lipase
molecules, Proc. Natl. Acad. Sci. U S A, 102, pp. 2368–2372.
39. Gumpp, H., Puchner, E. M., Zimmermann, J. L., Gerland, U., Gaub, H. E.,
and Blank, K. (2009). Triggering enzymatic activity with force, Nano
Lett., 9, pp. 3290–3295.
40. Antikainen, N. M., Smiley, R. D., Benkovic, S. J., and Hammes, G. G.
(2005). Conformation coupled enzyme catalysis: single-molecule and
transient kinetics investigation of dihydrofolate reductase, Biochem-
istry, 44, pp. 16835–16843.
41. Chen, Y., Hu, D. H., Vorpagel, E. R., and Lu, H. P. (2003). Probing single-
molecule T4 lysozyme conformational dynamics by intramolecular
fluorescence energy transfer, J. Phys. Chem. B, 107, pp. 7947–7956.
42. Ha, T., Ting, A. Y., Liang, J., Caldwell, W. B., Deniz, A. A., Chemla, D.
S., Schultz, P. G., and Weiss, S. (1999). Single-molecule fluorescence
spectroscopy of enzyme conformational dynamics and cleavage
mechanism, Proc. Natl. Acad. Sci. U S A, 96, pp. 893–898.
43. Hanson, J. A., Duderstadt, K., Watkins, L. P., Bhattacharyya, S., Brokaw,
J., Chu, J. W., and Yang, H. (2007). Illuminating the mechanistic roles of
enzyme conformational dynamics, Proc. Natl. Acad. Sci. U S A, 104, pp.
18055–18060.
44. Santoso, Y., Joyce, C. M., Potapova, O., Le Reste, L., Hohlbein, J., Torella,
J. P., Grindley, N. D. F., and Kapanidis, A. N. (2010). Conformational
transitions in DNA polymerase I revealed by single-molecule FRET,
Proc. Natl. Acad. Sci. U S A, 107, pp. 715–720.
45. Hohlbein, J., Gryte, K., Heilemann, M., and Kapanidis, A. N. (2010).
Surfing on a new wave of single-molecule fluorescence methods, Phys.
Biol., 7, p. 031001.
46. Roy, R., Hohng, S., and Ha, T. (2008). A practical guide to single-
molecule FRET, Nat. Methods, 5, pp. 507–516.
47. Blank, K., De Cremer, G., and Hofkens, J. (2009). Fluorescence-based
analysis of enzymes at the single-molecule level, Biotechnol. J., 4, pp.
465–479.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

116 Enzymology Meets Nanotechnology

48. Claessen, V. I., Engelkamp, H., Christianen, P. C., Maan, J. C., Nolte, R. J.,
Blank, K., and Rowan, A. E. (2010). Single-biomolecule kinetics: the art
of studying a single enzyme, Annu. Rev. Anal. Chem., 3, pp. 319–340.
49. Rothwell, P. J., Berger, S., Kensch, O., Felekyan, S., Antonik, M., Wohrl,
B. M., Restle, T., Goody, R. S., and Seidel, C. A. (2003). Multiparameter
single-molecule fluorescence spectroscopy reveals heterogeneity of
HIV-1 reverse transcriptase:primer/template complexes, Proc. Natl.
Acad. Sci. U S A, 100, pp. 1655–1660.
50. Widengren, J., Kudryavtsev, V., Antonik, M., Berger, S., Gerken, M.,
and Seidel, C. A. (2006). Single-molecule detection and identification
of multiple species by multiparameter fluorescence detection, Anal.
Chem., 78, pp. 2039–2050.
51. Watkins, L. P., and Yang, H. (2005). Detection of intensity change points
in time-resolved single-molecule measurements, J. Phys. Chem. B, 109,
pp. 617–628.
52. Talaga, D. S. (2007). Markov processes in single molecule fluorescence,
Curr. Opin. Colloid Interface Sci., 12, pp. 285–296.
53. Flomenbom, O., and Silbey, R. J. (2006). Utilizing the information
content in two-state trajectories, Proc. Natl. Acad. Sci. U S A, 103, pp.
10907–10910.
54. Li, C.-B., and Komatsuzaki, T. (2013). Aggregated markov model using
time series of single molecule dwell times with minimum excessive
information, Phys. Rev. Lett., 111, p. 058301.
55. Laursen, T., Singha, A., Rantzau, N., Tutkus, M., Borch, J., Hedegård, P.,
Stamou, D., Møller, B. L., and Hatzakis, N. S. (2014). Single molecule
activity measurements of cytochrome P450 oxidoreductase reveal the
existence of two discrete functional states, ACS Chem. Biol., 9, pp. 630–
634.
56. Anderson, E. M., Larsson, K. M., and Kirk, O. (1998). One biocatalyst:
many applications: the use of Candida antarctica B-Lipase in organic
synthesis, Biocatal. Biotransform., 16, pp. 181–204.
57. Verger, R. (1997). ‘Interfacial activation’ of lipases: facts and artifacts,
Trends Biotechnol., 15, pp. 32–38.
58. Skjot, M., De Maria, L., Chatterjee, R., Svendsen, A., Patkar, S. A.,
Ostergaard, P. R., and Brask, J. (2009). Understanding the plasticity of
the alpha/beta hydrolase fold: lid swapping on the Candida antarctica
lipase B results in chimeras with interesting biocatalytic properties,
ChemBioChem, 10, pp. 520–527.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

References 117

59. Uppenberg, J., Hansen, M. T., Patkar, S., and Jones, T. A. (1994).
The sequence, crystal-structure determination and refinement of two
crystal forms of lipase-B from Candida-antarctica, Structure, 2, pp.
293–308.
60. Berg, O. G., Cajal, Y., Butterfoss, G. L., Grey, R. L., Alsina, M. A., Yu, B.-Z.,
and Jain, M. K. (1998). Interfacial activation of triglyceride lipase from
Thermomyces (Humicola) lanuginosa: kinetic parameters and a basis
for control of the lid, Biochemistry, 37, pp. 6615–6627.
61. Brzozowski, A. M., Savage, H., Verma, C. S., Turkenburg, J. P., Lawson,
D. M., Svendsen, A., and Patkar, S. (2000). Structural origins of the
interfacial activation in Thermomyces (Humicola) lanuginosa lipase,
Biochemistry, 39, pp. 15071–15082.
62. McConn, J., Fasman, G. D., and Hess, G. P. (1969). Conformation of the
high pH form of chymotrypsin, J. Mol. Biol., 39, pp. 551–562.
63. Fersht, A. R., and Requena, Y. (1971). Equilibrium and rate constants
for the interconversion of two conformations of α-chymotrypsin: the
existence of a catalytically inactive conformation at neutral pH, J. Mol.
Biol., 60, pp. 279–290.
64. Leytus, S. P., Melhado, L. L., and Mangel, W. F. (1983). Rhodamine-based
compounds as fluorogenic substrates for serine proteinases, Biochem.
J., 209, pp. 299–307.
65. Terentyeva, T. G., Van Rossom, W., Van der Auweraer, M., Blank, K.,
and Hofkens, J. (2011). Morpholinecarbonyl-Rhodamine 110 based
substrates for the determination of protease activity with accurate
kinetic parameters, Bioconjug. Chem., 22, pp. 1932–1938.
66. Fieldler, F., and Hinz, H. (1994). No intermediate channelling in
stepwise hydrolysis of fluorescein di-beta-D-galactoside by beta-
galactosidase, Eur. J. Biochem., 222, pp. 75–81.
67. Hofmann, J., and Sernetz, M. (1983). A kinetic study on the enzy-
matic hydrolysis of fluorescein diacetate and fluorescein-di-beta-D-
galactopyranoside, Anal. Biochem., 131, pp. 180–186.
68. Huang, Z. J. (1991). Kinetic fluorescence measurement of fluorescein
di-beta-D-galactoside hydrolysis by beta-galactosidase: intermediate
channeling in stepwise catalysis by a free single enzyme, Biochemistry,
30, pp. 8535–8540.
69. Mikolasch, A., and Schauer, F. (2009). Fungal laccases as tools for the
synthesis of new hybrid molecules and biomaterials, Appl. Microbiol.
Biotechnol., 82, pp. 605–624.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

118 Enzymology Meets Nanotechnology

70. Rodriguez Couto, S., and Toca Herrera, J. L. (2006). Industrial and
biotechnological applications of laccases: a review, Biotechnol. Adv., 24,
pp. 500–513.
71. Wijma, H. J., Jeuken, L. J., Verbeet, M. P., Armstrong, F. A., and Canters,
G. W. (2006). A random-sequential mechanism for nitrite binding and
active site reduction in copper-containing nitrite reductase, J. Biol.
Chem., 281, pp. 16340–16346.
72. Almeida, M. G., Serra, A., Silveira, C. M., and Moura, J. J. (2010). Nitrite
biosensing via selective enzymes: a long but promising route, Sensors,
10, pp. 11530–11555.
73. Kuznetsova, S., Zauner, G., Schmauder, R., Mayboroda, O. A., Deelder,
A. M., Aartsma, T. J., and Canters, G. W. (2006). A Forster-resonance-
energy transfer-based method for fluorescence detection of the
protein redox state, Anal. Biochem., 350, pp. 52–60.
74. Krzeminski, L., Ndamba, L., Canters, G. W., Aartsma, T. J., Evans, S. D.,
and Jeuken, L. J. (2011). Spectroelectrochemical investigation of in-
tramolecular and interfacial electron-transfer rates reveals differences
between nitrite reductase at rest and during turnover, J. Am. Chem. Soc.,
133, pp. 15085–15093.
75. James, L. C., and Tawfik, D. S. (2003). Conformational diversity and
protein evolution: a 60-year-old hypothesis revisited, Trends Biochem.
Sci., 28, pp. 361–368.
76. Tokuriki, N., and Tawfik, D. S. (2009). Protein dynamism and
evolvability, Science, 324, pp. 203–207.
77. Teague, S. J. (2003). Implications of protein flexibility for drug
discovery, Nat. Rev. Drug. Discov., 2, pp. 527–541.
78. Burchak, O. N., Mugherli, L., Chatelain, F., and Balakirev, M. Y. (2006).
Fluorescein-based amino acids for solid phase synthesis of fluorogenic
protease substrates, Bioorg. Med. Chem., 14, pp. 2559–2568.
79. Liu, B., Fletcher, S., Avadisian, M., Gunning, P. T., and Gradinaru, C. C.
(2009). A photostable, pH-invariant fluorescein derivative for single-
molecule microscopy, J. Fluoresc., 19, pp. 915–920.
80. Maeda, H., Matsuno, H., Ushida, M., Katayama, K., Saeki, K., and
Itoh, N. (2005). 2,4-Dinitrobenzenesulfonyl fluoresceins as fluorescent
alternatives to Ellman’s reagent in thiol-quantification enzyme assays,
Angew. Chem., Int. Ed., 44, pp. 2922–2925.
81. Melhado, L. L., Peltz, S. W., Leytus, S. P., and Mangel, W. F. (1982). p-
Guanidinobenzoic acid esters of fluorescein as active-site titrants of
serine proteases, J. Am. Chem. Soc., 104, pp. 7299–7306.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

References 119

82. Turunen, P., Rowan, A. E., and Blank, K. (2014). Single-enzyme kinetics
with fluorogenic substrates: lessons learnt and future directions, FEBS
Lett., 588, pp. 3553–3563.
83. Wang, Z. Q., Liao, J., and Diwu, Z. (2005). N-DEVD-N’-morpholine-
carbonyl-rhodamine 110: novel caspase-3 fluorogenic substrates for
cell-based apoptosis assay, Bioorg. Med. Chem. Lett., 15, pp. 2335–
2338.
84. Li, J., and Yao, S. Q. (2008). “Singapore Green”: a new fluorescent dye for
microarray and bioimaging applications, Org. Lett., 11, pp. 405–408.
85. Sakabe, M., Asanuma, D., Kamiya, M., Iwatate, R. J., Hanaoka, K., Terai,
T., Nagano, T., and Urano, Y. (2013). Rational design of highly sensitive
fluorescence probes for protease and glycosidase based on precisely
controlled spirocyclization, J. Am. Chem. Soc., 135, pp. 409–414.
86. Urano, Y., Kamiya, M., Kanda, K., Ueno, T., Hirose, K., and Nagano,
T. (2005). Evolution of fluorescein as a platform for finely tunable
fluorescence probes, J. Am. Chem. Soc., 127, pp. 4888–4894.
87. Kamiya, M., Urano, Y., Ebata, N., Yamamoto, M., Kosuge, J., and
Nagano, T. (2005). Extension of the applicable range of fluorescein: a
fluorescein-based probe for Western blot analysis, Angew. Chem., Int.
Ed., 44, pp. 5439–5441.
88. Betzig, E., and Chichester, R. J. (1993). Single molecules observed
by near-field scanning optical microscopy, Science, 262, pp. 1422–
1425.
89. Harootunian, A., Betzig, E., Isaacson, M., and Lewis, A. (1986). Super-
resolution fluorescence near-field scanning optical microscopy, Appl.
Phys. Lett., 49, pp. 674–676.
90. Zhu, P., and Craighead, H. G. (2012). Zero-mode waveguides for single-
molecule analysis, Annu. Rev. Biophys., 41, pp. 269–293.
91. Levene, M. J., Korlach, J., Turner, S. W., Foquet, M., Craighead, H. G.,
and Webb, W. W. (2003). Zero-mode waveguides for single-molecule
analysis at high concentrations, Science, 299, pp. 682–686.
92. Eid, J., Fehr, A., Gray, J., Luong, K., Lyle, J., Otto, G., Peluso, P., Rank,
D., Baybayan, P., Bettman, B., Bibillo, A., Bjornson, K., Chaudhuri, B.,
Christians, F., Cicero, R., Clark, S., Dalal, R., Dewinter, A., Dixon, J.,
Foquet, M., Gaertner, A., Hardenbol, P., Heiner, C., Hester, K., Holden, D.,
Kearns, G., Kong, X., Kuse, R., Lacroix, Y., Lin, S., Lundquist, P., Ma, C.,
Marks, P., Maxham, M., Murphy, D., Park, I., Pham, T., Phillips, M., Roy, J.,
Sebra, R., Shen, G., Sorenson, J., Tomaney, A., Travers, K., Trulson, M.,
Vieceli, J., Wegener, J., Wu, D., Yang, A., Zaccarin, D., Zhao, P., Zhong,
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

120 Enzymology Meets Nanotechnology

F., Korlach, J., and Turner, S. (2009). Real-time DNA sequencing from
single polymerase molecules, Science, 323, pp. 133–138.
93. Korlach, J., Marks, P. J., Cicero, R. L., Gray, J. J., Murphy, D. L., Roitman,
D. B., Pham, T. T., Otto, G. A., Foquet, M., and Turner, S. W. (2008).
Selective aluminum passivation for targeted immobilization of single
DNA polymerase molecules in zero-mode waveguide nanostructures,
Proc. Natl. Acad. Sci. U S A, 105, pp. 1176–1181.
94. Lenne, P. F., Rigneault, H., Marguet, D., and Wenger, J. (2008). Fluores-
cence fluctuations analysis in nanoapertures: physical concepts and
biological applications, Histochem. Cell Biol., 130, pp. 795–805.
95. Rigneault, H., Capoulade, J., Dintinger, J., Wenger, J., Bonod, N., Popov,
E., Ebbesen, T. W., and Lenne, P. F. (2005). Enhancement of single-
molecule fluorescence detection in subwavelength apertures, Phys.
Rev. Lett., 95, p. 117401.
96. Kinkhabwala, A., Yu, Z., Fan, S., Avlasevich, Y., Mullen, K., and Moerner,
W. E. (2009). Large single-molecule fluorescence enhancements
produced by a bowtie nanoantenna, Nat. Photon., 3, pp. 654–657.
97. Lakowicz, J. R. (2005). Radiative decay engineering 5: metal-enhanced
fluorescence and plasmon emission, Anal. Biochem., 337, pp. 171–194.
98. Acuna, G. P., Möller, F. M., Holzmeister, P., Beater, S., Lalkens, B., and
Tinnefeld, P. (2012). Fluorescence enhancement at docking sites of
DNA-directed self-assembled nanoantennas, Science, 338, pp. 506–
510.
99. Novotny, L., and van Hulst, N. (2011). Antennas for light, Nat. Photon.,
5, pp. 83–90.
100. Punj, D., Mivelle, M., Moparthi, S. B., van Zanten, T. S., Rigneault, H., van
Hulst, N. F., Garcia-Parajo, M. F., and Wenger, J. (2013). A plasmonic
‘antenna-in-box’ platform for enhanced single-molecule analysis at
micromolar concentrations, Nat. Nanotechnol., 8, pp. 512–516.
101. Douglas, S. M., Dietz, H., Liedl, T., Hogberg, B., Graf, F., and Shih, W.
M. (2009). Self-assembly of DNA into nanoscale three-dimensional
shapes, Nature, 459, pp. 414–418.
102. Rothemund, P. W. K. (2006). Folding DNA to create nanoscale shapes
and patterns, Nature, 440, pp. 297–302.
103. Heucke, S. F., Baumann, F., Acuna, G. P., Severin, P. M., Stahl, S. W.,
Strackharn, M., Stein, I. H., Altpeter, P., Tinnefeld, P., and Gaub, H. E.
(2014). Placing individual molecules in the center of nanoapertures,
Nano Lett., 14, pp. 391–395.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

References 121

104. Kufer, S. K., Puchner, E. M., Gumpp, H., Liedl, T., and Gaub, H. E.
(2008). Single-molecule cut-and-paste surface assembly, Science, 319,
pp. 594–596.
105. Allen, B. L., Kichambare, P. D., and Star, A. (2007). Carbon nanotube
field-effect-transistor-based biosensors, Adv. Mater., 19, pp. 1439–
1451.
106. Clarke, J., Wu, H. C., Jayasinghe, L., Patel, A., Reid, S., and Bayley, H.
(2009). Continuous base identification for single-molecule nanopore
DNA sequencing, Nat. Nanotechnol., 4, pp. 265–270.
107. Hoeben, F. J. M., Meijer, F. S., Dekker, C., Albracht, S. P. J., Heering,
H. A., and Lemay, S. G. (2008). Toward single-enzyme molecule
electrochemistry: [NiFe]-hydrogenase protein film voltammetry at
nanoelectrodes, ACS Nano, 2, pp. 2497–2504.
108. Kim, S. N., Rusling, J. F., and Papadimitrakopoulos, F. (2007). Carbon
nanotubes for electronic and electrochemical detection of biomole-
cules, Adv. Mater., 19, pp. 3214–3228.
109. Lemay, S. G., Kang, S., Mathwig, K., and Singh, P. S. (2013). Single-
molecule electrochemistry: present status and outlook, Acc. Chem. Res.,
46, pp. 369–377.
110. Heller, I., Janssens, A. M., Mannik, J., Minot, E. D., Lemay, S. G., and
Dekker, C. (2007). Identifying the mechanism of biosensing with
carbon nanotube transistors, Nano Lett., 8, pp. 591–595.
111. Choi, Y., Moody, I. S., Sims, P. C., Hunt, S. R., Corso, B. L., Perez, I., Weiss,
G. A., and Collins, P. G. (2012). Single-molecule lysozyme dynamics
monitored by an electronic circuit, Science, 335, pp. 319–324.
112. Choi, Y., Moody, I. S., Sims, P. C., Hunt, S. R., Corso, B. L., Seitz, D. E.,
Blaszczak, L. C., Collins, P. G., and Weiss, G. A. (2012). Single-molecule
dynamics of lysozyme processing distinguishes linear and cross-linked
peptidoglycan substrates, J. Am. Chem. Soc., 134, pp. 2032–2035.
113. Choi, Y., Olsen, T. J., Sims, P. C., Moody, I. S., Corso, B. L., Dang, M. N.,
Weiss, G. A., and Collins, P. G. (2013). Dissecting single-molecule signal
transduction in carbon nanotube circuits with protein engineering,
Nano Lett., 13, pp. 625–631.
114. Balasubramanian, K., and Burghard, M. (2006). Biosensors based on
carbon nanotubes, Anal. Bioanal. Chem., 385, pp. 452–468.
115. Goldsmith, B. R., Coroneus, J. G., Kane, A. A., Weiss, G. A., and Collins, P.
G. (2007). Monitoring single-molecule reactivity on a carbon nanotube,
Nano Lett., 8, pp. 189–194.
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

122 Enzymology Meets Nanotechnology

116. Sorgenfrei, S., Chiu, C.-y., Gonzalez, R. L., Yu, Y.-J., Kim, P., Nuckolls,
C., and Shepard, K. L. (2011). Label-free single-molecule detection
of DNA-hybridization kinetics with a carbon nanotube field-effect
transistor, Nat. Nanotechnol., 6, pp. 126–132.
117. Tans, S. J., Verschueren, A. R. M., and Dekker, C. (1998). Room-
temperature transistor based on a single carbon nanotube, Nature,
393, pp. 49–52.
118. Besteman, K., Lee, J.-O., Wiertz, F. G. M., Heering, H. A., and Dekker,
C. (2003). Enzyme-coated carbon nanotubes as single-molecule
biosensors, Nano Lett., 3, pp. 727–730.
119. Chen, R. J., Zhang, Y., Wang, D., and Dai, H. (2001). Noncovalent
sidewall functionalization of single-walled carbon nanotubes for
protein immobilization, J. Am. Chem. Soc., 123, pp. 3838–3839.
120. Sims, P. C., Moody, I. S., Choi, Y., Dong, C., Iftikhar, M., Corso, B. L., Gul,
O. T., Collins, P. G., and Weiss, G. A. (2013). Electronic measurements of
single-molecule catalysis by cAMP-dependent protein kinase A, J. Am.
Chem. Soc., 135, pp. 7861–7868.
121. Olsen, T. J., Choi, Y., Sims, P. C., Gul, O. T., Corso, B. L., Dong, C., Brown,
W. A., Collins, P. G., and Weiss, G. A. (2013). Electronic measurements of
single-molecule processing by DNA polymerase I (Klenow fragment), J.
Am. Chem. Soc., 135, pp. 7855–7860.
122. Prisbrey, L., Schneider, G., and Minot, E. (2010). Modeling the
electrostatic signature of single enzyme activity, J. Phys. Chem. B, 114,
pp. 3330–3333.
123. Engel, A., and Muller, D. J. (2000). Observing single biomolecules at
work with the atomic force microscope, Nat. Struct. Biol., 7, pp. 715–
718.
124. Hinterdorfer, P., and Dufrene, Y. F. (2006). Detection and localization
of single molecular recognition events using atomic force microscopy,
Nat. Methods, 3, pp. 347–355.
125. Grandbois, M., Clausen-Schaumann, H., and Gaub, H. (1998). Atomic
force microscope imaging of phospholipid bilayer degradation by
phospholipase A2, Biophys. J., 74, pp. 2398–2404.
126. Mori, T., Asakura, M., and Okahata, Y. (2011). Single-molecule force
spectroscopy for studying kinetics of enzymatic dextran elongations,
J. Am. Chem. Soc., 133, pp. 5701–5703.
127. Heucke, S. F., Puchner, E. M., Stahl, S. W., Holleitner, A. W., Gaub, H.
E., and Tinnefeld, P. (2013). Nanoapertures for AFM-based single-
molecule force spectroscopy, Int. J. Nanotechnol., 10, pp. 607–619.

www.ebook3000.com
March 21, 2016 12:25 PSP Book - 9in x 6in 03-Allan-Svendsen-c03

References 123

128. Jacobs, M. J., and Blank, K. (2014). Joining forces: integrating the
mechanical and optical single molecule toolkits, Chem. Sci., 5, pp.
1680–1697.
129. Puchner, E. M., and Gaub, H. E. (2012). Single-molecule mechanoenzy-
matics, Annu. Rev. Biophys., 41, pp. 497–518.
130. Hill, C. M., Clayton, D. A., and Pan, S. (2013). Combined optical and
electrochemical methods for studying electrochemistry at the single
molecule and single particle level: recent progress and perspectives,
Phys. Chem. Chem. Phys., 15, pp. 20797–20807.
131. Zhao, J., Zaino III, L. P., and Bohn, P. W. (2013). Potential-dependent
single molecule blinking dynamics for flavin adenine dinucleotide
covalently immobilized in zero-mode waveguide array of working
electrodes, Faraday Discuss., 164, pp. 57–69.
This page intentionally left blank

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Chapter 4

Interfacial Enzyme Function Visualized


Using Neutron, X-Ray, and
Light-Scattering Methods

Hanna Wacklina,b,c and Tommy Nylanderc


a European Spallation Source ERIC, Lund, Sweden
b Department of Chemistry, University of Copenhagen, Copenhagen, Denmark
c Division of Physical Chemistry, Department of Chemistry, Lund University,

Lund, Sweden
hanna.wacklin@esss.se, tommy.nylander@fkem1.lu.se

In this chapter we describe recent results on using neutron,


X-ray, and light-scattering methods to analyze structural and
compositional changes due to the action of interfacially active
enzymes on a substrate of low aqueous solubility. The interfacial
mechanism poses challenges in terms of both the structural analysis
of a heterogeneous system and methods to monitor and interpret
the kinetics. Today many of these enzyme systems are used in a
range of applications, including detergency, material science, food
technology, biotechnology, and biofuels. In spite of the growing
industrial relevance, a surprisingly small number of fundamental
studies have been carried out. Here we will address mainly two
classes of important interfacially active enzymes, namely lipolytic
enzymes and cellulases. We will discuss the common features of
the two types of degradation processes, associated with the fact

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

126 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

that these are water-soluble enzymes acting on a substrate surface


that continuously changes due to the enzyme action. One distinctive
difference is the fate of the reaction products, which in the case of
lipolytic enzymes typically remain in the substrate matrix, whereas
the cellulose degradation products are fully soluble. In both cases
the enzymatic process causes changes to the substrate morphology
and structure. Therefore understanding these types of enzymes
requires the use of structural and surface-sensitive techniques that
can determine changes on the nanometer-length scale.

4.1 Phospholipase A2 : An Interfacially Activated Enzyme

Phospholipase A2 s (PLA2 ) are a class of enzymes [1] that se-


lectively hydrolyze the sn-2 acyl ester bond in phosphoglyceride
lipids, releasing a fatty acid and a lysolipid. These products are
precursors to important biochemical signaling agents (eicosanoids,
prostaglandins, and platelet-activating factor). The catalytic site
mechanism of PLA2 [2, 3] is shared between all the intracellular
and extracellular types of the calcium-dependent enzyme. Acting
in diverse environments from invertebrate and insect venoms to
our own cells and immune system, PLA2 hydrolyzes phospholipids
as part of inflammatory signaling [4], host defense [5], digestion
of fats, phospholipid remodeling, and cell lysis. Although high-
resolution crystal structures of many PLA2 s have been determined
[6], the mechanism by which their activity is regulated is not fully
understood. The fact that catalysis takes place on the surface of a
lipid membrane adds to the complexity. This means that collective
interactions of the membrane with the enzyme take place alongside
the substrate binding to the catalytic site. The heterogeneous nature
of the reaction also makes it necessary to understand the surface–
solution equilibria of the enzyme and lipids as well as the resulting
structural changes.
The reaction scheme in Fig. 4.1 illustrates the processes that
can occur in heterogeneous enzyme catalysis at the lipid–water
interface.
The initial binding of an enzyme, such as PLA2 , to the substrate
0
interface is characterized by an association rate constant k−d and a

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 127

P2

k P2 k-P2

k1 k2 k3
E* E*S E*P E* +P1*+P*2
k-1 k -3

o o
kd k -d kd k-P1 k-P1
k-d
P1
E

Figure 4.1 Reaction scheme for PLA2 hydrolysis of lipids at a membrane


interface. Reprinted from Ref. [7], Copyright (2007), with permission from
Elsevier.

dissociation rate constant kd0 . Once at the interface, the membrane-


bound enzyme E ∗ has to bind a substrate molecule (S) in its catalytic
pocket, an equilibrium described by the rate constants k1 and k−1 .
The chemical conversion step is described by the catalytic rate con-
stants k2 and k−2 . Once the lipid sn2–ester bond is broken, the prod-
ucts P1 and P2 are released by the enzyme (k3 /k−3 ), and E ∗ is free
to bind another substrate molecule or to leave the membrane. The
reaction products are in equilibrium with the surrounding aqueous
phase according to their solubility, which is generally higher than for
the substrate phospholipid. If the products remain in the membrane,
their presence will lead to dilution of the substrate. This effect can
be further enhanced if the lateral diffusion of lipids is restricted. If
one of the products has a higher tendency to partition into solution,
the compositional change can lead to variations in membrane
charge, spontaneous curvature, lipid packing, or phase separation.
The substrate membrane composition is not affected if both reaction
products immediately leave the membrane and have no affinity for
PLA2 . However, PLA2 activity or affinity for the membrane may be
altered if either of the products remains in the membrane and acts
as an inhibitor/activator for hydrolysis or changes the membrane
binding affinity of the enzyme (in which case kd = kd0 ).
Several approaches have been developed to provide a further
understanding of PLA2 activation, regulation, and substrate speci-
ficity. Most investigations of PLA2 kinetics have been carried out
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

128 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

in bulk phospholipid dispersions or in Langmuir monolayers at


the air–water interface where only the overall rate of product
formation can be monitored [8–10], and no information has been
obtained about the location of the enzyme or the distribution of
reaction products between the membrane and the aqueous phase.
Some have focused on measuring the rate of hydrolysis under
conditions where the complications resulting from the interfacial
character of the reaction have been reduced. For example, Verger’s
zeroth-order trough method uses short-chain lipids that produce
soluble reaction products in a Langmuir trough to maintain a
constant surface pressure, keeping the nature of the substrate
unchanged [11]. Jain’s Scooting model is based on the use of
negatively charged lipid vesicles, which do not exchange lipids
and which constitute independent reaction substrates, on which
the enzyme is irreversibly bound [10]. In the surface dilution
kinetics of Dennis and coworkers, the substrate is solubilized
in nonionic detergent micelles, allowing the bulk and interfacial
concentrations of the substrate to be controlled separately [12]. The
term “interfacial quality” has been often used to account for the
differences between rates observed for lipids in different physical
forms of substrates such as detergent-micelles, vesicles, monolayers,
and supported bilayers. This classification is meant to reflect the
ability of the enzyme to penetrate into the membrane and/or
undergo a conformational change required for activation [13]. While
it is possible that, for example, lipid density or interfacial curvature
can modulate PLA2 binding or its activity, it is difficult to conclusively
determine such effects in the absence of structural evidence.
Structural analysis of the membrane–PLA2 assembly is challeng-
ing, given the aqueous environment and disordered and fluid nature
of lipid membranes. In the following section we describe the use
of neutron scattering and ellipsometry to measure the membrane
structure and location of PLA2 in supported lipid bilayers. This
has enabled some key aspects of the enzyme–lipid interaction to
be determined as a function of membrane composition. Specular
neutron reflection and ellipsometry are techniques that noninva-
sively probe interfacial structures. In both cases, the structural
parameters (thickness, surface coverage, refractive index) of an
adsorbed film are derived from variations in reflected intensity

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 129

as a function of angle (neutrons) or polarization (ellipsometry).


Ellipsometry has a time resolution that enables the observation of
adsorption and desorption kinetics at a timescale of seconds, while
neutron reflection exploits deuterium labeling to highlight parts of
a multicomponent system. Both methods are suitable for structure
determination in systems with no long-range molecular ordering in
the interfacial plane, as is typical of fluid biological membranes, and
can be employed at buried interfaces (i.e., under water).

4.1.1 Neutron Reflection


Neutron reflection from planar surfaces can be used to measure the
structure of membranes perpendicular to the lipid–water interface
and probe the composition depth profile in fluid, single membranes
in an aqueous environment. The sensitivity of neutrons to the
membrane structure is based on the contrast to the surrounding
media, determined by the nuclear isotopic composition of the
membrane. The overall contrast will be determined by the sum of the
neutron scattering length densities of the components. For example,
in a mixture of lipid, enzyme, and water, the scattering length density
can be expressed as

ρmembrane = φlipid ρlipid + φwater ρwater + φenzyme ρenzyme (4.1)

where φ represents the volume fraction of each component and


ρ its molecular scattering length density. As there are significant
differences between the neutron scattering lengths of proteins and
lipids, it is possible to determine the structural arrangement of
the membrane–enzyme assembly, as well as the composition. Fur-
thermore, because there is a large difference between the neutron
scattering length of hydrogen and its heavy isotope deuterium, the
sensitivity to the different components of the membrane can be
enhanced by deuterium labeling. Thus, the membrane lipid compo-
sition, its water content, and the enzyme location can be determined.
Since neutrons interact weakly with many common materials, such
glass or silicon, it is possible to probe solid-supported membrane
samples enclosed in an aqueous compartment. Neutrons are also a
low-energy probe and do not cause any radiation or heat damage to
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

130 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

the biological molecules, which allows measurements of structural


changes or processes to be carried out over long timescales.

4.1.2 Ellipsometry
Ellipsometry measures the changes in the polarization of visible
light upon reflection at a specific wavelength and relates this
to the refractive index of the membrane at a reference surface,
typically silicon dioxide. The surface density of lipids in a supported
membrane per unit area can be computed from the refractive index
using de Feijter’s equation
n − n0
 (mg/m2 ) = d (4.2)
dn/dc
where n and n0 are the refractive indices of the lipid membrane and
bulk solvent, respectively; d is the membrane thickness; and dn/dc
is the refractive index increment of the lipids (typically 0.154 mL/g
[14]).

4.1.3 Activity of Naja mossambica mossambica PLA2


Neutron reflectivity and ellipsometry were used to monitor the
effect of Naja mossambica mossambica PLA2 on supported phos-
phocholine bilayers composed of 1,2-dioleylphosphatidylcholine
(DOPC), 1-palmitoyl, 2-oleylphosphatidylcholine (POPC), and 1,2-
dipalmitoylphosphatidylcholine (DPPC) at 25◦ C. The enzymatic
breakdown of the lipids leads to the destruction of the supported
bilayer by solubilization of a significant fraction of the original lipid
material [15]. However, there are important differences among the
lipids. The ellipsometry traces recording the lipid surface density
in Fig. 4.2 show that the membranes are solubilized starting
immediately after injection of the enzyme but with considerable
differences in both the initial rate and the extent of the process.
Remarkably, the initial rate for POPC is faster than that for DOPC,
despite both lipids being in the fluid or liquid crystalline phase,
whereas for DPPC (in the gel phase at 25◦ C) the reaction stops after
only 15% of the membrane has been solubilized.
Clearly the lipid chains, as well as the physical state of the
membrane, have a significant effect on the reaction progress, but

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 131

Figure 4.2 Surface density  (mg/m2 ) of DOPC-, POPC-, and DPPC-


supported bilayers recorded after injection of Naja mossambica mossambica
PLA2 at 0.02 mg/mL. Reprinted with permission from Ref. [15], Copyright
c 2005, American Chemical Society.

without knowledge of which membrane components are being


solubilized and what happens to the enzyme, it is difficult to
interpret the results. We therefore used neutron reflection to
elucidate the differences between the phospholipids.
The neutron reflectivity profiles of DOPC, POPC, and DPPC
bilayers before and after PLA2 hydrolysis are shown in Fig. 4.3,
with sketches illustrating the corresponding structural models that
were fitted to the data. As there is a large contrast between the
phospholipids and D2 O, the reflectivity is very sensitive to the overall
thickness and surface density of the phospholipid bilayer, as well
as the presence of the enzyme. The results fit well to a structure in
which the enzyme resides in a 21 ± 1 Å thick layer at the lipid–water
interface, with partial penetration into the outer membrane leaflet
in each case. No evidence was found for the enzyme penetrating into
the lower phospholipid leaflet.
While the neutron reflectivity results confirmed the differences
in reaction extent and rate observed by ellipsometry, the enzyme
penetration depth was found to increase with increasing lipid chain
saturation. In particular, PLA2 mainly resides on top of the DOPC
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

132 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

(a) (b) (c)

Figure 4.3 Neutron reflectivity recorded from (a) DOPC-, (b) POPC-, and
(c) DPPC-supported bilayers before (pink circles) and after (green squares)
Naja mossambica mossambica PLA2 injection (0.02 mg/mL). The reflectivity
of the clean silica substrate is shown in comparison (blue diamonds).
Reprinted with permission from Ref. [15], Copyright  c 2005, American
Chemical Society.

bilayer and only interacts with the headgroups, while penetration


into part of the hydrophobic region was observed for POPC and
DPPC. This implies that the nature of the lipid chains influences
the enzyme–membrane interaction and is further supported by
the strong and irreversible binding of PLA2 that was observed
on a purely hydrophobic self-assembled monolayer composed of
octadecylsilane. The number of PLA2 molecules adsorbed on the
lipid bilayer was greatest for POPC and somewhat lower for both
DOPC and DPPC, which is consistent with the differences observed
in the reaction rates. The enzyme remained bound at the membrane
surface even after the reaction had stopped in all three cases.
The difference between the three lipid bilayers lies mainly in the
packing order of the lipid hydrophobic chains and in the properties
of the hydrolysis products. Palmitic acid is only produced in the
hydrolysis of DPPC, and the striking difference in the extent of
reaction compared to DOPC and POPC suggests that it plays a major
role. One possible reason for the short reaction with PLA2 is that
all the fatty acid produced condenses the bilayer [16]. This occurs
to such an extent that the enzyme can no longer access the bonds
to be hydrolyzed. Here an increased penetration depth of enzyme

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 133

into the DPPC bilayer seems to serve as an inhibiting effect. The


formation of what are thought to be product-rich domains in DPPC
and other monolayers containing palmitic acid and phase separation
of the enzyme itself into condensed domains have been observed by
fluorescence microscopy [17, 18].
By using neutron reflection it was possible to observe how the
PLA2 –membrane interaction changes concurrently with the effects
of these differences have on the reaction kinetics.

4.1.4 Fate of the Reaction Products


One of the limiting factors in the study of PLA2 kinetics has been the
lack of methods to analyze changes in the membrane composition.
This is important in order to determine the partitioning of the
reaction products between the membrane and the water. The use of
a single-chain deuterium-labeled lipid substrate (d31 -POPC) allows
monitoring the distribution of the reaction products. This lipid has a
perdeuterated sn-1 palmitoyl (C16:0) chain and an unmodified sn-2
oleyl (C18:1) chain. A reaction scheme with the neutron scattering
length density values (in units of 10−6 Å−2 ) of the phosphocholine
lipid components is shown in Fig. 4.4.
As the half-deuterated phospholipid molecules are hydrolyzed
by PLA2 , a deuterated lysolipid and a nondeuterated fatty acid are
created, with a large contrast in their scattering length densities

head 1.86

head 1.86

Phospholipid

Figure 4.4 d31 -POPC hydrolysis reaction catalyzed by PLA2 . The neutron
scattering length densities of the lipid components are indicated in units of
10−6 Å−2 . Reprinted from Ref. [7], Copyright (2007), with permission from
Elsevier.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

134 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

Figure 4.5 Neutron reflectivity profiles of d31 -POPC before and during Naja
mossambica mossambica PLA2 hydrolysis. The black lines indicate the fits
corresponding to the solubilization of the deuterated lysolipid. The alternate
red and blue lines show the reflectivity that would correspond to the case
where both reaction products leave the interface at equal rates. The lowest
curve shows the reflectivity of the substrate in the absence of lipids. The
membrane reflectivity curves have been shifted up by successive factors of
10 for clarity. Reprinted from Ref. [7], Copyright (2007), with permission
from Elsevier.

(+6.44 and −0.2, respectively, in units of 10−6 Å−2 ). If there is


unequal partitioning of these fragments between the membrane
and the solution, this can be monitored as changes in the neutron
reflectivity.
The reflectivity profiles of d31 -POPC before and after 70 min
and after 10 h 10 min of N. mossambica mossambica PLA2
hydrolysis are shown in Fig. 4.5. Compared to unlabeled POPC,
which is immediately and completely hydrolyzed by the enzyme,
the reflectivity changes during the entire course of hydrolysis of
d31 -POPC are very small and mainly correspond a decrease of the
membrane thickness to 31 ± 2 Å. Remarkably at 70 min, the
scattering length density of the lipid chains has decreased to 1.44
± 0.15 × 10−6 Å−2 , which corresponds to a 1:3 ratio of deuterated
palmitoyl chains and nondeuterated oleyl chains. This indicates that
50% of the lipid molecules have been hydrolyzed and that all the

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 135

released d31 -lyso-palmitoyl-phosphocholine has been solubilized


into the aqueous phase. The small changes in reflectivity arise due
to the scattering length density of lyso-C16:0-PC (6.44 × 10−6 Å−1 )
being very close to D2 O, meaning that its replacement by the solvent
is barely observable.
At this stage of the reaction, a 22 ± 1 Å thick layer of the enzyme
was found at the membrane–water interface in a position similar
to that found in POPC and DOPC and replacing 40 ± 5% of the
volume of the outer lipid headgroups. This correlates well with the
amount of solubilized lysolipid (50%) indicated by the scattering
length density changes. After 10 h the reaction has gone to near
completion, with a 23 ± 1 Å layer of oleic acid (ρ = −0.2 ± 0.15
× 10−6 Å−2 ) remaining on the surface at a volume fraction of 0.55 ±
0.05 and a 21 ± 1 Å thick layer of PLA2 with a significantly increased
volume fraction of 0.5 ± 0.05.
These data point strongly to a mechanism in which the
accumulation of the negatively charged fatty acid in the membrane is
accompanied by an increased adsorption of PLA2 , while the lysolipid
leaves the supported bilayer, giving the enzyme easier access to the
hydrophobic membrane region.

4.1.5 The Lag Phase and Activation of Pancreatic PLA2


Pancreatic PLA2 exhibits a long lag period in zwitterionic phospho-
choline membrane substrates before the onset of rapid hydrolysis
[19]. The phenomenon has intensively been studied using phospho-
lipid vesicles, where it has been established that during the lag phase
a shift in the intrinsic tryptophan fluorescence of PLA2 occurs. This
indicates either a conformational change or membrane association
of the enzyme or both [20, 21]. Recently it has also been found
that the burst of hydrolysis coincides with the maximum available
membrane edge in supported membranes [22]. Such a burst also
occurs with the maximum of lateral membrane heterogeneity that
accompanies the lipid main phase transition [23]. Perhaps the most
intriguing result is that there appears to be a critical fraction of the
hydrolysis products (0.083) that is required for the onset of rapid
hydrolysis [24]. Apart from membrane heterogeneity, the suggested
mechanisms of PLA2 activation to explain the lag phase include slow
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

136 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

penetration of the enzyme into the membrane [13], a membrane-


induced conformational change or dimerization of the enzyme [25,
26], and autocatalysis mediated by the reaction products [27], which
are known to act as membrane permeabilization agents [28].
To understand what changes occur in the membrane during the
lag phase and how they relate to the onset of enzyme activity, we
recorded the neutron reflectivity profiles of phospholipid bilayers
during and after the lag phase, as well as following the kinetics using
ellipsometry.
Figure 4.6 shows the surface coverage  and thickness of a
DOPC bilayer recorded during attack by porcine pancreatic PLA2
at 0.01 and 0.02 mg/mL concentrations. In both cases there is
a significant induction period before the surface excess starts to
decrease rapidly, which is interpreted as the onset of sustained

Figure 4.6 Surface coverage and thickness of a DOPC bilayer recorded


by ellipsometry during porcine pancreatic PLA2 hydrolysis. PLA2 (0.01
mg/mL): surface coverage  (mg/m2 ) (gray solid line) and bilayer thickness
t (Å) (gray crosses). PLA2 (0.02 mg/mL): bilayer surface coverage 
(mg/m2 ) (black solid lines) and bilayer thickness t (Å) (open circles). PLA2
is injected at t = 0, and the end of the induction period is indicated in
each case. Reprinted from Ref. [7], Copyright (2007), with permission from
Elsevier.

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 137

hydrolysis. This lag period is shortened from 350 to 230 min as


the enzyme concentration is doubled. Linear fits to the initial rate
of rapid phase hydrolysis after the end of the lag phase resulted in
0.426 and 0.802 μmol/(m2 ·s) for 0.01 and 0.02 mg/mL, respectively.
In other words, doubling the enzyme concentration leads to a nearly
twofold increase in the rate of removal of lipid from the surface,
while the length of the lag phase decreases by only 34%. This implies
either that not all of the enzyme is immediately associated with
the lipid bilayer, but accumulates during the lag period, or that,
once at the interface, the enzyme molecules require a period of
time to overcome an activation barrier to catalysis. It is not possible
to distinguish these two mechanisms without knowledge of the
location and amount of PLA2 .
The shortening of the lag phase is consistent with earlier
suggestions in Ref. [24] that gradual accumulation of reaction
products is required to provide the ideal conditions for PLA2
catalysis, in which case the time taken to generate a sufficient
fraction of products would be shorter in the presence of a larger
number of PLA2 molecules bound to the membrane.
Figure 4.7 shows the reflectivity profiles of a DOPC bilayer
recorded before and after the injection of porcine pancreatic PLA2 .
No significant changes occur in the membrane surface density
during the first 3 h of lipid–PLA2 interaction, as the data could be
fitted by maintaining the lipid bilayer coverage within ± 10% of
the original. This confirms the existence of a lag phase of similar
length to that observed by ellipsometry. The increase in reflectivity
observed at high Q (0.15 > Q > 0.21 Å−1 ) corresponds to a 21
± 3 Å layer of the enzyme at the lipid–water interface, which
penetrates through the outer headgroup region and 6 ± 3 Å into the
hydrocarbon chains of DOPC. After 6 h, a much more pronounced
change in the reflectivity shows that hydrolytic breakdown of DOPC
has started to cause removal of lysolipid from the support surface.
At 9 h after enzyme injection, the destruction of the DOPC bilayer
continues at a rate comparable to that previously observed for N.
mossambica mossambica PLA2 , with 40% of the original bilayer mass
having been removed at this stage. Coincident with the start of rapid
hydrolysis is also a 40% increase in membrane-bound PLA2 from
0.32 to 0.45 mg/m2 .
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

138 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

Figure 4.7 Neutron reflectivity profiles of DOPC recorded before and


during porcine pancreatic PLA2 injection. (Open diamonds) Clean substrate
reflectivity in D2 O, (open circles) DOPC bilayer at in 10 mM tris-D2 O pH
7.4, (open squares) DOPC bilayer 3 h after injection of 0.01 mg/mL PLA2 ,
(open triangles) DOPC bilayer 6 h after PLA2 injection, and (crosses) DOPC
bilayer 9 h after PLA2 injection. Reprinted from Ref. [7], Copyright (2007),
with permission from Elsevier.

4.1.6 Distribution of Products during the Lag Phase


Given the proposed critical fraction of reaction products required
for activation of pancreatic PLA2 , we investigated the membrane
composition during the lag phase using d31 -POPC. The reflectivity
profiles in Fig. 4.8 show that at least a 5 h induction period exists
before any significant amount (>10%) of the lipid material begins
to leave the membrane. The interaction with PLA2 , which during
the lag phase resides in a 21 ± 3 Å layer partially embedded in the
membrane, leads mainly to thinning of the d31 -POPC hydrocarbon
region by 4 ± 1 Å. After 5 h a slow decrease in the chain scattering
length density from 3.17 × 10−6 to 2.75 × 10−6 Å−2 can be observed,
which implies that the deuterated lysopalmitoyl lipid has started to
leave the membrane. On the basis of the phospholipid component
volumes [29], this fraction corresponds to 5% of the deuterated sn-
1 chains in the original bilayer. The interfacial PLA2 volume fraction

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 139

substrate in D2O
4 d31POPC at pH 7.4 10mM TRIS
0.02 g/l pig PLA2 1h
0.02 g/l pig PLA2 5h
log Reflectivity 2 0.02 g/l pig PLA2 10h

-2

-4

-6

-8
0.01 0.06 0.11 0.16 0.21

Q/ Å -1

Figure 4.8 Reflectivity profiles of d31 -POPC recorded before and during
porcine pancreatic PLA2 injection. (Open diamonds) Clean substrate
reflectivity in D2 O, (open circles) d31 -POPC bilayer at in 10 mM tris-D2 O pH
7.4, (open squares) d31 -POPC bilayer 1 h after injection of 0.01 mg/mL PLA2 ,
(crosses) d31 -POPC bilayer 5 h after PLA2 injection, and (open triangles) d31 -
POPC bilayer 10 h after PLA2 injection. Reprinted from Ref. [7], Copyright
(2007), with permission from Elsevier.

is initially lower for d31 -POPC than for DOPC but increases threefold
during the lag period.
Our results constitute the first direct measurement of the
absolute amount of PLA2 bound to a phospholipid bilayer during
the lag phase and show unambiguously that it increases, although
the changes observed in lipid composition are small. More remark-
able is that in both cases, DOPC and d31 -POPC, the lag phase is ter-
minated when 5 ± 3% of the lipid molecules have been hydrolyzed
although the time required for this is considerably longer for d31 -
POPC. The volume fraction of PLA2 bound to d31 -POPC is initially
higher, but in both cases, it increases by ∼5 vol% during the course
of the lag phase, indicating that the departure of the lyso-PC and
generation of fatty acid enhance PLA2 binding to the membrane.

4.1.7 Hydrolysis of DPPC by Pancreatic PLA2


The interaction of pancreatic PLA2 with DPPC at the same
temperature (25◦ C) is markedly different from DOPC and POPC.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

140 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

Figure 4.9 Reflectivity profiles of d62 -DPPC before and after porcine
pancreatic PLA2 injection. (Circles) d62 -DPPC bilayer at in 10 mM tris-D2 O
pH 7.4, (crosses) d62 -DPPC bilayer at in 10 mM tris-CmSi pH 7.4, (open
squares) d62 -DPPC bilayer 1.5 h after injection of 0.01 mg/mL PLA2 , and
(open diamonds) d62 -DPPC bilayer after 7.5 h. Reprinted from Ref. [7].
Copyright (2007), with permission from Elsevier.

Figure 4.9 shows the reflectivity profiles from a chain-deuterated


d62 -DPPC recorded before and after injection of PLA2 .
In this case the lipid bilayer had a relatively low volume fraction
of only 45% with large (nanometer size) lipid-free areas of the
support surface. The injection of PLA2 was found to lead to a
small increase in reflectivity corresponding to PLA2 adsorption
and penetration uniformly throughout the bilayer. In a manner
similar to DOPC and POPC, the DPPC bilayer becomes 4 ± 1 Å
thinner upon interaction with PLA2 , indicating that some changes
in the lipid packing occur, but in the case of DPPC, there is
no observable lipid hydrolysis. This indicates that the pancreatic
enzyme is inactive toward a DPPC in the gel state. It has been argued
by several researchers that PLA2 can be activated by defects in model
membranes [23, 24], but our data point to the opposite conclusion;
it appears that at 25◦ C in the absence of Ca2+ DPPC is not hydrolyzed
by pancreatic PLA2 even when the bilayer has a large number of
defects.

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 141

4.1.8 Role of the Reaction Products in PLA2 Activation


The neutron reflectivity results are in good agreement with the
critical product mole fraction of 8.3% found in previous studies
[24] to be sufficient to eliminate the lag phase of pancreatic PLA2 .
Our results also show that despite the lysolipid being a leaving
group in the membrane, its chemical structure has an effect on
the activation of PLA2 by way of regulating the length of the lag
phase. Under conditions where the lag phase has been eliminated
by addition of Ca2+ , the effect of the lysolipid on the initial rate of
PLA2 hydrolysis has been found to be much smaller than that of the
fatty acid [30]. Considered together with the fact that the enzyme
seems to be inactive toward but still interacts with DPPC in the gel
state, this suggests that the PLA2 –membrane interaction has at least
two steps, adsorption of the enzyme to the lipid–water interface and
subsequent interaction with the lipid hydrocarbon region.
The amount of the enzyme bound to the lipid interface initially
is very similar for DOPC and DPPC and for pancreatic and cobra
venom PLA2 s, which suggests that this is predominantly driven by
interactions with the lipid headgroup but that the penetration step,
which is required for catalytic activity, is more dependent on the
nature of the phospholipid packing and hence the nature of the lipid
chains.
The solution partitioning of the d31 -lysolipid and the accumula-
tion of fatty acid in the membrane are consistent with their relative
water solubilities and with the lysolipid having a large zwitterionic
headgroup and the fatty acid having a long saturated hydrocarbon
chain. Although both have relatively low water solubilities, when
they are generated in the membrane in contact with a bulk
aqueous phase, they will partition into the water according to
their solubilities. The critical micellization concentration (cmc) of 1-
palmitoyl-3-sn-phosphocholine is 70 μM [31], which is considerably
higher than the entire stock of lysolipid (0.5 μM) that can be
released from the single supported membrane into the relatively
large water phase. This drives its solubilization from the membrane.
The C18:1 fatty acid is orders of magnitude less soluble. The neutron
reflectivity results are the first confirmation of the changing lipid
composition and the almost exclusive solubilization of the C16:0-
lyso-PC.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

142 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

An increased interaction of the enzyme with reaction products


in the membrane has been suggested earlier [24, 27, 32, 33]. This
is considered to be due to the increased electrostatic attraction
arising from the presence of a negatively charged fatty acid, with the
cationic residues of the enzyme that participate in the membrane
interaction [34, 35]. Our results support the product activation
hypothesis by confirming that the amount of PLA2 at the interface
increases during the lag phase (pancreatic PLA2 ) and also during the
active phase (cobra PLA2 ). However, if the activation of PLA2 were
only dependent on the initial electrostatic attraction, then the length
of the lag phase should be the same for all phosphaticylcholine
bilayers, because the amount of enzyme initially adsorbed is
remarkably similar for all the three lipids. However, we see a
significant difference in the lag length even between DOPC (3 h)
and POPC (5 h) and no catalytic activity toward DPPC, which clearly
suggests that the activation involves another rate-limiting step.
A debate about the mechanism of PLA2 activation has been going
on for decades between the leading groups in the area. While the Jain
group postulates that the activation is solely based on electrostatic
interactions, that is, only the number of enzymes adsorbed at the
membrane interface counts [36], the Verger laboratory originally
suggested that a penetration step is also required [37].
Our neutron reflection and ellipsometry results offer the first
confirmation of PLA2 penetration and a possible explanation
of several previously predicted effects. The location of PLA2 in
supported bilayers overlaps with the lipid headgroups and part of
the outer leaflet chain region [15]. Accompanying the lag phase we
also observed an increase in the penetration depth. Both results
strongly support the idea of a penetration step as rate limiting in
catalytic activity. Third, we see a similar amount of PLA2 adsorbed
on the inactive DPPC bilayer during the lag phase as on the active
DOPC bilayer during hydrolysis. Therefore the enzyme adsorption
step cannot be the single activating process.
At least two different modes of membrane binding have been
identified for PLA2 by fluorescence spectroscopy [19, 20]. While
there is no direct structural evidence for a conformational change
of the enzyme, it is more certain that the different fluorescence
states represent two different types of PLA2 –lipid interactions, that

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 143

is, two different locations of the enzyme at the lipid–water interface.


Penetration into the lipid chain region implies that the lipids are
also required to have the conformational freedom to allow PLA2 to
reach its catalytic depth as a part of lipid insertion into the active
site. Thus an inactive form of PLA2 can be bound at the lipid–water
interface without being able to perform catalysis, as in the case of
DPPC in the gel phase. The activity appears when the lipids are
heated to near or above the phase transition temperature, as has
indeed been observed [26]. The importance of lipid conformation is
further supported by the preferential hydrolysis of the fluid lipid in
mixtures such as 1,2-dimyristoylphosphatidycholine (DMPC)/1,2-
distearoylphosphatidycholine (DSPC) [38].
It has also been suggested [17, 24] that the presence of
reaction products causes phase separation in the membrane and can
activate PLA2 via defects. Given the complete solubilization of the
lysolipid evident in our neutron reflectivity data, this effect would
then be solely due to fatty acids, which can have very different
properties according to the nature of their hydrocarbon chains. In
particular it has been found that the effect of unsaturated fatty
acids on a phospholipid bilayer structure is negligible compared to
saturated fatty acids [16]. A saturated long-chain fatty acid such
as palmitic acid tends to pack tightly into a crystalline lattice, even
at physiological temperatures, while the corresponding unsaturated
fatty acid is much more flexible and has virtually no effect on the
lipid bilayer melting temperature.
The increased binding of PLA2 to the d31 -POPC bilayer with
increasing oleic acid content confirms that the binding of this
enzyme is favored by the accumulating fatty acid, but the strength
of their interaction does not inhibit catalysis, which proceeds to
near completion. However, hydrolysis of DPPC by N. mossambica
mossambica PLA2 stops after only ∼15% of the phospholipid has
been consumed, although PLA2 is still found to be present in the
inactive membrane. This behavior can only be explained by a binding
interaction, which has become so strong that the catalytic cycle of
PLA2 is disrupted and the enzyme is irreversibly bound to the lipid
matrix. Such trapping could be caused by tightly packed clusters of
palmitic acid.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

144 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

The reversibility of PLA2 binding to lipid interfaces has been a


matter of interest for some decades, mainly due to the influence it
has on the types of kinetic analysis that can be applied, with the
scooting and hopping modes being referred to as the irreversible
binding and the reversible binding, respectively [10]. The apparent
deactivation or trapping of PLA2 in DPPC membranes suggests that
the binding in this case at least is irreversible, as otherwise the
enzyme could continue the reaction by hopping in and out of the
membrane. It also suggests that the effect of fatty acids is local
substrate depletion because the reaction stops at such an early stage.
The biological functions and substrates of PLA2 are diverse,
and it is likely that its regulation is related to the membrane lipid
composition in each environment. Since PLA2 plays a major role in
inflammatory response, it is a good candidate for drug development
aimed at its selective inhibition [39], but advance is hampered
by a lack of understanding about the subtleties of its regulation.
Neutron reflection gives unique information about the composition
and structure of phospholipid membranes and is a valuable tool in
studying the regulation of membrane binding enzymes such as PLA2 .

4.1.9 Effect of pH and Activation by Me-β-cyclodextrin


Having observed that the fatty acid is retained in the membrane
and is largely responsible for the product activation of PLA2 , it is
of interest to study the pH dependence since the fatty acid product
(pKa 7–8) becomes more negatively charged and more water soluble
with increasing pH.
At pH 5, changes in reflectivity after PLA2 injection (as shown in
Fig. 4.10) are very small and correspond to 0.03 ± 0.01 μmol/m2
of PLA2 , adsorbed to the membrane–water interface where it,
however, does not cause any observable decrease in the membrane
lipid volume fraction within 10 h. The solubility of the lysolipid
is unchanged at pH 5, and if formed, it should partition into the
solution phase. The resolution of fitting the data corresponds to a
detection limit of 3 mol% of lysolipid, and from this, we can conclude
that no significant hydrolysis occurs at pH 5.
At pH 9, the lipid volume fraction starts to decrease within 1
h, with considerably more (0.08 ± 0.01 μmol/m2 ) PLA2 initially

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 145

Figure 4.10 Neutron reflectivity profiles recorded before and during


d31 -POPC hydrolysis by 0.01 mg/mL PLA2 at (a) pH 5 and (b) pH 9.
(Insets) Neutron scattering length density profiles corresponding to the fits.
Reprinted with permission from Ref. [40], Copyright  c 2009, American
Chemical Society.

adsorbed on the membrane surface. After 9 h, the membrane has


been completely digested into a fatty acid layer of 21 ± 3 Å, with
0.14 ± 0.03 μmol/m2 PLA2 attached to the lipid–water interface.
Comparison of this result with previous data measured at pH 7.4 [7]
reveals that the reaction rate is not significantly higher at pH 9 than
at neutral pH. At pH 9, 42% of the lysolipid has partitioned into the
solution phase, which agrees with the pH 7.4 result.
To determine the effects of changing pH on PLA2 activity and
the enzyme–membrane interaction independent of the solubility of
the reaction products, hydrolysis of d31 -POPC was also investigated
in the presence of Me-β-cyclodextrin (Me-β-CD), which has been
shown to extract fatty acids from monolayers at the air–water
interface [41, 42] and to have an activating effect on PLA2 and other
lipase enzymes [43, 44]. In contrast to the previous results, the
hydrolysis completes within 1 h at all pH values in the presence of
0.5 mM Me-β-CD.
Contrary to the assumption that cyclodextrins activate PLA2 by
extracting the fatty acid [43], Me-β-CD predominantly facilitates
solubilization of the lysolipid. In each case, a nearly identical fatty
acid layer is formed, but the amount of PLA2 bound to it increases
from 0.08 ± 0.03 μmol/m2 at pH 5 to 0.12 ± 0.03 μmol/m2 at pH 9.
Only 22% of the oleic acid formed is co-extracted with the lysolipid
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

146 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

Figure 4.11 Initial lipid surface concentration, final fatty acid concentra-
tion, and initial/final lipid:PLA2 ratios as functions of pH in the (a) absence
and (b) presence of 0.5 mM Me-β-CD. Reprinted with permission from Ref.
[40], Copyright  c 2009, American Chemical Society.

at pH 9. This result is not surprising considering that the lysolipid


has a much higher cmc (7 μM) [31] and faster equilibrium between
the membrane and the solution than the fatty acid. The thicknesses
of the fatty acid layers (21–23 Å) are much smaller than the length of
two oleic acid molecules (C18:1cis9 ≈ 15 Å), suggesting that severe
tilting of the bilayer (44◦ –50◦ ) or interdigitation of the chains occurs.
Figure 4.11 shows the initial lipid and final fatty acid amounts
and the lipid–PLA2 ratio as a function of pH. In the absence of Me-
β-CD there is no difference in the solubilization of the fatty acid
between pH 7.4 and pH 9. This is evidenced by the surface coverage
of the final oleic acid layer, which corresponds to the same number
of molecules as in the original lipid bilayer. The lipid-to-PLA2 ratio
decreases with increasing pH, as expected with an increasing degree
of fatty acid ionization, which increases the affinity of the enzyme
for the negatively charged membrane. At pH 9, more PLA2 is also
recruited to the membrane as the reaction progresses.
In the presence of Me-β-CD, the reaction rate is so fast at all pH
values that only the final fatty acid–PLA2 ratio could be measured,
which also decreases with pH.
A closer look at the data reveals more subtle differences in
the behavior in the presence and absence of Me-β-CD. With Me-
β-CD, the amount of PLA2 at the membrane surface at pH 5 is
much higher than that in the absence of Me-β-CD, which seems
counterintuitive since the lysolipid extraction should not alter the
enzyme–membrane interaction at pH 5 when the fatty acid is

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 147

uncharged (the apparent pKa of long-chain fatty acids is 8–9 in an


aggregated membrane [45]. At pH 7.4, it may be that a small number
of the fatty acids produced are ionized, but at pH 9, they are fully
ionized, which is consistent with the increase in the reaction rate
from pH 7.4 to 9 being accompanied by more PLA2 at the membrane
interface. The enzyme amount also increases during the reaction
faster at pH 9 than at pH 7.4.
Changes in the ionization state of the enzyme are one possible
cause for the pH-dependent activity. The N. mossambica mossambica
PLA2 sequence has several ionizable groups [6], of which three
histidines (pKa ≈ 6.2) change their ionization states going from pH
9 to pH 5. It is also possible that in these experiments some of the
seven aspartates (pKa 4.5) and six glutamic acids (pKa 4.6) of PLA2
were partially uncharged, as pH 5 in D2 O corresponds to pD 4.6 [46].
The net enzyme charge (including the Ca2+ cofactor) estimated from
the pKa s of all residues is neutral at pH 9 (when the N-terminus is
deprotonated) and +1 at pH 7.4 but becomes increasingly positive
at pH 5 (+10, assuming that the Asp and Glu residues have a 50%
degree of ionization). The positively charged residues (Lys and Arg)
that are largely responsible for the activation of PLA2 by negatively
charged lipids or fatty acids are clustered on the interfacial binding
face [47], but they do not change their ionization states within the
pH 5–9 range.
The inactivity at low pH could be explained by inhibition of
the active site mechanism of PLA2 , which relies on nucleophilic
attack on the lipid sn-2 ester bond by a water molecule, catalyzed
by proton removal from the water by His48 [2, 48]. At pH 5,
His48 is protonated, and therefore it should not be able to act as
the catalytic base. However, activation by Me-β-CD shows that the
hydrolytic mechanism is not fully inhibited at pH 5. PLA2 is generally
inhibited by phosphonate transition-state analogues at low pH,
which has been found to be based on hydrogen bonding with the
phosphate group [49]. It has also been shown that the protonation of
His48 promotes the binding of PLA2 to phosphorylcholine surfactant
micelles [50, 51]. This suggests that PLA2 binding to the lipid
phosphate becomes stronger when His48 is protonated, but it is
not reflected in the amount of the enzyme bound to the membrane,
which is lowest at pH 5. Cleavage of the sn-2 bond is also not
inhibited because extraction of the lysolipid by Me-β-CD leads to
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

148 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

full activation. The results therefore suggest the presence of another


mechanism by which PLA2 can cleave the sn-2 acyl bond at low pH,
in which the release of the products depends on extraction by Me-β-
CD.
At pH 5, it is plausible that the substrate lipid could be bound to
the active site in the same orientation as that found for phosphate-
based substrate analogue inhibitors, with the phosphate forming a
hydrogen bond with the protonated His48, and both the phosphate
and the sn-2 carbonyl oxygen coordinate to the calcium cofactor
(bound to Asp 49). At low pH, it may also be possible for the
catalysis to proceed via general acid-based hydrolysis. However,
general acid-catalyzed ester hydrolysis does not take place in the
absence of the enzyme, as the neutron reflectivity profiles of the
lipid substrate membranes were recorded over a period of several
hours, and the membranes were found to be stable. General acid
catalysis could also mean the additional hydrolysis of the sn-1 fatty
acyl chain. However, this would not explain activation by Me-β-CD,
as the glycerophosphocholine group could still inhibit the enzyme by
remaining bound to His48. We therefore propose that specific acid-
catalyzed hydrolysis of the sn-2 ester is promoted by the alternative
conformation adopted by the substrate at pH 5 and requires Me-β-
CD to release the lysolipid product.
This raises a question about the mechanism by which Me-β-
CD activates PLA2 at the membrane–water interface. Me-β-CD does
not adsorb on or penetrate the membrane, and extraction of the
lysolipid directly from the membrane is therefore unlikely if PLA2
is unable to release it. The substrate lipids are too large to be
encapsulated by Me-β-CD. PLA2 binding to the membrane is weak
at pH 5, suggesting that it operates via the hopping mechanism. At
low pH, in the weak substrate-binding limit, membrane-bound PLA2
is in rapid equilibrium with the solution phase and hops from site to
site on the membrane surface between catalytic cycles, as illustrated
in Fig. 4.12. The enzyme detaches from the membrane after
hydrolyzing a substrate lipid, with the lysolipid product bound to it.
Me-β-CD extracts the lysolipid from PLA2 , which allows the enzyme
to return to the catalytic cycle. In the absence of Me-β-CD, PLA2 is
unable to release the lysolipid products and remains bound to the
membrane.

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Phospholipase A2 149

Figure 4.12 PLA2 activation by Me-β-CD during the hopping mechanism.


As weakly bound PLA2 unbinds from the substrate membrane after
hydrolyzing a lipid substrate, the lysolipid is extracted from PLA2 by Me-
β-CD, which releases the enzyme back to the catalytic cycle. Reprinted with
permission from Ref. [40], Copyright c 2009, American Chemical Society.

In conclusion, neutron reflection has allowed the in situ obser-


vation of several key phenomena in supported membranes during
PLA2 hydrolysis. First, the penetration of the enzyme could be
clearly observed and depends on the lipid composition. Second, the
amount of the enzyme at the membrane–water interfaces correlates
with the rate of hydrolysis, but the extent of reaction varies with lipid
chain saturation and temperature. The fate of the reaction products
measured by the use of a partially deuterated lipid was determined
for the first time, with the exclusive solubilization of the lysolipid
leading to accumulation of the fatty acid in the membrane. The
length of the lag phase of pancreatic PLA2 depends on lipid chain
unsaturation, with number of lipids hydrolyzed during the lag phase
corresponding well with the critical mole fraction found previously
necessary to activate the pancreatic PLA2 for both DOPC and POPC.
The enzyme penetrates deeper into the membrane and associates
more strongly with the membrane during the lag phase. The fatty
acid solubility varies only moderately with pH, whereas Me-β-CD
activates PLA2 independent of pH by facilitating the solubilization
of the lysolipid.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

150 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

Neutron reflection is clearly capable of detecting several impor-


tant aspects of interfacial catalysis. The measurements presented
here have been relatively long, of the order of hours, but the recent
advances in instrumentation have decreased the times required for
acquiring a membrane structure to minutes, and with the advent of
new neutron sources, there is a promise to reach second or even
subsecond timescales.

4.2 Other Lipolytic Enzyme Reactions on Surfaces

4.2.1 Triacylglycerol Lipases and the Role of Lipid Liquid


Crystalline Nanostructures
Lipolytic enzymes or lipases not only have important biological
functions in the lipid metabolism but also are used in numerous
applications [52]. The substrates for lipolytic enzymes are self-
assembled structures or aggregates of different lipid molecules,
as discussed before. Most natural substrates have low aqueous
solubility and are dispersed in or exposed to an aqueous solution
containing the enzyme. In contrast to phospholipases the substrate
for triacylglycerol lipases (TGLs) is often nonlamellar, so in addition
to the interfacial interaction one also has to consider the effect
of the curvature and phase transformations that might occur as
a consequence of the enzyme action. Here it is important to bear
in mind that lipolytic enzymes act at the oil–water interface, and
hence, the term “interfacial activation” has been used to describe
the lipase action, which implies that the interfacial structure of the
substrate is important [53]. The surface properties of the substrate
are dependent on the lipid composition as well as the conditions
in the aqueous phase. Lipolytic enzymes are generally small in size
compared to the substrate assembly, but they can be similar in size
to the local curvature of the interface.
In the pioneering in vitro study of lipolysis of triglyceride
droplets in an intestine-like environment, Patton and Carey ob-
served a sequence of liquid crystalline phases depending on
the solution conditions, among them a viscous isotropic phase
composed of monoglycerides and fatty acids, identical to the one

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Other Lipolytic Enzyme Reactions on Surfaces 151

formed in monoglyceride systems [54]. The isotropic phase that


these lipolysis products form, later defined as a cubic phase, are
under normal physiological conditions rapidly solubilized in mixed
micelles of fatty acids and bile salts if present in excess. However,
after a fat-rich meal, the bile acid amounts in vivo are not always
sufficient to solubilize all lipids, and therefore, it has been argued
that the cubic liquid crystalline phases can be present during
digestion [55]. Here, the bicontinuity and the ability of the cubic
monoglyceride phases to solubilize hydrophobic and amphiphilic
molecules are thought to be important for the lipolysis process [56].
These structural features make it possible for the lipase and water to
freely diffuse through the phases formed by the lipolysis products,
which surround the diminishing fat droplet.
Inspired by the work of Patton and Carey, the effect of lipase
action on the liquid crystalline phase as well as other self-assemble
structures such as vesicles and dispersions of cubic phases was
studied by Borné et al. using the small-angle X-ray scattering (SAXS)
and nuclear magnetic resonance (NMR) [57–59]. They showed that
the observed changes in self-assembled structures could be mapped
by following the monoolein–oleic acid phase diagram (low pH)
where lipolysis gives rise to a sequence of phase transitions:
cubic → reversed hexagonal → micellar cubic → reversed micellar
→ dispersion.
Alternatively, the monoolein–sodium oleate aqueous ternary phase
diagram is followed at high pH, when the corresponding phase
sequence proceeds from the lamellar to the normal hexagonal
phase.
Salentinig et al. also investigated the impact of the lipolysis
products, for example, oleic acid, on the structure of monoolein cubic
phase dispersions stabilized by the Pluronic F127 [60]. Their SAXS
data show that the system undergoes structural transitions from
a dispersion of bicontinuous cubic phases (cubosomes) through
dispersions of reversed hexagonal phases (hexosomes) and micellar
cubic phases (Fd3m symmetry) to emulsified microemulsions with
increasing oleic acid concentration [60]. As expected and previously
reported by Borne et al. [58], the internal structure of the dispersed
particles depends strongly on the pH, where high pH tends to favor
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

152 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

the formation of vesicles instead of the reverse phases at low pH.


When triglyceride emulsions were stabilized by β-lactoglobulin and
β-casein, in vitro lipolysis by pancreatic lipase caused a transition
sequence from an oil emulsion to a microemulsion, micellar cubic,
inverse hexagonal, and, finally, bicontinuous cubic liquid crystalline
droplets [61]. They also observed strong effects on the lipolysis
reaction of solution properties such as bile-juice concentration and
pH as well as of hydrophobic additives. Since then several studies
on the liquid crystalline phase and colloidal transformations during
the digestion of lipid assemblies, emulsions of acylglycerides, or
liquid crystalline nanoparticles (LCNPs) containing polar lipids have
been presented [61–67]. These have shown that apart from the
lipid composition and the type of lipolytic enzyme, the solution
conditions such as bile salt concentration, pH (i.e., the degree of
protonation of the fatty acids), and buffer conditions are important
both for the kinetics of the lipolysis and the formed nanostructure.
For example, Salentinig et al. observed by time-resolved synchrotron
SAXD and cryo–transmission electron microscop (cryo-TEM) that
highly ordered nanostructures are formed during the digestion
of milk fat globules catalyzed by lipolytic enzymes [64]. They
observed that at low-bile conditions highly ordered lipid particles
with substantial internal surface area are formed.
Recently Wadsäter et al. showed how the structure of the
glyceroldioleate (GDO)/soy phosphatidylcholine (soy-PC) LCNPs
evolves during the exposure to a TGL under near-physiological
temperature and pH conditions [65]. TGL catalyzes the degradation
of glycerodioleate to monoglycerides, glycerol, and free fatty acids.
During the degradation, the internal liquid crystalline structure of
the nanoparticles changes continuously from the reversed Fd3m
structure to structures with less negative curvature (hexagonal,
bicontinuous cubic, and sponge phases) and finally results in the
formation of multilamellar liposomes (Fig. 4.13).
A number of lipase activity studies have concerned the interfacial
reactions of these enzymes using monolayers and have provided
some leads on how to control lipase activity by modulating the lipid
composition [68, 69]. We have previously studied lipase-catalyzed
hydrolysis of cubic nanoparticles formed from monoolein, which is
the final step in the lipolysis of trioolein and leads to drastic changes

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Other Lipolytic Enzyme Reactions on Surfaces 153

(a)

(b)

Figure 4.13 (a) Cryo-TEM images of 50/50 soy-PC/GDO nanoparticles in


0.12 M Tris and pH 7.5 degraded by TGL at 37◦ C. The sequential evaluation
of phases is illustrated. (b) SAXD data showing the effect of TGL-catalyzed
degradation of 50/50 GDP/soy-PC LCNPs in 0.12 M Tris at pH 7.5 and
37◦ C. The lipase-free reference is shown as a thick line. Diffractograms
were initially recorded every minute for 56 min after addition of TGL.
Additional diffractograms were then recorded after 4 h and 15 min
and 9 h. Reprinted with permission from Ref. [65], Copyright  c 2014,
American Chemical Society. The article is available as open access from
http://pubs.acs.org/doi/pdf/10.1021/am501489e.

in the liquid crystalline structure [70]. Thermomyces lanuginosus


lipase (TLL) was conjugated to gold nanoparticles to visualize the
enzyme location and the enzymatic digestion of lipid aggregates by
means of cryo-TEM.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

154 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

4.3 Cellulase Enzymes

Lignocellulose fibers are hierarchical structures of cellulose fibrils


with intertwined hemicellulose chains, which are considered to
protect the cellulose from hydrolysis. The degradation of cellulosic
fibers in nature therefore requires the concerted and cooperative
action of a number of cellulolytic and hemicellulolytic enzymes [71,
72]. Cellulolytic and hemicellulolytic enzymes resemble lipolytic
enzymes in one important aspect, namely that they are water-
soluble enzymes acting on an insoluble substrate. Therefore,
although they are strictly speaking not interfacially activated
(i.e., the substrate does not have to be insoluble), interactions
with the substrate surface are important. In fact, carbohydrate-
binding modules (CBMs) are often appended to cellulases and
hemicellulases for this purpose. For instance, different CBMs that
bind crystalline cellulose (CBM1) or amorphous cellulose (CBM4)
can be used [73, 74].
The increased interest in biofuels and novel applications of
cellulosic materials has triggered a large number of studies, and
to comprehensively represent the work carried out requires a
separate review. Here we will, therefore, only address some of the
relevant aspects that have been elucidated using surface techniques
to monitor the cellulose activity by following the degradation of a
model cellulose substrate. These techniques include ellipsometry
[75–77], quartz crystal microbalance (QCM) [78–81], and neutron
reflectometry [78, 79]. Eriksson et al. showed using ellipsometry
that degradation of cellulose by a commercial cellulase mixture,
Celluzyme, was preceded by an initial enzyme adsorption to the
model cellulose surface and increased in enzyme concentration
[76]. However they noted that only a minor dependence of pH was
observed for both enzyme adsorption and activity. Interestingly they
found that the activity was saturated at sufficiently high enzyme
concentration. In a later study, they used a more well-defined
cellulase preparation, Humicola insolens GH45 cellulase [75], for
which the CBM is very important to maintain the surface binding as
well as the activity of the enzyme, which are otherwise significantly
inhibited. Removing the CMB also eliminated the pH dependence of
enzyme-catalyzed degradation process.

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Cellulase Enzymes 155

Hu et al. used QCM-D in an interesting way to measure changes


in the bulk solution viscosity as a consequence of the digestion
of different cellulosic thin films [80]. Their results were cross-
correlated with other methods to find that celluloses with higher
crystallinity indices were hydrolyzed more slowly and to a lower
extent than those of low crystallinity.
Maurer et al. carried out a more systematic kinetic study
using flow ellipsometry to record the degradation of spin-coated
cellulose films by Trichoderma reesei cellulases as a function of
time [77]. The kinetic data obtained were analyzed by applying
a combined Langmuir/Michaelis–Menten model, which takes into
account reversible adsorption of cellulase to the cellulose surface
and the formation of complexes between surface cellulose chains
with the adsorbed enzyme. The latter was found to be the rate-
determining step. As in previous studies [75, 76], the rate of cellulose
digestion at the water–solid interface increases with enzyme
concentration until a level that corresponds to the maximum
adsorption of cellulose is reached. Mauer et al. also used atomic force
microscopy (AFM) to characterize the morphology of the cellulose
surface before and after exposure to the enzyme. They found that
the surface became significantly rougher after cellulase exposure,
and they concluded from the occurrence of deep crevasses that the
enzyme action does not occur in an isotropic fashion.
Cheng et al. compared the action of four cellulases (endoglu-
canases) on amorphous cellulose films by using a combination
of neutron reflectometry and QCM-D to follow the evolution of
the cellulose layer structure [78]. The four cellulases compared
were a mesophilic fungal endoglucanas (Cel45A from H. insolens),
a processive endoglucanase from a marine bacterium (Cel5H from
Saccharophagus degradans), and two enzymes from thermophilic
bacteria (Cel9A from Alicyclobacillus acidocaldarius and Cel5A from
Thermotoga maritima). The neutron reflectivity curves and QCM-D
data for the Cel45A endoglucanase at 20◦ C are shown in Fig. 4.14.
Cel45A and Cel5H, which contain CBMs, penetrated and digested
the bulk of the amorphous cellulose films to a much greater extent
than Cel9A and Cel5A lacking CBMs. Cel45A causes a substantial
expansion and roughening of the cellulose film, evidenced by the
shift and smearing out of the interference fringes in the neutron
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

156 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

Figure 4.14 (a) Neutron reflectometry curves recorded before and after
exposing a regenerated cellulose film to a solution of Cel45A cellulase at
5 μM and 20◦ C. (b) QCM-D data showing the frequency shift  f /n and the
change in dissipation D versus time from for regenerated cellulose films
exposed to 5 μM cellulase Cel45A. Reprinted with permission from Ref. [78],
Copyright c 2012, American Chemical Society.

reflectivity data (Fig. 4.14a). A similar observation was made for


the same enzyme on a semicrystalline cellulose film using AFM
[77]. However in the case of Cel5H, the observed decrease in film
thickness due to enzyme actions did not change the roughness of
the layer as was the case for Cel45A. Cheng et al. found their results
consistent with Cel45A acting as a classic endoglucanase digesting
the interior cellulose chains, while Cel5H digested predominantly
the ends of the chains, consistent with its properties as a processive
endoglucanase [78].
One interesting aspect is to be able to control the cellulose
activity by additives, for example, by modulating the cellulose–
substrate binding. Such an effect can be dependent on the type of
cellulase used. This was demonstrated in the study by Jausovec et
al. who used in situ null ellipsometry to monitor the activity of two
cellulases extracted from Trichoderma viride and Aspergillus niger
[82]. The effect of 3-(trimethoxysilyl)-propyldimethyloctadecyl am-

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

Cellulase Enzymes 157

a 10 30 b 0.5 6
0 20
0.0 5
-10 10

dx-dx,subst (nm)
Γ−Γsubst (mg m-2)

-0.5 4
0

dx-dx,subst (nm)
-20

Γ−Γsubst (mg m-2)


-10 -1.0
-30 3

-40 -20 -1.5


2
-50 -30 -2.0
1
-60 -40
-2.5
-70 -50 0
0 60 120 150 180 0 60 120 180 240 300 360 420 480
Time (min) Time (min)
40 0.5 6
c 0 d
20 0.0 5
-10

dx-dx,subst (nm)
Γ−Γsubst (mg m-2)
dx-dx,subst (nm)
Γ−Γsubst (mg m-2)

-20 0 -0.5 4

-30 -20 -1.0


3
-40 -40 -1.5
2
-50 -60 -2.0
1
-60
-80 -2.5
-70 0
0 60 120 150 180 0 60 120 180 240 300 360 400 460
Time (min) Time (min)

Figure 4.15 The removal from cellulose films by means of cellulose action
as determined by in situ ellipsometry. (Filled symbols) The normalized film
mass  – subst , and (open symbols) the normalized thickness of the film dx –
dx, subst as a function of time after addition of cellulase from a buffer solution
of pH 4.7. (a) T. viride cellulase at 10 mg/L (circles) and 1 mg/L (squares)
and (b) A. niger cellulase at 10 mg/L (circles) and 54 mg/L (squares). When
the film was pre-exposed to the antimicrobial agent 3-(trimethoxysilyl)-
propyldimethyloctadecyl ammonium chloride the effect of cellulose activity
is reduced, as shown in (c) and (b) comparing pure cellulose film (circles)
and cellulose film treated with TMPA (squares). (c) The results as a function
of time after addition of 10 mg/L T. viride cellulose and (d) the effect
of adding 54 mg/L A. niger cellulose. Reprinted from Ref. [82], Copyright
(2008), with permission from Elsevier.

monium chloride (TMPAC), an antimicrobial agent, on the cellulase


activity was also investigated. Upon addition, the enzyme initially
adsorbs to the substrate (Fig. 4.15a,b), followed by a decrease in
the amount of cellulose on the surface due to enzyme-catalyzed
digestion. We can also note that the cellulose film thickness changes
do not follow the decrease in mass but suggest swelling of the
cellulose film. Furthermore the activities of the two enzymes are
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

158 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

quite different, with T. viride cellulase appearing to have a much


higher degradation rate. The degradation rate with this cellulase
also decreased significantly if the cellulose film was pretreated with
the antimicrobial agent, whereas A. niger cellulase action seems to
be affected to a limited extent (Fig. 4.15c,d).

4.4 Conclusion

Many types of enzymes catalyze reactions at interfaces, either in


the physiological environment or in technologically important ap-
plications, and a detailed understanding of the surface phenomena
are key to modeling, regulating, and modifying enzymatic activities.
Neutron, X-rays, and light-scattering methods have contributed new
knowledge by elucidating the structural details of the enzyme–
substrate assemblies and their transformations as a consequence
of catalytic activities. In the case of interfacially activated enzymes
such as phospholipases, the fate of the reaction products and the
relationship to enzymatic activities are crucial for understanding
the regulation of activities of these ubiquitous enzymes. For lipases,
the phase transitions related to the physical properties of the
reaction products are a key controlling factor in determining how
the soluble enzyme accesses the substrate–oil interface and how the
morphology of the product–enzyme dispersion formed. Cellulases
are used in a range of chemical and biochemical applications, from
food technology to biofuel production, and there scattering methods
have helped to relate the observed enzyme activity to the surface-
binding properties and the physical effects on the cellulose substrate
films.

Acknowledgments

The Swedish Research Council (VR) both through regular grants


and the Linnaeus Center of Excellence “Organizing Molecular
Matter,” the Swedish Foundation for Strategic Research (SSF) via
framework grant RMA08-0056, EUSTREP FP6 project BIOSCOPE
(Contract No. NMP4-CT-2003-505211), and NanoLund, the Center

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

References 159

for Nanoscience at Lund University are gratefully acknowledged for


financial support.

References

1. Six, D. A., and Dennis, E. A. (2000). The expanding superfamily


of phospholipase A(2) enzymes: classification and characterization,
Biochim. Biophys. Acta, 1488(1–2), pp. 1–19.
2. Dijkstra, B. W., Drenth, J., and Kalk, K. H. (1981). Active-site and catalytic
mechanism of phospholipase-A2, Nature, 289(5798), pp. 604–606.
3. Volwerk, J. J., Pieterson, W. A., and de Haas, G. H. (1974). Histidine at
active-site of phospholipase-A2, Biochemistry, 13(7), pp. 1446–1454.
4. Leslie, C. C. (2004). Regulation of the specific release of arachidonic
acid by cytosolic phospholipase A2, Prostaglandins Leukot. Essent. Fatty
Acids, 70(4), pp. 373–376.
5. Nevalainen, T. J., Haapamaki, M. M., and Gronroos, J. M. (2000). Roles
of secretory phospholipases A(2) in inflammatory diseases and trauma,
Biochim. Biophys. Acta, 1488(1–2), pp. 83–90.
6. Arni, R. K., and Ward, R. J. (1996). Phospholipase A(2): a structural
review, Toxicon, 34(8), pp. 827–841.
7. Wacklin, H. P., Tiberg, F., Fragneto, G., and Thomas, R. K. (2007).
Distribution of reaction products in phospholipase A2 hydrolysis,
Biochim. Biophys. Acta, 1768(5), pp. 1036–1049.
8. Hoyrup, P., Mouritsen, O. G., and Jorgensen, K. (2001). Phospholipase
A(2) activity towards vesicles of DPPC and DMPC-DSPC containing small
amounts of SMPC, Biochim. Biophys. Acta, 1515(2), pp. 133–43.
9. Honger, T., Jorgensen, K., Stokes, D., Biltonen, R. L., and Mouritsen, O.
G. (1997). Phospholipase A(2) activity and physical properties of lipid-
bilayer substrates, Methods Enzymol., 286, pp. 168–190.
10. Berg, O. G., and Jain, M. K. (2002). Interfacial Enzyme Kinetics (John Wiley
and Sons, Chichester).
11. Verger, R., and de Haas, G. H. (1973). Enzyme reactions in a membrane
model. 1. New technique to study enzyme reactions in monolayers,
Chem. Phys. Lipids, 10(2), pp. 127–136.
12. Carman, G. M., Deems, R. A., and Dennis, E. A. (1995). Lipid signaling
enzymes and surface dilution kinetics, J. Biol. Chem., 270(32), pp.
18711–18714.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

160 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

13. Verger, R., Mieras, M. C. E., and de Haas, G. H. (1973). Action of


phospholiase A at interfaces, J. Biol. Chem., 218(11), pp. 4023–4034.
14. Tiberg, F., Harwigsson, I., and Malmsten, M. (2000). Formation of model
lipid bilayers at the silica-water interface by co-adsorption with non-
ionic dodecyl maltoside surfactant, Eur. Biophys. J. Biophys. Lett., 29(3),
pp. 196–203.
15. Vacklin, H., Tiberg, F., Fragneto, G., and Thomas, R. K. (2005).
Phospholipase A2 hydrolysis of supported phospholipid bilayers: a
neutron reflectivity and ellipsometry study, Biochemistry, 44(8), pp.
2811–2821.
16. Inoue, T., Yanagihara, S., Misono, Y., and Suzuki, M. (2001). Effect of fatty
acids on phase behavior of hydrated dipalmitoylphosphatidylcholine
bilayer: saturated versus unsaturated fatty acids, Chem. Phys. Lipids,
109(2), pp. 117–133.
17. Maloney, K. M., Grandbois, M., Grainger, D. W., Salesse, C., Lewis, K.
A., and Roberts, M. F. (1995). Phospholipase A(2) domain formation
in hydrolyzed asymmetric phospholipid monolayers at the air/water
interface, Biochim. Biophys. Acta, 1235(2), pp. 395–405.
18. Panaiotov, I., and Verger, R. (2000). Enzymatic reactions at interfaces:
interfacial and temporal organization of enzymatic lipolysis. In Physical
Chemistry of Biological Interfaces, Baszkin, A., and Norde, W., eds.
(Marcel Dekker, New York), pp. 359–400.
19. Burack, W. R., Gadd, M. E., and Biltonen, R. L. (1995). Modulation
of phospholipase A(2): identification of an inactive membrane-bound
state, Biochemistry, 34(45), pp. 14819–14828.
20. Gadd, M. E., and Biltonen, R. L. (2000). Characterization of the
interaction of phospholipase A(2) with phosphatidylcholine-phosphat-
idylglycerol mixed lipids, Biochemistry, 39(32), pp. 9623–9631.
21. Bell, J. D., and Biltonen, R. L. (1992). Molecular details of the activation of
soluble phospholipase- A(2) on lipid bilayers: comparison of computer-
simulations with experimental results, J. Biol. Chem., 267(16), pp.
11046–11056.
22. Nielsen, L. K., Risbo, J., Callisen, T. H., and Bjornholm, T. (1999). Lag-burst
kinetics in phospholipase A(2) hydrolysis of DPPC bilayers visualized by
atomic force microscopy, Biochim. Biophys. Acta, 1420(1–2), pp. 266–
271.
23. Honger, T., Jorgensen, K., Biltonen, R. L., and Mouritsen, O. G. (1996).
Systematic relationship between phospholipase A(2) activity and
dynamic lipid bilayer microheterogeneity, Biochemistry, 35(28), pp.
9003–9006.

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

References 161

24. Burack, W. R., Yuan, Q., and Biltonen, R. L. (1993). Role of lateral
phase-separation in the modulation of phospholipase-A2 activity,
Biochemistry, 32(2), pp. 583–589.
25. Tatulian, S. A., Biltonen, R. L., and Tamm, L. K. (1997). Structural changes
in a secretory phospholipase A(2) induced by membrane binding: a clue
to interfacial activation?, J. Mol. Biol., 268(5), pp. 809–815.
26. Romero, G., Thompson, K., and Biltonen, R. L. (1987). The activation
of porcine pancreatic phospholipase-A2 by dipalmitoylphosphatidyl-
choline large unilamellar vesicles: analysis of the state of aggregation
of the activated enzyme, J. Biol. Chem., 262(28), pp. 13476–13482.
27. Apitzcastro, R., Jain, M. K., and Dehaas, G. H. (1982). Origin of the
latency phase during the action of phospholipase-A2 on unmodified
phosphatidylcholine vesicles, Biochim. Biophys. Acta, 688(2), pp. 349–
356.
28. Davidsen, J., Mouritsen, O. G., and Jorgensen, K. (2002). Synergistic
permeability enhancing effect of lysophospholipids and fatty acids on
lipid membranes, Biochim. Biophys. Acta, 1564(1), pp. 256–262.
29. Petrache, H. I., Feller, S. E., and Nagle, J. F. (1997). Determination of
component volumes of lipid bilayers from simulations, Biophys. J., 72(5),
pp. 2237–2242.
30. Jain, M. K., and Jahagirdar, D. V. (1985). Action of phospholipase-A2
on bilayers: effect of fatty-acid and lysophospholipid additives on the
kinetic-parameters, Biochim. Biophys. Acta, 814(2), pp. 313–318.
31. Stafford, R. E., Fanni, T., and Dennis, E. A. (1989). Interfacial properties
and critical micelle cocentration of lysophospholipids, Biochemistry, 28,
pp. 5113–5120.
32. Cajal, Y., Berg, O., and Jain, M. K. (2004). Origin of delays in monolayer
kinetics: phospholipase A2 paradigm, Biochemistry, 43, pp. 9256–
9264.
33. Wieloch, T., Borgstrom, B., Pieroni, G., Pattus, F., and Verger, R. (1982).
Product activation of pancreatic lipase, J. Biol. Chem., 257(19), pp.
11523–11528.
34. Pan, Y. H., Epstein, T. M., Jain, M. K., and Bahnson, B. J. ( 2001).
Five coplanar anion binding sites on one face of phospholipase A(2).
Relationship to interface binding, Biochemistry, 40(3), pp. 609–617.
35. Yu, B. Z., Poi, M. J., Ramagopal, U. A., Jain, R., Ramakumar, S., Berg, O.
G., et al. (2000). Structural basis of the anionic interface preference
and k*(cat) activation of pancreatic phospholipase A(2), Biochemistry,
39(40), pp. 12312–12323.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

162 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

36. Cajal, Y., Alsina, M. A., Berg, O. G., and Jain, M. K. (2000). Product
accumulation during the lag phase as the basis for the activation of
phospholipase A(2) on monolayers, Langmuir, 16(1), pp. 252–257.
37. Verger, R., Mieras, M. C. E., and Dehaas, G. H. (1973). Action of
phospholipase a at interfaces, J. Biol. Chem., 248(11), pp. 4023–
4034.
38. Leidy, C., Mouritsen, O. G., Jorgensen, K., and Peters, N. H. (2004).
Evolution of a rippled membrane during phospholipase A(2) hydrolysis
studied by time-resolved AFM, Biophys. J., 87(1), pp. 408–418.
39. Yedgar, S., Lichtenberg, D., and Schnitzer, E. (2000). Inhibition of
phospholipase A(2) as a therapeutic target, Biochim. Biophys. Acta,
1488(1–2), pp. 182–187.
40. Wacklin, H. P. (2009). Interfacial mechanism of phospholipase A(2):
pH-dependent inhibition and Me-beta-cyclodextrin activation, Biochem-
istry, 48(25), pp. 5874–5881.
41. Slotte, J. P., and Illman, S. (1996). Desorption of fatty acids from
monolayers at the air/water interface to cyclodextrin in the subphase,
Langmuir, 12(23), pp. 5664–5668.
42. Alahverdjieva, V., Ivanova, M., Verger, R., and Panaiotov, I. (2005).
A kinetic study of the formation of [beta]-cyclodextrin complexes
with monomolecular films of fatty acids and glycerides spread at the
air/water interface, Colloids Surf., B, 42(1), pp. 9–20.
43. Ivanova, M., Verger, R., and Panaiotov, I. (1997). Mechanisms underlying
the desorption of long-chain lipolytic products by cyclodextrins:
application to lipase kinetics in monolayer, Colloids Surf., B, 10(1), pp.
1–12.
44. Ivanova, M. G., Ivanova, T., Verger, R., and Panaiotov, I. (1996).
Hydrolysis of monomolecular films of long chain phosphatidylcholine
by phospholipase A(2) in the presence of beta-cyclodextrin, Colloids
Surf., B, 6(1), pp. 9–17.
45. Kanicky, J.R., and Shah, O. D. (2003). Effect of premicellar aggregation on
the pKa of fatty acid soap solutions, Langmuir, 19, pp. 2034–2038.
46. Kaiser, B. L., and Kaiser, E. T. (1969). Effect of D20 on the
carboxypeptidase-catalyzed hydrolysis of O-(trans-cinnamoyl)-L-beta-
phenyllactate and N-(N-benzoylglycyl)-L-phenylalanine, Proc. Natl.
Acad. Sci. U S A, 64, pp. 36–41.
47. Scott, D. L., Mandel, A. M., Sigler, P. B., and Honig, B. (1994). The electro-
static basis for the interfacial binding of secretory phospholipases A2,
Biophys. J., 67(2), pp. 493–504.

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

References 163

48. Scott, D. L., White, S. P., Otwinowski, Z., Yuan, W., Gelb, M. H., and Sigler,
P. B. (1990). Interfacial catalysis–the mechanism of phospholipase-A2,
Science, 250(4987), pp. 1541–1546.
49. Yu, L., and Dennis, E. A. (1991). Critical role of a hydrogen bond in
the interaction of phospholipase A2 with transition-state and substrate
analogues, Proc. Natl. Acad. Sci. U S A, 88(20), pp. 9325–9329.
50. Ikeda, K., Sano, S.-I., Teshima, K., and Samejima, Y. (1984). pH depen-
dence of the binding constant of a phospholipase A2 from Agkistrodon
halys blomhoffii venom to micelles of n-hexadecylphosphorylcholine, J.
Biochem., 96(5), pp. 1427–1436.
51. Donne-Op den Kelder, G. M., Hille, J. D. R., Dijkman, R., De Haas, G. H., and
Egmond, M. R. (1981). Binding of porcine pancreatic phospholipase A2
to various micellar substrate analogs. The involvement of histidine-48
and aspartic acid-49 in the binding process, Biochemistry, 20(14), pp.
4074–4078.
52. Schmid, R. D., and Verger, R. (1998). Lipases: interfacial enzymes with
attractive applications, Angew. Chem., Int. Ed., 37, pp. 1608–1633.
53. Verger, R. (1997). ”Interfacial activation” of lipases: facts and artifacts,
Trends Biotechnol., 15, pp. 32–38.
54. Patton, J. S., and Carey, M. C. (1979). Watching fat digestion. The
formation of visible product phases by pancreatic lipase is described,
Science, 204, pp. 145–148.
55. Lindström, M., Ljusberg-Wahren, H., Larsson, K., and Borgström, B.
(1981). Aqueous lipid phases of relevance to intestinal fat digestion and
absorption, Lipids, 16, pp. 749–754.
56. Patton, J. S., Vetter, R. D., Hamosh, M., Borgström, B., Lindström, M.,
and Carey, M. C. (1985). The light microscopy of fat digestion, Food
Microstruct., 4, pp. 29–41.
57. Borné, J., Nylander, T., and Khan, A. (2002). Effect of lipase on different
lipid liquid crystalline phases formed by oleic acid based acyl glycerols
in aqueous systems, Langmuir, 18, pp. 8972–8981.
58. Borné, J., Nylander, T., and Khan, A. (2002). Effect of lipase on
monoolein-based cubic phase dispersion (cubosomes) and vesicles, J.
Phys. Chem. B, 106(40), pp. 10492–10500.
59. Caboi, F., Borné, J., Nylander, T., Khan, A., Svendsen, A., and Patkar, S.
(2002). Lipase action on a monoolein/sodium oleate aqueous cubic
liquid crystalline phase: a NMR and X-ray diffraction study, Colloids Surf.,
B, 26, pp. 159–171.
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

164 Interfacial Enzyme Function Visualized Using Neutron, X-Ray, and Light-Scattering

60. Salentinig, S., Sagalowicz, L., and Glatter, O. (2010). Self-assembled


structures and pKa value of oleic acid in systems of biological relevance,
Langmuir, 26, pp. 11670–11679.
61. Salentinig, S., Sagalowicz, L., Leser, M. E., Tedeschi, C., and Glatter, O.
(2011). Transitions in the internal structure of lipid droplets during fat
digestion, Soft Matter, 7, pp. 650–661.
62. Barauskas, J., and Nylander, T. (2008). Lyotropic liquid crystals as
delivery vehicles for food ingredients. In Delivery and Controlled Release
of Bioactives in Foods and Nutraceuticals, Garti, N., ed. (Woodhead,
Cambridge) pp. 107–131.
63. Borné, J., Nylander, T., and Khan, A. (2002). Effect of lipases on
monoolein based cubic phase dispersions (cubosomes) and vesicles, J.
Phys. Chem. B, 106(40), pp. 10492–10500.
64. Salentinig, S. J., Phan, S., Khan, J., Hawley, A., and Boyd, B. J. (2013).
Formation of highly organized nanostructures during the digestion of
milk, ACS Nano, 7, pp. 10904–10911.
65. Wadsäter, M., Barauskas, J., Nylander, T., and Tiberg, F. (2014).
Formation of highly structured cubic micellar lipid nanoparticles of
soy phosphatidylcholine and glycerol dioleate and their degradation by
triacylglycerol lipase, ACS Appl. Mater. Interfaces, 6, pp. 7063–7069.
66. Warren, D. B., Anby, M. U., Hawley, A., and Boyd, B. J. (2011). Real
time evolution of liquid crystalline nanostructure during the digestion
of formulation lipids using synchrotron small-angle X-ray scattering,
Langmuir, 27, pp. 9528–9534.
67. Fong, W. K., Salentinig, S., Prestidge, C. A., Mezzenga, R., Hawley, A.,
and Boyd, B. J. (2014). Generation of geometrically ordered lipid-based
liquid-crystalline nanoparticles using biologically relevant enzymatic
processing, Langmuir, 30, pp. 5373–5377.
68. Reis, P., Holmberg, K., Watzke, H., Leser, M. E., and Miller, R. (2009).
Lipases at interfaces: a review, Adv. Colloid Interface Sci., 147–148, pp.
237–250.
69. Golding, M., and Wooster, T. J. (2010). The influence of emulsion
structure and stability on lipid digestion, Curr. Opin. Colloid Interface Sci.,
15(1–2), pp. 90–101.
70. Brennan, J. L., Kanaras, A. G., Nativo, P., Tshikhudo, T. R., Rees, C.,
Fernandez, L. C., et al. (2010). Enzymatic activity of lipase-nanoparticle
conjugates and the digestion of lipid liquid crystalline assemblies,
Langmuir, 26, pp. 13590–13599.
71. Van Dyk, J. S., and Pletschke, B. I. (2012). A review of lignocellulose
bioconversion using enzymatic hydrolysis and synergistic cooperation

www.ebook3000.com
March 21, 2016 14:10 PSP Book - 9in x 6in 04-Allan-Svendsen-c04

References 165

between enzymes-factors affecting enzymes, conversion and synergy,


Biotech. Adv., 30, pp. 1458–1480.
72. Zhang, Y.-H. P., and Lynd, L. R. (2006). A functionally based model for
hydrolysis of cellulose by fungal cellulases, Biotechnol. Bioeng., 94, pp.
888–898.
73. Boraston, A. B., Bolam, D. N., Gilbert, H. J., and Davies, G. J. (2006).
Carbohydrate-binding modules: fine tuning polysaccharide recognition,
Biochem. J., 382, pp. 769–781.
74. Hammel, M., Fierobe, H.-P., Czjzek, M., Kurkal, V., Smith, J. C., Bayer, E. A.,
et al. (2005). Structural basis of cellulosome efficiency explored by small
angle X-ray scattering, J. Biol. Chem., 280, pp. 38562–38568.
75. Eriksson, J., Malmsten, M., Tiberg, F., Hønger Callisen, T., Damhus,
T., and Johansen, K. S. (2005). Model cellulose films exposed toH.
insolens glucoside hydrolase family 45 endo-cellulase: the effect of the
carbohydrate-binding module, J. Colloid Interface Sci., 285, pp. 94–99.
76. Eriksson, J., Malmsten, M., Tiberg, F., Hønger, Callisen, T., Damhus, T., and
Johansen, K. S. (2005). Enzymatic degradation of model cellulose films,
J. Colloid Interface Sci., 284, pp. 99–106.
77. Maurer, S. A., Bedbrook, C. N., and Radke, C. J. (2012). Cellulase
adsorption and reactivity on a cellulose surface from flow ellipsometry,
Ind. Eng. Chem. Res., 51, pp. 11389–11400.
78. Cheng, G., Datta, S., Liu, Z., Wang, C., Murton, J. K., Brown, P. A., et al.
(2012). Interactions of endoglucanases with amorphous cellulose films
resolved by neutron reflectometry and quartz crystal microbalance with
dissipation monitoring, Langmuir, 28, pp. 8348–8358.
79. Cheng, G., Liu, Z., Murton, J. K., Jablin, M., Dubey, M., Majewski, J., et al.
(2011). Neutron reflectometry and QCM-D study of the interaction of
cellulases with films of amorphous cellulose, Biomacromolecules, 12, pp.
2216–2224.
80. Hu, G., Heitmann Jr., J. A., and Rojas, O. J. (2009). Quantification of
cellulase activity using the quartz crystal microbalance technique, Anal.
Chem., 81, pp. 1872–1880.
81. Josefsson, P., Henriksson, G., and Wågberg, L. (2008). The physical action
of cellulases revealed by a quartz crystal microbalance study using
ultrathin cellulose films and pure cellulases, Biomacromolecules, 9, pp.
249–254.
82. Jausovec, D., Angelescu, D., Voncina, B., Nylander T, and Lindman, B.
(2008). The antimicrobial reagent role on the degradation of model
cellulose film, J. Colloid Interface Sci., 327, pp. 75–83.
This page intentionally left blank

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Chapter 5

Folding Dynamics and Structural Basis of


the Enzyme Mechanism of Ubiquitin
C-Terminal Hydroylases

Shang-Te Danny Hsu


Institute of Biological Chemistry, Academia Sinica, 128, Section 2, Academia Road,
Taipei 11529, Taiwan
sthsu@gate.sinica.edu.tw

Ubiquitination is one of the key posttranslational modifications


associated with protein homeostasis, signaling, and trafficking.
Ubiquitination requires a highly coordinated action that involves an
ubiquitin activating enzyme (E1), an ubiquitin conjugating enzyme
(E2), and an ubiquitin ligase (E3) to covalently link the C-terminus
of ubiquitin to the lysine-chain ε-amino group of the substrate
protein to form an isopeptide bond [1–3]. The human genome
encodes for two E1s, almost 40 E2s, and up to 1000 E3s [2].
The vast combination of E1–E2–E3 provides substrate specificity
for ubiquitination. Upon monoubiquitination, polyubiquitination
can proceed by conjugating additional ubiquitin molecules to one
of the seven lysine residues—K6, K11, K27, K29, K33, K48, and
K63—in addition to the methionine (M1) at the N-terminus of the
ubiquitin molecule that is conjugated to the substrate protein [4].

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

168 Folding Dynamics and Structural Basis of the Enzyme Mechanism

The K48-linked polyubiquitin chain is the most common form of


polyubiquitination, and it targets substrate proteins to the ubiquitin
proteasome pathway (UPS) for degradation as a means to clear
misfolded and dysfunctional proteins in order to maintain protein
homeostasis [5, 6]. Similar function is also conferred by K11-
linked polyubiquitin chains. K63-linked polyubiqtination, on the
other hand, is associated with endocytic trafficking, inflammation,
translation, and lysosome-dependent protein degradation [7–9].
Recently, the functional role of K33-linked polyubiquitination has
been implicated in protein trafficking as well [10]. Depending
on the types of ubiquitination, polyubiquitin chains can adopt
very different conformations. For instance, K6-, K11-, and K48-
linked polyubiquitin chains are compact, while M1- and K63-linked
polyubiquitin chains are linear and extended [4, 11]. These different
conformations can be recognized by various ubiquitin-interacting
motifs (UIMs) and deubiquitinases (DUBs) [12, 13]. In addition to
protein degradation and trafficking, other forms of ubiquitinations
are involved in a myriad of biological functions. For further details,
the readers are referred to a recent review by Komander and Rape
[4].
Like ubiquitination, deubiquitination is a tightly regulated
process [14]. The human genome encodes for at least 98 DUBs [15],
and many of these DUBs are associated with cancer [16]. According
to the sequence and structural similarities, DUBs can be categorized
into six families, namely ubiquitin carboxy-terminal hydrolases
(UCHs), ubiquitin-specific proteases (USPs), ovarian-tumor pro-
teases (OTUs), Machado–Joseph disease protein-domain proteases,
JAMM/MPN domain-associated metallopeptidases (JAMMs), and
monocyte chemotactic protein-induced protein (MCPIP) [15–19].
Among all the DUB families, UCHs are the first to be identified. They
are unique to the DUB families in that the UCH domain can bind
to ubiquitin and cleaves the protein adduct from the C-terminus of
ubiquitin using the same domain; the other DUBs usually require
a ubiquitin-binding domain to bring the substrates for ubiquitin
removal by another catalytic domain, which is either covalently
linked to the ubiquitin-binding domain or forms a noncovalent
complex with auxiliary components. In this review, the focus will
be on the biochemical and biophysical analyses of UCHs, with an

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Introduction 169

additional emphasis on the folding functional dynamics of UCH


variants in the context of human diseases.

5.1 Introduction

UCHs form the peptidase C12 superfamily, which belongs to papain-


like cysteine proteases. Like all papain-like cysteine proteases,
UCHs utilize conserved cysteine, histidine, and aspartate residues to
hydrolyze the isopeptide bond at the C-terminus of ubiquitin. This
is a means to recycle monoubiquitins or, in the case of proubiquitin,
to digest the C-terminal extensions to yield mature ubiquitins. The
enzyme mechanisms and functionalities have been described in
detail by Reyes-Turcu and Wilkinson in their recent reviews [17, 19].
UCHs are present in most, if not all, eukaryotes, and their primary
sequences exhibit substantial divergence for organisms of different
genomic complexity (Fig. 5.1). While yeast genome encodes for
only one UCH, Yuh1, humans and other higher organisms have
multiple UCH isoforms. The human genome encodes for four UCHs,
namely UCH-L1, UCH-L3, UCH-L5 (also known as UCH37), and BRCA
binding protein 1 (BAP1). Human UCH-L1 shares 55% sequence
identity with UCH-L3, but it only shares 22% and 21% with UCH-
L5 and BAP1, respectively. In this regard, human UCH-L1 is as
similar to yeast Yuh1 (sequence identity 23%) as to UCH-L5 and
BAP1, suggesting a functional divergence at some point during
evolution (Fig. 5.2) [20]. Indeed, human UCH-L1 and UCH-L3 are
single-domain proteins 223 and 230 residues, respectively, in length.
UCH-L5 is 329 residues long with a C-terminal extension that
forms a coiled-coil helical motif, known as the UCH37 domain,
which is involved in proteasome binding (Section 5.1.4) [21].
BAP1 is the largest human UCH, which contains 729 residues. It
also contains a long C-terminal extension with multiple protein-
interacting motifs [22] (Section 5.1.5). Interestingly, human BAP1
shares 42% sequence identity with the UCH from Arabidopsis
thaliana, UCH1; human UCH-L5 shares 47% sequence identity
with its ortholog in Trichinella spiralis and their functions are
evolutionarily conserved [23]. Despite the sequence diversity, all the
reported UCH crystal structures preserve the same 3D structures.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

170 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Figure 5.1 Sequence alignment of the catalytic domains of selected UCHs.


The sequences were selected from the BLAST assembled RefSeq genomes.
The alignment and coloring were prepared by ESPript 3.0. The gene IDs
are shown for each entry, followed by the names of the proteins and species.
The solvent accessibility of individual residues is colored blue, cyan, and
white at the bottom of the sequence alignment on the basis of the crystal
structure of UCH-L1 (PDB ID: 2ETL).

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Introduction 171

Figure 5.2 Phylogenetic tree of selected members of the UCH family. The
phylogram is generated by ClustalW2 using the selected sequences shown
in Fig. 5.1.

The residues that are involved in ubiquitin binding and catalysis are
also highly conserved (Fig. 5.3).

5.1.1 UCH-L1
UCH-L1 is one of the most abundant proteins in neuronal cells [24,
25]. It is also known as protein gene product 9.5 (PGP9.5), which has
been identified through 2D polyacrylamide gel electrophoresis (2D-
PAGE) analysis [26]. Its expression level is estimated to accounts
for 1%–5% of total cytosolic proteins in neuronal tissues and hence
is a widely used biomarker for abnormality associated with brain
functions [27–30]. Its DUB activity was first established using bovine
homologs, namely UCH-L1, L2, L3, and H4, which are referred to
their lower and higher molecular masses during chromatography
purification [31, 32]. Although UCH-L1 is the most abundant of
the four bovine homologs, its DUB activity is the lowest. The same
applies to the human orthologs (see discussion below).

5.1.1.1 Genetic association between UCH-L1 and


neurodegenerative diseases
Despite low DUB activity, UCH-L1 is closely associated with neurode-
generative diseases [33]. The expression level of UCH-L1 was found
to be elevated in Alzheimer’s disease (AD) hippocampal proteome
[34]. UCH-L1 is also known as PARK5 for its genetic association
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

172 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Figure 5.3 Structural mapping of sequence conservation of UCH family.


(Left) The crystal structure of human UCH-L1 in complex with UbVME
(PDB ID: 3KW5) is used as the template. UCH-L1 is shown in surface
representation, except for the crossover loop (residues 148–159), which
is shown in ribbon. UbVME is shown in cartoon representation in green,
with the C-terminus being inside the binding pocket. The degree of
sequence conservation is color-ramped from blue to white for high to
low sequence conservation, respectively. (Right) Sausage representation
of the UCH domain with structural alignment and sequence conservation
rendering. The catalytic residues are shown in ball-and-stick representation
and labeled with residue type and sequence number according to the
sequence of human UCH-L1. The back structure is color-ramped from
blue to white to red for high to low degree of sequence conservation.
The radii of the backbone sausage representation correspond to the root-
mean-squared deviation of the structure alignment among all reported UCH
structures (PDB ID: 2ETL, 3IRT, 2LEN, 3IFW, 1XD3, 2WDT, 1CMX, 3RIS,
4I6N, 3A7S, 3TB3, 3IGR, 4IG7, and 3RII). The side-chain atoms of three
missense mutations in human UCH-L1, E7A, S18Y, and I93M, which are
associated with neurodegenerative diseases, are shown in green, yellow,
and red spheres, respectively. The structural alignment and rendering were
generated by ENDscript and PyMol.

with Parkinson’s disease (PD) [35–37]. Immunohistochemistry


staining showed that UCH-L1 coaggregated with α-synuclein in
Lewy bodies, a hallmark of PD [38]. The gracile axonal dystrophy
(gad) mouse has been generated with an intragenic deletion of the
UCH-L1 gene, with typical neurodegeneration phenotypes, including
sensory and motor ataxia, in addition to the dying-back-type axonal

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Introduction 173

degeneration, and formation of inclusions in nerve terminals [39].


One of the possible physical functions of UCH-L1 is the stabilization
of monoubiquitin in neurons [40]. In cell cultures, the aggregation
levels of UCH-L1 and its PD-associated variants, namely S18Y and
I93M, are elevated in response to the inhibition of proteasome [41],
suggesting the functional association between UPS and PD. Notably,
the I93M variant of UCH-L1, which is associated with increased risk
of PD, exhibits markedly increased aggregation propensity under
normal cell grown condition, suggesting that misfolding of UCH-L1
may be one of the underlying factors for PD pathogenesis [41, 42].
Unlike most UCHs, UCH-L1 has been reported to display unexpected
ubiquitin ligase activity that is dependent on the formation of
noncovalent dimers [43]. Liu et al. further suggested that the
protective role of the S18Y variant is associated with its reduced
ability for dimerizing itself and its inhibitory effect against the ligase
activity of the I93M variant in trans.

5.1.1.1.1 I93M
The I93M polymorphism was first identified in a German family
which suffered from familial early-onset PD [44]. Recombinant I93M
variant showed a loss of about half of its ubiquitin hydrolyase
activity compared to the wild type (wt), which may be attributed
to the conformation perturbations induced by the I93M mutation
that is located in close proximity to the catalytic cysteine residue,
C90, at the hydrophobic core (Fig. 5.3) [44]. However, subsequent
surveys have failed to find strong evidence of genetic association
of the I93M mutation with PD patients neither in other European
countries [45] nor in China [46]. Despite the lack of strong genetic
association with PD, the I93M variant has been subject to a gamut
of detailed investigations. Transgenic mice expressing the I93M
variant showed significant dopaminergic neuronal loss [47]. In
the COS-7 cell line, the I93M variant is more aggregation prone
than wt [41]. The I93M variant exhibited aberrant interactions
with cellular components that are involved in chaperone-mediated
autophagy (CMA), including heat-shock proteins Hsp90 and Hsc70,
as demonstrated by affinity pull-down assays [42]. This is highly
relevant because CMA is a major pathway for the clearance of α-
synuclein aggregates without the involvement of UPS [48]. The
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

174 Folding Dynamics and Structural Basis of the Enzyme Mechanism

interference of the CMA pathway by the I93M variant may therefore


lead to accumulation of toxic synuclein oligomers, thereby causing
neural toxicity [49]. However, using purified recombinant Hsp90 and
Hsc70, we did not observe direct physical interaction between UCH-
L1 variants and these molecular chaperones, neither by solution-
state nuclear magnetic resonance (NMR) spectroscopy or intrinsic
fluorescence spectroscopy nor by native PAGE, suggesting the
involvement of additional cellular factors (unpublished data).

5.1.1.1.2 S18Y
S18Y polymorphism was originally proposed to play a protective
role against early-onset PD [50]. Similar results were reported for
Chinese [51], Japanese [52–54], and German populations [55]. An
international consortium was established, and a similar conclusion
was researched [56]. However, several other studies that concerned
populations in Italy [57], Australia [58], and European Caucasians in
the U.S. [53], Han-Chinese [59, 60], and more recently Japanese [61],
in addition to other reports [62–65], found no conclusive connection
between the S18Y polymorphism and PD. The exact role of the S18Y
polymorphism in PD remains to be established.
Furthermore, the S18Y polymorphism has been found to be
weakly associated with Huntington’s disease (HD) [66–68]. Some
reports also suggested that the S18Y polymorphism is protective
against AD [69], although contradicting results have also been
reported recently [70]. Reduction in the mRNA level of UCH-L1 has
been found in an AD mouse model [71]. Interestingly, S18Y has
been suggested to promote cataract [72], which is also a misfolding
disease as a result of β crystallin aggregation, similar to PD, AD, and
HD [73].

5.1.1.1.3 E7A
Recently, an E7A polymorphism has been identified in a Turkish
family that suffers from early onset of progressive neurodegener-
ation that involves impairment of eyesight in childhood, cerebellar
ataxia, and spasticity with upper motor neuron dysfunction [74].
The recombinant E7A variant exhibited significant loss of ubiquitin-
binding capacity, thereby leading to nearly complete loss of DUB
activity. This can be rationalized structurally as E7A is highly

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Introduction 175

conserved among all UCHs (Fig. 5.1) and is involved in direct binding
to ubiquitin (Fig. 5.3).

5.1.1.2 UCH-L1 in oncogenesis


In addition to its association with neurodegenerative diseases, UCH-
L1 is also associated with various forms of cancer [75], potentially
through functional perturbations to the UPS pathway [76]. UCH-
L1 was found to promote the development of lymphoma [77, 78]
and prostate cancer [79]. Its expression level was upregulated
in colorectal cancer [80] and breast cancer [81]. Its association
with lung carcinoma has also been proposed [82]. The underlying
molecular mechanism of UCH-L1-associated metastasis may be
linked to its association with adhesion complexes that promote
cell migration and anchorage-independent growth [83]. UCH-L1 can
regulate the activities of a number of cyclin-dependent kinases,
which, in turn, enhances cancer cell proliferation. Importantly, such
a regulatory function of UCH-L1 is independent of its hydrolyase
activity [84].

5.1.2 Molecular Insights into the Pathogenesis Associated


with UCH-L1
Although its association with neurodegeneration has not yet
been firmly established, UCH-L1 has been subject to extensive
biochemical and biophysical characterizations in the context of
neurodegenerative diseases. Small-angle neutron scattering (SANS)
data suggested that wt UCH-L1 and its I93M and S18Y variants
can dimerize in solution at relatively low protein concentration
(less than 40 μM) [85]. Concentration-dependent dimerization of
UCH-L1 variants was also observed by analytical ultracentrifugation,
which is associated with its unusual ubiquitin ligase activity that is
responsible for the accumulation of α-synuclein inclusion [43].
The crystal structure of the apo form of human UCH-L1 (PDB
ID: 2ETL) [86] and those of the I93M and S18Y variants (PDB
ID: 4JKJ and 3IRT, respectively) [87] are essentially identical—
the pair-wise positional root-mean-square deviation (RMSD) of the
backbone Cα atoms are less than 0.3 Å between these variants.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

176 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Nonetheless, far-UV circular dichroism (CD) spectroscopy suggested


that the I93M mutation induces an increased amount of β-sheet
content, whereas the S18Y mutation has no discernable effect
on the secondary structure content of UCH-L1 [85]. Using high-
resolution heteronuclear NMR spectroscopy, we found considerably
large chemical shift perturbations around the mutation site of the
I93M variant; in contrast, the size and range of the chemical shift
perturbation in the S18Y are relatively small [88], which is expected
from a mutation that takes place on the surface of UCH-L1 (Fig.
5.3). To examine the effect of PD-associated mutations on the folding
of UCH-L1, we compared the folding stabilities and kinetics of wt
UCH-L1 and the I93M and S18Y variants [88]. While the structural
perturbation in the S18Y variant is marginal, the I93M variant
is significantly destablized during equilibrium unfolding by urea
and the unfolding kinetics is accelerated by approximately 1 order
of magnitude. In the same study, we employed NMR hydrogen–
deuterium exchange (HDX) experiments under native conditions to
investigate residue-specific folding stabilities of wt and the I93M
variant of UCH-L1. In line with the time-resolved fluorescence
measurements, accelerated HDX rates were observed for the I93M
variant. For the residues that exhibited EX1 HDX behavior, that is,
the HDX process is kinetically controlled by the opening rate of
the corresponding hydrogen bond [89], their HDX rates were also
accelerated by approximately one order of magnitude in the I93M
variant compared to those of wt. Our solution-based biophysical
analysis showed that the PD-associated mutations indeed induced
substantial structural perturbation as well as reduced folding
stability and increased unfolding frequency for the I93M variant.
Additionally, UCH-L1 is susceptible to oxidative modification
[90]. Wada and coworkers have shown that I93M shares the same
aberrant interactions as the posttranslationally modified form of
UCH-L1 by 4-hydroxyl-2-nonenal (HNE), which is also present in
Lewy bodies [91]. Posttranslational modification of UCH-L1 may
therefore be attributed to the etiology of neurodegeneration in
sporadic PD patients [91]. Indeed, HNE modification leads to
increased population of partially unfolded form (PUF), which is
aggregation-prone as evidenced by the poorly resolved NMR 15 N-
1
H HSQC spectra [92]. Given its small size, HNE is able to access

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Introduction 177

and modify the catalytic C90, which has limited solvent-accessible


surface area compared to the highly reactive C152 on the crossover
loop. Once C90 or C152 is covalently modified by HNE, it would
trigger subsequent unfolding events in UCH-L1, accompanied by a
cascade of HNE modifications that eventually lead to aggregation
[92]. Likewise, dopamine derivative was found to covalently modify
C152 and lead to aggregation of UCH-L1 [93]. 1,2-Naphthoquinone,
which is a reactive metabolite, also targets C152 that is mediated
by glutathione [94]. Prostaglandin (PG) and its derivative, 12-
prostaglandin J2, have been shown to inhibit the activity of UCH-L1
and UCH-L3 [95]. It was later established by NMR spectroscopy and
mass spectrometry that covalent modification of C152 of UCH-L1
by 15-deoxy-12,14-prostaglandin J2 (15d-PGJ2) can lead to partial
unfolding and aggregation of mouse homolog of wt and I93M UCH-
L1 [96].

5.1.3 UCHL3
UCH-L3 was one of the first identified UCHs [24], and its crystal
structure was also the first to be determined [97]. Together with
UCH-L1, UCH-L3 served as the model system for the original
characterization of substrate recognition and catalysis mechanism
[98]. UCH-L3 knockout mice showed no obvious phenotype,
suggesting the function of UCH-L3 is overlapped with other DUBs
[99]. However, UCH-L1 and UCH-L3 double-knockout mice exhibited
neurodegeneration, posterior paralysis, and dysphagia [100]. With
regard to substrate specificity, UCH-L3 but not UCH-L1 binds
to diubiquitin; nevertheless, K48-linked diubiquitin specifically
inhibits UCH-L3 but not UCH-L1 [101]. UCH-L3, as well as UCH-L1,
binds to and cleaves the C-terminal extension of mutant ubiquitin
(UBB+1), which is implicated in tauopathies and polyglutamine
diseases [102]. In addition to the canonical ubiquitin C-terminal
hydrolysis activity, UCH-L3 also cleaves the C-terminus of an
ubiquitin-like protein, NEDD8 [103]. Although UCH-L3 has not been
linked to human diseases, its high DUB activity, relative to other
UCHs, has made it a model system for the development of screening
assays and UCH inhibitors [104–107].
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

178 Folding Dynamics and Structural Basis of the Enzyme Mechanism

5.1.4 UCHL5
UCH-L5 is also called UCH37, for its molecular weight is ap-
proximately 37 kDa. I shall hereafter denote UCH-L5 as UCH37
in order to keep consistent nomenclature with the literature. In
addition to the UCH domain at the N-terminal part, UCH37 has
a C-terminal extension that contains several coiled-coil segments.
Through direct binding to the adhesion regulating molecule 1
(Adrm1), a yeast Rnp13 ortholog, UCH37 forms a complex with the
proteasome [21]. Complexation with the S19 proteasome complex
is required for the efficient disassembly of polyubiquitin chain
by UCH37 [108]. Additionally, UCH37 can translocate into the
nucleus and interact with the human Ino80 chromatin-remodeling
complex (hINO80) [109]. Interplay between UCH37, hINO80, and
proteasome was proposed to regulate DNA transcription and repair.
While the DUB activity of proteasome-associated UCH37 can reduce
proteasome-dependent protein degradation, a recent finding has
suggested that the binding of loosely folded proteins to proteasome-
bound UCH37 can activate adenosine triphosphate (ATP) hydrolysis
and initiate their own degradation [110]. Therefore, polyubiquitin
trimming by proteasome-associated UCH37 can either promote or
inhibit protein degradation by proteasome. Its regulation requires a
coordinated action with Rnp11 (another S19 proteasome-associated
protein) and other factors for proteolysis substrate unfolding and
translocation [111].
Recently, Liu and coworkers have used solution-state NMR spec-
troscopy to determine the C-terminal domain of Rnp13. Combined
with molecular modeling and small-angle X-ray scattering (SAXS),
they proposed a model of the UCH37 in complex with the C-terminal
domain of Rnp13 [112]. Additionally, they have generated a number
of C-terminally truncated constructs of UCH37 to demonstrate that
the C-terminal extension is responsible for the oligomerization and
the autoinhibition. Upon binding to Rnp13, full-length UCH37 is
more active than UCH-L3 in terms of ubiquitin hydrolysis.
The expression level of UCH37 is upregulated in esophageal
squamous cell carcinoma (ESCC) patients [113]. Through glu-
tathione S-transferase (GST) pull-down assays, UCH37 was found to
bind weakly to Smad2 and Smad3 but strongly to Smad7, which is

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Introduction 179

associated with Smurf ubiquitin ligase, suggesting its functional role


in ubiquitination regulation [114]. The same study also showed that
UCH37 can deubiquitinate and stabilize type I transforming growth
factor (TGF)-β receptor, which is implicated in oncogenesis [114].
UCH37 knockdown can downregulate the expression of several
genes along the TGF-β signaling pathway [115].
Compared to UCH-L1 and UCH-L3, UCH37 has the longest
crossover loop, that is, residues 150–160 (Fig. 5.1). The length of
the crossover loop is responsible for substrate recognition [116].
By grafting the long crossover loop of UCH37 onto UCHL1, the
chimeric UCH can then hydrolyze K48-linked diubiquitin, which is
not possible for wt UCH-L1.

5.1.5 BAP1
BAP1 is a BRCA1 (breast/ovarian cancer susceptibility gene
product) binding protein, which was identified through a yeast two-
hybrid assay [22]. It is a large multidomain protein of 729 amino
acids, encompassing the UCH domain at the N-terminus, followed by
four functional domains, namely (i) BRCA-associated RING domain
protein 1 (BARD1) binding domain, (ii) host cell factor 1 (HCF-1)
binding domain, (iii) transcription factor Ying Yang 1 (YY1) binding
domain, and (iv) nuclear localization sequence at the C-terminus
that targets BAP1 into the nucleus [117]. Mutations in BAP1 have
been reported to be associated with oncogenesis. For further details,
the readers are referred to a recent review by Carbone et al. [117].
In the context of DUB activity of the UCH domain of BAP1,
Harbour et al. first reported a number of frequent mutations
in BAP1 that are associated with uveal melanomas, including
two nonsense truncating mutations, Q36X and W196X, and four
missense mutations, C91G, G128R, H169Q, and S172R, in the UCH
catalytic domain [118]. In particular, mutations at the catalytic
residues, namely C91G and H169Q, would abolish the hydrolyase
activity of BAP1. Furthermore, the G128R and S172R mutations are
also located in close proximity to the catalytic site. These results
strongly suggest that the DUB activity of BAP1 is closely associated
with the oncogenesis of metastatic uveal melanomas [119].
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

180 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Deletion of mouse BAP1 is embryonic lethal as it is involved


in a network of interactions with many transcription factors
[120]. Biochemical data have shown that mammalian BAP1 is
assembled into a megadalton–multiprotein complex that contains
several transcription factors and cofactors, including HCF-1, YY1,
forkhead transcription factors FOXK1 and FOXK2, the histone
acetyltransferase HAT1, and the human homologs of additional sex
combs ASXL1 and ASXL2, suggesting a critical role of BAP1 in gene
regulation [121]. While the assembly of the multiprotein complex
is independent of the DUB activity of BAP1, BAP1 depletion or
the C91S mutation, which abrogates the DUB activity of BAP1,
lead to significant changes of the expression patterns of numerous
genes, many of which are associated with cell cycle progress, DNA
replication, recombination repair, and cell death [121]. Recently,
the Drosophila ortholog of BAP1, Calypso, has been shown to form
complex with the additional sex combs (ASX) to form a repressive
DUB (PR-DUB) that is essential for developmental gene repression
[122]. The substrate selectivity is also present for the in vitro
reconstituted Drosophila PR-DUB and human PR-DUB. Intriguingly,
the DUB activity of BAP1 and Calypso is only activated upon the
formation of PR-DUB, that is, binding to ASX is essential for the DUB
activity.

5.2 UCH Structures

The crystal structure of free human UCH-L3 was the first example of
UCH family (PDB ID: 1UCH) [97]. It contains a six-stranded β-sheet,
surrounded by seven α-helices. Residues 147–165 that correspond
to the crossover loop and the preceding α-helix 6 were missing in
this particular structure due to conformational flexibility. Through
structural homology search using the program DALI [123], UCH-
L3 was found to be structurally similar to another papain-like
protease, cathepsin B. However, although the catalytic residues, as
well as the secondary structural elements of UCH-L3 and cathepsin
B, are spatially aligned, the orders by which individual secondary
structural elements are arranged differ quite significantly. Shortly
after the structure of yeast Yuh1 in complex with the inhibitor

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

UCH Structures 181

ubiquitin aldehyde (Ubal) was reported (PDB ID: 1CMX) [124], the
structure of human UCH-L3 in complex with a suicide ubiquitin
vinylmethylester (UbVME; PDB ID: 1XD3) was also determined.
This structure revealed substantial conformational ordering of the
crossover loop and α-helix 6 that are involved in ubiquitin binding
[125]. It was therefore proposed that the crossover loop plays an
important role in substrate recognition in UCHs (Section 5.4).
The past few years saw rapid growth in the wealth of structural
understanding of UCHs. The crystal structure of human UCH-L1 in
its apo form was first reported (PDB ID: 2ETL) [86] followed by
that of UCH-L1 in complex with UbVME (PDB ID: 3KW5), along with
the structures of the PD-associated I93M and S18Y variants, both in
their apo-forms and UbVME-bound forms (PDB ID: 4JKJ, 3IRT, 3IFW,
and 3KVF) [87]. Unlike UCH-L3, the crossover loop and α-helix 6
of UCH-L1 are clearly resolved in the apo-form. Ubiquitin binding
only resulted in relatively small conformational rearrangements in
the loop structure without significant perturbation in the helical
conformation. This is confirmed by NMR chemical shift-derived
secondary structure contents of UCH-L1 and UCH-L3 [126, 127].
Notably, however, ubiquitin binding induces a cascade of side-chain
rearrangements to align the catalytic residues into the productive
configuration (Fig. 5.4). This is achieved by a concerted inward
motion of F214, which is in direct contact with ubiquitin, and F53,
which pushes the imidazole ring of H161 to close proximity of C90.
Indeed, localized conformational rearrangements are also observed
by NMR chemical shift perturbations for human UCH-L1 upon
binding to ubiquitin (unpublished data). It is worth mentioning that
although F214 of UCH-L1 is highly conserved among UCHs (Fig. 5.1),
such concerted side-chain motions are only seen in human UCH-L1.
The catalytic side chains of UCH-L3 and UCH-L5 are properly aligned
in a productive configuration that is poised to carry out ubiquitin
hydrolysis.
In the case of UCH37, the structure of the catalytic domain alone
(PDB ID: 3A7S; residues 1–228) [128], that of the catalytic domain
with a short C-terminal extension (PDB ID: 3RII and 3RIS; residues
1-240) [129], and that of full-length UCH37 (PDB ID: 3IHR) [130]
have been reported. In particular, the full-length UCH37 forms a
tetramer in the crystal structure with the C-terminal helices of one
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

182 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Figure 5.4 Structural alignment of catalytic residues of UCHs. The side


chains of the conserved catalytic residues (C90, H161, and D176 in UCH-
L1) as well as the substrate-stabilizing glutamine residue (Q84 in UCH-
L1) are shown in sticks. The VME group at the C-terminus of ubiquitin
is shown in ball-and-stick representation. The distances between Oδ2 of
D176 and Nε2 of H161, between Nδ1 of H161 and Sγ of C90, and between
Nε2 of Q84 and the thiohemiacetal hydroxyl oxygen on the aldehyde group
of vinylmethyl ester (VME) are shown in yellow dashed lines with their
distances indicated. The crystal structures of UCHs correspond to those that
are used in Fig. 5.3. To highlight the conformational changes of the catalytic
residues in UCH-L1 upon ubiquitin binding, the side chains of apo- and
ubiquitin-VME-bound UCH-L1 are colored in green and indigo with larger
radii of the sticks, compared to those of the other UCHs. Note that significant
conformational changes are also observed in the side chains of the crossover
loop-truncation variant of apo-UCH37 [116], shown in magenta.

monomer in contact with the catalytic site of the other monomer.


Size-exclusion chromatography showed that the oligomeric state of
UCH37 is concentration dependent [130]. On the basis of the crystal
structure of the oligomeric UCH37, the authors proposed that the C-
terminal extension functions as a switch for autoinhibition, which
can be switched off upon binding to the proteasome-associated
Rnp13.

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Folding Dynamics and Kinetics 183

Recently, the crystal structure of the catalytic domain of UCHL37


of T. spiralis, a parasidic nematode, in complex with ubiquitin has
been reported (PDB ID: 4IG7) [131], so have those of Plasmodium
falciparum UCH-L3 in complex with a suicide inhibitor (PDB ID:
2WE6 and 2WDT) [132]. The structures of these parasitic UCHs
may provide hints for therapeutic developments that target such
an evolutionarily conserved UCH–ubiquitin interaction [23]. In
view of improving UCH-oriented structure-based drug design, we
performed structural mapping of the sequence conservation of the
UCHs from model eukaryotic organisms, ranging from yeast to
humans. While the primary sequences of UCH-L1 from different
organisms are high, the four UCHs in humans share marginal
sequence homology. The sequence of yeast UCH and that of P.
falciparum are evolutionarily divergent from the other UCHs (Figs.
5.1 and 5.2). Overall, the surface residues are generally less
conserved than those in the hydrophobic core, including the central
β-sheet and α-helix 3. Nevertheless, those that are located at the
ubiquitin-binding interface are highly conserved (Fig. 5.3).

5.3 Folding Dynamics and Kinetics

As mentioned earlier, the 3D structures of UCHs highly resemble


those of other papain-like cysteine proteases, for example, cathepsin
B [97] (Section 5.2). However, what was not realized until recently
is that UCHs are topologically knotted with five crossings in their
backbone structures to form an intricate 52 Gordian knot [133, 134].
While it has become evident that topological knots are prevalent in
the topological space of protein structures, how and why proteins
knot themselves remain open questions [135–137].
Despite the intricate knotted elements, UCH-L1 and UCH-L3 are
not particularly stable to withstand chemical and thermal unfolding:
both proteins begin to unfold in the presence of ca. 3 M urea [88,
138] or at around 50◦ C [98]. Notably, however, both proteins exhibit
folding intermediates along their folding pathways. In the case of
UCH-L1, there is strong evidence to suggest that the urea-induced
intermediate is oligomeric, which also resembles the intermediate
that is accumulated under native condition (unpublished data).
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

184 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Stopped-flow fluorescence analysis showed that UCH-L3 exhibits


multiple distinct kinetic phases during urea-induced unfolding
and refolding, which correspond to two parallel folding pathways,
along which a distinct hyperfluorescent folding intermediate is
transiently populated [138]. Multiple unfolding kinetic phases were
also observed in UCH-L1 but the rates of which were significantly
slower than those of UCH-L3 [88].
Theoretical simulations have suggested that the tying mech-
anism of knotted protein structures involve the formation of a
slipknot intermediate [139]. There is a certain kinetic barrier
associated with the threading of the polypeptide chain through the
crossing loop. Our NMR HDX analysis on UCH-L1 has identified
some degree of protection for the N-terminus of UCH-L1, which
is part of the crossing elements together with α-helices 5 and 6
[88]. A recent study has reported that the truncation of the first
12 residues at the N-terminus of UCH-L1 (NT-UCH-L1) is readily
aggregated and degraded in cell cultures. HDX coupled with mass
spectrometry revealed that NT-UCH-L1 is less stable than wt UCH-
L1 [140]. Presumably, the knotted element at the N-terminal part of
UCH-L1 plays an important role in providing folding stability as well
as substrate-binding affinity.

5.4 Substrate Recognition

To date, all the crystal structures of ubiquitin-bound UCHs are


covalent complexes [86, 87, 124, 125, 131]. It is unclear why crystals
of noncovalent complexes could not be obtained for structure
determination, given that UCH-L1 and UCH-L3 bind to ubiquitin
quite tightly—the corresponding dissociate constants are tens
to hundreds of nM—and have a slow dissociation process [98,
141, 142]. Solution-state NMR spectroscopy has been employed
to explore the binding interfaces between UCHs and ubiquitin.
Significant line broadening was observed in yeast Yuh1 during
ubiquitin titration, suggesting that the intermediate exchange
process takes place on the microsecond-to-millisecond timescale
[143]. On the one hand similar phenomena were also observed
during ubiquitin titration into human UCH-L3 and the catalytic

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Substrate Recognition 185

domain of UCH37 (unpublished data). On the other hand, NMR was


also used to map the UCH-L3 binding site on ubiquitin through
chemical shift perturbation mapping [144, 145], and the results are
in good agreement with X-ray crystallography [125].
Using a number of naturally occurring proubiquitin constructs
in addition to designed constructs that contain single or multiple
amino acids at the C-terminus of ubiquitin, Wilkinson and coworkers
demonstrated that UCH-L1 and UCH-L3 display no preference for
the residue type, except for proline, at the P1 position, that is,
the residue immediately attached to the C-terminus of ubiquitin
[146]. Both UCHs can process proubiquitins with long C-terminal
extensions, for example, Ub-CEP52, which is processed into the yeast
ribosomal protein L40 upon UCH hydrolysis, as long as they do not
fold into a globular structure: for Ub-CEP52, the hydrolysis rate of
UCH-L1 is about four orders of magnitude slower than that of UCH-
L3.
The ability to hydrolyze folded protein substrates is associated
with the length of the crossover loop of UCHs. As mentioned
above, by grafting the long crossover loop of UCH37, which is six
residues longer than that of UCH-L1, the engineered UCH-L1 is
able to hydrolyze K48-linked diubiquitin, which is otherwise not
hydrolyzable for UCH-L1 and UCH-L3 [116]. Structurally speaking,
even with the additional six residues, it is unlikely that one folded
ubiquitin can thread through the crossover loop to expose the
isopeptide bond to the catalytic site (Fig. 5.3). In comparison with
the loop-out model that was put forward by Misaghi et al. [125], it
is likely that the longer crossover loop together with a preceding
α-helix 6 may undergo sizeable conformational rearrangements
to create a much larger crevice to accommodate a K48-linked
diubiquitin at the catalytic site. With an even longer crossover
loop—four residues longer than that of UCH37—BAP1 may target
even larger substrates, such as polyubiquitin chains or multiprotein
complexes (Fig. 5.1). Note that, the same difference in crossover loop
length is also present in the genome of a distantly related parasite
Schistosoma mansoni, which diverges from the phylogenic tree very
early on during evolution [147], suggesting that the loop-length-
dependent substrate specificity of UCHs may be a highly conserved
characteristic in eukaryotic systems.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

186 Folding Dynamics and Structural Basis of the Enzyme Mechanism

5.5 Enzyme Mechanism

To facilitate mechanistic analysis of the enzyme activity of UCHs


and development of potent UCH inhibitors, ubiquitin C-terminal
7-amido-4-methylcoumarin (UbAMC) has been developed as a
fluorogenic substrate for UCHs [141]. UbAMC’s advantage is that the
UCH enzyme activity can be monitored in real time by following
the emergence of AMC fluorescence (λex = 340 nm, l em = 425
nm) upon release from ubiquitin. It involves the formation of a
thioester intermediate using the conserved histidine side chain as
the general base, followed by nucleophilic attack of the scissile bond
to release the product—in this case, the AMC moiety—and finally
hydrolysis of the thioester bond to release ubiquitin and regenerate
a free enzyme. Detailed mechanistic insights have been delineated
for UCH-L3 [141] and UCH-L1 [142]. While the catalytic triad, C90,
H161, and D176 in UCH-L1, is essential for the enzyme activity, Das
and coworkers also showed that a conserved glutamine residue in
the proximity of the active site, Q84 in UCH-L1, plays an important
role in stabilizing the transition state of the UCH–UbAMC complex
[148]. Indeed, most of the crystal structures of apo- and ubiquitin-
bound UCHs showed properly aligned side-chain configurations,
except for the side chain of H161 of apo-UCH-L1 (Fig. 5.4). This may
explain why UCH-L1 displays the slowest turnover rate, kcat , among
all UCHs that have been characterized thus far despite its relatively
high binding affinity toward ubiquitin.
Comparison of the reported enzyme kinetics parameters of UCHs
in the literature reveals a startling finding. The turnover rates
of UbAMC hydrolysis by UCH-L1 in different studies can vary by
as much as 17-fold, while the overall kcat /KM values can vary
by fivefold (Table 5.1). A recent study has reported that reactive
oxygen species can reversibly inactivate a variety of DUBs [151].
Preincubation with reducing agents, such as dithiothreitol (DTT),
can markedly reactivate UCH-L1, UCH-L3, and UCH37. Indeed,
preincubation of UCHs with DTT is a common practice that is
reported in the literature. However, our laboratory also observed
the formation of noncovalent oligomers of UCH-L1 variants when
the stock solutions were stored at 4◦ C or −80◦ C over an extended

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Enzyme Mechanism 187

Table 5.1 Enzyme kinetics parameters of UCH variants using UbAMC as


substrate

Protein Mutation kcat (s−1 ) K M (μM) kcat /KM 106 (M−1 ·s−1 ) Reference

UCH-L1 0.010 0.034 0.31 [142]


0.02 0.040 0.50 [149]
0.035 0.047 0.74 [148]
0.174 0.12 1.45 [150]
Q84A 0.0011 0.056 0.02 [148]
I93M 0.079 0.11 0.72 [150]
S18Y 0.196 0.136 1.44 [150]
UCH-L3 8.1 0.039 207.7 [141]
18.6 0.077 241.6 [148]
9.1 0.05 182.0 [105]
Q89A 1.03 0.099 10.40 [148]
UCH-L51−240 33.67 21.5 1.57 [148]
Q82A – – 0.01 [148]
Yeast Yuh1 4.49 0.02 224.5 [124]
TsUCH371−226 0.37 1.09 0.34 [131]

period of time (unpublished data). Quantitative analysis of the


enzyme kinetics parameters of UCHs, at least for those that exhibit
significant aggregation propensities, would therefore require a
stringent sample preparation procedure to ensure that the samples
are free of inactive, oligomeric populations, in order to minimize
undesirable artifacts.
The use of UbAMC has enabled high-throughput fragment-based
screens for UCH inhibitors. Liu et al. have identified a number of
oxime isatin derivatives that can inhibit UCH-L1 and UCH-L3 with
IC50 values down to 0.8 μM. The best compound can select UCH-
L1 over UCH-L3 by up to 28-fold in their respective IC50 values.
Importantly, treatment of these UCH-L1-specific inhibitors results
in the suppression of the proliferation of the H1299 lung cancer
cell line, demonstrating the significance in developing UCH-specific
inhibitors as potential therapeutics [106]. Through structure–
activity relationship, a 7H-thieno[2,3-b]pyridin-6-one derivative has
been developed to selectively inhibit UCH-L1 with a Ki,app value
of 2.8 μM with no appreciable inhibitory effects on UCH-L3 and
other papain-like proteases at a concentration of 20 mM [152].
In silico screen followed by UbAMC assays also identified several
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

188 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Table 5.2 Substrate specificity of UCH variants

Protein Substrate kcat (s−1 ) K M (μM) kcat /KM 106 (M−1 ·s−1 ) Reference
UCH-L1 UbAMC 0.02 0.04 0.500 [149]
UbW 0.03 0.13 0.231
UbWA 0.0005 0.45 0.001
UbAW 0.0001 0.08 0.001
UCH-L3 UbAMC 5.9 0.02 295.0
UbW 2.6 0.2 13.0
UbWA 2.4 0.78 3.1
UbAW 1.5 0.16 9.4
UbAMC 9.1 0.05 182.0 [105]
Ub-LysTARMA 4.5 0.86 5.2
Ub-Gly-LysTARMA 27 0.07 385.7
Ub-p53(384-389) 0.92 3.8 0.24

1,5-dihydro-2H-pyrrol-2-one derivatives with moderate inhibitory


effects against UCH-L3 (IC50 100–150 μM) [107].
Despite all the aforementioned advantages of UbAMC and its
common usage in the literature, it is not a true physiological
substrate for UCHs. For instance, the Michaelis–Menten enzyme
kinetics parameters of UCH-L3 using UbAMC as a substrate are
distinctly different from those that are based on a C-terminal
fusion of natural amino acids—Ub-X, X = W, AW, or WA—the latter
showed markedly reduced substrate turnover rate and binding
affinity. In particular, The KM values are 10–40 times higher for
the tryptophan-containing Ub-fusion proteins compared to that
of UbAMC (Table 5.2), suggesting binding interference by the
bulky side chain of the substrate [149]. Studying the enzyme
mechanism of UCHs using natural substrates is challenging, not least
because of the difficulty to obtain homogeneous isopeptide-bonded
substrates in large quantities. Recently, Brik and coworkers have
developed a repertoire of synthetic methods to ubiquitinate peptide
substrates through chemical ligation [104] (see review by Spasser
and Brik [153]). Such approach also enabled the introduction
of fluorescent probes onto the ubiquitinated substrate for high-
throughput screening of UCH inhibitors, for example, Ub-p53(384-
389), an oncogene p53-derived ubiquitinated peptide [105]. The
ability to introduce polyubiquitination to specific recombinant

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

Conclusion 189

proteins also paved the way toward the realm of mechanistic


characterizations of true physiological substrates of not only UCHs
but also other DUBs [154].

5.6 Conclusion

While the link between oncogenesis and BAP1 has been firmly
established in the literature [117], emerging evidence has suggested
important functional roles of UCH-L1 and UCH37 in various other
forms of cancer, for their DUB activities are directly or indirectly
associated with the UPS pathway, which is key to oncogenesis [15,
16]. Indeed, UCH-L1-specific inhibitors have been developed to
suppress lung cancer cell proliferation [106]. Despite being one of
the most abundant neuronal proteins, however, the physiological
substrates of UCH-L1 remain elusive. Nevertheless, structural
comparison of the apo-and ubiquitin-bound UCH-L3 demonstrated
that the folding dynamics of the crossover loop is closely associated
with substrate binding [125]. The length of the crossover loop also
determines substrate specificity of different UCHs [116]. Although
ubiquitin binding results in entropic loss due to structural ordering,
UCH-L3 exhibits the most efficient DUB activity among all UCHs that
have been investigated so far. It is likely that the local dynamics
around the catalytic site, such as concerted side-chain motions, plays
an important role in the hydrolysis activity.
The controversy in the genetic associations of UCH-L1 variants,
namely I93M and S18Y, with neurodegenerative diseases is another
outstanding question. With the advent of chemical ligation-based
methods to ubiquitinate proteins [104], together with site-specific
fluorophore labeling, one may begin to devise high-throughput
screening procedures to scout for inhibitors or potentiators against
specific UCHs, in order to help identify the physiological substrates
of various UCHs. Compared with other UCHs, UCH-L1 is a very
inefficient UCH. Loss of UCH-L1 DUB activity due to mutations
or posttranslational modifications is likely to be compensated by
other DUBs. In other words, loss of function is unlikely to be a
major contribution of UCH-L1 in the context of neurodegenerative
diseases.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

190 Folding Dynamics and Structural Basis of the Enzyme Mechanism

Given its abundance in neuronal cells, misfolding and ag-


gregation of UCH-L1, which may tip the balance of cellular
protein homeostasis, are more likely gain-of-toxicity contributions
to neurodegeneration. Indeed, in vitro biophysical analyses have
observed highly populated folding intermediates during equilibrium
and kinetic unfolding by urea. Furthermore, the PD-associated
I93M mutation significantly destablizes UCH-L1 and accelerates
its unfolding kinetics [88]. There is good evidence to suggest the
presence of PUFs of UCH-L1 under native conditions, which are
susceptible to posttranslational modifications that would further
increase the aggregation propensity of UCH-L1 [92]. Moreover,
the tendency to co-localize with α-synuclein and Parkin [41],
both of which are well-established PD risk factors, increases the
likelihood to form aggresomes of UCH-L1 and α-synuclein that
would eventually form Lewy bodies. Structural insights into the
molecular interactions between these PD risk factors are therefore
needed to unravel the underlying mechanism of neurodegeneration.

References

1. Hershko, A., Heller, H., Elias, S., and Ciechanover, A. (1983). Compo-
nents of ubiquitin-protein ligase system. Resolution, affinity purifica-
tion, and role in protein breakdown, J. Biol. Chem., 258, pp. 8206–8214.
2. Ye, Y., and Rape, M. (2009). Building ubiquitin chains: E2 enzymes at
work, Nat. Rev. Mol. Cell Biol., 10, pp. 755–764.
3. Scheffner, M., Nuber, U., and Huibregtse, J. M. (1995). Protein
ubiquitination involving an E1-E2-E3 enzyme ubiquitin thioester
cascade, Nature, 373, pp. 81–83.
4. Komander, D., and Rape, M. (2012). The ubiquitin code, Annu. Rev.
Biochem., 81, pp. 203–229.
5. Ciechanover, A. (1998). The ubiquitin-proteasome pathway: on protein
death and cell life, EMBO J., 17, pp. 7151–7160.
6. Hershko, A., and Ciechanover, A. (1998). The ubiquitin system, Annu.
Rev. Biochem., 67, pp. 425–479.
7. Miranda, M., and Sorkin, A. (2007). Regulation of receptors and
transporters by ubiquitination: new insights into surprisingly similar
mechanisms, Mol. Interventions, 7, pp. 157–167.

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

References 191

8. Mukhopadhyay, D., and Riezman, H. (2007). Proteasome-independent


functions of ubiquitin in endocytosis and signaling, Science, 315, pp.
201–205.
9. Tanno, H., and Komada, M. (2013). The ubiquitin code and its decoding
machinery in the endocytic pathway, J. Biochem., 153, pp. 497–504.
10. Yuan, W. C., et al. (2014). K33-linked polyubiquitination of coronin 7
by Cul3-KLHL20 ubiquitin E3 ligase regulates protein trafficking, Mol.
Cell, 54, pp. 586–600.
11. Trempe, J. F. (2011). Reading the ubiquitin postal code, Curr. Opin.
Struct. Biol., 21, pp. 792–801.
12. Mevissen, T. E., et al. (2013). OTU deubiquitinases reveal mechanisms
of linkage specificity and enable ubiquitin chain restriction analysis,
Cell, 154, pp. 169–184.
13. Ye, Y., et al. (2012). Ubiquitin chain conformation regulates recognition
and activity of interacting proteins, Nature, 492, pp. 266–270.
14. Huang, O. W., and Cochran, A. G. (2013). Regulation of deubiquitinase
proteolytic activity, Curr. Opin. Struct. Biol., 23, pp. 806–811.
15. Nijman, S. M., et al. (2005). A genomic and functional inventory of
deubiquitinating enzymes, Cell, 123, pp. 773–786.
16. Fraile, J. M., Quesada, V., Rodriguez, D., Freije, J. M., and Lopez-Otin,
C. (2012). Deubiquitinases in cancer: new functions and therapeutic
options, Oncogene, 31, pp. 2373–2388.
17. Reyes-Turcu, F. E., Ventii, K. H., and Wilkinson, K. D. (2009). Regulation
and cellular roles of ubiquitin-specific deubiquitinating enzymes,
Annu. Rev. Biochem., 78, pp. 363–397.
18. Amerik, A. Y., and Hochstrasser, M. (2004). Mechanism and function of
deubiquitinating enzymes, Biochim. Biophys. Acta, 1695, pp. 189–207.
19. Reyes-Turcu, F. E., and Wilkinson, K. D. (2009). Polyubiquitin binding
and disassembly by deubiquitinating enzymes, Chem. Rev., 109, pp.
1495–1508.
20. Sanchez-Pulido, L., Kong, L., and Ponting, C. P. (2012). A common
ancestry for BAP1 and Uch37 regulators, Bioinformatics, 28, pp. 1953–
1956.
21. Lam, Y. A., Xu, W., DeMartino, G. N., and Cohen, R. E. (1997). Editing of
ubiquitin conjugates by an isopeptidase in the 26S proteasome, Nature,
385, pp. 737–740.
22. Jensen, D. E., et al. (1998). BAP1: a novel ubiquitin hydrolase which
binds to the BRCA1 RING R
nger and enhances BRCA1-mediated cell
growth suppression, Oncogene, 16, pp. 1097–1112.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

192 Folding Dynamics and Structural Basis of the Enzyme Mechanism

23. White, R. R., et al. (2011). Characterisation of the Trichinella spiralis


deubiquitinating enzyme, TsUCH37, an evolutionarily conserved
proteasome interaction partner, PLOS Negl. Trop. Dis., 5, p. e1340.
24. Wilkinson, K. D., Deshpande, S., and Larsen, C. N. (1992). Comparisons
of neuronal (PGP 9.5) and non-neuronal ubiquitin C-terminal hydro-
lases, Biochem. Soc. Trans., 20, pp. 631–637.
25. Lavender, F. L., Day, I. N., and Thompson, R. J. (1992). DNA sequence
comparison between human and marsupial genes encoding PGP9.5: a
neurone-specific ubiquitin C-terminal hydrolase, Biochem. Soc. Trans.,
20, p. 263S.
26. Thompson, R. J., Doran, J. F., Jackson, P., Dhillon, A. P., and Rode,
J. (1983). PGP 9.5: a new marker for vertebrate neurons and
neuroendocrine cells, Brain Res., 278, pp. 224–228.
27. Papa, L., et al. (2010). Ubiquitin C-terminal hydrolase is a novel
biomarker in humans for severe traumatic brain injury, Crit. Care Med.,
38, pp. 138–144.
28. Liu, M. C., et al. (2010). Ubiquitin C-terminal hydrolase-L1 as a
biomarker for ischemic and traumatic brain injury in rats, Eur. J.
Neurosci., 31, pp. 722–732.
29. Svetlov, S. I., et al. (2010). Morphologic and biochemical character-
ization of brain injury in a model of controlled blast overpressure
exposure, J. Trauma, 69, pp. 795–804.
30. Mondello, S., et al. (2012). Clinical utility of serum levels of ubiquitin
C-terminal hydrolase as a biomarker for severe traumatic brain injury,
Neurosurgery, 70, pp. 666–675.
31. Mayer, A. N., and Wilkinson, K. D. (1989). Detection, resolution, and
nomenclature of multiple ubiquitin carboxyl-terminal esterases from
bovine calf thymus, Biochemistry, 28, pp. 166–172.
32. Wilkinson, K. D., Lee, K. M., Deshpande, S., Duerksen-Hughes, P., Boss,
J. M., and Pohl, J. (1989). The neuron-specific protein PGP 9.5 is a
ubiquitin carboxyl-terminal hydrolase, Science, 246, pp. 670–673.
33. Gong, B., and Leznik, E. (2007). The role of ubiquitin C-terminal
hydrolase L1 in neurodegenerative disorders, Drug News Perspect., 20,
pp. 365–370.
34. Sultana, R., et al. (2007). Proteomics analysis of the Alzheimer’s
disease hippocampal proteome, J. Alzheimers Dis., 11, pp. 153–164.
35. Hattori, N., and Mizuno, Y. (2004). Pathogenetic mechanisms of parkin
in Parkinson’s disease, Lancet, 364, pp. 722–724.

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

References 193

36. Gosal, D., Ross, O. A., and Toft, M. (2006). Parkinson’s disease: the
genetics of a heterogeneous disorder, Eur. J. Neurol., 13, pp. 616–627.
37. Mizuno, Y., et al. (2006). Progress in familial Parkinson’s disease, J.
Neural Transm. Suppl., pp. 191–204.
38. Lowe, J., McDermott, H., Landon, M., Mayer, R. J., and Wilkinson, K. D.
(1990). Ubiquitin carboxyl-terminal hydrolase (PGP 9.5) is selectively
present in ubiquitinated inclusion bodies characteristic of human
neurodegenerative diseases, J. Pathol., 161, pp. 153–160.
39. Saigoh, K., et al. (1999). Intragenic deletion in the gene encoding
ubiquitin carboxy-terminal hydrolase in gad mice, Nat. Genet., 23, pp.
47–51.
40. Osaka, H. (2003). Ubiquitin carboxy-terminal hydrolase L1 binds to
and stabilizes monoubiquitin in neuron, Human Mol. Genet., 12, pp.
1945–1958.
41. Ardley, H. C., Scott, G. B., Rose, S. A., Tan, N. G., and Robinson, P.
A. (2004). UCH-L1 aggresome formation in response to proteasome
impairment indicates a role in inclusion formation in Parkinson’s
disease, J. Neurochem., 90, pp. 379–391.
42. Kabuta, T., Furuta, A., Aoki, S., Furuta, K., and Wada, K. (2008). Aberrant
interaction between Parkinson disease-associated mutant UCH-L1 and
the lysosomal receptor for chaperone-mediated autophagy, J. Biol.
Chem., 283, pp. 23731–23738.
43. Liu, Y., Fallon, L., Lashuel, H. A., Liu, Z., and Lansbury, P. T., Jr. (2002). The
UCH-L1 gene encodes two opposing enzymatic activities that affect
alpha-synuclein degradation and Parkinson’s disease susceptibility,
Cell, 111, pp. 209–218.
44. Leroy, E., et al. (1998). The ubiquitin pathway in Parkinson’s disease,
Nature, 395, pp. 451–452.
45. Harhangi, B. S., et al. (1999). The Ile93Met mutation in the
ubiquitin carboxy-terminal-hydrolase-L1 gene is not observed in
European cases with familial Parkinson’s disease, Neurosci. Lett., 270,
pp. 1–4.
46. Shi, Q., and Tao, E. (2003). An Ile93Met substitution in the UCH-L1 gene
is not a disease-causing mutation for idiopathic Parkinson’s disease,
Chin. Med. J. (Engl.), 116, pp. 312–313.
47. Setsuie, R., et al. (2007). Dopaminergic neuronal loss in transgenic
mice expressing the Parkinson’s disease-associated UCH-L1 I93M
mutant, Neurochem. Int., 50, pp. 119–129.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

194 Folding Dynamics and Structural Basis of the Enzyme Mechanism

48. Cuervo, A. M., Stefanis, L., Fredenburg, R., Lansbury, P. T., and
Sulzer, D. (2004). Impaired degradation of mutant alpha-synuclein by
chaperone-mediated autophagy, Science, 305, pp. 1292–1295.
49. Kabuta, T., and Wada, K. (2008). Insights into links between familial
and sporadic Parkinson’s disease: physical relationship between UCH-
L1 variants and chaperone-mediated autophagy, Autophagy, 4, pp.
827–829.
50. Wang, J., Zhao, C. Y., Si, Y. M., Liu, Z. L., Chen, B., and Yu, L. (2002). ACT
and UCH-L1 polymorphisms in Parkinson’s disease and age of onset,
Mov. Disord., 17, pp. 767–771.
51. Tan, E. K., et al. (2006). Case-control study of UCHL1 S18Y variant in
Parkinson’s disease, Mov. Disord., 21, pp. 1765–1768.
52. Satoh, J., and Kuroda, Y. (2001). A polymorphic variation of serine to
tyrosine at codon 18 in the ubiquitin C-terminal hydrolase-L1 gene
is associated with a reduced risk of sporadic Parkinson’s disease in a
Japanese population, J Neurol. Sci., 189, pp. 113–117.
53. Zhang, J., et al. (2000). Association between a polymorphism of
ubiquitin carboxy-terminal hydrolase L1 (UCH-L1) gene and sporadic
Parkinson’s disease, Parkinsonism Relat. Disord., 6, pp. 195–197.
54. Momose, Y., et al. (2002). Association studies of multiple candidate
genes for Parkinson’s disease using single nucleotide polymorphisms,
Ann. Neurol., 51, pp. 133–136.
55. Wintermeyer, P., et al. (2000). Mutation analysis and association
studies of the UCHL1 gene in German Parkinson’s disease patients,
Neuroreport, 11, pp. 2079–2082.
56. Maraganore, D. M., et al. (2004). UCHL1 is a Parkinson’s disease
susceptibility gene, Ann. Neurol., 55, pp. 512–521.
57. Savettieri, G., et al. (2001). Lack of association between ubiquitin
carboxy-terminal hydrolase L1 gene polymorphism and PD, Neurology,
57, pp. 560–561.
58. Mellick, G. D., and Silburn, P. A. (2000). The ubiquitin carboxy-terminal
hydrolase-L1 gene S18Y polymorphism does not confer protection
against idiopathic Parkinson’s disease, Neurosci. Lett., 293, pp. 127–
130.
59. Zhang, Z. J., et al. (2008). Lack of evidence for association of a UCH-
L1 S18Y polymorphism with Parkinson’s disease in a Han-Chinese
population, Neurosci. Lett., 442, pp. 200–202.
60. Tan, E. K., et al. (2010). Analysis of the UCHL1 genetic variant in
Parkinson’s disease among Chinese, Neurobiol. Aging, 31, pp. 2194–
2196.

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

References 195

61. Miyake, Y., et al. (2012). UCHL1 S18Y variant is a risk factor for
Parkinson’s disease in Japan, BMC Neurol., 12, p. 62.
62. Levecque, C., et al. (2001). No genetic association of the ubiquitin
carboxy-terminal hydrolase-L1 gene S18Y polymorphism with familial
Parkinson’s disease, J. Neural. Transm., 108, pp. 979–984.
63. Elbaz, A., et al. (2003). S18Y polymorphism in the UCH-L1 gene and
Parkinson’s disease: evidence for an age-dependent relationship, Mov.
Disord., 18, pp. 130–137.
64. Healy, D. G., et al. (2006). UCHL-1 is not a Parkinson’s disease
susceptibility gene, Ann. Neurol., 59, pp. 627–633.
65. Hutter, C. M., et al. (2008). Lack of evidence for an association between
UCHL1 S18Y and Parkinson’s disease, Eur. J. Neurol., 15, pp. 134–139.
66. Naze, P., Vuillaume, I., Destee, A., Pasquier, F., and Sablonniere, B.
(2002). Mutation analysis and association studies of the ubiquitin
carboxy-terminal hydrolase L1 gene in Huntington’s disease, Neurosci.
Lett., 328, pp. 1–4.
67. Metzger, S., et al. (2006). The S18Y polymorphism in the UCHL1 gene is
a genetic modifier in Huntington’s disease, Neurogenetics, 7, pp. 27–30.
68. Xu, E. H., Tang, Y., Li, D., and Jia, J. P. (2009). Polymorphism of HD and
UCHL-1 genes in Huntington’s disease, J. Clin. Neurosci., 16, pp. 1473–
1477.
69. Xue, S., and Jia, J. (2006). Genetic association between Ubiquitin
Carboxy-terminal Hydrolase-L1 gene S18Y polymorphism and spo-
radic Alzheimer’s disease in a Chinese Han population, Brain Res.,
1087, pp. 28–32.
70. Zetterberg, M., et al. (2010). Ubiquitin carboxy-terminal hydrolase L1
(UCHL1) S18Y polymorphism in Alzheimer’s disease, Mol. Neurode-
gener., 5, p. 11.
71. Poon, W. W., et al. (2013). beta-Amyloid (Abeta) oligomers impair
brain-derived neurotrophic factor retrograde trafficking by down-
regulating ubiquitin C-terminal hydrolase, UCH-L1, J. Biol. Chem., 288,
pp. 16937–16948.
72. Rudolph, T., et al. (2011). Ubiquitin carboxyl-terminal esterase L1
(UCHL1) S18Y polymorphism in patients with cataracts, Ophthalmic
Genet., 32, pp. 75–79.
73. Chiti, F., and Dobson, C. M. (2006). Protein misfolding, functional
amyloid, and human disease, Annu. Rev. Biochem., 75, pp. 333–366.
74. Bilguvar, K., et al. (2013). Recessive loss of function of the neuronal
ubiquitin hydrolase UCHL1 leads to early-onset progressive neurode-
generation, Proc. Natl. Acad. Sci. U S A, 110, pp. 3489–3494.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

196 Folding Dynamics and Structural Basis of the Enzyme Mechanism

75. Fang, Y., Fu, D., and Shen, X. Z. (2010). The potential role of ubiquitin c-
terminal hydrolases in oncogenesis, Biochim. Biophys. Acta, 1806, pp.
1–6.
76. Jara, J. H., Frank, D. D., and Ozdinler, P. H. (2013). Could dysregu-
lation of UPS be a common underlying mechanism for cancer and
neurodegeneration? Lessons from UCHL1, Cell Biochem. Biophys., 67,
pp. 45–53.
77. Hussain, S., et al. (2010). The de-ubiquitinase UCH-L1 is an oncogene
that drives the development of lymphoma in vivo by deregulating
PHLPP1 and Akt signaling, Leukemia, 24, pp. 1641–1655.
78. Bheda, A., Yue, W., Gullapalli, A., Shackelford, J., and Pagano, J. S. (2011).
PU.1-dependent regulation of UCH L1 expression in B-lymphoma cells,
Leuk. Lymphoma, 52, pp. 1336–1347.
79. Jang, M. J., Baek, S. H., and Kim, J. H. (2011). UCH-L1 promotes cancer
metastasis in prostate cancer cells through EMT induction, Cancer
Lett., 302, pp. 128–135.
80. Ma, Y., et al. (2010). Proteomic profiling of proteins associated with
lymph node metastasis in colorectal cancer, J. Cell Biochem., 110, pp.
1512–1519.
81. Schroder, C., et al. (2013). Prognostic relevance of ubiquitin C-terminal
hydrolase L1 (UCH-L1) mRNA and protein expression in breast cancer
patients, J. Cancer Res. Clin. Oncol., 139, pp. 1745–1755.
82. Orr, K. S., et al. (2011). Potential prognostic marker ubiquitin carboxyl-
terminal hydrolase-L1 does not predict patient survival in non-small
cell lung carcinoma, J. Exp. Clin. Cancer Res., 30, p. 79.
83. Frisan, T., Coppotelli, G., Dryselius, R., and Masucci, M. G. (2012).
Ubiquitin C-terminal hydrolase-L1 interacts with adhesion complexes
and promotes cell migration, survival, and anchorage independent
growth, FASEB J., 26, pp. 5060–5070.
84. Kabuta, T., et al. (2013). Ubiquitin C-terminal hydrolase L1 (UCH-L1)
acts as a novel potentiator of cyclin-dependent kinases to enhance cell
proliferation independently of its hydrolase activity, J. Biol. Chem., 288,
pp. 12615–12626.
85. Naito, S., et al. (2006). Characterization of multimetric variants of
ubiquitin carboxyl-terminal hydrolase L1 in water by small-angle
neutron scattering, Biochem. Biophys. Res. Commun., 339, pp. 717–725.
86. Das, C., et al. (2006). Structural basis for conformational plasticity of
the Parkinson’s disease-associated ubiquitin hydrolase UCH-L1, Proc.
Natl. Acad. Sci. U S A, 103, pp. 4675–4680.

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

References 197

87. Boudreaux, D. A., Maiti, T. K., Davies, C. W., and Das, C. (2010). Ubiquitin
vinyl methyl ester binding orients the misaligned active site of the
ubiquitin hydrolase UCHL1 into productive conformation, Proc. Natl.
Acad. Sci. U S A, 107, pp. 9117–9122.
88. Andersson, F. I., et al. (2011). The effect of Parkinson’s-disease-
associated mutations on the deubiquitinating enzyme UCH-L1, J. Mol.
Biol., 407, pp. 261–272.
89. Krishna, M. M., Hoang, L., Lin, Y., and Englander, S. W. (2004).
Hydrogen exchange methods to study protein folding, Methods, 34,
pp. 51–64.
90. Choi, J., et al. (2004). Oxidative modifications and down-regulation of
ubiquitin carboxyl-terminal hydrolase L1 associated with idiopathic
Parkinson’s and Alzheimer’s diseases, J. Biol. Chem., 279, pp. 13256–
13264.
91. Kabuta, T., et al. (2008). Aberrant molecular properties shared by
familial Parkinson’s disease-associated mutant UCH-L1 and carbonyl-
modified UCH-L1, Hum. Mol. Genet., 17, pp. 1482–1496.
92. Werrell, E. F. (2011). Biophysical Studies on the Neuronal Ubiquitin C-
terminal hydrolase, UCH-L1 (University of Cambridge, Cambridge).
93. Contu, V. R., et al. (2014). Endogenous neurotoxic dopamine deriv-
ative covalently binds to Parkinson’s disease-associated ubiquitin
C-terminal hydrolase L1 and alters its structure and function, J.
Neurochem., 130, pp. 826–838.
94. Toyama, T., Shinkai, Y., Yazawa, A., Kakehashi, H., Kaji, T., and Kumagai,
Y. (2014). Glutathione-mediated reversibility of covalent modification
of ubiquitin carboxyl-terminal hydrolase L1 by 1,2-naphthoquinone
through Cys152, but not Lys4, Chem. Biol. Interact., 214,
pp. 41–48.
95. Li, Z., et al. (2004). Delta12-Prostaglandin J2 inhibits the ubiquitin
hydrolase UCH-L1 and elicits ubiquitin-protein aggregation without
proteasome inhibition, Biochem. Biophys. Res. Commun., 319, pp.
1171–1180.
96. Koharudin, L. M., Liu, H., Di Maio, R., Kodali, R. B., Graham, S. H.,
and Gronenborn, A. M. (2010). Cyclopentenone prostaglandin-induced
unfolding and aggregation of the Parkinson disease-associated UCH-
L1, Proc. Natl. Acad. Sci. U S A, 107, pp. 6835–6840.
97. Johnston, S. C., Larsen, C. N., Cook, W. J., Wilkinson, K. D., and Hill, C. P.
(1997). Crystal structure of a deubiquitinating enzyme (human UCH-
L3) at 1.8 A resolution, EMBO J., 16, pp. 3787–3796.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

198 Folding Dynamics and Structural Basis of the Enzyme Mechanism

98. Larsen, C. N., Price, J. S., and Wilkinson, K. D. (1996). Substrate binding
and catalysis by ubiquitin C-terminal hydrolases: identification of two
active site residues, Biochemistry, 35, pp. 6735–6744.
99. Kurihara, L. J., Semenova, E., Levorse, J. M., and Tilghman, S. M.
(2000). Expression and functional analysis of Uch-L3 during mouse
development, Mol. Cell Biol., 20, pp. 2498–2504.
100. Kurihara, L. J., Kikuchi, T., Wada, K., and Tilghman, S. M. (2001). Loss
of Uch-L1 and Uch-L3 leads to neurodegeneration, posterior paralysis
and dysphagia, Hum. Mol. Genet., 10, pp. 1963–1970.
101. Setsuie, R., Sakurai, M., Sakaguchi, Y., and Wada, K. (2009). Ubiquitin
dimers control the hydrolase activity of UCH-L3, Neurochem. Int., 54,
pp. 314–321.
102. Dennissen, F. J., et al. (2011). Mutant ubiquitin (UBB+1) associated
with neurodegenerative disorders is hydrolyzed by ubiquitin C-
terminal hydrolase L3 (UCH-L3), FEBS Lett., 585, pp. 2568–2574.
103. Wada, H., Kito, K., Caskey, L. S., Yeh, E. T., and Kamitani, T. (1998).
Cleavage of the C-terminus of NEDD8 by UCH-L3, Biochem. Biophys.
Res. Commun., 251, pp. 688–692.
104. Kumar, K. S., Spasser, L., Ohayon, S., Erlich, L. A., and Brik, A. (2011).
Expeditious chemical synthesis of ubiquitinated peptides employing
orthogonal protection and native chemical ligation, Bioconjug. Chem.,
22, pp. 137–143.
105. Ohayon, S., Spasser, L., Aharoni, A., and Brik, A. (2012). Targeting
deubiquitinases enabled by chemical synthesis of proteins, J. Am.
Chem. Soc., 134, pp. 3281–3289.
106. Liu, Y., et al. (2003). Discovery of inhibitors that elucidate the role of
UCH-L1 activity in the H1299 lung cancer cell line, Chem. Biol., 10, pp.
837–846.
107. Hirayama, K., Aoki, S., Nishikawa, K., Matsumoto, T., and Wada, K.
(2007). Identification of novel chemical inhibitors for ubiquitin C-
terminal hydrolase-L3 by virtual screening, Bioorg. Med. Chem., 15, pp.
6810–6818.
108. Yao, T., et al. (2006). Proteasome recruitment and activation of the
Uch37 deubiquitinating enzyme by Adrm1, Nat. Cell Biol., 8, pp. 994–
1002.
109. Yao, T., et al. (2008). Distinct modes of regulation of the Uch37 deu-
biquitinating enzyme in the proteasome and in the Ino80 chromatin-
remodeling complex, Mol. Cell, 31, pp. 909–917.

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

References 199

110. Peth, A., Kukushkin, N., Bosse, M., and Goldberg, A. L. (2013).
Ubiquitinated proteins activate the proteasomal ATPases by binding
to Usp14 or Uch37 homologs, J. Biol. Chem., 288, pp. 7781–7790.
111. Liu, C. W., and Jacobson, A. D. (2013). Functions of the 19S complex in
proteasomal degradation, Trends Biochem. Sci., 38, pp. 103–110.
112. Jiao, L., et al. (2014). Mechanism of the Rpn13-induced activation of
Uch37, Protein Cell, 5, pp. 616–630.
113. Chen, Y., et al. (2012). Expression and clinical significance of UCH37
in human esophageal squamous cell carcinoma, Dig. Dis. Sci., 57, pp.
2310–2317.
114. Wicks, S. J., et al. (2005). The deubiquitinating enzyme UCH37 interacts
with Smads and regulates TGF-beta signalling, Oncogene, 24, pp. 8080–
8084.
115. Cutts, A. J., Soond, S. M., Powell, S., and Chantry, A. (2011). Early phase
TGFbeta receptor signalling dynamics stabilised by the deubiquitinase
UCH37 promotes cell migratory responses, Int. J. Biochem. Cell Biol., 43,
pp. 604–612.
116. Zhou, Z. R., Zhang, Y. H., Liu, S., Song, A. X., and Hu, H. Y. (2012). Length
of the active-site crossover loop defines the substrate specificity of
ubiquitin C-terminal hydrolases for ubiquitin chains, Biochem. J., 441,
pp. 143–149.
117. Carbone, M., Yang, H., Pass, H. I., T., K., Testa, J. R., and Gaudino, G.
(2013). BAP1 and cancer, Nat. Rev. Cencer, 13, pp. 153–159.
118. Harbour, J. W., et al. (2010). Frequent mutation of BAP1 in metastasiz-
ing uveal melanomas, Science, 330, pp. 1410–1413.
119. Wiesner, T., et al. (2012). Toward an improved definition of the tumor
spectrum associated with BAP1 germline mutations, J. Clin. Oncol., 30,
pp. e337–340.
120. Dey, A., et al. (2012). Loss of the tumor suppressor BAP1 causes
myeloid transformation, Science, 337, pp. 1541–1546.
121. Yu, H., et al. (2010). The ubiquitin carboxyl hydrolase BAP1 forms a
ternary complex with YY1 and HCF-1 and is a critical regulator of gene
expression, Mol. Cell Biol., 30, pp. 5071–5085.
122. Scheuermann, J. C., et al. (2010). Histone H2A deubiquitinase activity
of the Polycomb repressive complex PR-DUB, Nature, 465, pp. 243–
247.
123. Holm, L., and Sander, C. (1993). Protein structure comparison by
alignment of distance matrices, J. Mol. Biol., 233, pp. 123–138.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

200 Folding Dynamics and Structural Basis of the Enzyme Mechanism

124. Johnston, S. C., Riddle, S. M., Cohen, R. E., and Hill, C. P. (1999).
Structural basis for the specificity of ubiquitin C-terminal hydrolases,
EMBO J., 18, pp. 3877–3887.
125. Misaghi, S., Galardy, P. J., Meester, W. J., Ovaa, H., Ploegh, H. L.,
and Gaudet, R. (2005). Structure of the ubiquitin hydrolase UCH-L3
complexed with a suicide substrate, J. Biol. Chem., 280, pp. 1512–
1520.
126. Andersson, F. I., Jackson, S. E., and Hsu, S. T. (2010). Backbone
assignments of the 26 kDa neuron-specific ubiquitin carboxyl-terminal
hydrolase L1 (UCH-L1), Biomol. NMR Assign., 4, pp. 41–43.
127. Harris, R., et al. (2007). Backbone 1H, 13C, and 15N resonance
assignments for the 26-kD human de-ubiquitinating enzyme UCH-L3,
Biomol. NMR Assign., 1, pp. 51–53.
128. Nishio, K., et al. (2009). Crystal structure of the de-ubiquitinating
enzyme UCH37 (human UCH-L5) catalytic domain, Biochem. Biophys.
Res. Commun., 390, pp. 855–860.
129. Maiti, T. K., Permaul, M., Boudreaux, D. A., Mahanic, C., Mauney, S., and
Das, C. (2011). Crystal structure of the catalytic domain of UCHL5,
a proteasome-associated human deubiquitinating enzyme, reveals an
unproductive form of the enzyme, FEBS J., 278, pp. 4917–4926.
130. Burgie, S. E., Bingman, C. A., Soni, A. B., and Phillips, G. N., Jr. (2011).
Structural characterization of human Uch37, Proteins, 80, pp. 649–
654.
131. Morrow, M. E., et al. (2013). Stabilization of an unusual salt bridge
in ubiquitin by the extra C-terminal domain of the proteasome-
associated deubiquitinase UCH37 as a mechanism of its exo specificity,
Biochemistry, 52, pp. 3564–3578.
132. Artavanis-Tsakonas, K., et al. (2010). Characterization and structural
studies of the Plasmodium falciparum ubiquitin and Nedd8 hydrolase
UCHL3, J. Biol. Chem., 285, pp. 6857–6866.
133. Virnau, P., Mirny, L. A., and Kardar, M. (2006). Intricate knots in
proteins: function and evolution, PLOS Comp. Biol., 2, p. e122.
134. Lai, Y. L., Yen, S. C., Yu, S. H., and Hwang, J. K. (2007). pKNOT: the protein
KNOT web server, Nucleic Acids Res., 35, pp. W420–424.
135. Virnau, P., Mallam, A., and Jackson, S. (2011). Structures and folding
pathways of topologically knotted proteins, J. Phys. Condens. Matter, 23,
p. 033101.
136. Mallam, A. L. (2009). How does a knotted protein fold?, FEBS J., 276,
pp. 365–375.

www.ebook3000.com
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

References 201

137. Taylor, W. R. (2000). A deeply knotted protein structure and how it


might fold, Nature, 406, pp. 916–919.
138. Andersson, F. I., Pina, D. G., Mallam, A. L., Blaser, G., and Jackson, S. E.
(2009). Untangling the folding mechanism of the 5(2)-knotted protein
UCH-L3, FEBS J., 276, pp. 2625–2635.
139. Sulkowska, J. I., Noel, J. K., Ramirez-Sarmiento, C. A., Rawdon, E. J.,
Millett, K. C., and Onuchic, J. N. (2013). Knotting pathways in proteins,
Biochem. Soc. Trans., 41, pp. 523–527.
140. Kim, H. J., et al. (2014). N-terminal truncated UCH-L1 prevents
Parkinson’s disease associated damage, PLOS ONE, 9, p. e99654.
141. Dang, L. C., Melandri, F. D., and Stein, R. L. (1998). Kinetic and
mechanistic studies on the hydrolysis of ubiquitin C-terminal 7-amido-
4-methylcoumarin by deubiquitinating enzymes, Biochemistry, 37, pp.
1868–1879.
142. Case, A., and Stein, R. L. (2006). Mechanistic studies of ubiquitin C-
terminal hydrolase L1, Biochemistry, 45, pp. 2443–2452.
143. Rajesh, S., Sakamoto, T., Iwamoto-Sugai, M., Shibata, T., Kohno, T., and
Ito, Y. (1999). Ubiquitin binding interface mapping on yeast ubiquitin
hydrolase by NMR chemical shift perturbation, Biochemistry, 38, pp.
9242–9253.
144. Roth, G., et al. (2007). Ubiquitin binds to a short peptide segment of
hydrolase UCH-L3: a study by FCS, RIfS, ITC and NMR, ChemBioChem,
8, pp. 323–331.
145. Wilkinson, K. D., et al. (1999). The binding site for UCH-L3 on ubiquitin:
mutagenesis and NMR studies on the complex between ubiquitin and
UCH-L3, J. Mol. Biol., 291, pp. 1067–1077.
146. Larsen, C. N., Krantz, B. A., and Wilkinson, K. D. (1998). Substrate speci-
ficity of deubiquitinating enzymes: ubiquitin C-terminal hydrolases,
Biochemistry, 37, pp. 3358–3368.
147. Pereira, R. V., et al. (2014). Conservation and developmental expres-
sion of ubiquitin isopeptidases in Schistosoma mansoni, Mem. Inst.
Oswaldo Cruz, 109, pp. 1–8.
148. Boudreaux, D. A., Chaney, J., Maiti, T. K., and Das, C. (2012). Contribution
of active site glutamine to rate enhancement in ubiquitin C-terminal
hydrolases, FEBS J., 279, pp. 1106–1118.
149. Luchansky, S. J., Lansbury, P. T., Jr., and Stein, R. L. (2006). Substrate
recognition and catalysis by UCH-L1, Biochemistry, 45, pp. 14717–
14725.
March 21, 2016 13:39 PSP Book - 9in x 6in 05-Allan-Svendsen-c05

202 Folding Dynamics and Structural Basis of the Enzyme Mechanism

150. Nishikawa, K., et al. (2003). Alterations of structure and hydrolase ac-
tivity of parkinsonism-associated human ubiquitin carboxyl-terminal
hydrolase L1 variants, Biochem. Biophys. Res. Commun., 304, pp. 176–
183.
151. Lee, J. G., Baek, K., Soetandyo, N., and Ye, Y. (2013). Reversible
inactivation of deubiquitinases by reactive oxygen species in vitro and
in cells, Nat. Commun., 4, p. 1568.
152. Mermerian, A. H., Case, A., Stein, R. L., and Cuny, G. D. (2007). Structure-
activity relationship, kinetic mechanism, and selectivity for a new class
of ubiquitin C-terminal hydrolase-L1 (UCH-L1) inhibitors, Bioorg. Med.
Chem. Lett., 17, pp. 3729–3732.
153. Spasser, L., and Brik, A. (2012). Chemistry and biology of the ubiquitin
signal, Angew. Chem., Int. Ed. Engl., 51, pp. 6840–6862.
154. Hemantha, H. P., et al. (2014). Nonenzymatic polyubiquitination of
expressed proteins, J. Am. Chem. Soc., 136, pp. 2665–2673.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Chapter 6

Stabilization of Enzymes by Metal


Binding: Structures of Two Alkalophilic
Bacillus Subtilases and Analysis of the
Second Metal-Binding Site of the
Subtilase Family

Jan Dohnalek,a,b,c Katherine E. McAuley,a,d


Andrzej M. Brzozowski,a Peter R. Østergaard,e
Allan Svendsen,e and Keith S. Wilsona
a Structural Biology Laboratory, Department of Chemistry, University of York,

Heslington, York YO10 5DD, UK


b Institute of Biotechnology of the Academy of Sciences of the Czech Republic,

Vı́deňská 1083, 14220 Praha 4, Czech Republic


c Institute of Macromolecular Chemistry, Academy of Sciences of the Czech Republic,

Heyrovskeho nam. 2, 16206 Praha 6, Czech Republic


d Diamond Light Source, Harwell Science and Innovation Campus, Didcot,

Oxon, OX11 0DE, UK


e Novozymes A/S, Brudelysvej 26, DK-2880 Bagsværd, Denmark

dohnalek@ibt.cas.cz, keith.wilson@york.ac.uk

6.1 Introduction: Subtilases and Metal Binding

The subtilases, a family of serine endoproteinases (EC. 3.4.21.14)


with molecular weight typically in the range 26–29 kDa, make

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

204 Stabilization of Enzymes by Metal Binding

up one of the most intensively studied families of enzymes [1].


The subtilase family S8 [2] contains many nonspecific enzymes
that function through a classical aspartate-histidine-serine catalytic
triad but have no sequence or fold similarity to the trypsin family.
Their substrate specificity is usually rather broad and they function
as general-purpose proteinases with a preference to cleave after
hydrophobic amino acids. In bacteria, archaea, and fungi, family
members are probably involved in nutrition, and the majority of
them are secreted. Most subtilases are active at neutral to mildly
alkaline pH, and many are thermostable.
Subtilases have acted as paradigms for studies inter alia of
kinetics and mechanism [3], 3D structure, protein–ligand interac-
tions, stability to denaturation, complexes with naturally occurring
polypeptide inhibitors, and site-directed mutagenesis [4]. Hence a
wealth of knowledge is available on their structures and functions.
Subtilases were first isolated from bacilli, and Bacillus subtilisins
have long been a focus both of academic study and commercial
exploitation, the latter primarily through their use in the detergent
industry. Bacilli generally secrete subtilisins from the cell, and
this imposes a need for extra stability compared to intracellular
enzymes. Bacillus subtilisins vary considerably in sequence, with
identities as low as 30%, but have very similar 3D folds. They share
several key features, such as a leader sequence to ensure transport
from the cell, with the majority of the family possessing di- and/or
monovalent metal-binding sites that contribute to stability. In the
classical subtilisins there is a strong calcium-binding site and a
second weak site that has an alternative sodium site close to it.
The term “subtilisin” is conventionally restricted to enzymes
from members of the bacilli, and the name “subtilase” was
introduced later to allow inclusion of a number of homologous
enzymes from other species such as proteinase K [5] from the fungus
Tritirachium album limber and thermitase [6] from the bacterium
Thermoactinomyces vulgaris. These enzymes retain the family fold
and have very similar active sites to that of the Bacillus subtilisins
but have much lower levels of sequence similarity with many more
insertions/deletions in surface loops. While they generally retain
a dependency on calcium for their structural integrity and activity,
the calcium-binding sites often differ considerably from those in the

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Introduction 205

Bacillus enzymes. Thus while the principle of using calcium as an


extracellular stabilizing agent is exploited by all the extracellular
subtilases, it has been independently lost or acquired at various
points during their evolution.
The Bacillus amyloliquefaciens subtilisin BPN and identical
subtilisin Novo were among the first enzyme structures to be
determined by X-ray crystallography [7, 8]. The shape corresponds
roughly to a half sphere with a diameter of about 40 Å, the active
site being located on the flat surface of the hemisphere. The fold is
made up of a single domain, composed of mixed αβ elements based
on a central parallel β-sheet with helices running anti-parallel to it.
Since then an extensive series of structures have followed both from
Bacillus species and of subtilases from other organisms. To simplify
discussion elsewhere in the text, the species, PDB codes, acronyms,
and original references are listed in Table 6.1 for a representative
set of subtilases of known 3D structures. The table is not intended
to encompass all members of the family but rather to include
representatives of the major Bacillus enzyme groupings together
with other representative species. The sequences of enzymes in
Table 6.1 were aligned on the basis of 3D structural equivalence
using the EBI secondary structure matching (SSM) algorithm [9] in
Fig. 6.1. The conservation of the core of the structure is evident, with
a number of key residues present in all species.
Subtilases bind their polypeptide substrates through a set of
amino acid-binding subsites, labeled as S1, S2, S3, etc., on the N-
terminal side of the peptide bond to be cleaved, and S1 , S2 , S3 , etc.,
on the C-terminal side [24]. The corresponding sites on the substrate
are referred to as P1, P1 , etc. This notation is adhered to for the
present complexes with the chymotrypsin inhibitor 2A (CI2A) [25].
We report here the crystal structures of two subtilases from
Bacillus sp. TY145 (henceforth SubTY) and from B. halmapalus
(henceforth SubHal). SubTY is closely related to the sphericase
subtilase from B. sphaericus 2297, for which the 3D structure was
reported earlier [18, 25]. SubHal is highly similar (94% identity)
to subtilase KP-43 [21], and these two show substantial differences
from other Bacillus subtilisins, with an additional domain and
different calcium-binding sites. These differences are evident from
even a quick glance at Fig. 6.1.
March 21, 2016

Table 6.1 Representative set of subtilases with known structures used for comparison. Sequence identity derived from standard
sequence alignment. Sequence identities were calculated for the catalytic domains only. Structure-based sequence identities are
13:40

defined as by EBI-SSM [9]

Secondary structure Secondary structure


matching to SubTY matching to SubHal

Common name

Sequence length
(catalytic)
PDB id of compared
structure
Sequence identity
to Sub BPN
(sequence based
only) (%)
Structure based
sequence identity
to Sub BPN (%, no)
Number of
disulfide bonds
Number of Ca2+
sites
Number of Na+
sites
CI2 or Eglin C com-
plex determined
Sequence identity
to SubTY (%)
No. residues
aligned
No. SSEa aligned
r.m.s.d. (Å)
Sequence identity
to SubHal (%)
No. residues
aligned
No. SSE aligned
r.m.s.d. (Å)

[reference] Species of origin

Subtilisin BPN Bacillus amy- 275 1lw6 100 100 0 1 0 CI2 35 247 14 1.1 24 252 17 1.5
206 Stabilization of Enzymes by Metal Binding

Novo [12] loliquefaciens


Mesenteri- Bacillus pumilus 275 1mee 86 86, 237 0 2 0 Eglin C 35 250 14 1.1 25 255 17 1.6
copeptidase
PSP Book - 9in x 6in

[13]
Subtilisin E [14] Bacillus subtilis 275 1scj 85 86, 236 0 2 0 Autodigestion 34 248 14 1.1 24 249 17 1.5

www.ebook3000.com
168
product
Subtilisin Bacillus 274 1cse 69 70, 192 0 2 0 Eglin C 39 249 14 1.1 24 253 17 1.6
Carlsberg [15] licheniformis
Savinase [16] Bacillus lentus 275 1svn 59 61, 164 0 1 1 CI2 38 249 14 1.1 26 250 17 1.7
M-protease Bacillus 269 1mpt 59 61, 163 0 2 0 – 38 247 14 1.1 26 250 17 1.8
[17] KSM-K16
Thermitase [6] Thermoacti- 279 2tec 41 44, 115 0 2 1 Eglin C 33 246 14 1.2 22 249 17 1.7
nomyces
vulgaris
06-Allan-Svendsen-c06
March 21, 2016
13:40

Sphericase [18] Bacillus 310 1ea7 35 41, 102 1 4+ 1+ – 72 310 16 0.6 25 242 14 1.7
sphaericus
SubTY Bacillus sp. 311 5ffn 35 42, 104 1 3 0 CI2 100 – – – 26 243 13 1.6
TY145
Proteinase K Tritirachium 279 1ic6 32 39, 92 2 2 0 – 31 250 14 1.6 20 229 15 1.8
[19] album
Vibrio Vibrio sp. PA-44 291 1sh7 33 38, 93 3 3 0 PMS 30 251 14 1.4 22 245 15 2.1
proteinase [20]
Kp-43 [21] Bacillus sp. 434 (317) 1wmd 25 31, 78 0 3 0 – 26 241 14 1.5 94 433 28 0.5
KSM-KP43
SubHal Bacillus 433 (318) 5fbz 24 31, 79 0 3 0 CI2 26 243 13 1.6 100 – – –
halmapalus
PSP Book - 9in x 6in

Furin [22] Mus musculus 686 (331) 1p8j 22 25, 63 3 2 0 Peptide 18 247 11 1.9 15 251 14 1.8
inhibitor
Kexin [23] Saccharomyces 701 (337) 1ot5 22 25, 64 2 3 0 Peptide 19 256 11 2.1 16 253 14 1.8
cerevisiae inhibitor

a
Secondary structure element.
+
The number referred to by the authors is 5+0 in contrast to our findings in their structure.
Introduction
207
06-Allan-Svendsen-c06
March 21, 2016
13:40

208 Stabilization of Enzymes by Metal Binding


PSP Book - 9in x 6in

www.ebook3000.com
Figure 6.1 Structure-based multiple sequence alignment of subtilases (EBI SSM [9]). Shading: 100% conserved red, similar
(Risler matrix, 0.7 similarity score) yellow, and residues of the mature protein with unknown coordinates. Framed blocks mark
equivalent residues in 3D. Consensus regions of secondary structure elements are indicated. Full mature sequences are shown
with the exception of furin and kexin for which only residues with known coordinates are included. Prepared using programs
GENEDOC [10] and ESPRIPT [11].
06-Allan-Svendsen-c06
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Introduction 209

Through the rest of the paper we refer to a set of classical Bacillus


subtilisins. This refers to the set of enzymes, including subtilisins
BPN [7], Novo [8], Carlsberg (also called Alcalase) [15], and B. clausii
(B. lentus) subtilisin (also called subtilase 309 or Savinase) [16],
for all of which the sequences and 3D structures are known, with
highly similar folds and almost no insertions or deletions, and which
all possess equivalently located strong Ca-I and weak Ca-II calcium-
binding sites, with a monovalent ion site (Na-II) close to Ca-II.
Dependence upon bound divalent (usually calcium) and mono-
valent ions for stability and activity is an important feature of the
Bacillus subtilisins, as of several other secreted bacterial enzymes.
In the next section we propose a convention for the labeling of these
sites in the enzymes, which should allow a more straightforward
comparison of their presence or absence in future structures.

6.1.1 Calcium-Binding Sites in Bacillus: Proposal for a


Standard Nomenclature
The two calcium sites, first identified in BPN and subsequently in
classical Bacillus subtilisin structures, were referred to as Ca-I (or
sometimes A) for the tighter-binding site and Ca-II (or sometimes
B) for the weaker-binding site. There is an alternatively occupied
monovalent cation (usually Na+ )-binding site, which we propose to
refer to as Na-II, roughly 2.7 Å from Ca-II. We propose to retain the
names Ca-I and Ca-II for these two calcium sites. Sphericase was the
first Bacillus subtilisin reported to deviate substantially from this
classical two-calcium model. We propose to renumber the five sites
in sphericase (Ca-1 to Ca-5) [18] as follows: Ca-1 becomes Ca-IV,
Ca-2 becomes Ca-V, Ca-3 becomes Ca-VI, Ca-4 becomes Ca-VII, while
Ca-5 becomes Na-II (i.e., the sodium ion–binding site in the reference
set). We believe that site Ca-5 in sphericase is in all probability a
sodium, based on our present structural analysis of SubTY—see the
following section—although the absence of anomalous scattering
information from the deposited data prevents us from calculating
anomalous difference maps to confirm this.
Sites in future structures (starting with SubHal below) can then
be either referred to one of these sites or assigned higher numbers if
they lie in novel locations (as found in SubHal and KP-43). Table 6.2
March 21, 2016

Table 6.2 Ca2+ /Na+ sites in the compared structures. The Kp43 and SubHal sites XI and XII are not present in the P-domains of furin and kexin
13:40

Subtilase Site I Site II Site III Site IV Site V Site VI Site VII Site VIII Site IX Site X Site XI Site XII
Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues
q q q q q q q q q q q q
BPN Ca2+ Q2 (H2 O) G169 – – – – – – – – – – – – – – – – – – – –
1.0 D41 Y171
L75 V174
N77
I79
V81
Mesenteri- Ca2+ Q2 Ca2+ A169 – – – – – – – – – – – – – – – – – – – –
copeptidase 1.0 D41 1.0 Y171
L75 T174
N77 1x
I79 H2 O
V81
Subtilisin E Ca2+ Q2 Ca2+ A169 – – – – – – – – – – – – – – – – – – – –
1.0 D41 1.0 Y171
L75 T174
N77 1x
PSP Book - 9in x 6in

I79 H2 O
V81

www.ebook3000.com
Carlsberg Ca2+ Q2 Ca2+ A169 – – – – – – – – – – – – – – – – – – – –
0.9 D41 0.42 Y171
L75 V174
N77 2x
T79 H2 O
V81
Savinase Ca2+ Q2 Na+ A169 – – – – – – – – – – – – – – – – – – – –
(unpublished 1.0 D41 1.0 Y171
CI2A L75 A174
complex) N77 2x
I79 H2 O
V81
06-Allan-Svendsen-c06
March 21, 2016
13:40

M-protease Ca2+ Q2 Ca2+ A169 – – – – – – – – – – – – – – – – – – – –


1.0 D41 1.0 Y171
L75 A174
N77 G195
I79 D197
V81
Thermitase Ca2+ D5 Na+ A173 Ca2+ D57 – – – – – – – – – – – – – – – – – –
1.0 D47 1.0 Y175 1.0 D60
V82 A178 D62
N85 1x T64
T87 H2 O Q66
I89
-Sphericase – – Na+∗ 1.0 G181 – – Ca2+ D287 Ca2+ T214 Ca2+ N29 Ca2+ D115 – – – – – – – – – –
L183 1.0 I288 1.0 D217 1.0 E49 1.0 5x
A186 A295 V219 D98 H2 O
2x G297 Q221 3x
H2 O D299 D224 H2 O
1x 1x
H2 O H2 O
2+ 2+
PSP Book - 9in x 6in

– SubTY – – Na+ G182 – – Ca D288 Ca T215 – – – D116 – – – – – – – – – –


1.0 L184 1.0 I289 1.0 D218
A187 G296 I220
2x G298 Q222
H2 O D300 D225
1x 1x
H2 O H2 O
– – – Ca2+a P175 – – – – – – – – – – – – – – – – – – – –
Proteinase 1.0 V177
K D200
4x
H2 O
(Contd.)
06-Allan-Svendsen-c06
March 21, 2016
13:40

Table 6.2 (Contd.)


Subtilase Site I Site II Site III Site IV Site V Site VI Site VII Site VIII Site IX Site X Site XI Site XII
Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues Ion Residues
q q q q q q q q q q q q
Vibrio – – Ca2+ P171 Ca2+ D56 – – – – – – – – – – Ca2+ D9 – – – – – –
proteinase 1.0 G173 1.0 D61 1.0 D12
D196 D63 Q13
4x 3x D19
H2 O H2 O N21
1x
H2 O
Kp43 – – – – – – – – – – – – – – – – – – Ca2+ E186 Ca2+ D367 Ca2+ D384
1.0 S194 1.0 L368 1.0 T386
D197 D369 P388
H201 D394 D391
2x 1x N392
H2 O H2 O 1x H2 O
SubHal – – – – – – – – – – – – – – – – – – Ca2+ E185 Ca2+ D366 Ca2+ D383
PSP Book - 9in x 6in

(CI2 1.0 S193 1.0 L367 1.0 T385


complex) D196 D368 P387
H200 D393 N390

www.ebook3000.com
2x E399 N391
H2 O 1x 1x H2 O
H2 O
Furin Ca2+ D115 Na+b T309 – – – – – – – – – – Ca2+ D258 – – – – – – – –
1.0 D162 S311 1.0 D301
V205 T314 E331
N208 1x 3x
V210 H2 O H2 O
G212
06-Allan-Svendsen-c06
March 21, 2016
13:40

Kexin Ca2+ D135 Na+c T328 – – – – – – – – – – Ca2+ D277 – – – – – – – –


1.0 D184 S330 1.0 D320
K224 S333 E350
F229 1x 3x
G231 H2 O H2 O
N227


The metal ion present in the PDB entry is Ca2+ ; in many previous studies this metal was modeled as a Ca2+ ion with low occupancy and/or relatively high B value, but
according to our strong evidence a Na+ ion should be present here.
a
The ion is shifted by more than 2 Å away from the expected position.
b
The structure has a H2 O oxygen present with B = 0.75 Å2 in the place of Na+ in other structures. We think this is a sodium ion.
c
Ca2+ in the structure (q = 1.0, B = 39 and 45 Å2 , two copies); this has a pattern of typical Na+ binding as known from other structures.
Residues = a. a. forming the site; q is the occupancy factor.
PSP Book - 9in x 6in
06-Allan-Svendsen-c06
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

214 Stabilization of Enzymes by Metal Binding

shows the suggested numbering of the sites as well as the moiety


present at the site in the set of structures included in our
comparisons. We have renumbered calcium sites Ca-I, Ca-II, and
Ca-III from the KP-43 publication [21] as Ca-X, Ca-XII, and Ca–XI,
respectively.

6.1.2 The Weak Metal-Binding Site


The Ca-II/Na-II sites warrant further discussion. The Na-II site was
first identified in a seminal subtilisin paper by Drenth et al. [8]. The
two sites were later described in detail by Pantoliano and coworkers
in a series of papers on the structure and stability of subtilisin
BPN and its mutants [26–28]. Pantoliano et al. [27] performed
an extensive study of the weak site and identified a divalent Ca-II
and a monovalent Na-II site (our terminology) lying 2.7 Å apart.
Occupancy of the two sites was shown to be mutually exclusive.
Mutations of individual residues in the vicinity of Ca-II affected
calcium affinity in this region.
Kinetic and inactivation analysis on subtilisin BPN lacking the
strong Ca-I site was carried out by Alexander et al. [29] to evaluate
the contribution of the weak-binding sites independent of any
effects of the Ca-I site. They measured affinities for Ca2+ (Ka = 67
mM−1 ), K+ (10), Na+ (1.3), NH4 (5), and Li+ (12) ions, presumably
at Ca-II for calcium and Na-II for the others. This indicated stronger
metal binding to the monovalent Na-II in comparison to the divalent
Ca-II site. Two further papers [28, 30] describe both calcium-
dependent and independent stabilizing mutants of subtilisin BPN
but focus on the Ca-I site.
In summary, these authors showed that the Ca-II site was only
occupied by a calcium ion if the concentration of calcium was
sufficiently high in the solution relative to that of sodium. The nearby
Na-II site can be occupied by a number of monovalent cations, usu-
ally sodium, again highly dependent on the ambient sodium concen-
tration. Potassium can also occupy the Na-II site, or more precisely
a position approximately 0.7 Å away from Na-II, toward Ca-II (PDB
ids 1sub and 1suc) [31]. This results from the larger size of the K+
ion compared to Na+ and the fact that K+ binds not only to protein
oxygens of the Na-II site but also to two oxygens of the Ca-II site.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Introduction 215

In spite of the clear conclusions from these studies, there remains


some confusion in the coordinates deposited in the Protein Data
Bank (PDB, http://www.wwpdb.org/), especially with regard to
the identity and occupancy of the metal in Ca-II and Na-II. While
a number of structures have the correctly identified the metal
ion in the appropriate site, a substantial number appear to have
discrepancies, including (1) Ca2+ modeled with partial occupancy
(and often high ADP) in the Na-II site, (2) a sodium atom where there
should be a water molecule (or vice versa), or (3) a water molecule
in the site actually occupied by a calcium ion.
The ambiguity with respect to Ca-II/Na-II even includes three
structures at resolutions where the true metal ligand should have
been clear. First, a Na-II ion was incorrectly modeled in one of
the present authors’ (KSW) publications as a partially occupied
calcium ion (PDB id 1svn) [16]. Second, the structure of the B. clausii
subtilisin was determined at 0.78 Å spacing (PDB code 1gci) [32].
These authors also placed a half-occupied calcium in the Na-II site
with a B value of 11.5. The metal ligand distances clustered in
the range 2.8–3.2 Å, considerably longer than expected for calcium,
especially at this level of accuracy. There was no report of disorder
around the Ca-II site. We propose that this is probably a potassium
ion given the bond lengths and chelation of oxygens from both
the Na-II and the Ca-II sites. Unfortunately, the experimental X-ray
data were not deposited in the PDB, which would have allowed re-
refinement and validation of the model for the alternative metal ions
(e.g., inspection of the anomalous difference maps to distinguish K or
Ca from Na ions). For Savinase 1svn, the native data are available but
not the anomalous differences, which emphasizes the importance of
depositing not only the amplitude but also the anomalous scattering
data.
Third, sphericase and the psychrophilic proteinase S41 (0.8 and
1.0 Å resolution, respectively) were reported to bind five calcium
ions, one of them in the Na-II site [33]. The calcium ion in sphericase
was modeled with decreased occupancy in the Na-II position. To
add to the confusion, in the deposited coordinates for S41, this ion
was correctly identified as sodium with full occupancy, in contrast
with the publication where it is reported to be a calcium. Both
sphericase and S41 differ from classical subtilisins at the Ca-II
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

216 Stabilization of Enzymes by Metal Binding

site where additional side chains would preclude calcium binding.


There are local environment changes such as additional amino acids
with their side chains directed toward the site and others causing
different conformation of the acidic side chain of site Ca-II, which
make calcium binding highly unlikely.
The above examples indicate the possibility of misinterpretation
of the metal identity in other structures. In a later section we report
an analysis of the weak metal site in the Bacillus subtilisin structures
deposited in the PDB.

6.2 Two New Structures of Subtilases with Altered


Calcium Sites

6.2.1 Proteinase SubTY


6.2.1.1 The overall fold
The final model contains one molecule of SubTY with all 311
residues, the CI2A inhibitor, two calcium and one sodium ions, and
446 water molecules. Only residues 21–84 of CI2A are visible in
the electron density maps. All the residues are well ordered, with
a number possessing alternative side chain conformations, and have
good geometry, as indicated by PROCHECK [34] and WHAT CHECK
[35]. Other validation criteria are listed in Table 6.3.
SubTY is closely similar to sphericase from B. sphaericus (PDB
id 1ea7), with a root-mean-square deviation (RMSD) in Cα positions
of 0.56 Å and a sequence identity of 72%. A structural comparison
against the SCOP [36] database using the DALI [37] server confirms
the similarity of SubTY to other subtilases such as Savinase (RMSD
1.5 Å, PDB id 1gci), proteinase K (RMSD 2.0 Å, PDB id 1pfg),
kexin fragment (RMSD 2.4 Å, PDB id 1ot5), and a serine-carboxyl
proteinase (RMSD 2.9 Å, PDB id 1ga1).

6.2.1.2 The active site


The SubTY catalytic triad is made up of residues Ser251, His72,
and Asp35. There are hydrogen bonds between His72 Nε2 and the
Oγ atom of Ser251 (2.71 Å) and between atom His Nδ1 and Asp35

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 217

Table 6.3 X-ray diffraction data and structure refinement details. Numbers
in parentheses refer to the highest-resolution shell, if not stated otherwise

SubTy:CI2A SubHal:CI2A Apo-SubHal


Data collection
X-ray source Rigaku Rot. anode Rigaku Rot. anode ESRF ID14.2
Wavelength (Å) 1.54178 1.54178 0.933
Temperature (K) 120 120 100
Detector IP MAR345 IP MAR345 CCD ADSC Q4
Crystal data
Space group P 21 21 21 P 21 P 21
a, b, c (Å) 58.8, 66.8, 107.1 58.4, 151.4, 64.1 45.2, 100.0, 96.7
α, β, γ (◦ ) 90.0, 90.0, 90.0 90.0, 117.107, 90.0 90.0, 103.52, 90.0
Mosaicity (◦ ) 0.57 0.31 0.38
Resolution (Å) 25.0–1.8 20.0–1.9 30.0–2.0
Total reflections 336213 547162 181081
Unique reflections 40054 59477 41891
Completeness (%) 100 (99.6) 77 (15) 74 (49)
Rmerge 0.070 (0.347) 0.046 (0.128) 0.101 (0.268)
Mean [I /σ (I )] 33 (5.8) 22 (5.4) 6 (2.4)
Refinement statistics
Reflections, working set 36035 53438 37654
Reflections, R free set (%) 2009 (5.0%) 3005 (5.1%) 2140 (5.1%)
Rcryst 0.152 0.125 0.215
Rfree 0.188 0.181 0.303
No. of non-H atoms 3156 8661 7064
No. Res. alternate conf. 3 19 1
Contents of AU SubTy (311 a.a.) 2 SubHal (433 a.a.) 2 SubHal
CI2A (63 a.a.) 2 CI2A (68 a.a.)
Peptide KPSLL
Metals 2 Ca2+ , Na+ 6 Ca2+ 6 Ca2+
Water 446 1115 662
Mean B factors (Å2 )
All atoms 14.8 18.4 29.8
Chains A, C (Enzyme) 16.5 15.3, 14.0 14.7, 13,8
Chains B, D (CI2A) 18.1 24.2, 36.9 –
RMS deviation from ideal
Bond length (Å) 0.016 0.010 0.017
Bond angles (◦ ) 1.51 1.13 1.68
Ramachandran plota
Residues generously allowed 2 (A) 2 (A) 3 (C) 3 (A)
(chain)
Residues disallowed (chain) 0 0 2 (A, B)
PDB id 5ffn 5fbz 5fax

a
Ramachandran plot generated with COOT [38].
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

218 Stabilization of Enzymes by Metal Binding

(2.69 Å to Oδ1 and 3.07 Å to Oδ2 ). The chymotrypsin inhibitor CI2A


is bound to the active site via an extended loop region that forms
multiple hydrogen bonding interactions with the enzyme, and there
is continuous electron density at the scissile bond of CI2A that
indicates that the inhibitor is uncleaved.
There are only minor differences to the equivalent BPN complex.
The orientation of the CI2A with respect to the body of the enzyme
differs by about 17◦ , within the range seen in similar complexes,
probably a result of crystal packing. The inhibitor-binding cleft in
SubTY is slightly wider than that in BPN mainly caused by the
replacement of Tyr104 by Ser114 in SubTY and the rearrangement
of the N-terminal end of the 133–145 helix.

6.2.1.3 SubTY calcium and sodium sites


The strong Ca-I site is absent in both SubTY and sphericase. In
contrast, calcium is bound to two sites with octahedral coordination
geometry, designated as Ca-IV and Ca-V, not present in classical
subtilisins. Ca-IV is coordinated by two aspartates (Asp288 and
Asp300), three main chain carbonyl oxygens (Ile289, Gly296,
Gly298), and a water. The coordination of Ca-V is via two aspartates
(Asp218 and Asp225), a glutamine (Gln222), two main chain
carbonyl oxygens (Ile220 and Thr215), and a water.
Ca-IV and Ca-V are equivalent to two of the calcium-binding sites
of sphericase, the only differences being that two of the main-chain
oxygen ligands are provided by different residue types. Interestingly,
despite the close similarity in sequence, sphericase contains two
calcium ions Ca-VI and Ca-VII that are not present in SubTY. In
sphericase, Ca-VI is coordinated by Asn29, Glu49, and Asp98, but
in SubTY, the aspartate and asparagine residues are replaced by
lysines, which precludes calcium binding. The residues around the
Ca-VII site are the same in both structures, but there is no density for
a calcium ion in SubTY. Sphericase was crystallized with 5–25 mM
CaCl2 [18], whereas no calcium was included in the crystallization
buffer for SubTY.
SubTY contains a sodium ion at the Na-II site, with trigonal
bipyramidal geometry and coordinated by three main chain oxygens
of Gly182, Leu184, and Ala187, together with two water molecules.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 219

Figure 6.2 SubTY anomalous difference Fourier synthesis contoured at the


3σ level showing the peaks for the Ca2+ ions and sulfur atoms.

Computation of an anomalous scattering density map using the


phases from the refined model proved conclusively that sites Ca-IV
and Ca-V were occupied by calcium, which has a significant
anomalous component at this wavelength (Cu Kα). In contrast, the
Na-II site was occupied by a metal with no significant anomalous
scattering, presumed to be sodium, as shown in Fig. 6.2, in keeping
with the observed geometry, bond lengths, and B values. Also
evident in this figure are peaks corresponding to the sulfur atoms
of the methionine and cysteine residues.

6.2.1.4 SubTY disulfide bridge


There is a disulfide bridge between residues Cys52 and Cys66,
equivalent to that between Cys51 and Cys65 in sphericase: it is
absent in other Bacillus subtilisins. Whereas in sphericase, the
cysteines were modeled as 75% in the bridge configuration and
25% in the reduced form (presumably due to radiation damage
arising from the long exposures necessary for recording the atomic
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

220 Stabilization of Enzymes by Metal Binding

resolution data), they were modeled as 100% in the bridge


configuration in SubTY.

6.2.2 SubHal
6.2.2.1 The unliganded form of SubHal
The unliganded crystal form of SubHal contains two independent
molecules of the enzyme, six Ca2+ ions, and 662 waters per
asymmetric unit. The enzyme is folded into two domains: a catalytic
one that follows the overall topology of the consensus subtilase fold
[1] and a C-domain as seen previously in KP-43 (Fig. 6.3).
The C-domain is an all antiparallel β-sandwich jelly-roll compris-
ing 4 + 4 β-strands (β11–β18). Apart from KP-43, the structurally
closest relative found by a DALI [37] search is a so-called CUB fold,
composed of a parallel/antiparallel β-sandwich structure consisting
of 2 × 5 β-strands. However, the strand number and connectivity
are different. The kexin P-domain appears as match number 7 with
Z -score 6.4 in the 370 results.
The interface between the C-domain and the catalytic domain of
SubHal is formed by a total of 24 hydrogen bonds and a number

Figure 6.3 Topology of B. halmapalus subtilase. Selected secondary


structure elements delimiting residue numbers and approximate positions
of Ca2+ binding sites are shown.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 221

of hydrophobic contacts. There are 18 water molecules mediating


interaction between the two domains. Eight of them are buried
within the interface, while four form a cluster near the side chain of
Lys216 that interacts with carbonyl oxygens forming the Ca-II/Na-II
sites in other subtilases. The buried contact surface as calculated by
the program AreaIMol, CCP4 [39] equals 1555 Å2 .
The two crystallographically independent enzyme chains are
labeled A and B. The enzyme molecules interact via six main chain–
main chain and four main chain–side chain hydrogen bonds mainly
between the 104 and 109 loops with a total interface surface area
of 460 Å2 . While the ends of the two substrate-binding sites lie
adjacent to one another, the catalytic serines are 39 Å apart. The
crystal packing of the enzyme monomers is quite different to that
seen in the SubHal:CI2 complex, suggesting the dimer of complexes
is an artifact of the crystal. Chains A and B of the unliganded SubHal
structure superpose on all C-atoms with an RMSD of 0.41 Å (0.21 Å
on Cα atoms). The overall fold is essentially identical to that in the
complex with an RMSD on all atoms between chains A of SubHal:CI2
and of unliganded SubHal of 0.51 Å. Since all four independent
enzyme molecules (two from the complex below and two from the
apo-enzyme) are highly similar in structure, all discussion refers to
the high-resolution complex structure unless otherwise indicated.

6.2.2.2 The SubHal:CI2A complex


This structure was refined against data to a spacing of 1.9 Å to a
final R/Rfree of 0.125/0.181. The asymmetric unit consists of two
independent proteinase:CI2A moieties (Fig. 6.4), six Ca2+ ions, a
small peptide with the sequence KPSLL (a product of autodigestion),
and 1115 water molecules. Essentially all residues were in well-
defined electron density. The enzyme and inhibitor chains are
labeled A:B and C:D in the two crystallographically independent
complexes.

6.2.2.3 Termini, surface, and pH stability of SubHal


There was substantial residual electron density close to Asn1 that
was successfully modeled as carbamoylation of the N-terminal
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

222 Stabilization of Enzymes by Metal Binding

Figure 6.4 Dimer of SubHal:CI2A complexes viewed approximately down


the pseudo twofold symmetry axis of the dimer. Color coding: red:
α-helices; green: β-strands; yellow: loops of the subtilisin catalytic domain;
orange: the C-domain of the protease; blue: the CI2A inhibitor, (in different
gray levels) the second complex of the dimer related by a pseudo twofold
symmetry; yellow spheres: Ca2+ ions.

amine. Both terminal atoms are involved in strong hydrogen


bonds to protein, which confirms carbamoylation as opposed to N-
acetylation. The stability of the carbamoylated Asn1 indicates that it
is not accessible to solvent. The residue is involved in 13 hydrogen
bonds to the surrounding protein atoms (carboxy group to Arg188
Nε , Nη2 , 2x Asn186 Nδ2 , Val3 N, Ala4 N, Leu227 N, amine to Leu227
O, side chain amide to Gly74 O, Ser75 Oγ , Asn82 O, Ser228 Oγ ,
and carbonyl to Arg5 N) with no H bonds to water molecules and
provides an extension of the H-bonding pattern of helix αa. This
embedding of the N-terminus is likely to protect it from various
forms of external stress, such as high pH.
Recalculation of the electron density map for KP-43 using
the deposited data and coordinates (PDB id 1wmd) revealed
similar density around its N-terminal Asn, suggesting it too is

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 223

carbamoylated. This was instead modeled in the published study as


two alternative conformations.
N-terminal carbamoylation has previously been observed in
other proteins, playing either a structural role in stabilization of
the N-terminus or a functional role. Thus hemoglobin binds carbon
dioxide at its N-terminus to facilitate its transfer to lungs [40].
Similar modifications were found in thymidylate synthase [41]
and in the light harvesting complex II where it is involved in
bacteriochlorophyll binding [42].
The C-domain of SubHal and KP-43, whose stability is enhanced
by the binding of two of the three calcium ions, can be seen as
a protective extension of the C-terminus and is absent in other
subtilases. An analogous domain is present in the proprotein
convertases (PCs), furin [22] and kexin [23], where in contrast
it facilitates the role of a stem between the transmembrane and
catalytic domains.
The optimal pH of 10 for SubHal activity classifies the enzyme
as one of the alkaline proteinases extremely stable in basic soil
environments. The structure reveals several features connected
with high pH stability. Surface analysis reveals an extensive region
with significant negative electrostatic potential around the active
site. Even the presence of the calcium ion in site Ca-XI does not
invert the sign of electrostatic potential in this region, which is
the second largest area of negative potential. Arg17 of CI2A of
the other enzyme–inhibitor complex binds specifically to Asp368
and Tyr418 in this site and the N-terminus of the inhibitor may
be directed toward this site by the local negative electrostatic
potential. The protein surface on the opposite side of the molecule
has a predominantly positive potential (calculated at pH 7). The
proportion of hydrophobic surface of a protein has been proposed
to play an important role in stability at high pH [43]; however, the
surface of SubHal does not appear to be unusually hydrophobic.

6.2.2.4 The two crystallographically independent SubHal:CI2A


complexes
There are no substantial structural differences between the two
chemically identical complexes in the asymmetric unit. Superpo-
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

224 Stabilization of Enzymes by Metal Binding

sition (based on the Cα atoms) shows that the largest differences


are the result of a change in the inhibitor orientation in the active
site. When only the main chain atoms of the enzyme itself are
superposed, the orientation of the CI2A changes by approximately
6.3◦ , with the largest displacement of the inhibitor main chain
(2.25 Å) being on the outer helix (residues 32–43). This is
presumably a result of different crystal packing. Inhibitor D is less
well ordered than inhibitor B, with higher temperature factors and
many side chains having weak or missing electron density.
Each CI2A packs against the C-domain of the enzyme of the other
complex in the asymmetric unit. In previous CI2A complexes, there
is no density for residues 1–19 (or 1–18), as for complexes with
Eglin-C where residues 1–7 or 1–6 are disordered (Lys8 of Eglin-C
is equivalent to Thr22 of CI2A). In the SubHal complex there is a
reasonable density for residues 16–19 that pack against the surface
of a neighboring enzyme in the crystal, stabilized by a specific
interaction of Arg17 of CI2A with the enzyme surface at the CA-XI
site. This contact gives rise to a pseudodimer in the SubHal:CI2A
crystal, with a considerable buried surface area of 814 Å2 (surface
area buried only between enzyme molecules A and C of 594 Å2 ). This
is unlikely to reflect a biologically significant dimerization, as there
is no experimental evidence for dimer formation in solution, but it
may explain the spontaneous crystallization often observed for this
complex.

6.2.2.5 The calcium sites in SubHal


The two classical binding sites are both absent. Ca-I does not exist
(Table 6.2), and Ca-II/Na-II is occupied by the side chain of Lys216
and forms one side of one of the two cavities at the interface between
the catalytic and C-domains. Each SubHal:CI2A complex contains
three divalent calcium cations (Fig. 6.4) (equivalent to those in KP-
43) confirmed by computation of an anomalous difference Fourier
synthesis (Table 6.4). We have numbered the three SubHal sites
as Ca-X (610), Ca-XI (611), and Ca-XII (612), where the numbers
in parentheses refer to the residue number in the deposited PDB
files. These correspond to Ca-I, Ca-III, and Ca-II in the original
KP-43 publication (see nomenclature discussion in Section 6.1).

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 225

Table 6.4 Peak heights at calcium sites in anomalous


difference synthesis for the SubHal:CI2 complex. The
values are given in σ units above the mean difference
density. There are additional peaks at the positions
of many of the sulfur atoms. The highest unexplained
peak is at water number 10 at 3.5σ . The numbers
in parentheses are residue numbers used in the PDB
deposition

Molecule A Molecule C
Ca-X (601) 12 1
Ca-XI (602) 17 17
Ca-XII (603) 17 12

Ca-X is coordinated by identical residues in both enzymes and


binds to the catalytic domain with ligands Asp196, Glu185, Ser193,
and His200 and two water oxygens. Nδ1 of His200 lies at one
vertex of the bipyramid. The ligation of Ca-XI is again identical in
both structures, with three carboxyl groups (Asp366, Asp368, and
Glu399) and two carbonyl groups (Asp393 and Leu367). Arg17
of CI2A makes two hydrogen bonds to the side chain carboxyl
of Asp368. CI2A binding is accompanied by a movement of the
neighboring Tyr418 that stacks against the Arg17 side chain. In
unliganded SubHal, the Tyr418 side chain rotates out toward site Ca-
XI by 9◦ . Thus the carboxyl group of Asp368 changes its orientation
between the unliganded and inhibited structures to accommodate
better the CI2A Arg17 but still coordinates Ca-XI in both structures.
Ca-XII is bound in a loop of the C-domain where it chelates
five amino acid residues and one water molecule in an overall
coordination shell of eight atoms. Asn390 (Asp391 in KP-43) and
Asn391 bind through their side chain amide oxygens, Asp383
through its carboxylate, Pro387 and Thr385 through their carbonyl
oxygens, and the latter, in addition, through its side chain hydroxyl
group with a rather long distance of 2.69 Å. The presence of only one
charged ligand suggests weaker binding of this calcium.
Ca2+ ions are bound in the same three sites in the unliganded
SubHal structure. In comparison to the SubHal:CI2 complex, the
Ca-X site in molecule A lacks two coordinating waters and the Ca-
XII misses one coordinating water in both molecules A and B. The
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

226 Stabilization of Enzymes by Metal Binding

Figure 6.5 SubHal view showing loop 233–245 occlusion of the active site.
The catalytic triad carbon atoms are in green, loop 233–245 is in yellow, and
CI2A inhibitor is in blue. A thumbnail of the subtilase–CI2A complex: the
C-domain in orange. The catalytic triad and loop residues are labeled in
black, and CI2A inhibitor residues are in blue.

latter is probably the result of different crystal packing affecting the


calcium-binding site environment in the unliganded structure and
its slightly lower resolution.

6.2.2.6 The active site of SubHal


The positions of the residues of the catalytic triad in the structures
of SubHal and KP-43 are essentially identical: the numbering of
Asp30 and His68 is the same in both proteins, while Ser255 in KP-
43 is Ser254 in SubHal (Fig. 6.5). However, there are substantial
differences between KP-43 and SubHal:CI2A, as summarized below.
In KP-43, dioxane is bound in the S1 pocket, while in SubHal the
Met59 side chain of the inhibitor occupies this site. Ala133 forms
the end of the S1 subsite in KP-43, while this is Pro132 in SubHal
so the stopper of this hydrophobic subsite is composed of the larger
hydrophobic Pro. In SubHal:CI2A Pro132 packs closely against the
inhibitor’s Phe69 side chain, which suggests that CI2A binding in the
KP-43 will differ here. In SubHal, the conformation of the side chain
of Glu162 changes upon inhibitor binding (the side chain moves
toward the S1 pocket and thus toward the Met59 side chain, site
P1) and packs more tightly on the enzyme surface to make space for

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 227

Arg81 of the inhibitor. His68 of the catalytic triad is not H-bonded


to Asp30 in the unliganded SubHal structure; however, it has moved
into its appropriate catalytic position in the CI2A complex.
The active site in unliganded SubHal contains water molecules,
which correspond to the CI2A inhibitor H-bond donors and
acceptors in the SubHal:CI2 complex. The catalytic Asp30 and
Ser254 are in essentially identical conformations to those in the
complex, while His68 hydrogen bonds to Ser254 with a somewhat
different conformation. A short Ser254,Oγ ..His 68,Nε2 H-bond in the
complex (2.6 Å) is longer in the unliganded structure (2.9 Å). The
His68 side chain rotates by approximately 60◦ with respect to the
complex so that the His68,Nδ1 ..Asp30,Oδ2 hydrogen bond (2.6 Å) is
broken in the unliganded structure. The conformations are almost
identical in both molecules A and B of unliganded SubHal.
The protein surface surrounding the active site is partially
formed by the 233–245 loop of SubHal, which is absent from the
sequence in other subtilases. The residues of this loop form a barrier
at the C-terminal end of the active site cleft (from the point of view of
the substrate) and contribute to the S2 and S3 subsites of the enzyme
judged by the CI2A inhibitor contacts (Fig. 6.6).

Figure 6.6 Stereoview of the superposition of the active site of SubHal


(black), subtilisin BPN /Novo (green), and Savinase (magenta). The part of
the CI2A inhibitor binding to the active site is shown in yellow sticks. The
loop 233–245 (on the left) forming a barrier at the end of the active site is
only present in SubHal.
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

228 Stabilization of Enzymes by Metal Binding

Figure 6.7 pH stability profile of SubHal, SubTY, and savinase, measured as


the relative residual activity after 2 h of incubation at a given pH at 37◦ C.

As noted for KP-43, the N-terminal entrance of the cleft in SubHal


is more open and less restrictive on the location of the N-terminal
residues of the substrate than in all other subtilases. This end of the
cleft is also more open in furin and kexin.

6.2.3 Enzymatic Activity of SubTY and SubHal


Both SubTY and SubHal are stable in the broad 6–11 pH range
(Fig. 6.7), but lose activity below pH 7.0. Both show optimal pH
in the alkaline range (Fig. 6.8), with the highest relative activities
measured at pH 11.0 using the Protazyme AK assay. Their high ther-
mostability is demonstrated by a steady gain in relative activity with
temperature increase to 70◦ C (SubTY) or 80◦ C (SubHal) (Fig. 6.9).

6.2.4 Comparison of SubTY and SubHal with Other


Subtilases
The 3D structures of a set of nine representative Bacillus subtilisins
and other subtilases were superimposed pairwise first on SubTY

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 229

Figure 6.8 Relative activity of SubHal, SubTY, and savinase as a function of


pH using the Protazyme AK assay.

Figure 6.9 Relative activity of SubHal, SubTY, and savinase at pH 9.0 as a


function of temperature using the Protazyme AK assay.
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

230 Stabilization of Enzymes by Metal Binding

Figure 6.10 Stereoview of Cα traces and bound metals of subtilisin BPN


(green), sphericase (red), vibrio proteinase (blue), SubHal (black), and furin
(yellow). All metal-binding sites in Table 6.2 are represented by the metal
ions (spheres) and labeled accordingly. The BPN catalytic triad is shown
in ball-and-stick representation (carbon [green], oxygen [red], and nitrogen
[blue]). The graphics in these figures were created with CCP4MG [44].

and second on SubHal using SSM (Table 6.1). The backbone and
calcium sites of a subset of these are superimposed in Fig. 6.10.
SubTY is very similar to sphericase, the only real difference being
the number of calcium-binding sites. An equivalent disulfide bridge
is present in both structures but not in other family members. There
are major differences compared to the classical subtilisins, namely in
the number of calcium sites and the length of the surface loops: these
are larger in several places in SubTY and sphericase (Fig. 6.1). The
differences observed for sphericase have been discussed extensively
[18] and for SubTY are briefly summarized here.

(i) Residues 16–24 form a loop similar to that in sphericase,


replacing the α-helix 13–20 in classical subtilases.
(ii) Residues 46–66, 128–135, 194–203, and 211–225 are similar
to sphericase and form different and substantially larger loops
than in classical subtilases.
(iii) The loop 173–178 is truncated and almost identical to those in
sphericase and SubHal.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 231

(iv) Residues 283–300 extend the last α-helix before the consensus
C-terminus by one turn followed by two additional tight turns.
(v) SubTY and sphericase have a truncated C-terminus lacking the
short helix.
In SubHal, as in KP-43, there are considerably greater differences
from the classical subtilisins. While the secondary structure align-
ment revealed a substantial degree of similarity with other members
for up to 250 residues in 17 secondary structural elements (e.g., for
the mesentericopeptidase the RMSD on aligned atoms was 1.48 Å)
there are regions of SubHal that do not align with the other family
members, including SubTY. SubHal shows significant differences
in ion binding, size, and organization of several loops and the
presence of the C-domain, first observed in KP-43. The calcium-
binding pattern in SubHal follows that of KP-43—the ions bind in
sites quite different to all other subtilases and the SubHal C-domain
carries two of them. The most significant differences in the context
of the previously described KP-43 structure are summarized here.
(i) The SubHal N-terminus is shorter by 4–10 residues, modified
to N-carboxyasparagine, and buried within a rich network of
hydrogen bonds of the enzyme.
(ii) Residues 58–63 and 78–84 in SubHal in regions of high
variability within the subtilase family fall in the group of
shorter loops in contrast to SubTY where the equivalent
residues 57–67 and 82–93 form the largest loops.
(iii) Residues 95–122 in SubHal retain the loop–helix–loop orga-
nization as in other subtilases but in a different location as a
result of a reorientation of the α-helix by about 28o .
(iv) The loop–helix motif 132–161 in SubHal is shaped in a
different way than that of other subtilases. This results in
an overall shift of the helix by ca. 2.7 Å toward the SubHal
C-domain. This part of the SubHal chain participates in
formation of a contact platform between the catalytic and
C-domains (loop 351–365).
(v) The tight turn of residues 182–183 in classical subtilisins
is extended to a loop (186–199), which includes the Ca-X
site in SubHal and KP-43 (and to a loop with no calcium-
binding functionality in SubTY). This loop in SubHal contains
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

232 Stabilization of Enzymes by Metal Binding

three turns and forms the only calcium-binding site of the


catalytic domain lying on the enzyme surface away from the
C-domain and from the active site. In SubTY this loop with
corresponding residue numbers 193–203 lies on the enzyme
surface in the opposite direction and does not form a metal-
binding site. However, it lies close to its neighboring loop 287–
301, which forms the Ca-IV calcium-binding site in SubTY
(Ca2+ 311 in sphericase).
(vi) The opposite situation occurs around residues 209–215,
where both SubHal and SubTY have additional loops but that
of SubTY contains the Ca-V-binding site (residue number 312
in sphericase). Superposition of the structures shows that
the SubTY style of loop would clash with the C-domain in
SubHal. This loop replaces a simple short turn seen in all other
subtilases.
(vii) Residues 233–245 of SubHal, as in KP-43, form a longer loop
that partially restricts access to the active site (Fig. 6.6). The
implications for protease specificity remain unclear.
(viii) The classical subtilase C-terminus corresponds to residue 316
of SubHal. SubHal and subtilase KP-43 are extended by the
C-domain at this point. In subtilisin BPN /Novo there is a five-
residue extension beyond the consensus subtilisin C-terminus
whose direction approximately coincides with the first
β-strand of the SubHal C-domain.

6.2.5 The SubHal C-domain Compared to the Eukaryotic


PCs, Furin and Kexin
The SubHal C-domain, primarily composed of β-strands, was first
observed in subtilase KP-43, and it binds two of the three calcium
ions of SubHal. Its fold closely resembles the P-domain of the
eukaryotic PC family, a set of intracellular subtilases involved in the
conversion of protein precursors to their active forms. These differ
from the bacterial subtilases in that they show a high specificity
for the cleavage site in their target proteins. X-ray structures of two
representatives of the PC family, furin [22] (PDB id 1p8j) and kexin
[23] (1ot5), showed very similar folds in both enzymes.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 233

The structures of subtilisin BPN, SubHal, furin, and other


two subtilases were superimposed using SSM (Fig. 6.10). It is
immediately clear that the catalytic domains all conform to the
overall subtilase topology, with the SubHal and furin structures
possessing considerable variations in their loop structures. The
P-domain of kexin is composed largely of β-strands and is very
similar in position to that of furin (Fig. 6.10). In contrast, the C-
domain of SubHal and subtilase KP-43 and the P-domain of kexin
do not lie in the same position and orientation relative to the
superimposed catalytic domains—there is some overlap but no
structural similarity within this overlap. Nevertheless, when the
topologies of the kexin, furin, and SubHal C-domains are compared,
they are all seen to have the same underlying topology, with furin
and kexin only differing from one another by the insertion of an extra
helix in kexin, and indeed from SubHal by the insertion of one or two
helices, respectively.
It is intriguing that the bacterial subtilases KP-43 and SubHal
have a C-domain that has folds similar to the extra P-domain of
the PC family but which is attached to these catalytic domains in a
quite different fashion. While the PC family P-domain is proposed to
contribute directly to the specificity site and to favor the preference
of furin for Arg in the P1 and P2 subsites, this cannot be the
role of the C-domain of SubHal and KP-43 as it lies too far away.
Nevertheless, the close similarity in topology and the similarity in
sequence do suggest evolution of these domains from a common
ancestor. Given the presence of the modified N-terminus and the
extra calcium-binding domain both in SubHal and in KP-43, the
main significance of the C-domain could be protection of the enzyme
C-terminus and stabilization of the overall structure based on metal
binding.

6.2.5.1 Active site comparison


The above set of subtilase structures were superposed on the basis
of the Cα atoms of Asp, His, and Ser of the catalytic triad, the
Gly following the active site His and Ser128 in SubHal, conserved
throughout the Bacillus subtilisins. The positions of all these atoms
superimpose extremely well (Fig. 6.6), with the conformations of the
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

234 Stabilization of Enzymes by Metal Binding

catalytic triad highly conserved. However significant differences are


evident as we move away from the catalytic to the specificity parts
of the active site. The contacts between SubHal and CI2A follow the
classical pattern seen in other complexes only in the very center of
the active site but differ significantly with increasing distance from
the catalytic triad.

6.2.5.2 The specificity pockets


When compared to other subtilases significant differences can be
found in subsites S4 , S3 , and S1 . In other subtilases the S4 subsite
is partially formed by a conserved Leu that in SubHal is replaced
by Trp129. In SubHal the loop around residue 96 and the region
around residue 103 define parts of the S3 and S4 subsites differently
from other subtilases (main chain shifts, sequence differences, etc.).
S1 normally created by Tyr217 or Ile220 is in SubHal formed by
Met250.
The SubHal Gly251 (Asn or Ser in other subtilases, Asn218 in
BPN participating in S2 ) makes way for the back-flipped 235–245
loop, which places Trp240 parallel to the 250–251 peptide bond.
Any side chain on 251 would prevent this close contact of the
loop with the active site. Residue Phe239 of this loop packs against
Met250, which would not be possible in the case of the bulky Tyr217
side chain found in other subtilases. Overall, this loop forms an
extended raised platform at the C-terminal end of the active site
(Fig. 6.6).

6.2.5.3 Inhibitor CI2A binding


CI2A (from barley, Hordeum vulgare, 83 a.a.) is known to inhibit
enzymes from the subtilase superfamily [45]. The binding of CI2A
to the SubHal active site was compared with that in other subtilase–
inhibitor complexes, including thermitase:eglinC (ID 2tec), Sav-
inase:CI2A, SubTY:CI2A (present work), B. pumilus subtilase:eglinC
(1mee), BPN :CI2A (1lw6), and subtilase Carlsberg:eglin C (1cse).
The sequences of the CI2A barley chymotrypsin and leech eglin C
inhibitors are only 29% identical but possess highly similar folds.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 235

While the fold of CI2A itself in the SubTY and SubHal complexes
is precisely as seen in other subtilase–inhibitor complexes, there
is a high degree of variation in the orientation of the inhibitor
with respect to the enzyme, probably reflecting crystal packing.
The CIA2 complexes were superposed based only on their enzyme
components. The orientation of the inhibitor molecule can then be
represented by the vector between the Cα atoms of CI2A Met59
in the active site and Asp42 at the opposite end of the inhibitor.
The orientation of this vector varies by up to 29◦ in the published
complexes. The vector in the SubHal and SubTY complexes lies
within this range.
In the other CI2A complexes (including SubTY) residues 1–19
are disordered, while in SubHal residues 16–19 can be modeled
and interact with the C-domain of the second enzyme in the
asymmetric unit to form a pseudodimer. To our best knowledge
this is the first observation of localization of residues prior to
number 19 of the CI2A inhibitor and also of formation of a dimer
of a subtilase-like protease inhibited with CI2A where one inhibitor
molecule participates in contacts not only to the active site of
the enzyme but also to another enzyme molecule in the crystal.
The interaction is most likely driven by suitable surface potential
of the neighboring enzyme molecule. The N-terminus of CI2A,
most likely unfolded in solution, can reach a localized state in
suitable environment. This contact may explain its tendency toward
spontaneous crystallization.
In the classical subtilisins, upon CI2A binding a short antiparallel
β-sheet is formed between enzyme residues 100–102 (BPN
numbering) and inhibitor residues 56–58 (hydrogen bonds Gly102
O..Ile56 N, Gly102 N..Ile56 O, and Gly100 O..Thr58 N). This pattern
is conserved within the whole set of the compared structures (Table
6.1) except for SubHal, furin, and kexin. The situation is different in
the SubHal complex, where loop 97–108 folds in multiple turns quite
differently to the classical subtilases. Leu103 NH is too far from Ile56
O of CI2A (4.5 Å) to form an H-bond, and there is an extended groove
between CI2A residues 51–58 and the loop, bridged by a H-bonded
network of 10 water molecules.
The equivalent loop in furin (residues 227–234) and kexin (246–
253) is similar in the two enzymes but entirely different from
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

236 Stabilization of Enzymes by Metal Binding

that in subtilisin BPN or SubHal. In furin and kexin there are


no analogous main chain hydrogen bond donors or acceptors in
the crucial positions and there are no structural data for CI2A
binding.

6.2.6 Activity Profiles


The pH curve of SubTY and SubHal follows an alkalophilic-type
curve as for the B. clausii subtilisin, with optimal pH of 9.0 for SubTY
using the pNA assay and 11.0 for both enzymes using the Protazyme
AK assay (Fig. 6.8). SubHal is more sensitive to low pH conditions. At
pH 5.0 its residual activity is only 14% compared to 97% for SubTY
and 99% for Savinase. The temperature optimum (Protazyme AK
assay) for SubHal is at 80◦ C, with over 20% activity at 90◦ C, while
SubTY has its optimum at 70◦ C, similar to Savinase (Fig. 6.9). The
activity and stability profiles of SubHal are significantly shifted to
the highest pH and temperature ranges of the subtilases. In addition,
SubHal has rather broad specificity, cleaving between charged,
polar, polar and hydrophobic, hydrophobic and hydrophobic, and
also aliphatic and aromatic side chains. Both SubHal and SubTY
are inhibited by phenylmethylsulfonyl fluoride (PMSF). The very
broad specificity of SubTY and SubHal are comparable to the well-
known subtilisins from B. clausii (old name: B. lentus) (Savinase), B.
licheniformis (Alcalase), and B. amyloliquefaciens (BPN ) using small
pNA peptide substrates.

6.2.7 Comparison of Metal Binding at the Strong and


Weak Sites in the S8 Family
The strong Ca-I site only binds calcium and is present in most
but not all subtilases. It is clearly defined in most structures but
absent in both SubHal and SubTY. The Ca-II/Na-II sites are also
present in many subtilases, characterized by Pantoliano et al.,
and careful analysis of binding patterns allows for differentiation
of the individual structures. The weak metal site is absent from
SubHal but is present in SubTY, where the Na-II site is occupied by
sodium.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 237

6.2.8 The Ca-II and Na-II Metal-Binding Sites


We noted that the Na-II metal appeared to be incorrectly assigned
as calcium in the sphericase structure and subsequently analyzed
the weak Ca-II/Na-II in an extensive set of subtilase structures in
the PDB. The typical Na+ -, Ca2+ -, and water-binding patterns from
studies properly assessing the nature of the metal were used as
master structures [27–31]. The identity, occupancies, and B-factors
of the metals reported in the weak sites of the individual structures
were taken into account and compared to the typical sodium-
and calcium-binding geometries, as well as electron density maps
when the X-ray data were available. Where there were significant
discrepancies, we propose a new identity for the metal or water
in the site. Occupation of the site close to Na-II by potassium was
assessed separately as this larger ion binds both to side chains of
the Na-II and the Ca-II site (in a central position), as well as having
longer bond lengths around 2.7 Å.
The site-content analysis covering a set of 79 subtilase structures
is limited to calcium, potassium, sodium, and water (Table 6.5).
Figure 6.11 shows a comparison of the Ca-II and Na-II sites with
the original and newly assigned ligands. The Na-II site is more
frequently occupied by an ion or atom (55 cases) than is the Ca-II site
(17). In another 23 structures only the central position is occupied
(i.e., with contacts to ligands typical for both sites). Forty-seven of
the seventy-nine structures contain an incorrectly assigned atom or
ion in one of the three positions, most frequently (16) with Ca2+
incorrectly modeled instead of Na+ . A summary of the analysis is
given in Table 6.6.
The Na-II site and in some structures the central position
selectively bind monovalent ions (Na+ and K+ ), while the Ca-II
site selectively binds divalent calcium ions. Binding of an ion to
one site precludes occupancy of the other site. However, in many
structures, there is a water molecule in one site acting as a ligand
for an ion in the other site. In only three structures was calcium
correctly modeled in the Ca-II site (Fig. 6.11). In the Na-II site,
sodium was correctly modeled 23 times and was misinterpreted 16
times as calcium and 11 times as water. We propose that in reality
Na+ occupies the Na-II site in 50 structures out of 79 and Ca2+
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

238 Stabilization of Enzymes by Metal Binding

(a)

(b)

Figure 6.11 Stereoviews of the second metal-binding site of subtilisins.


Coordinating residues of the Ca-II and Na-II sites are marked for subtilisin
BPN . Ions or atoms (small spheres) are color coded: oxygen (red), calcium
(black), sodium (blue), and potassium (green). (a) Original identity of
ligands as reported in the deposited structures. Notice the significant scatter
of the individual atom types. (b) Revised ligand identity as shown in Table
6.5. Note the formation of four clusters—two clusters of water molecules
in the middle, usually accompanying binding of an ion in the opposite
site, and two clusters of calcium ions (Ca-II) and of sodium ions (Na-II).
Superposition of 79 structures.

the Ca-II site only in three plus one potential occurrence close to
the central position. K+ was correctly modeled in three structures
(either in the central position or close to the Na-II site) and probably
misinterpreted six times, five times as Ca2+ and once as water. This
is not surprising as calcium binding to the Ca-II site occurs only at

www.ebook3000.com
March 21, 2016
13:40

Table 6.5 Content of the second metal-binding site of subtilisins with differentiation to sites Ca-II, Na-II, and the central site for 79 X-ray structures.
Atom type with the range of distances to the coordinating protein atoms, coordinating residues, B factor, and occupancy factor q are given as found in
the PDB. The Suggested column contains the most likely identity of the atom on the basis of distances and coordination. dmin is structure resolution;
dCaII-NaII is distance between atoms or ions in the Ca-II and Na-II sites if both are occupied. If protein residue disrupting the usual organization of a site
is identified, it is shown in parentheses; “rot”—the side chain is fixed in a conformation that disrupts the site. Suggested atoms or ions: a question
mark is placed where there is a high uncertainty of atom or ion identification; the value in parentheses is the next most likely interpretation. The
order of records within each block is given by the year of publication and then by alphabetical order of the first author, together with a reference
number or by the year of publication in the PDB if no other reference is available
CaII site Central site NaII site
Atom/ion
– Atom/ion –
residues B on residues B on
Reference Residues distances residues Atom or distances residues Atom dCaII−
PDB (year of PDB Subtilisin dmin (disrupting (range in (range in Ion or B ion (q, (range in (range or B NaII

Suggested
Suggested

id Publication) name (Å) the site) Å) Å2 ) atom q (Å2 ) B) Residues Å) in Å2 ) ion q (Å2 ) (Å)

Ca-II and Na-II occupied


PSP Book - 9in x 6in

1aqn Pantoliano, BPN 1.8 E195 D197 2.3 2.5 56 Ca2+ 1.0 5.6 – – A169 Y171 2.6 2.8 68 W 1.0 14.6 – 2.5
1989 [29] 8324 V174
1ak9 Pantoliano, BPN 1.8 E195 D197 2.3 2.4 9 10 Ca2+ 0.75 4.3 – – A169 Y171 2.6 2.7 6 12 Na+ 1.0 17.7 – 2.6
1989 [29] 8321 V174
1a2q Pantoliano, BPN 1.8 E195 D197 2.6 3.3 10 13 W 1.0 13.6 – – A169 Y171 2.2 2.3 89 Ca2+ 0.6 12.0 Na+ 2.3
1989 [29] V174
1au9 Pantoliano, BPN 1.8 E195 D197 2.5 3.2 56 W 1.0 15.3 – Ca2+ A169 Y171 2.6 2.9 46 – – – K+ 3.0
1989 [29] (0.7, 2.5, 2.9- V174 (W)
3.1 from CaII)
1sbn Heinz, 1991 BPN 2.1 E195 D197 2.8 (1x) 25 55 W 1.0 37.2 – Ca2+ (1.0, G169 Y171 2.5 3.0 30 35 – – – K+ 2.7
[45] 32.6, 3.0-3.5 V174 (W)
from CaII)
(Contd.)
06-Allan-Svendsen-c06
March 21, 2016
13:40

Table 6.5 (Contd.)


CaII site Central site NaII site
Atom/ion
– Atom/ion –
residues B on residues B on
Reference Residues distances residues Atom or distances residues Atom dCaII−
PDB (year of PDB Subtilisin dmin (disrupting (range in (range in Ion or B ion (q, (range in (range or B NaII

Suggested
Suggested

id Publication) name (Å) the site) Å) Å2 ) atom q (Å2 ) B) Residues Å) in Å2 ) ion q (Å2 ) (Å)

1svn Betzel, 1992 Savinase 1.4 G195 D197 2.6 2.8 11 12 W 1.0 9.0 – – A169 Y171 2.3 2.5 78 Ca2+ 0.6 14.0 Na+ 2.2
[16] A174
1st3 Goddette, Savinase 1.4 G189 D191 2.6 2.8 10 12 W 1.0 23.9 – – A163 Y165 2.2 2.3 67 Ca2+ 1.0 15.4 Na+ 1.9
1992 [47] A168
1sud Gallagher, BPN 1.9 E195 D197 2.5 2.8 10 11 Ca2+ 0.49 22.8 – – G169 Y171 2.5 2.5 7 11 K+ 0.5 14.3 – 2.2
1993 [32] CRB-S3 V174
1sbh Kidd, 1996 BPN 1.8 E195 D197 2.7 3.4 20 20 W 1.0 30.4 – – A169 Y171 2.3 2.4 14 20 Ca2+ 1.0 38.8 Na+ 2.2
[48] 8397+1 V174
1scj Jain, 1998 E 2.0 E195 D197 2.3 2.9 14 19 W 1.0 10.7 Ca2+ ? – A169 Y171 2.6 2.8 12 15 Ca2+ 1.0 40.4 Na+ 2.6
[14] T174
PSP Book - 9in x 6in

1yja Kidd, 1999 BPN 1.8 E195 D197 2.6 3.3 13 17 W 1.0 21.9 – – A169 Y171 2.2 2.4 10 17 Ca2+ 1.0 23.7 Na+ ? 2.3
[49] 8397+1 V174
1yjb Kidd, 1999 BPN 1.8 E195 D197 2.7 2.7 39 W 1.0 2.0 – – A169 Y171 2.2 2.3 22 Ca2+ 1.0 7.5 Na+ ? 2.3

www.ebook3000.com
[49] 8397+1 V174
1yjc Kidd, 1999 BPN 1.8 E195 D197 2.9 3.6 55 W 1.0 7.1 – – A169 Y171 2.2 2.3 23 Ca2+ 1.0 28.9 Na+ 2.1
[49] 8397+1 V174
1ubn Dinakarpan- Selenosub- 2.4 E195 D197 2.8 3.6 20 35 W 1.0 70.3 – – G169 Y171 2.4 2.5 19 20 Ca2+ 1.0 44.4 Na+ 2.1
dian, 1999 tilisin V174
[50] BPN
1ndq Pan, 2003 Savinase 1.8 G195 D197 2.7 3.5 14 25 W 1.0 21.6 – – A169 Y171 2.3 2.4 9 12 Ca2+ 0.74 24.5 Na+ 2.4
[51] A174
1ndu Pan, 2003 Savinase 1.6 G195 D197 2.7 3.7 21 23 W 0.93 17.8 – – A169 Y171 2.3 2.3 14 15 Ca2+ 0.92 28.5 Na+ 2.5
[51] A174
06-Allan-Svendsen-c06
March 21, 2016
13:40

1tk2 Unpublished Savinase 1.5 G195 D197 2.6 2.6 15 16 W 1.0 17.5 – – A169 Y171 2.3 2.3 12 14 Ca2+ 1.0 2.0 Na+ 2.3
A174 ?**
Na-II occupied
1cse Bode, 1987 Carlsberg 1.2 (E197) – – – – – – – A169 Y171 2.3 2.4 69 Ca2+ 0.42 10.8 Na+ –
[15] V174
1sbc Neidhart, Carlsberg 2.5 (E197) – – – – – – – A169 Y171 - 4 11 – – – – –
1988 [52] V174
2sec McPhalen, Carlsberg 1.8 (E197) – – – – – – – A169 Y171 2.5 2.6 8 10 Ca2+ 0.91 9.9 K+
1988 [53] V174
1sca Fitzpatrick, Carlsberg 2.0 (E197) – – – – – – – A169 Y171 2.3 2.4 8 12 Na+ 1.0 16.1 – –
1993 [54] V174
1scb Fitzpatrick, Carlsberg 2.3 (E197) – – – – – – – A169 Y171 2.2 2.5 5 13 W 1.0 10.9 Na+ –
1993 [54] V174
1scd Fitzpatrick, Carlsberg 2.3 (E197) – – – – – – – A169 Y171 2.4 2.6 10 17 Ca2+ 1.0 32.8 Na+ –
1994 [55] V174
1af4 Schmitke, Carlsberg 2.6 (E197) – – – – – – – A169 Y171 2.4 2.6 15 20 W 1.0 4.7 Na+ –
1997 [56] V174
1bh6 Eschenburg, DY 1.8 (E197) – – – – – – – A169 Y171 2.5 2.6 9 10 Na+ 1.0 22.5 – –
1998 [57] V174
1c3l Prangé, 1998 Carlsberg 2.2 (E197) – – – – – – – A169 Y171 2.3 2.4 9 18 Ca2+ 1.0 39.3 Na+ –
PSP Book - 9in x 6in

[58] V174
1be6 Schmitke, Carlsberg 2.2 (E197) – – – – – – – A169 Y171 2.3 2.6 14 21 W 1.0 14.2 Na+ –
1998 [59] V174
1be8 Schmitke, Carlsberg 2.2 (E197) – – – – – – – A169 Y171 2.2 2.6 21 27 W 1.0 22.8 Na+ –
1998 [59] V174
1bfk Schmitke, Carlsberg 2.3 (E197) – – – – – – – A169 Y171 2.3 2.6 24 34 W 1.0 24.3 Na+ –
1998 [59] V174
1bfu Schmitke, Carlsberg 2.2 (E197) – – – – – – – A169 Y171 2.3 2.6 21 28 W 1.0 11.9 Na+ –
1998 [59] V174
1vsb Stoll, 1998 Carlsberg, 2.1 (E197) – – – – – – – A169 Y171 2.3 2.5 2 10 W 1.0 2.00 Na+ –
[60] type VIII V174

(Contd.)
06-Allan-Svendsen-c06
March 21, 2016
13:40

Table 6.5 (Contd.)


CaII site Central site NaII site
Atom/ion
– Atom/ion –
residues B on residues B on
Reference Residues distances residues Atom or distances residues Atom dCaII−
PDB (year of PDB Subtilisin dmin (disrupting (range in (range in Ion or B ion (q, (range in (range or B NaII

Suggested
Suggested

id Publication) name (Å) the site) Å) Å2 ) atom q (Å2 ) B) Residues Å) in Å2 ) ion q (Å2 ) (Å)

3vsb Stoll, 1998 Carlsberg, 2.6 (E197) – – – – – – – A169 Y171 1.9 2.1 2 24 Na+ 1.0 10.0 – –
[60] type VIII V174
1av7 Stoll, 1998 Carlsberg 2.6 (E197) – – – – – – – A169 Y171 2.4 2.8 5 12 Na+ 1.0 30.2 – –
[60] type VIII V174
1avt Stoll, 1998 Carlsberg 2.0 (E197) – – – – – – – A169 Y171 2.2 2.4 38 Na+ 1.0 24.3 – –
[60] type VIII V174
1gns Almog, 2002 BPN 1.8 E195 D197 – 8 13 – – – – – G169 Y171 2.3 2.5 67 W 1.0 18.4 Na+ –
[61] V174
1oyv Barrette-Ng, Carlsberg 2.5 (E197) – – – – – – – A169 Y171 2.3 2.8 22 33 W 1.0 16.8 Na+ –
PSP Book - 9in x 6in

2003 [62] V174


1r0r Horn, 2003 Carlsberg 1.1 (E197) – – – – – – – A169 Y171 2.1 2.4 2.4 12 13 2x 0.5 43.0 Na+ –
[63] A174 2.4 Ca2+ 0.5 22.8

www.ebook3000.com
1tm1 Radisky, BPN 1.7 E195 – 15 19 – – – – – G169 Y171 2.3 2.4 14 17 Na+ 1.0 16.3 – –
2004 [64] (D197 rot) V174
1tm3 Radisky, BPN 1.6 E195 – 16 19 – – – – – G169 Y171 2.3 2.4 13 16 Na+ 1.0 16.0 – –
2004 [64] (D197 rot) V174
1tm4 Radisky, BPN 1.7 E195 – 20 22 – – – – – G169 Y171 2.3 2.4 17 20 Na+ 1.0 20.0 – –
2004 [64] (D197 rot) V174
1tm5 Radisky, BPN 1.5 E195 – 17 18 – – – – – G169 Y171 2.3 2.4 14 15 Na+ 1.0 14.5 – –
2004 [64] (D197 rot) V174
1tm7 Radisky, BPN 1.6 E195 – 19 21 – – – – – G169 Y171 2.3 2.4 15 17 Na+ 1.0 17.7 – –
2004 [64] (D197 rot) V174
06-Allan-Svendsen-c06
March 21, 2016
13:40

1tmg Radisky, BPN 1.7 E195 – 15 17 – – – – – G169 2.3 2.4 12 14 Na+ 1.0 15.0 – –
2004 [64] (D197 Y171
rot) V174
1to1 Radisky, BPN 1.7 E195 – 18 20 – – – – – G169 2.3 2.4 14 16 Na+ 1.0 18.2 – –
2004 [64] (D197 Y171
rot) V174
1to2 Radisky, BPN 1.3 E195 – 13 15 – – – – – G169 2.3 2.4 11 12 Na+ 1.0 13.0 – –
2004 [64] (D197 Y171
rot) V174
1yu6 Maynes, Carlsberg 1.6 (E197) – – – – – – – A169 2.3 2.4 12 13 W 1.0 5.3 Na+ –
2005 [65] Y171
V174
1y1k Radisky, BPN 1.6 E195 – 18 19 – – – – – G169 2.3 2.4 14 16 Na+ 1.0 16.7 – –
2005 [66] (D197 Y171
rot) V174
PSP Book - 9in x 6in

1y33 Radisky, BPN 1.8 E195 – 16 18 – – – – – G169 2.3 2.4 11 16 Na+ 1.0 16.6 – –
2005 [66] (D197 Y171
rot) V174
1y34 Radisky, BPN 1.6 E195 – 16 17 – – – – – G169 2.3 2.4 13 13 Na+ 1.0 15.0 – –
2005 [66] (D197 Y171
rot) V174
1y3b Radisky, BPN 1.8 E195 – 16 18 – – – – – G169 2.3 2.4 13 16 Na+ 1.0 15.8 – –
2005 [66] (D197 Y171
rot) V174

(Contd.)
06-Allan-Svendsen-c06
March 21, 2016

Table 6.5 (Contd.)


13:40

CaII site Central site NaII site


Atom/ion
– Atom/ion –
residues B on residues B on
Reference Residues distances residues Atom or distances residues Atom dCaII−
PDB (year of PDB Subtilisin dmin (disrupting (range in (range in Ion or B ion (q, (range in (range or B NaII

Suggested
Suggested

id Publication) name (Å) the site) Å) Å2 ) atom q (Å2 ) B) Residues Å) in Å2 ) ion q (Å2 ) (Å)

1y3c Radisky, BPN 1.7 E195 – 16 18 – – – – – G169 Y171 2.3 2.4 14 16 Na+ 1.0 16.4 – –
2005 [66] (D197 rot) V174
1y3d Radisky, BPN 1.8 E195 – 20 20 – – – – – G169 Y171 2.3 2.4 15 19 Na+ 1.0 19.3 – –
2005 [66] (D197 rot) V174
1y3f Radisky, BPN 1.7 E195 – 17 20 – – – – – G169 Y171 2.3 2.4 17 20 Na+ 1.0 20.6 – –
2005 [66] (D197 rot) V174
1y48 Radisky, BPN 1.8 E195 – 18 20 – – – – – G169 Y171 2.3 2.4 14 19 Na+ 1.0 18.5 – –
2005 [66] (D197 rot) V174
1y4a Radisky, BPN 1.6 E195 – 13 21 – – – – – G169 Y171 2.2 2.4 10 12 W 1.0 5.1 Na+ –
2005 [66] (D197 rot) V174
Central position occupied
PSP Book - 9in x 6in

1sib Heinz, 1991 BPN 2.4 E195 D197 3.1 (1x) 37 (1x) – – – – Ca2+ (1.0, G169 Y171 2.5 3.1 41 48 – – – K+ –
[46] 51.4) V174 (Na+ )
3sic Takeuchi, BPN 1.8 E195 D197 3.0 3.0 19 19 – – – – Ca2+ (1.0, G169 Y171 2.9 3.0 14 18 – – – W? –

www.ebook3000.com
1991 [67] 40.3) V174
5sic Takeuchi, BPN 2.2 E195 D197 2.9 3.1 99 – – – – Ca2+ (1.0, G169 Y171 2.9 3.0 8 11 – – – W –
1991 [67] 26.8) V174
1sub Gallagher, BPN 1.8 E195 D197 3.0 3.3 8 10 – – – – K+ (0.64, G169 Y171 2.5 2.8 78 – – – – –
1993 [32] CRB-S3 17.0) V174
1suc Gallagher, BPN 1.8 E195 D197 3.0 3.1 89 – – – – K + (0.75, G169 Y171 2.6 2.8 57 – – – – –
1993 [32] CRB-S3 20.0) V174
1scn Steinmetz, Carlsberg 1.9 E195 2.8 (1x) 19 (1x) – – – – Na+ (1.0, V174 3.1 17 – – – W
1994 [68] (E197) 47.5)
1spb Gallagher, BPN 2.0 E195 D197 3.0 3.3 36 – – – – Na+ (0.92, G169 Y171 2.5 2.6 29 – – – – –
1995 [69] 11.2) V174
06-Allan-Svendsen-c06
March 21, 2016
13:40

1mpt Yamane, M- 2.4 G195 2.7 3.0 5 11 – – – – Ca2+ A169 2.4 2.8 7 12 – – – K+ (W –
1995 [17] protease D197 (1.0, Y171 or
26.6) A174 Na+ )
1sup Gallagher, BPN 1.6 E195 2.8 3.1 8 11 – – – – Na+ G169 2.8 3.0 8 11 – – – W –
1996 [70] D197 (1.0, Y171
17.5) V174
1sua Almog, BPN 2.1 E195 2.7 2.8 78 – – – – W (1.0, G169 2.8 3.1 5 10 – – – – –
1998 [71] D197 12.5) Y171
V174
1sue Gallagher, BPN 1.8 E195 2.9 3.2 7 11 – – – – Na+ G169 2.7 2.9 7 14 – – – W –
1998 [72] D197 (1.0, Y171
12.0) V174
1gci Kuhn, 1998 Savinase 0.8 G195 2.9 3.2 88 – – – – Ca2+ A169 2.8 3.0 77 – – – W –
[33] D197 (0.5, Y171
11.5) A174
PSP Book - 9in x 6in

1c9j Graycar, Savinase 1.8 G195 3.0 3.3 9 16 – – – – Ca2+ A169 2.6 2.9 79 – – – W –
1999 [73] D197 (0.83, Y171 (K+ )
34.8) A174
1c9m Graycar, Savinase 1.7 G195 2.7 3.2 13 13 – – – – Ca2+ A169 2.7 3.1 11 13 – – – W –
1999 [73] D197 (0.83, Y171
37.2) A174
1c9n Graycar, Savinase 1.5 G195 3.2 3.2 16 18 – – – – Ca2+ A169 2.7 3.1 14 15 – – – W –
1999 [73] D197 (0.86, Y171
43.7) A174

(Contd.)
06-Allan-Svendsen-c06
March 21, 2016
13:40

Table 6.5 (Contd.)


CaII site Central site NaII site
Atom/ion
– Atom/ion –
residues B on residues B on
Reference Residues distances residues Atom or distances residues Atom dCaII−
PDB (year of PDB Subtilisin dmin (disrupting (range in (range in Ion or B ion (q, (range in (range or B NaII

Suggested
Suggested

id Publication) name (Å) the site) Å) Å2 ) atom q (Å2 ) B) Residues Å) in Å2 ) ion q (Å2 ) (Å)

1iav Graycar, Savinase 1.8 G195 D197 2.7 3.1 89 – – – – Ca2+ (0.65, A169 Y171 2.7 3.1 68 – – – W –
1999 [73] 24.5) A174
1jea Graycar, Savinase 2.0 G195 D197 2.6 3.2 58 – – – – Ca2+ (0.79, A169 Y171 2.7 3.1 38 – – – W –
1999 [73] 31.5) A174
1dui Pan, 2000 BPN 2.0 E195 D197 2.9 3.0 11 12 – – – – Na+ (1.0, G169 Y171 3.0 3.1 8 15 – – – W –
[74] 10.7) V174 (K)
1gnv Almog, 2002 BPN 1.9 E195 D197 2.5 2.8 67 – – – – W (0.95, G169 Y171 3.0 3.3 8 10 – – – K+ ? –
[61] 3.9) V174 or
W?
1lw6 Radisky, BPN 1.5 E195 D197 2.9 3.1 8 10 – – – – W (1.0, G169 Y171 2.9 3.0 89 – – – – –
PSP Book - 9in x 6in

2002 [12] 12.1) V174


1q5p Bott, 2003 Savinase 1.6 G189 D191 2.8 3.1 00 – – – – Ca2+ (1.0, A163 Y165 2.8 3.1 0 10 – – – W* –
[75] 0.0) A168

www.ebook3000.com
1v5i Unpublished BPN 1.5 E195 D197 2.7 3.1 13 15 – – – – W (1.0, G169 Y171 2.8 3.3 13 14 – – – – –
14.1) V174
1y4d Radisky, BPN 2.0 E195 D197 2.7 3.1 25 25 – – – – Na+ (1.0, G169 Y171 2.7 2.9 21 23 – – – W –
2005 [66] 13.0) V174
None of the sites occupied
1sbi Kidd, 1996 BPN 2.2 E195 D197 – 8 16 – – – – – A169 Y171 – 12 16 – – – ***
[48] 8397 V174

*Uneasy to classify due to the lack of structural information.


**B value(s) most likely unrealistic.
***Missing solvent.
06-Allan-Svendsen-c06
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Two New Structures of Subtilases with Altered Calcium Sites 247

Table 6.6 Summary of the content analysis of the Ca-II, Na-II, and central
metal-binding site of subtilisins. See Table 6.5 for details

No. of No. Frequent


Category cases incorrect misinterpretation Other
Both sites occupied 17 14 12× Ca2+ , correct Na+ –
Only Na-II occupied 38 16 4× Ca2+ , correct Na+ 11× Na+ missed
Only central 23 17 14× ion, correct water 2× incorrect ion
site occupied
None occupied 1 Solvent missing
All cases 79 47 16× Ca2+ , correct Na+ 14× ion, correct water

low or zero sodium concentration, rare in crystallographic studies.


The same is probably true for potassium binding but has not been
investigated in depth.
Even though some of the structures analyzed are difficult to judge
with certainty or are inconclusive (e.g., structures with no solvent
modeled), most of the misinterpreted cases indicate that sodium
binding to its typical binding site (several oxygens as ligands in
close proximity to each other, without the constraint of octahedral
coordination) should occur at high frequency. Only under special
conditions, such as intended ion-binding studies or sodium-free
solutions, can other ions be found in, or close to, the Ca-II and
Na-II sites.
Reinterpretation and refinement of the reported structures
against the experimental diffraction data would be the obvious next
step following our analysis. However, the data are not available
for many structures and, in addition, detailed information on
crystallization solutions composition is frequently lacking.
This analysis may seem to stress the importance of the second
site occupancy in a rather pedantic way, but its importance should
not be underestimated. It is clear that the stability of Bacillus
subtilisins is critically dependent on the metal ions present. Removal
of calcium from Ca-I leads to unfolding and autodegradation.
Stability is further controlled by the presence of a divalent calcium
in Ca-II (but only at low sodium concentration) or a monovalent
ion, usually sodium in Na-II. Many published studies purporting to
investigate the dependency of the stability or activity of subtilases
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

248 Stabilization of Enzymes by Metal Binding

on metal ions have clearly failed to identify which ion lies in the
Ca-II/Na-II site. Indeed from the results presented here it is evident
that in many X-ray structures calcium was not bound or could not
play a stabilizing role in the Ca-II site but rather the neighboring Na-
II site was occupied by sodium. The effect of this on stability remains
unclear.

6.3 Conclusion: Implications for Structural Studies of


Enzymes

The new structures for subtilases SubHal and SubTY shed light on
the details of calcium binding to subtilases. In addition, the analysis
of the second metal-binding site content in the published structures
of subtilases clearly shows that wrong ions were often modeled
in subsites Ca-II and Na-II. This gives all solution experiments,
performed on model subtilisins in the past 25 years, a different
perspective. Even if relevant data about stability and activity were
provided, their interpretation is not trivial due to the potential
discrepancies in metal site occupancy described here. Future
experiments should be performed in a more controlled manner, with
the help of the present analysis.
The structure of the SubHal:CI2A complex brings for the first
time structural details about this particular subtilase type and
CI2A binding. While the structure of SubHal is similar to subtilase
KP-43, the complex with the chymotrypsin inhibitor leads to two
conclusions. First, a subtilase from an alkalophilic organism such as
B. halmapalus is not expected to encounter chymotrypsin inhibitor
in its natural environment. Even if CI2A binding to SubHal in
principle repeats what has been observed for other such complexes,
the SubHal molecule is not perfectly suited for the inhibitor binding
(discussed for regions 51–58 of CI2A). This leads to a rather loose
fit when compared to other subtilase:CI2A complexes, with direct
protein–protein interactions being replaced by water-mediated
ones. The structural details can contribute to studies focused on
evolutionary optimization of inhibitor:subtilase interactions.
Analysis of the SubTY structure led to the identification of Na+
in the Na-II site. The subsequent analysis of metal ions present

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Materials and Methods 249

in the second binding site of deposited subtilase structures led to


several conclusions. In previous structural studies numerous cases
of misinterpretation of the metal in the Ca-II/Na-II site can be found.
The largest classes consist of sodium incorrectly modeled as calcium
(16 of 79) or water interpreted as an ion (14). Our reassignment
of the site content clarifies the relationship between the atom type
and particular subsite position. We propose that the Ca-II, Na-II,
and the central site can be distinguished from their position in the
structure, which reflects the chemical nature of the bound metal or
ion. Unfortunately, this was often not taken into account in the many
structure analyses. Therefore, the results of many previous solution
studies of the influence of metals in the second metal-binding site
(in particular calcium and sodium) on subtilase stability and activity
may have been biased by structural misinterpretations.

6.4 Materials and Methods

6.4.1 SubTY
6.4.1.1 Protein production and purification
The SubTY gene was transformed into Escherichia coli. DNA purified
from an overnight culture of these transformants was transformed
into B. subtilis by restriction endonuclease digestion, purification
of DNA fragments, and ligation. Transformation of B. subtilis was
performed as described by Dubnau and Davidoff-Abelson [76].
Expression procedure details can be also found in Refs. [77] and
[78].
The culture broth was centrifuged (20,000× g, 20 min), and
the supernatants were carefully decanted from the precipitates.
The combined supernatants were filtered through a Seitz EKS
plate in order to remove the rest of the Bacillus host cells. The
pH of the EKS filtrate was adjusted to pH 7, and the filtrate
was applied to a bacitracin silica column equilibrated in 100 mM
H3 BO3 , 10 mM dimethylglutaric acid, 2 mM CaCl2 , and pH 7. After
washing the bacitracin column extensively with the equilibration
buffer, the protease was step eluted with 100 mM H3 BO3 , 10 mM
dimethylglutaric acid, 2 mM CaCl2 , 1 M NaCl, 25% isopropanol, and
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

250 Stabilization of Enzymes by Metal Binding

pH 7. The bacitracin eluate was transferred to 50 mM H3 BO3 , 5 mM


dimethylglutaric acid, 1 mM CaCl2 , and pH 5 on a G25 Sephadex
column and applied to an S-Sepharose HP column equilibrated in
the same buffer. After washing the column extensively with the
equilibration buffer, the protease was eluted with a linear NaCl
gradient (0–0.5 M) in the same buffer. Fractions from the column
were analyzed for protease activity, and active fractions were further
analyzed by SDS-PAGE. Fractions where only one band was seen on
the coomassie-stained SDS-PAGE gel were pooled as purified SubTY
protease.

6.4.1.2 Purification of the SubTY:CI2A (1:1) complex


Approximately 65 mg of SubTY protease was mixed with approxi-
mately 30 mg of purified CI2A, resulting in a molar excess of the
CI2A inhibitor in the mixture. SubTY:CI2A (1:1) complexes were
separated from excess CI2A on a Superdex 75 size-exclusion column
equilibrated in 100 mM H3 BO3 , 10 mM dimethylglutaric acid, 2 mM
CaCl2 , 150 mM NaCl, and pH 7.0. Fractions from the column were
analyzed by SDS-PAGE and SubTY:CI2A containing fractions were
pooled and concentrated on an Amicon ultrafiltration cell to give the
final SubTY:CI2A complex product (ca. 50 mg).

6.4.1.3 Crystallization
SubTY complexed with the chymotrypsin inhibitor CI2A was concen-
trated to 45 mg/mL in 10 mM Tris and pH 7.0, and crystallization
trials were performed using hanging drop vapor diffusion. The first
crystals were obtained by mixing 1 μL of the protein solution with
1 μL of a reservoir solution containing 5.5 M sodium formate and
0.2 M imidazole-malate pH 6.5. These crystals grew as thin, stacked
plates that were subsequently massively improved by the addition
of 200 mM NDSB 201, leading to the growth of single, prism-
shaped crystals. A crystal from the optimized condition was vitrified
directly from the crystallization drop and diffracted to a spacing
of 1.8 Å.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Materials and Methods 251

6.4.1.4 Structure determination


X-ray diffraction data were collected with a Rigaku rotating anode X-
ray generator and a MAR345 imaging plate detector. The data were
processed and scaled in the space group P 21 21 21 using the HKL
suite [79]. Data processing statistics are shown in Table 6.3.
Phases were determined by molecular replacement (MR) with
the program AmoRe [80] using Savinase (PDB id 1svn) as the search
model. The rotation and translation function searches resulted in a
simple clear solution. Rigid-body refinement of the resulting model
gave a correlation coefficient of 38% and an R-factor of 0.47 in
the 12–3.5 Å resolution range. The asymmetric unit contains one
SubTY–CI2A complex, corresponding to a VM value of 2.6 Å3 /Da and
a solvent content of 53%. The structure was rebuilt manually using
XtalView [81]. Refinement of the structure was carried out with
REFMAC5 [82] with automated addition of water molecules.

6.4.2 SubHal
6.4.2.1 Protein production and purification
Cloning and expression in B. subtilis were performed as for SubTY
[83]. The culture broth was centrifuged (20,000× g, 20 min), and
the supernatants were carefully decanted from the precipitates. The
combined supernatants were filtered through a Seitz EKS plate in
order to remove the rest of the Bacillus host cells. The EKS filtrate
was transferred to 50 mM H3 BO3 , 5 mM succinic acid, 1 mM CaCl2 ,
and pH 7 on a G25 sephadex column and applied to a bacitracin silica
column equilibrated in the same buffer. After washing the bacitracin
column extensively with the equilibration buffer, the protease was
step eluted with 100 mM H3 BO3 , 10 mM succinic acid, 2 mM
CaCl2 , 1 M NaCl, 5% isopropanol, and pH 7. When the bacitracin
eluate was analyzed by SDS-PAGE, only one band was seen on the
coomassie-stained SDS-PAGE gel. However, the eluate was colored.
To remove the color from the eluate, the eluate was transferred
to 50 mM H3 BO3 , 10 mM succinic acid, 1 mM CaCl2 , and pH 8 on
a G25 Sephadex column and applied to a Q Sepharose FF column
equilibrated in the same buffer. The SubHal protease did not bind
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

252 Stabilization of Enzymes by Metal Binding

to the column at these conditions, and the colorless effluent was


collected as purified SubHal protease.

6.4.2.2 Purification of the SubHal:CI2A (1:1) complex


Approximately 50 mg of SubHal was mixed with 15 mg of purified
CI2A, resulting in a molar excess of the CI2A inhibitor in the mixture.
SubHal:CI2A (1:1) complexes were separated from excess CI2A on a
Superdex 75 size-exclusion column equilibrated in 10 mM HEPES,
1 mM CaCl2 , 100 mM NaCl, and pH 7. Fractions from the column
were analyzed by SDS-PAGE to identify the SubHal:CI2A containing
fractions. The top fractions were pooled as SubHal:CI2A complexes
(ca. 25 mg).

6.4.2.3 Crystallization
6.4.2.3.1 Unliganded SubHal
SubHal was concentrated to 24 mg/mL in 50 mM sodium cacodylate
buffer, pH 6.5, 2 mM CaCl2 , and 100 mM NaCl. The enzyme was
crystallized by hanging drop vapor diffusion in drops containing 1
μL of protein solution and 1 μL of reservoir solution over 0.5 mL
of reservoir: Hampton Screen 2 no. 23 (10% dioxane, 0.1 M 2-(N-
morpholino)-ethanesulfonic acid [MES] buffer, pH 6.5, and 1.6 M
ammonium sulfate) [84]. Clusters of elongated plates appeared after
several days.

6.4.2.3.2 The SubHal:CI2A inhibitor complex


SubHal in 50 mM sodium cacodylate buffer, pH 6.5, 2 mM CaCl2 ,
100 mM NaCl, and 13.5 mg/mL was incubated with concentrated
CI2A inhibitor solution (5.5 mg/mL) for several minutes at a molar
inhibitor:enzyme ratio of 2:1. Excess inhibitor was separated by gel
filtration on Superdex 75 10/30 column (Akta FPLC chromatogra-
phy system). Fractions corresponding to the SubHal:CI2A complex
size were pooled and concentrated on a 1 kDa cutoff membrane to
approximately 14 mg/mL.
The SubHal:CI2A complex crystallized spontaneously upon
concentration within several hours, but this resulted in crystals
that were too small for diffraction studies. However, this effect
could be largely avoided by omitting NaCl from the concentrated

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Materials and Methods 253

solution. Because of the potential for spontaneous crystallization,


fresh concentrated samples had to be prepared for all crystallization
trials. Native SubHal:CI2A plate-like crystals were grown by hanging
drop vapor diffusion, with the drop consisting of 2 μL of 15–20
mg/mL concentration protein, 10 mM sodium cacodylate in HCl
buffer, pH 6.5, and 1 μL of reservoir solution: 20% w/v PEG 4000,
0.1 M 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES)
buffer, pH 7.5, and 10% v/v isopropanol.
Crystals for heavy atom derivatives were grown under similar
conditions: 5%–17% w/v PEG 4000, 0.1 M HEPES buffer pH 7.5,
10% isopropanol, 14–22 mg/mL protein concentration in 50 mM
sodium cacodylate buffer, pH 6.5, and 2 mM CaCl2 . Heavy atom
soaks were carried out by transferring selected crystals into drops
containing the appropriate heavy atom salt and left in a hanging
drop setup for the times stated in Table 6.7.

6.4.2.4 Structure determination


6.4.2.4.1 The CI2A complex
X-ray diffraction data were collected on a native monoclinic
plate-like crystal with CuKα radiation from an in-house Rigaku
rotating anode generator and a MAR345 imaging plate detector
(Marresearch GmbH, Norderstedt, Germany). Processing with the
HKL package yielded data, 76.6% complete to a spacing of 1.9 Å in
space group P 21 (Table 6.3).
The volume of the unit cell together with the molecular weight of
the complex (54.5 kDa) suggested between 1 and 3 complexes in the
asymmetric unit (estimated Matthews coefficients of 4.6, 2.3, and 1.5
and solvent contents of 73%, 46%, and 20%, respectively). However,
MR searches with a variety of models based on known structures
of subtilases did not lead to any successful solutions. Therefore
heavy atom derivatives were prepared by soaking crystals in mother
liquor with varying concentrations of the following compounds:
ethyl mercury thiosalyclic acid, K2 Au(CN)2 , dichloro(2,2 :6 ,2 -
terpyridine)platinum(II) dihydrate, CH3 HgCl2 , and K2 PtCl4 . Only
data from the 10 mM K2 Au(CN)2 and 5 mM K2 PtCl4 soaks were
useful for phasing. Diffraction data processing was performed
similarly as for the SubHal:CI2A complex (Table 6.3).
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

254 Stabilization of Enzymes by Metal Binding

Table 6.7 Heavy atom data and statistics for the SubHal structure
solution. Values in parentheses are for the highest-resolution shell.
Au derivative data were collected on a MAR imaging plate 300 mm,
a Rigaku rotating anode (3 kW), and Yale mirrors focusing optics. Pt
derivative data were collected on a MAR imaging plate 345 mm, a
Rigaku rotating anode (5 kW), with Osmic mirrors focusing optics

Heavy atom derivatives K2 Au(CN)2 K2 PtCl4

Wavelength (Å) 1.5418 1.5418


Temperature (K) 120 120
Salt concentration 10 mM 5 mM
Time of soaking 96 hours 192 hours
No. oscillation images, oscillation 321× 0.5 151 × 0.5
range (◦)
Mosaicity (◦) 0.22 1.12
No. observations 380458 146954
No. unique reflections 63202 16297
Diffraction limits (Å) 20.0 – 2.03 (2.09 – 2.03) 29.4 – 3.00 (3.11 –3.00)
R sym 0.077 (0.239) 0.060 (0.166)
Data completeness (%) 99.8 (98.1) 80.9 (61.2)
Space group P 21 P 21
Unit cell parameters
a, b, c (Å) 58.01, 151.46, 63.79 58.27, 152.00, 64.06
β(◦) 117.13 116.99
No. heavy atom sites per AU
for phasing 3 2

The initial positions of the gold and platinum atoms were


determined independently with the program SOLVE [85]. The heavy
atom parameters were further refined and additional substructure
sites were searched for (MLPHARE [86]). A total of three gold and
two platinum sites were found. However, the resulting phases after
improvement by density modification (using the program DM [87])
to 3 Å or to 2 Å resolution did not yield electron density maps of
good enough quality for manual interpretation or automatic model
building. Nevertheless, two copies of SubTY could be readily placed
into the 3 Å resolution MIR map by the phased MR procedure
in MOLREP [88]. Noncrystallographic symmetry further improved
the MIR map. Automatic model building and refinement starting
from this NCS-averaged map led to a first model with 188 of the
expected 1,000 residues. The same procedure, but starting from

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Materials and Methods 255

the phases from NCS-averaged MIR phases combined with those


from the MR model, led to approximately 900 of the 1,000 residues
being built automatically. Further refinement (REFMAC5 [82]) and
manual rebuilding (Xfit, XtalView [81]) led to a complete model of
the complex with a final R/Rfree of 0.125/0.181) (Table 6.3). Here
and elsewhere in the text the letters A and C denote the chains of
the two independent enzyme molecules in the asymmetric unit and
B and D those of the two CI2A inhibitors.
The quality of the refined model was assessed using SFCHECK
[89] and PROCHECK [34] (Table 6.3), and subsequent structure
rebuilding and analysis were performed in COOT [38]. Five residues
lie in the generously allowed region of the Ramachandran plot [90],
and all are well ordered with excellent electron density. Asp34 in
both molecules A and C form a tight turn between strands β1 and
β2, and all its hydrogen bond donors and acceptors are involved
in strong hydrogen bonds with neighboring residues within the A
chain. In molecule A, the side chain carboxyl group H bonds to
the main chain amino groups of His43 and Ser232. Ala80 in both
molecules A and C lies in a very tight loop, which spans residues
79–83, stabilized by a number of short hydrogen bonds one of
which is Nζ, LysA83 ..OAlaA80 (2.75 Å). His243 in molecule C lies on the
surface of the protein and interacts directly with a symmetry-related
molecule by a nonbonded close contact Cδ1, LeuB40 –OHisC243 (3.41 Å).
The carbonyl oxygen is forced by crystal packing to a less favorable
conformation. In molecule A, His243 is not involved in any direct
contact with symmetry-related molecules and lies in the allowed
region of the plot.

6.4.2.4.2 Unliganded SubHal


The quality of the crystals of the unliganded form of SubHal
from the initial screening could not be easily improved because
of autoproteolysis. Therefore X-ray data were collected from a
fragment of a crystal from the initial crystallization screen. This
crystal suffered from relatively rapid radiation damage, and it was
not possible to record complete data (Table 6.3). However, the data
were sufficient to allow refinement of the structure.
Enzyme coordinates from the refined SubHal:CI2A complex were
used as a search model for MR. The data were indexed in space
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

256 Stabilization of Enzymes by Metal Binding

group P 21 (Table 6.3), and MR clearly indicated positions of two


enzyme molecules in the asymmetric unit. The model was refined
(REFMAC5 [82]) with minor manual rebuilding (Xfit, XtalView [81])
to a final R/Rfree of 0.215/0.303 (Table 6.3). Again we refer to the
two molecules as A and B.
Asp34 lies in the disallowed region of the Ramachandran plot
in both molecules. However, small changes in the local environment
compared to the complex structure cause an even more pronounced
effect of distortion of the protein main chain torsion angles. In
molecule A, residues Ala80, Tyr225, and Ser329 lie in the generously
allowed region of the Ramachandran plot. Ala80 was already
discussed in the complex structure above. The loop that forms
the tight turn of residues 223–226 differs by several degrees in
chain A compared to chain B that distorts the main chain torsion
angles of Tyr225. Ser A329 forms a tight turn and appears in two
alternative conformations, whereas in chain B this residue interacts
with symmetry-related molecules in a single fixed conformation.
The equilibrium between the alternative conformations without
crystal packing interactions in chain A is shifted toward one, which
hydrogen bonds the side chain OH to OThrA328 . This position of the
hydroxyl oxygen creates an unfavorable contact with the carbonyl
oxygen of Ser329 in molecule A. The (ϕ, φ) torsion angles of this
residue are then affected by this close contact.

6.4.3 Protease Assays


pNA assay of protease activity with Suc-AAPF-pNA (Bachem L-1400)
as substrate was carried out at room temperature (25◦ C) in assay
buffers consisting of 100 mM succinic acid, 100 mM HEPES, 100 mM
CHES, 100 mM CABS, 1 mM CaCl2 , 150 mM KCl, and 0.01% Triton
X-100 and adjusted to pH values of 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0,
10.0, and 11.0 with HCl or NaOH. Twenty microliters of protease
(diluted in 0.01% Triton X-100) was mixed with 100 μL of assay
buffer. The assay was started by adding 100 μL of pNA substrate (50
mg dissolved in 1.0 mL of dimethyl sulfoxide (DMSO) and further
diluted 45× with 0.01% Triton X-100). The increase in OD405 was
monitored as a measure of the protease activity.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

Materials and Methods 257

Protazyme AK assay with Protazyme AK tablet (cross-linked


and dyed casein; from Megazyme) as substrate was carried out at
controlled temperature in the following buffers 100 mM succinic
acid, 100 mM HEPES, 100 mM CHES, 100 mM CABS, 1 mM CaCl2 ,
150 mM KCl, and 0.01% Triton X-100 and adjusted to pH values
of 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, and 11.0 with HCl or NaOH.
A Protazyme AK tablet was suspended in 2.0 mL 0.01% Triton X-
100 by gentle stirring. Five hundred microliters of this suspension
and 500 μL of assay buffer were mixed and placed on ice. Twenty
microliters of protease sample (diluted in 0.01% Triton X-100)
was added. The assay was initiated by transferring the mixture
to an Eppendorf thermo mixer preset to the assay temperature.
The 15-minute incubation at the highest shaking rate (1400 rpm)
was stopped by transferring the sample back to the ice bath. After
several minutes of centrifugation at 4◦ C 200 μL of supernatant was
transferred to a microtiter plate. OD650 was read as a measure of
protease activity. A buffer blind was used in the assay (instead of
an enzyme).

6.4.4 pH Stability
For the pH-stability profiles determination the proteases were
diluted 10x in the assay buffers and incubated for 2 h at 37◦ C. The
samples were then transferred to pH 9, before assay for residual
activity, by dilution in the pH 9 assay buffer.

6.4.5 Data Deposition


The X-ray data and coordinates have been deposited in the PDB, with
deposition codes 5FFN (SubTy:CI2A complex), 5FBZ (SubHal:CI2A
complex), and 5FAX (unliganded SubHal).

Acknowledgments

JD acknowledges support by the project BIOCEV, Biotechnology


and Biomedicine Centre of the Academy of Sciences and Charles
University (CZ.1.05/1.1.00/02.0109); from the European Regional
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

258 Stabilization of Enzymes by Metal Binding

Development Fund; and by the Ministry of Education, Youth and


Sports of the Czech Republic (Grant no. LG14009). We thank the staff
of the ESRF for the provision of data collection facilities at beam line
ID14.2.

References

1. Siezen, R. J., and Leunissen, J. A. M. (1997). Subtilases: the superfamily


of subtilisin-like serine proteases, Prot. Sci., 6, pp. 501–523.
2. Rawlings, N. D., Morton, F. R., and Barrett, A. J. (2006). MEROPS: the
peptidase database, Nucleic Acids Res., 34, pp. D270–D272.
3. Ruan, B., London, V., Fisher, K. E., Gallagher, D. T., and Bryan, P. N. (2008).
Engineering substrate preference in subtilisin: structural and kinetic
analysis of a specificity mutant, Biochemistry, 47, pp. 6628–6636.
4. Bryan, P. N. (2000). Protein engineering of subtilisin, Bioch. Biophys.
Acta, 1543, pp. 203–222.
5. Pahler, A., Banerjee, A., Dattagupta, J. K., Fujiwara, T., Lindner, K., Pal, G.
P., Suck, D., Weber, G., and Saenger, W. (1984). 3-dimensional structure
of fungal proteinase-K reveals similarity to bacterial subtilisin, EMBO J.,
3, 6, pp. 1311–1314.
6. Gros, P., Betzel, Ch., Dauter, Z., Wilson, K. S., and Hol, W. G. J. (1989).
Molecular dynamics refinement of a thermitase-eglin-c complex at 1.98
Å resolution and comparison of two crystal forms that differ in calcium
content, J. Mol. Biol., 210, pp. 347–367.
7. Wright, C. S., Alden, R. A., and Kraut, J. (1969). Structure of subtilisin
BPN at 2.5 Å resolution, Nature, 221, pp. 235–242.
8. Drenth, J., Hol, W. G. J., Jansonius, J. N., and Koekoeck, R. (1972).
Subtilisin NOVO: the three-dimensional structure and its comparison
with subtilisin BPN , Eur. J. Biochem., 26, pp. 177–181.
9. Krissinel, E., and Henrick, K. (2004). Secondary-structure matching
(SSM), a new tool for fast protein structure alignment in three
dimensions, Acta Crystallogr. D, 60, pp. 2256–2268.
10. Nicholas, K. B., Nicholas H. B., Jr., and Deerfield, D.W., II. (1997). GeneDoc:
analysis and visualization of genetic variation, EMBNEW. NEWS, 4, p. 14.
11. Gouet, P., Courcelle, E., Stuart, D. I., and Metoz, F. (1999). ESPript:
multiple sequence alignments in PostScript, Bioinformatics, 15, pp.
305–308.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

References 259

12. Radisky, E. S., and Koshland, D. E., Jr. (2002). A clogged gutter mechanism
for protease inhibitors, Proc. Natl. Acad. Sci. U S A, 99, pp. 10316–10321.
13. Dauter, Z., Betzel, C., Genov, N., Pipon, N., and Wilson, K. S. (1991).
Complex between the subtilisin from a mesophilic bacterium and the
leech inhibitor eglin-C, Acta Crystallogr. B, 47, pp. 707–730.
14. Jain, S. C., Shinde, U., Li, Y., Inouye, M., and Berman, H. M. (1998).
The crystal structure of an autoprocessed Ser221Cys-subtilisin E-
propeptide complex at 2.0 Å resolution, J. Mol. Biol., 284, pp. 137–
144.
15. Bode, W., Papamokos, E., and Musil, D. (1987). The high-resolution X-ray
crystal structure of the complex formed between subtilisin Carlsberg
and eglin c, an elastase inhibitor from the leech Hirudo medicinalis.
Structural analysis, subtilisin structure and interface geometry, Eur. J.
Biochem., 166, pp. 673–692.
16. Betzel, C., Klupsch, S., Papendorf, G., Hastrup, S., Branner, S., and Wilson,
K. S. (1992). Crystal structure of the alkaline proteinase savinase from
Bacillus lentus at 1.4 Å resolution, J. Mol. Biol., 223, pp. 427–445.
17. Yamane, T., Kani, T., Hatanaka, T., Suzuki, A., Ashida, T., Kobatashi, T., Ito,
S., and Yamashita, O. (1995). Structure of a new alkaline serine protease
(M-protease) from Bacillus sp. KSM-K16, Acta Crystallogr. D, 51, pp.
199–206.
18. Almog, O., Gonzalez, A., Klein, D., Greenblatt, H. M., Braun, S., and
Shoham, G. (2003). The 0.93 Å crystal structure of sphericase: a calcium-
loaded serine protease from Bacillus sphaericus, J. Mol. Biol., 332, pp.
1071–1082.
19. Betzel, C., Gourinath, S., Kumar, P., Kaur, P., Perbandt, M., Eschenburg,
S., and Singh, T. P. (2001). Structure of a serine protease proteinase K
from Tritirachium album limber at 0.98 Å resolution, Biochemistry, 40,
pp. 3080–3088.
20. Arnorsdottir, J., Kristjansson, M. M., and Ficner, R. (2005). Crystal
structure of a subtilisin-like serine proteinase from a psychrotrophic
Vibrio species reveals structural aspects of cold adaptation, FEBS J., 272,
pp. 832–845.
21. Nonaka, T., Fujihashi, M., Kita, A., Saeki, K., Ito, S., Horikoshi, K., and Miki,
M. (2004). The crystal structure of an oxidatively stable subtilisin-like
alkaline serine protease, KP-43, with a C-terminal β-barrel domain, J.
Biol. Chem., 279, pp. 47344–47351.
22. Henrich, S., Cameron, A., Bourenkov, G. P., Kiefersauer, R., Huber, R.,
Lindberg, I., Bode, W., and Than, M. E. (2003). The crystal structure of the
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

260 Stabilization of Enzymes by Metal Binding

proprotein processing proteinase furin explains its stringent specificity,


Nat. Struct. Biol., 10, pp. 520–526.
23. Holyoak, T., Wilson, M. A., Fenn, T. D., Kettner, C. A., Petsko, G. A., Fuller,
R. S., and Ringe, D. (2003). 2.4 Å resolution crystal structure of the
prototypical hormone-processing protease Kex2 in complex with an
Ala-Lys-Arg boronic acid inhibitor, Biochemistry, 42, pp. 6709–6718.
24. Schechter, I., and Berger, A. (1967). On the size of the active site in
proteases. I. Papain, Biochem. Biophys. Res. Commun., 27, pp. 157–162.
25. Svendsen, I., Martin, B., and Jonassen, I. (1980). Characteristics of
Hiproly barley III. Amino acid sequences of two lysine-rich proteins,
Carlsberg Res. Commun., 45, pp. 79–85.
26. Bryan, P. N., Rollence, M. L., Pantoliano, M. W., Wood, J., Finzel, B.
C., Gilliland, G. L., Howard, A. J., and Poulos, L. (1986). Proteases
of enhanced stability: characterization of a thermostable variant of
subtilisin. Proeins, 1, pp. 326–334.
27. Pantoliano, M. W., Whitlow, M., Wood, J. F., Rollence, M. L., Finzel, B. C.,
Gilliland, G. L., Poulos, T. L., and Bryan, P. N. (1988). The engineering
of binding affinity at metal ion binding sites for the stabilization of
proteins: subtilisin as a test case, Biochemistry, 27, pp. 8311–8317.
28. Pantoliano, M. W., Whitlow, M., Wood, J. F., Dodd, S. W., Hardman, K.
D., Rollence, M. L., and Bryan, P. N. (1989). Large increases in general
stability for subtilisin BPN through incremental changes in the free
energy of unfolding, Biochemistry, 28, pp. 7205–7213.
29. Alexander, P. A., Ruan, B., and Bryan, P. N. (2001). Cation-dependent
stability of subtilisin, Biochemistry, 40, pp. 10634–10639.
30. Alexander, P. A., Ruan, B., Strausberg, S. L., and Bryan, P. N. (2001).
Stabilizing mutations and calcium-dependent stability of subtilisin,
Biochemistry, 40, pp. 10640–10644.
31. Gallagher, T., Bryan, P., and Gilliland, G. L. (1993). Calcium-independent
subtilisin by design, Proteins, 16, pp. 205–213.
32. Kuhn, P., Knapp, M., Soltis, S. M., Ganshaw, G., Thoene, M., and Bott,
R. (1998). The 0.78 Å structure of a serine protease: Bacillus lentus
subtilisin, Biochemistry, 37, pp. 13446–13452.
33. Almog, O., González, A., Godin, N., de Leeuw, M., Mekel, M. J., Klein, D.,
Braun, S., Shoham, G., and Walter, R. L. (2009). The crystal structures of
the psychrophilic subtilisin S41 and the mesophilic subtilisin Sph reveal
the same calcium-loaded state, Proteins, 74, pp. 489–496.
34. Laskowski, R. A., MacArthur, M. W. Moss, D. S., and Thornton, J. M. (1993).
Procheck: a program to check the stereochemical quality of protein
structures, J. Appl. Crystallogr., 26, pp. 283–291.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

References 261

35. Hooft, R. W. W., Vriend, G., Sander, C., and Abola, E. E. (1996). Errors in
protein structures, Nature, 381, p. 272.
36. Murzin, A. G., Brenner, S. E., Hubbard, T., and Chothia, C. (1995). SCOP:
a structural classification of proteins database for the investigation of
sequences and structures, J. Mol. Biol., 247, pp. 536–540.
37. Holm, L., and Sander, C. (1993). Protein-structure comparison by
alignment of distance matrices, J. Mol. Biol., 233, pp. 123–138.
38. Emsley, P., Lohkamp, B., Scott, W., and Cowtan, K. (2010). Features and
development of Coot, Acta Crystallogr. D, 66, pp. 486–501.
39. Collaborative Computational Project, Number 4. (1994). The CCP4 suite:
program for protein crystallography, Acta Crystallogr. D, 50, pp. 760–
763.
40. Arnone, A. (1974). X-ray studies of the interaction of CO2 with human
deoxyhaemoglobin, Nature, 247, pp. 143–145.
41. Fauman, E. B., Rutenber, E. E., Maley, G. F., Maley, F., and Stroud, R.M.
(1994). Water-mediated substrate/product discrimination: the product
complex of thymidylate synthase at 1.83 Å, Biochemistry, 33, pp. 1502–
1511.
42. Papiz, M. Z., Prince, S. M., Howard, T., Cogdell, R. J., and Isaacs, N.
W. (2003). The structure and thermal motion of the B800–850 LH2
complex from Rps. acidophila at 2.0 Å resolution and 100 K: new
structural features and functionally relevant motions, J. Mol. Biol., 326,
pp. 1523–1538.
43. Maeda, H., Mizutani, O., Yamagata, Y., Ichishima, E., and Nakajima, T.
(2001). Alkaline-resistance model of subtilisin ALP I, a novel alkaline
subtilisin, J. Biochem., 129, pp. 675–682.
44. McNicholas, S., Potterton, E., Wilson, K. S., and Noble, M. E. M. (2011).
Presenting your structures: the CCP4mg molecular-graphics software,
Acta Crystallogr. D, 67, pp. 386–394.
45. Svendsen, I., Jonassen, I., Hejgaard, J., and Boisen, S. (1980). Amino-
acid sequence homology between a serine protease inhibitor from
barley and potato inhibitor 1, Carlsberg Res. Commun., 45, pp. 389–
395.
46. Heinz, D. W., Priestle, J. P., Rahuel, J., Wilson, K. S., and Grütter, M.
G. (1991). Refined crystal structures of subtilisin Novo in complex
with wild-type and two mutant eglins. Comparison with other serine
proteinase inhibitor complexes, J. Mol. Biol., 217, 353–371.
47. Goddette, D. W., Paech, C., Yang, S. S., Mielenz, J. R., Bystroff, C., Wilke, M.
E., and Fletterick, R. J. (1992). The crystal structure of the Bacillus lentus
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

262 Stabilization of Enzymes by Metal Binding

alkaline protease, subtilisin BL, at 1.4 Å resolution, J. Mol. Biol., 228, pp.
580–595.
48. Kidd, R. D., Yennawar, H. P., Sears, P., Wong, C. H., and Farber, G. K. (1996).
A weak calcium binding site in subtilisin BPN has a dramatic effect on
protein stability, J. Am. Chem. Soc., 118, pp. 1645–1650.
49. Kidd, R. D., Sears, P., Huang, D. H., Witte, K., Wong, C. H., and Farber, G.
K. (1999). Breaking the low barrier hydrogen bond in a serine protease,
Protein Sci., 8, pp. 410–417.
50. Dinakarpandian, D., Shenoy, B. C., Hilvert, D., McRee, D. E., McTigue, M.,
and Carey, P. R. (1999). Electric fields in active sites: substrate switching
from null to strong fields in thiol- and selenol-subtilisins, Biochemistry,
38, pp. 6659–6667.
51. Pan, X., Bott, R., and Glatz, C. E. (2003). Subtilisin surface properties and
crystal growth kinetics, J. Cryst. Growth, 254, pp. 492–502.
52. Neidhart, D. J., and Petsko, G. A. (1988). The refined crystal structure of
subtilisin Carlsberg at 2.5 Å resolution, Protein Eng., 2, pp. 271–276.
53. McPhalen, C. A., and James, M. N. (1988). Structural comparison of
two serine proteinase-protein inhibitor complexes: eglin-c-subtilisin
Carlsberg and CI-2-subtilisin Novo, Biochemistry, 27, pp. 6582–6598.
54. Fitzpatrick, P. A., Steinmetz, A. C., Ringe, D., and Klibanov, A. M. (1993).
Enzyme crystal structure in a neat organic solvent, Proc. Natl. Acad. Sci.
U S A, 90, pp. 8653–8657.
55. Fitzpatrick, P. A., Ringe, D., and Klibanov, A. M. (1994). X-ray crystal
structure of cross-linked subtilisin Carlsberg in water vs. acetonitrile,
Biochem. Biophys. Res. Commun., 198, pp. 675–681.
56. Schmitke, J. L., Stern, L. J., and Klibanov, A. M. (1997). The crystal struc-
ture of subtilisin Carlsberg in anhydrous dioxane and its comparison
with those in water and acetonitrile, Proc. Natl. Acad. Sci. U S A, 94, pp.
4250–4255.
57. Eschenburg, S., Genov, N., Peters, K., Fittkau, S., Stoeva, S., Wilson, K. S.,
and Betzel, C. (1998). Crystal structure of subtilisin DY, a random mutant
of subtilisin Carlsberg, Eur. J. Biochem., 257, pp. 309–318.
58. Prangé, T., Schiltz, M., Pernot, L., Colloc’h, N., Longhi, S., Bourguet, W., and
Fourme, R. (1998). Exploring hydrophobic sites in proteins with xenon
or krypton, Proteins, 30, pp. 61–73.
59. Schmitke, J. L., Stern, L. J., and Klibanov, A. M. (1998). Comparison of
x-ray crystal structures of an acyl-enzyme intermediate of subtilisin
Carlsberg formed in anhydrous acetonitrile and in water, Proc. Natl.
Acad. Sci. U S A, 95, pp. 12918–12923.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

References 263

60. Stoll, V. S., Eger, B. T., Hynes, R. C., Martichonok, V., Jones, J. B., and Pai, E.
F. (1998). Differences in binding modes of enantiomers of 1-acetamido
boronic acid based protease inhibitors: crystal structures of gamma-
chymotrypsin and subtilisin Carlsberg complexes, Biochemistry, 37, pp.
451–462.
61. Almog, O., Gallagher, D. T., Ladner, J. E., Strausberg, S., Alexander, P.,
Bryan, P., and Gilliland, G. L. (2002). Structural basis of thermostability.
Analysis of stabilizing mutations in subtilisin BPN , J. Biol. Chem., 277,
pp. 27553–27558.
62. Barrette-Ng, I. H., Ng, K. K., Cherney, M. M., Pearce, G., Ryan, C. A., and
James, M. N. (2003). Structural basis of inhibition revealed by a 1:2
complex of the two-headed tomato inhibitor-II and subtilisin Carlsberg,
J. Biol. Chem., 278, pp. 24062–24071.
63. Horn, J. R., Ramaswamy, S., and Murphy, K. P. (2003). Structure and
energetics of protein-protein interactions: the role of conformational
heterogeneity in OMTKY3 binding to serine proteases J. Mol. Biol., 331,
pp. 497–508.
64. Radisky, E. S., Kwan, G., Karen, Lu, C. J., and Koshland Jr., D. E. (2004).
Binding, proteolytic, and crystallographic analyses of mutations at
the protease-inhibitor interface of the subtilisin BPN /chymotrypsin
inhibitor 2 complex, Biochemistry, 43, pp. 13648–13656.
65. Maynes, J. T., Cherney, M. M., Qasim, M. A., Laskowski, M., Jr., and James,
M. N. (2005). Structure of the subtilisin Carlsberg-OMTKY3 complex
reveals two different ovomucoid conformations, Acta Crystallogr. D, 61,
pp. 580–588.
66. Radisky, E. S., Lu, C. J., Kwan, G., and Koshland Jr., D. E. (2005). Role of
the intramolecular hydrogen bond network in the inhibitory power of
chymotrypsin inhibitor 2, Biochemistry, 44, pp. 6823–6830.
67. Takeuchi, Y., Noguchi, S., Satow, Y., Kojima, S., Kumagai, I., Miura, K., Naka-
mura, K. T., and Mitsui, Y. (1991). Molecular recognition at the active site
of subtilisin BPN : crystallographic studies using genetically engineered
proteinaceous inhibitor SSI (Streptomyces subtilisin inhibitor), Protein
Eng., 4, pp. 501–508.
68. Steinmetz, A. C., Demuth, H. U., and Ringe, D. (1994). Inactivation of sub-
tilisin Carlsberg by N-((tert-butoxycarbonyl)alanylprolylphenylalanyl)-
O-benzoylhydroxyl- amine: formation of a covalent enzyme-inhibitor
linkage in the form of a carbamate derivative, Biochemistry, 33, pp.
10535–10544.
69. Gallagher, T., Gilliland, G., Wang, L., and Bryan, P. (1995). The
prosegment-subtilisin BPN complex: crystal structure of a specific
“foldase,” Structure, 3, pp. 907–914.
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

264 Stabilization of Enzymes by Metal Binding

70. Gallagher, T., Oliver, J., Bott, R., Betzel, C., and Gilliland, G. L. (1996).
Subtilisin BPN at 1.6 Å resolution: analysis for discrete disorder and
comparison of crystal forms, Acta Crystallogr. D, 52, pp. 1125–1135.
71. Almog, O., Gallagher, T., Tordova, M., Hoskins, J., Bryan, P., and Gilliland,
G. L. (1998). Crystal structure of calcium-independent subtilisin BPN
with restored thermal stability folded without the prodomain, Proteins,
31, pp. 21–32.
72. Gallagher, D. T., Pan, Q. W., and Gilliland, G. L. (1998). Mechanism of ionic
strength dependence of crystal growth rates in a subtilisin variant, J.
Cryst. Growth, 193, pp. 665–673.
73. Graycar, T., Knapp, M., Ganshaw, G., Dauberman, J., and Bott, R. (1999).
Engineered Bacillus lentus subtilisins having altered flexibility, J. Mol.
Biol., 292, pp. 97–109.
74. Pan, Q., and Gallagher, D. T. (2000). Probing protein interaction
chemistry through crystal growth: structure, mutation, and mechanism
in subtilisin s88, J. Cryst. Growth, 212, pp. 555–563.
75. Bott, R. R., Chan, G., Domingo, B., Ganshaw, G., Hsia, C. Y., Knapp, M., and
Murray, C. J. (2003). Do enzymes change the nature of transition states?
Mapping the transition state for general acid-base catalysis of a serine
protease, Biochemistry, 42, pp. 10545–10553.
76. Dubnau, D., and Davidoff-Abelson, R. (1971). Fate of transforming
DNA following uptake by competent Bacillus subtilis: I. Formation and
properties of the donor-recipient complex, J. Mol. Biol., 56, pp. 209–221.
77. Outtrup, H., Dampmann, C., and Lindegård, P. (1992). Patent
WO92/17577.
78. Svendsen, A., and Draborg, H. (2004). Patent WO2004/067737.
79. Otwinowski, Z., and Minor, W. (1997). Processing of X-ray diffraction
data collected in oscillation mode, Methods Enzym., 276, pp. 307–326.
80. Navaza, J., and Saludjian, P. (1994). AMoRe: an automated molecular
replacement program package, Acta Crystallogr. A, 50, pp. 157–163.
81. McRee, D. E. (1999). XtalView/Xfit: a versatile program for manipulating
atomic coordinates and electron density, J. Struct. Biol., 125, pp. 156–
165.
82. Murshudov, G. N., Vagin, A. A., and Dodson, E. J. (1997). Refinement of
macromolecular structures by the maximum-likelihood method, Acta
Crystallogr. D, 53, pp. 240–255.
83. Svendsen, A., and Minning, S. (2004). Patent WO2004/083362.
84. Jancarik, J., and Kim, S. H. J. (1991). Sparse matrix sampling: a screening
method for crystallization of proteins, J. Appl. Cryst., 24, pp. 409–411.

www.ebook3000.com
March 21, 2016 13:40 PSP Book - 9in x 6in 06-Allan-Svendsen-c06

References 265

85. Terwilliger, T. C., and Berendzen, J. (1999). Automated MAD and MIR
structure solution, Acta Crystallogr. D, 55, pp. 849–861.
86. Otwinowski, Z. (1991). Daresbury Study Weekend Proceedings, pp. 80–
86.
87. Cowtan, K. (1994). Joint CCP4 and ESF-EACBM Newsletter on Protein
Crystallography, 31, pp. 34–38.
88. Vagin, A., and Teplyakov, A. (1997). MOLREP: an automated program for
molecular replacement, J. Appl. Cryst., 30, pp. 1022–1025.
89. Vaguine, A. A., Richelle, J., and Wodak, S. J. (1999). SFCHECK: a unified set
of procedures for evaluating the quality of macromolecular structure-
factor data and their agreement with atomic model, Acta Crystallogr. D,
55, pp. 191–205.
90. Ramachandran, G. N., Ramakrishnan, C., and Sasisekharan, V. (1963).
Stereochemistry of polypeptide chain conformations, J. Mol. Biol., 7, pp.
95–99.
This page intentionally left blank

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Chapter 7

Structure and Functional Roles of


Surface Binding Sites in Amylolytic
Enzymes

Darrell Cockburn and Birte Svensson


Enzyme and Protein Chemistry, Department of Systems Biology,
Technical University of Denmark, Kongens Lyngby DK 2800, Denmark
bis@bio.dtu.dk

7.1 Introduction

As the continued use of fossil fuels becomes an ever more


untenable proposition, it is critical that alternative, sustainable
resources be developed to replace them. Plant biomass is one of
the leading candidates to fill this role, and starch is the second-
largest component of this material (after cellulose) and perhaps
the most easily utilizable. Additionally starch is a very important
part of the food supply, making up a large portion of human calorie
consumption.
Starch is composed solely of the monosaccharide glucose,
present in two main polymers, the essentially linear α-1,4-linked
amylose and the branched amylopectin comprising an α-1,4-linked

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

268 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

backbone of chains connected by periodic α-1,6 branch points.


Amylopectin is the dominant component of starch, constituting
about 75%, with some variation depending on the botanical origin
[1]. It typically forms double helices and is organized into alternating
layers of crystalline and amorphous regions in the starch granules.
The organization of the amylose is less well understood, though data
on plant mutants producing more or less amylose demonstrate its
importance for overall starch granule morphology [2]. Studies of
the granule surface using atomic force microscopy have revealed
blocklet structures, punctuated by hair-like extensions [3] that leave
the starch granule looking like a hairy billiard ball [4].
Given its prominent position within natural systems, generally
serving for energy storage, a suite of enzymes has evolved to
efficiently degrade starch in plants. This typically includes the endo-
acting α-amylases, which, in conjunction with the exo-acting β-
amylases and α-glucosidases, hydrolyze the α-1,4 linkages. The
endo-acting isoamylases and limit dextrinases (also referred to as
plant pullulanases) are responsible for cleaving the α-1,6 linkages
and serve a debranching function. Within the CAZy classification
system (www.cazy.org) [5] the β-amylases belong to the family
GH14, while the α-glucosidases can be found in the family GH31 as
well as GH13. The α-amylases, limit dextrinases, and isoamylases are
organized into different subfamilies of the family GH13. A similar
complement of enzymes, albeit by no means universal, is also typical
for nonplant starch-degrading organisms. Some organisms contain
α-amylases from GH57 and other families [6], while glucoamylase
is a prominent exo-acting, glucose-producing enzyme of the family
GH15 primarily found in fungi. Among the debranching enzymes,
pullulanases and isoamylases can act directly on amylopectin as
opposed to limit dextrinase that preferentially hydrolyses shorter
branched dextrins, also called α-limit dextrins. In the bacterium
Bacteroides thetaiotaomicron a set of proteins is encoded as
colocalized genes constituting the starch utilization system (Sus)
and allows for efficient degradation of starch [7]. In conjunction
with a TonB-dependent porin and a series of binding proteins, an
outer membrane-bound α-amylase (called SusG) serves as the only
extracellular enzyme. Once a released oligosaccharide arrives in the
periplasm, a neopullulanase and an α-glucosidase [8, 9] are present

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Introduction 269

to complete the digestion of the products of the α-amylase for entry


into the central metabolic pathways of the organism.
The binding proteins of B. thetaiotaomicron (SusE and SusF) [10]
exemplify a common theme in enzymes that degrade starch granules
or indeed any insoluble polymer—the need for a mechanism to
bind to the surface of the target substrate. The surface of the starch
granule has a limited binding capacity, so it is important for enzymes
to be able to bind strongly and thus concentrate upon that surface
to allow for speedy degradation. A typical solution to this problem
is carbohydrate-binding modules (CBMs), such as seen in SusE
and SusF. These small (typically 30–100 amino acids) domains are
found in a wide variety of carbohydrate-degrading proteins, indeed
7% of the enzymes in the CAZy database are known to possess
CBMs [11]. Similar to the glycoside hydrolases, CBMs group into
sequence-based families [12] and currently 69 CBM families are
listed in the CAZy database. Of these, 12 are starch-binding domains,
that is, CBM20, CBM21, CMB25, CBM26, CBM34, CBM41, CBM45,
CBM48, CBM53, CBM58, CBM68, and CBM69. Removal of a CBM
from an enzyme typically leads to much reduced binding and activity
toward insoluble target substrates, while the addition of one of these
domains can greatly improve these properties. For instance, removal
of a CBM20 from the Aspergillus niger glucoamylase abolished
binding to starch granules [13], while addition of the same CBM
to the barley α-amylase 1 increased affinity toward starch granules
sixfold [14]. CBMs are often thought of as separate modules that are
able to move semi-independently of the catalytic domain and thus
provide flexibility in the interaction with substrates (see Fig. 7.1a).
However, in many cases CBMs are intimately associated with the
catalytic domain, unable to move relative to the active site or cannot
be removed without a large impact on the structural stability of the
enzyme. Among starch-degrading enzymes CBM48s often have such
a close contact with the catalytic domain [15, 16] (Fig. 7.1b). CBM48
has some resemblance to the typically linker-connected CBM20,
sharing a structural fold, some of the key binding residues, and a
taxonomical distribution through multiple domains of life [17]. The
binding function is maintained in CBM48 as the crystal structure
of the rice branching enzyme I shows maltopentaose bound to its
CBM48 [18]. Thus it would seem that having a substrate binding site
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

270 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

Figure 7.1 Comparison of carbohydratebinding site types. Green indicates


the catalytic domain, while red indicates a binding site motif, either (a and b)
a CBM or (c) an SBS. (a) Fully separated CBM3 in Cel9A from Thermobifida
fusca (PDB ID:1JS4). (b) CBM48 from the barley limit dextrinase intimately
associated with the catalytic domain (PDB ID: 2Y4S). Note that there is an
additional N-terminal domain in the limit dextrinase that is omitted for
clarity. (c) Barley α-amylase 1 with one of its SBSs (SBS1) highlighted (PDB
ID: 1RP8).

situated in a fixed location relative to the active site provides some


advantage in the degradation of polysaccharides.
Similar to CBMs with a fixed position, surface binding sites (SBSs)
[19, 20] bind carbohydrates at a static location relative to the active
site (see Fig. 7.1c). SBSs differ, however, from the CBMs in that they
are directly part of the catalytic module Domain C of GH13 enzymes
is an interesting case, where it is generally considered part of the
catalytic module, though it often contains binding sites in the form of
SBSs, so perhaps could be thought of as a CBM in its own right. This
distinction between SBSs and fixed-position CBMs may well be a
semantic difference from a practical standpoint, though not enough
is known about the respective roles of fixed-position CBMs and
SBSs to say for certain if key functional differences exist. However,
from a nomenclature perspective one key difference is evident, the
separate domains of CBMs makes them relatively easy to identify at
the sequence level, while the identification of SBSs is much more
difficult, if not impossible, due to the far lower number of amino
acids that define the site. As a consequence most SBSs have been
identified by structural studies, while identifications on the basis
of sequence comparisons are tentative at best, as will be discussed
below.

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Identification of SBSs 271

7.2 Identification of SBSs: X-Ray Crystallography

The workhorse in terms of discovering SBSs has undoubtedly been


X-ray crystallography, with all but a few of the currently more
than 50 known SBScontaining enzymes being identified by this
technique. Although the first enzyme (lysozyme) had its structure
solved by X-ray crystallography in 1965 [21], it has only been with
the large increase in the number of crystal structures in the last
decade, particularly those carbohydrateactive enzymes in complex
with substrates and substrate analogs, that the existence of SBSs
has come to be generally appreciated. Enzyme–ligand complex
structures are usually generated in one of two ways [22]. The first
is by co-crystallization, where the crystallization solution contains
the ligand of interest. This has the advantage that the protein
can be stabilized by the presence of the ligand, often making the
crystallization process more efficient Indeed some proteins seem to
require the presence of ligand in order to crystallize. Alternatively,
substrates or substrate analogs can be added after the crystals have
formed. This soaking in of oligosaccharides has the advantage that
multiple ligands can be tested fairly easily. However, portions of
the protein may remain inaccessible due to crystal contacts and if
significant motion of the protein is required to accommodate the
ligand, the result may be the cracking of the crystals rendering this
approach nonviable for some ligand–protein combinations. Both
approaches have been successful in identifying SBSs [23–25] as they
are generally fairly rigid platforms, though minor movements have
sometimes been detected [25]. One example among the amylases
where co-crystallization has been used is SusG for which the
structure was solved in complex with maltoheptaose, revealing
binding at the active site, a CBM58, and an SBS [24]. In contrast,
the complex structure of amylomaltase from Thermus aquaticus of
GH77, which belongs to the amylolytic enzymes and shares the
glycoside hydrolase clan GH-H with families GH13 and GH70 [11],
was achieved by soaking the crystals with 100 mM acarbose and
shows accommodation of this pseudomaltotetraose inhibitor at the
active site and an SBS [25]. Interestingly, this structure underwent
rearrangements in the active site upon binding acarbose at the SBS
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

272 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

that have implications for the activity of the enzyme, as will be


discussed in Section 7.8.
In addition to identifying the existence and location of SBSs,
X-ray crystallography provides information about the architecture
of these sites, giving a hint toward their function. The barley α-
amylase isozymes 1 and 2 (AMY1 and AMY2) provide an interesting
example of this [23, 26–28]. AMY1 has two SBSs termed SBS1 and
SBS2, seen illustrated in Figs. 7.2a and 7.2b, respectively. These
two sites have very different architectures, with SBS1 possessing
two key tryptophan residues (W278 and W279) joined at an
approximately 130◦ angle, while SBS2 has a tyrosine (Y380) and
a histidine (H395) that are positioned to clamp around a sugar

Figure 7.2 SBSs in GH13 enzymes. (a) SBS1 (W278/W279) and (b) SBS2
(Y380/H395) from the inactive barley α-amylase 1 catalytic nucleophile
mutant (AMY1 D180A) in complex with maltoheptaose (PDB ID: 1RP8).
(c and d) Other GH13 SBSs that are similar in architecture to SBS1. (c) SBSs
(W439/W469) of the α-amylase from Bacillus halmapalus in complex with
glucose (PDB ID: 2GJP). (d) One of the four SBSs (W276/W284) from the
human salivary α-amylase in complex with acarbose (PDB ID: 1MFU). The
enzymes are depicted as cartoons in gray, with the SBS residues shown as
sticks and colored by element (green = carbon; blue = nitrogen; red =
oxygen). Liganded sugar chains are shown in dark red.

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Bioinformatics of SBS Enzymes 273

chain. Originally these sites have been called the starch granule–
binding site and “pair of sugar tongs,” respectively, referring to the
roles suggested by their architectures. While the SBS2 architecture
is somewhat unique, SBS1-like motifs have been seen in other
amylolytic enzymes [20], such as Y276/W284 in porcine pancreatic
α-amylase [29], Y276/W284 in human salivary α-amylase [30] (Fig.
7.2c) and Trp439/Trp469 in the Bacillus halmapalus α-amylase [31]
(Fig. 7.2d), suggesting these sites perform similar functions in their
respective enzymes. The roles of SBS1 and SBS2 in the activity of
AMY1 have been probed more deeply over the years, and they will
serve as key examples throughout this review as to how the various
techniques described here are able to increase our understanding of
SBSs.

7.3 Bioinformatics of SBS Enzymes

Approximately half of the enzymes known to contain an SBS are


found within the α-amylase family GH13. This is a large and diverse
family containing a variety of enzymatic activities all directed
toward starch or starch-like substrates. Due to this diversity, Stam et
al. [32] proposed the division of this family into 35 subfamilies (now
expanded to 40 in CAZy), which possess a greater degree of sequence
similarity than the entire GH13 and consequently are more (though
not completely) uniform in regard to activity, phylogenetic origin,
and domain organization. This subfamily classification provides a
convenient and informative way of looking at SBSs in family GH13
enzymes. Within the family GH13 as a whole none of the SBS
amino acids show much conservation, even when looking only at
enzymes known to possess an SBS, as the SBSs can occur at multiple
structural locations. However, at the subfamily level, conservation
of SBS residues tends to be quite high, likely owing to the more
conserved roles of the enzymes themselves. Currently SBS have been
reported for 15 subfamilies and in two enzymes not assigned to
a GH13 subfamily. Table 7.1 shows the GH13 enzymes known to
possess an SBS. One striking aspect of this list is that SBSs are found
in a variety of subfamilies scattered throughout the family tree,
suggesting that they are an important feature for GH13 enzymes
March 23, 2016
12:32

Table 7.1 SBS-containing enzymes in the family GH13

Enzyme EC number Organism Subfamily CBM Reference


α-Amylase (SusG) 3.2.1.1 B. thetaiotaomicron unassigned CBM58 [24]
α-Amylase B 3.2.1.1 Halothermothrix orenii unassigned No [33]
α-Amylase 3.2.1.1 A. oryzae 1 No [34]
Maltogenic α-amylase (Novamyl) 3.2.1.133 Geobacillus stearothermophilus 2 CBM20 [35]
CGTase 2.4.1.19 B. circulans str. 251 2 CBM20 [36]
CGTase 2.4.1.19 Thermoanerobacterium thermosulfurigenes 2 CBM20 [37]
α-1, 4-Glucan: phosphate 2.4.1.- Streptomyces coelicolor A3(2) 3 No [38]
α-maltosyltransferase (GlgE)
Amylosucrase 3.2.1.4 Neisseria polysaccharea 4 No [39]
α-Amylase 3.2.1.1 B. halmapalus 5 No [31]
Maltohexaose producing α-amylase 3.2.1.98 Bacillus sp. 707 5 No [40]
α-Amylase (AMY1 and AMY2) 3.2.1.1 Hordeum vulgare (barley) 6 No [26, 27]
α-Amylase 3.2.1.1 Pyrococcus woesei 7 No [41]
PSP Book - 9in x 6in

Starch branching enzyme 1 2.4.1.18 Oryza sativa Japonica group 8 CBM48 [18]
Branching enzyme 2.4.1.18 Escherichia coli K-12 MG1655 9 CBM48 [42]

www.ebook3000.com
Maltooligosyltrehalose 3.2.1.141 Deinococcus radiodurans 10 CBM48 [43]
trehalohydrolase
Isoamylase 3.2.1.68 Chlamydomonas reinhardtii CC425 11 No [44]
Pullulanase type I 3.2.1.41 B. subtilis str. 168 14 CBM48 PDB ID: 2E9B
Trehalose synthase (TreS) 5.4.99.16 Mycobacterium smegmatis 16 No [45]
274 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

α-Amylase I (TVAI) 3.2.1.1 Thermoactinomyces vulgaris R-47 21 CBM34 [46]


Porcine pancreatic α-amylase 3.2.1.1 Sus scrofa 24 No [29]
Salivary and pancreatic α-amylases 3.2.1.1 Homo sapiens 24 No [30]
07-Allan-Svendsen-c07
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Binding Site Isolation 275

in general, despite their disparate locations on the various enzyme


surfaces. Another interesting thing to note is that CBMs are relatively
common in SBS-containing enzymes [19]. This suggests that CBMs
and SBSs play functionally complementary roles that can coexist
within the same enzyme.

7.4 Binding Site Isolation

One difficult aspect of studying SBSs is the need to differentiate


between binding taking place at multiple sites (see Fig. 7.3). Besides
binding at the SBS, substrates will need to bind in the active site. In
addition, many SBS-possessing enzymes have multiple SBSs. Due to
the apparently small number of important residues typically found
in SBSs, mutagenesis is generally effective at eliminating binding at

Figure 7.3 Isolation of binding sites for the study of SBSs. The presence
of multiple binding sites complicates the study of SBS-containing enzymes;
however, it is possible to study them individually using various methods to
block one of the sites, some of which are depicted here. First, mutagenesis
can be used to remove key binding residues in SBSs and in some cases
the active site. Second, the more open nature of SBSs can allow binding
of ligands that are excluded from the active site. Third, mechanism-based
inhibitors can be used to covalently bind and block the active site.
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

276 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

such sites, which allows evaluation of the impact on activity and


binding characteristics. However, this strategy often cannot be used
to eliminate binding to active sites. For instance, the active site of
AMY1 extends from subsite −6 (toward the nonreducing end of
the substrate) to +4 (toward the reducing end) [47], with multiple
important residues being involved [48, 49]. An alternative approach
to mutation is to block the active site with a covalent inhibitor, which
has been used in two different SBS-containing xylanases to confine
binding to their SBSs [50, 51]. In both cases it was also possible
to use site-directed mutants of the SBS to isolate binding in the
active site to get a full picture of sugar binding in the enzyme (see
Section 7.7). This method has not yet been applied to the GH13
enzymes, but other strategies have been used. For instance the
cyclic oligosaccharide β-cyclodextrin can bind to the SBSs of AMY1
but not to the active site [52, 53]. Thus affinity for this substrate
mimic was measured at each of the two SBSs by either mutating
one site and monitoring binding at the other or keeping both sites
intact and fitting binding data to a two-site binding model [52],
showing that β-cyclodextrin binds to SBS1 and SBS2 with a Kd of
1.4 mM and 0.07 mM, respectively. This 20-fold difference in the
Kd at the two SBSs has been further exploited in several studies
to selectively eliminate binding at SBS2 without having to rely on
a mutant enzyme [54]. The advantage in this case is that it was
confirmed that the mutation does not have any additional impact
on the enzyme, particularly at the active site. Furthermore, blocking
a binding site avoids the complications associated with producing
and working with mutant proteins, which are frequently less stable
than the wild type. Besides the use of substrate analogs to selectively
block SBSs, the use of insoluble substrates such as starch granules
may also be effective. For AMY1 it has been found that removal
of both surface sites (either through mutation or by blocking with
β-cyclodextrin) effectively eliminates binding to starch granules,
which are also hydrolyzed extremely slowly at 0.2% of the wild-
type activity level by an SBS1/SBS2 mutant [28, 52, 55, 56]. This is
analogous to what is seen when a CBM is removed from a starch-
active enzyme [13]. Thus, in these cases, the active site does not
participate significantly in binding to the insoluble substrate, with
only the SBSs contributing to the measured affinity.

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Nuclear Magnetic Resonance 277

7.5 Protection of Binding Sites from Chemical Labeling

In addition to X-ray crystallography, SBSs have occasionally been


identified through chemical protection studies [26]. In such a
differential labeling experiment the protein is modified by a
chemical reagent in the presence and absence of a protecting agent
(a substrate or substrate analog) that will bind to the protein.
By determining which amino acids were protected, the binding
site can be identified. This method gave, prior to the structure
determination, an early insight into the presence of the roles of
tryptophans at SBS1 in AMY2, the naturally more abundant isoform
that has about 80% sequence identity with AMY1 [26]. Even when
the binding sites are already known, such protection can prove
useful. For instance in surface plasmon resonance (SPR), the protein
first needs to be immobilized on a chip. While the immobilization
strategies commonly used employ linkers to allow access to most of
the protein even when it is bound to the chip, it is still possible for
binding sites to be rendered inaccessible. By including a substrate or
substrate analog during the immobilization (or during biotinylation
for use with streptavidin chips, for example, in Ref. [52]), SBSs can be
kept accessible. For instance SBS1 of AMY1 is undetectable during
SPR unless a protecting ligand such as β-cyclodextrin is included
during immobilization (unpublished data).

7.6 Nuclear Magnetic Resonance

As discussed above, X-ray crystallography can provide useful


information about the structures of proteins and bound ligands.
However, additional techniques are needed in order to explore the
interactions in more detail, with different experiments required to
isolate binding at each site. Nuclear magnetic resonance (NMR)
provides an all-in-one technique that can be used to determine
structure and binding affinity simultaneously and has the added
advantage of taking place in solution, which is more closely
mimicking the natural environment of the enzyme. One major
drawback, however, is that NMR becomes increasingly difficult to
perform as the protein being studied becomes larger [57]. The size
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

278 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

of most family GH13 enzymes (approximately 40–100 kDa) falls


within the range where NMR structure determination is difficult
but possible, though none of the SBS-containing enzymes in this
family have been explored by NMR. However, a small family GH11
xylanase from B. circulans contains an SBS and its structure was
determined and ligand binding investigated by NMR [51]. Using 1 H-
15
N heteronuclear single quantum correlation (HSQC) spectroscopy,
the binding of xylotetraose to the enzyme was monitored, and it was
found to bind independently at the active site and at a second site
on the surface of the enzyme. Mutagenesis of three residues at this
SBS eliminated binding, and utilization of the covalent inhibitor 2 ,4 -
dinitrophenyl 2-deoxy-2-fluoro-β-xylobioside to block the active
site confirmed a Kd of 2.2 and 3.4 mM at the SBS and the active
site, respectively. Further investigation using xylododecaose in an
inactive mutant found a single Kd of 0.08 mM as this substrate was
binding simultaneously to both sites, indicating that this SBS acts
in concert with the active site to efficiently bind substrates. While
this is one example of NMR being used to investigate the properties
of an SBS, in principle this technique clearly has great potential for
studying SBSs in other enzymes, including those from the family
GH13.

7.7 Binding Assays

In addition to structural studies, several other techniques have


been found to be effective in the investigation of SBSs. Affinity
electrophoresis (Fig. 7.4) [58, 59] is one such technique that has
been used in several cases to characterize SBS-containing enzymes
[51, 56]. In this technique the protein of interest is applied to two
native polyacrylamide gels and subjected to electrophoresis. One
gel contains the polysaccharide under investigation, the other is a
control gel without polysaccharide. The migration distance of the
protein is compared between the two, with slower migration in the
polysaccharide gel indicating an interaction between the protein
and the polysaccharide. Affinity electrophoresis allows for the study
of the interaction between proteins and soluble polysaccharides,
which can be tricky using other techniques. Moreover, it can be

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Binding Assays 279

Figure 7.4 Affinity electrophoresis of AMY1. Lane 1 is AMY1 in the control


gel, lane 2 is the reference proteins ladder, and lane 3 is AMY1 in a gel
containing 0.1% glycogen. The retardation of the mobility of AMY1 in lane 3
is indicative of its interaction with glycogen (highly soluble and more highly
branched α-glucan than amylopectin).

used in a quantitative manner to determine binding affinity by


running a series of gels with varying polysaccharide concentration
and fitting the migration distances to a binding curve [60, 61].
However, for characterization of SBSs it typically has been used
in a more qualitative manner to look at the impact of removing
one of the binding sites of an enzyme at a single polysaccharide
concentration. In one example Ludwiczek et al. [51] used affinity
electrophoresis to examine the impact of either mutating the SBS
or blocking the active site on the ability of the B. circulans xylanase
of GH11 to bind different forms of xylan. They found that the
active site and the SBS contributed about equally to the affinity
toward xylan, with slight differences depending on the source of
the xylan. By contrast, the SBSs of AMY1 were found to be mostly
responsible for the binding of the enzyme to starch granules [52] as
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

280 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

well as soluble polysaccharides [56], with only low levels of binding


detected when both SBSs were mutated. SBS2, however, was found
to be more important than SBS1 for binding to various soluble α-
glucans, pointing toward the difference in the function of these two
sites.
While affinity electrophoresis is well suited to investigating
enzyme interactions with soluble polysaccharides, pulldown or
depletion assays can provide similar information about interactions
with insoluble polysaccharides [62, 63]. In these assays the enzyme
is incubated with the insoluble polysaccharide and the mixture is
centrifuged to pellet the polysaccharide, pulling down any attached
enzyme with it. The supernatant is then assayed for remaining
enzyme, allowing the calculation of the amount bound to the
polysaccharide. In the case of AMY1, mutation of SBS1 [52] and
SBS2 [55] led to a 34- and 9-fold reduction in affinity for starch
granules, respectively. Mutation of both sites resulted in a total
loss of detectable affinity [52, 56]. Further investigation revealed
that in particular the SBSs of AMY1 recognized motifs found in
A-type starches (cereals) as opposed to B-type starches (tubers
and legumes) [56]. Taken together with the affinity electrophoresis
results, these data illustrate that SBS1 is more important for binding
to intact starch granules, while SBS2 is more important for binding
to soluble fractions or in the case of the starch granules to individual
α-glucan chains. Pull-down assays have been used with a variety
of other SBS-containing enzymes as well. Thus human salivary α-
amylase has four SBSs seen in the crystal structure and removal of
these sites by mutagenesis decreased the affinity of the enzyme for
starch granules 10-fold [64]. For the glucoamylases (family GH15)
of the yeast Saccharomycopsis fibuligera, mutation of the residues
in the SBS resulted in a dramatic decrease in binding and activity
toward starch granules [65, 66]. Also among the xylanases there
are several examples where removal of the SBS reduced the affinity
toward insoluble xylan from 3 to 130-fold, depending on the enzyme
and substrate [51, 67, 68]. In the glucan phosphatase Lsf2 from
Arabidopsis thaliana there are two SBSs that have been identified
and removal of both sites resulted in an 87% reduction in the
enzyme’s affinity for amylopectin [69]. This is one of the few glucan
phosphatases that have been found to lack a CBM, so in this case

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Binding Assays 281

it is the SBSs that are playing the role of localizing the enzyme to
its substrate. In the SusG α-amylase from B. thetaiotaomicron there
is both a CBM (family CBM58) and an SBS [24]. Deletion of the
CBM leads to a complete loss in affinity for starch granules and a
71% decrease in activity, while mutation of the SBS leads to a 66%
decrease in activity toward starch granules. In this case the SBS
seems to not be simply working as a CBM replacement but rather
to act in concert with the CBM in some way. Given the proximity of
the SBS to the binding area of the active site accommodating the
reducing end of the substrate, it has been speculated that it may
be involved in transferring oligosaccharide reaction products to the
porin that is associated with the Sus complex for importation into
the cell [24].
A technique highly complementary to pull-down assays and
affinity electrophoresis is confocal laser scanning microscopy
(CLSM), in which a fluorescently labeled protein is visualized,
allowing insight into where it localizes within a given system. CLSM
has been used in conjunction with pull-down assays to investigate
the interaction of AMY1 with starch granules [56]. This describes
the relative roles of the active site and SBSs in the interaction with
starch granules. These results illustrated that at least one of the two
SBSs was required for binding to the surface of fully intact starch
granules; however, even AMY1 with both SBSs inactivated was able
to bind to imperfections or damaged regions in the granules where
the inner layers were exposed. This demonstrated that the active site
of AMY1 is sufficient for binding to the internal amorphous regions
of the starch granule, along with the exposed sides of amylopectin
double helices within the crystalline regions. By contrast, the SBSs
of AMY1 were absolutely required for binding at the surface.
While the techniques described above function well for the inves-
tigation of binding to polysaccharides, other methods are required
to examine binding to oligosaccharides. Two such techniques that
have been used in the study of SBSs are SPR and isothermal titration
calorimetry (ITC). ITC has the advantage of being performed fully in
solution without the need for any kind of labels. Additionally, ITC
allows for highly precise measurements and the determination of
the thermodynamic properties of the interaction under study. On
the other hand, SPR requires that the protein is immobilized and has
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

282 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

the advantage that minimal amounts of the protein being studied are
needed. Thus, SPR enables analysis of mutants or modified proteins
that are only available in small amounts. Additionally SPR allows for
a wider range of dissociation constants to be investigated up to the
millimolar range [70]. In comparison, protein concentrations used
in ITC are generally about 10 times the Kd , with protein solubility
often limiting the maximum Kd to the tens of micromolar range
[71]. Both ITC and SPR were utilized to investigate the binding at
the active site and the SBS of the xylanase from B. subtilis [50]. By
using an SBS mutant and the wild-type enzyme inactivated with the
covalent inhibitor 2,3-epoxypropyl βD-xylopyranoside, the binding
at each site was isolated, determining a Kd toward xylotetraose of
0.9 and 1.45 mM for the active site and SBS, respectively. Binding at
the active site could be confirmed independently using ITC, though
the investigators did not have enough of the inhibited wild type to
perform ITC with this variant. Nonetheless, the ITC value for the
active site Kd of 1 mM agreed quite well with the SPR results. As
mentioned earlier, SPR has been used to determine that SBS1 and
SBS2 of AMY1 bind to β-cyclodextrin with Kd of 1.4 and 0.07 mM,
respectively [52, 55]. Additionally, SPR has been used to determine
the affinity of the active site and SBSs toward maltoheptaose and a
DP10 oligosaccharide with an α-1,6 branch, by using a combination
of mutants and blocking SBS2 with β-cyclodextrin [56]. Thus SBS1
was found not to bind to these oligosaccharides, at least within
the concentration range tested, while the active site and SBS2 were
found to have a similar affinity for each substrate with a Kd of 1.2
mM for maltoheptaose and 1.8 mM for the branched oligosaccharide.
This again points to very different roles of the two SBSs of AMY1,
with SBS2 being important for binding to individual sugar chains.

7.8 Activity Assays

In addition to their roles in binding, SBSs have also been investigated


for their impact on enzyme activity. The amylomaltase of GH77 from
Thermus aquaticus has an SBS [25] and is capable of performing
four types of reactions: hydrolysis, coupling, disproportionation, and
cyclization. Mutation of the SBS decreases the rate of the first three

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Future Prospects 283

types of reactions, while increasing the rate of cyclization [72]. In


this case the mechanism of action of the SBS appears to be allosteric
regulation as the free structure and the structure with acarbose
bound at the SBS show significant movement within the active site
of the enzyme [25]. Also for AMY1 the SBSs have been found to
play important roles in the activity of the enzyme, including for the
hydrolysis of starch granules and influencing the processivity of the
enzyme [52]. SBS2 has been found to be particularly important for
AMY1’s activity toward amylopectin. Nielsen et al. [54] found that
two rates governed the activity against the branched amylopectin
but not the linear amylose. The fast a rate, which was associated
also with a high affinity component, was found to be dominant
and completely dependent on the presence of an intact SBS2, as
either mutation or blocking this site with β-cyclodextrin eliminated
this faster a rate, while the slower b rate was essentially not
affected. This provides a further line of evidence that SBS2 is of
key importance for the ability of AMY1 to bind to and degrade
amylopectin, which is also important for the initial binding and
attack at the surface of starch granules. Remarkably, addition of
β-cyclodextrin at a higher concentration of amylopectin elevated
activity, perhaps reflecting an allosteric regulation connected with
binding to SBS2, as proposed in a previous kinetics study of multiple
binding sites accommodating the inhibitor acarbose [53]. SBS1 on
the other hand appears to be the most important for recognizing and
binding to the surface exposed regions of crystalline amylopectin. In
this way the two SBSs and the active site act in concert to bind and
degrade starch granules (see Fig. 7.5).

7.9 Future Prospects

While much has been learned about the roles of SBSs, as detailed
above, the question of how this knowledge can be applied essentially
remains. A different potential use of SBSs is to make them targets
of pharmacological chaperones (PCs) [73, 74]. In many genetic
diseases the root cause of the problem is a key enzyme misfolding
due to the presence of a mutation. A PC is a molecule that can
bind to such a protein and thereby stabilize it and allow at least
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

284 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

Figure 7.5 Model of AMY1 initial binding to starch granules. SBS1 serves
as the initial point of attachment to the starch granule at the crystalline
face, with SBS2 serving a secondary, though important, role in this
initial attachment by binding to an amylopectin chain. SBS2 then acts in
conjunction with the active site in the degradation of amylopectin chains.
It is important to note that this figure depicts one, albeit important, binding
scenario. It is likely that others occur as well, such as the case where three
independent sugar chains bind at each of the three sites. Having multiple
productive binding modes is likely to contribute to the overall efficiency of
the enzyme.

some of the protein to fold correctly, thereby relieving the disease


state. Generally the PCs are designed to bind to the active site;
however, this also means that they are potent inhibitors of the
enzyme. Several amylolytic enzymes are at the heart of genetic
diseases in humans, particularly within the various glycogen storage
diseases (GSDs) [75]. These include the GH13 glycogen debranching
enzyme (type III GSD), the GH13 glycogen branching enzyme (type
IV GSD), the GH31 acid α-glucosidase (type II GSD), and the glycogen
phosphatase laforin (Lafora disease). None of these diseases are
currently treated by PC; however, several mutations of the human
α-galactosidase lead to a misfolding disease called Fabry’s disease
[76], which can be alleviated by the PC 1-deoxygalactonojirimycin
[73, 74]. This therapy is successful as the inhibitor binds to the
enzyme at the neutral pH of the cytoplasm where folding takes
place but dissociates at the lower pH of the lysosome where the
enzyme performs its function [77]. Interestingly, α-galactosidase

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

Future Prospects 285

also contains an SBS that was found to bind the β-anomer of


galactose, suggesting that the SBS could be specifically targeted,
leaving the enzyme uninhibited [78]. Given the mechanism of the
PC described above, it is not necessary to avoid binding to the active
site in this particular case, but SBSs could serve as targets for PCs
in other situations where a convenient way to target the active site
only during folding does not exist. While it is unknown if any of the
GSD-related enzymes described above contain SBSs, such an SBS-
targeting PC therapy remains an intriguing possibility for treating
variants of these diseases where protein misfolding leads to activity
deficiency.
In the biotechnology industry SBSs may find uses as well. As
has been seen in the examples above, SBSs can have profound
impacts on the activity of enzymes, and thus they may serve as
targets of enzyme engineering, either by modification of existing
SBSs or by introduction of them de novo. SBSs may have some
advantages over CBMs in an industrial setting as their fixed location
on the catalytic domain may allow for the use of smaller, more
stable enzymes. While there are no published reports of SBSs being
engineered in amylolytic enzymes, several attempts have been made
to engineer the SBSs of xylanases [67, 68]. These studies have
been moderately successful in improving the binding affinity at the
SBSs, but corresponding increases in activity have not been realized.
Interestingly, in the few instances where binding affinity at the active
site and SBS of an enzyme have been determined, there has been
a strong correlation in the affinities. In the GH11 xylanases from
B. circulans the Kd for xylotetraose was found to be 3.4 and 2.4
mM at the active site and SBS, respectively, as determined by NMR
[51]. For the xylanase from B. subtilis, the Kd for xylohexaose was
found to be 0.90 mM at the active site and 1.45 mM at the SBS.
Alteration of the SBS of the B. subtilis xylanase (by making two
amino acid substitutions) led to a modest decrease in the Kd toward
water unextractable xylan from 8.8 to 5.3 mg/mL. However, these
substitutions led to a decrease in the rate of solubilization of this
substrate. This decrease in activity may occur because it breaks the
correlation between the affinity at the active site and the SBS, which
may be important for the activity of the enzyme. Very remarkably,
the activity of a GH11 xylanase from Thermobacillus xylanilyticus
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

286 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

improved upon modification of its SBS [79]. This xylanase was


subjected to directed evolution and one of the most beneficial
mutations was found to be a serine to threonine substitution in
close proximity to the SBS, which increased the kcat /KM toward
birchwood xylan by 56%. The impact on binding affinity was not
measured, but the nature of the mutation suggests that it would
be subtle, perhaps adjusting the orientation of binding rather than
the affinity. This illustrates the point that more knowledge about
SBSs is required to advance rational engineering of these sites. While
there are no published reports of SBS engineering in GH13 enzymes,
some amylases have been subjected to substantial modification to
optimize their properties for industrial use. Two prime examples
are the B licheniformis amylase (BLA) [80–82] and the Alkalimonas
amylolytica amylase (AmyK) [83, 84, 85–87]. Both of these enzymes
are from subfamily 5 of GH13, a subfamily with two members known
to possess SBSs (see Table 7.1). There are no published complex
structures for either AmyK or BLA, and in the case of AmyK none of
the known SBS residues are conserved. In BLA several but not all of
the SBS residues seen in other subfamily 5 enzymes are conserved.
Interestingly, a chimeric enzyme using the first 300 amino acids
from the B. amyloliquefaciens α-amylase and the last 186 amino
acids from BLA has been created and found to possess an SBS [88].
This SBS is equivalent in position to one of those found in the Bacillus
sp. 707 α-amylase [40], though the other SBSs from that enzyme
were not seen in the structure of the chimeric enzyme. Thus both
AmyK and BLA, two enzymes that have been extensively engineered,
may well be able to benefit from the introduction of SBSs at the
equivalent position to those found in other subfamily 5 enzymes.
Such an approach might prove easier than attempting to improve an
existing SBS.

7.10 Conclusion

The current knowledge of SBSs has been presented here, with


particular emphasis on the amylolytic enzymes and the barley
α-amylase 1 serving as the prime example. Studies of the SBSs of
AMY1 have led to an unprecedented depth of understanding about

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

References 287

the role of binding at these sites in the activity of an amylolytic


enzyme. The techniques employed to investigate SBSs center around
their identification and examination of their binding properties.
They include X-ray crystallography, NMR, affinity electrophoresis,
pull-down assays, CLSM, SPR, and ITC. While the number of known
SBS-containing enzymes is mostly limited to those that have been
identified through structural studies, the various methods described
here should be capable of identifying SBS-containing enzymes in the
absence of structural data. Analysis of binding properties and using
strategies suited to discriminate between the different sites would
allow the detection of binding sites in addition to the active site.
Application of these methods has the potential to greatly expand
our knowledge about what kinds of enzymes contain SBSs and
the properties of those binding sites. This is important, since, as
illustrated in this review, SBSs are critical for the function of many
enzymes, particularly those active against xylan or starch. It seems
likely that there are other classes of enzymes for which SBSs play an
important role that are waiting to be discovered.

Acknowledgments

This work was supported by DTU with an HC Ørsted postdoctoral


fellowship (to DC) and by a grant from the Danish Council for
Independent Research|Natural Sciences (to BS). The authors would
like to thank Monica Palcic and Lyann Sim for sharing their data on
the C. reinhardtii isoamylase.

References

1. Buléon, A., Colonna, P., Planchot, V., and Ball, S. (1998). Starch granules:
structure and biosynthesis, Int. J. Biol. Macromol., 23, pp. 85–112.
2. Glaring, M. A., Koch, C. B., and Blennow, A. (2006). Genotype-specific
spatial distribution of starch molecules in the starch granule: a
combined CLSM and SEM approach, Biomacromolecules, 7, pp. 2310–
2320.
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

288 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

3. Park, H., Xu, S., and Seetharaman, K. (2011). A novel in situ atomic force
microscopy imaging technique to probe surface morphological features
of starch granules, Carbohydr. Res., 346, pp. 847–853.
4. Lineback, D. R. (1986). Current concepts of starch structure and its
impact on properties, J. Jpn. Soc. Starch Sci., 33, pp. 80–88.
5. Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M., and
Henrissat, B. (2014). The carbohydrate-active enzymes database (CAZy)
in 2013, Nucleic Acids Res., 42, pp. D490–D495.
6. Janeček, Š., Svensson, B., and MacGregor, E. A. (2014). α-Amylase: an
enzyme specificity found in various families of glycoside hydrolases,
Cell. Mol. Life Sci., 71, pp. 1149–1170
7. Martens, E. C., Koropatkin, N. M., Smith, T. J., and Gordon, J. I.
(2009). Complex glycan catabolism by the human gut microbiota:
the Bacteroidetes Sus-like paradigm, J. Biol. Chem., 284, pp. 24673–
24677.
8. D’Elia, J. N., and Salyers, A. A. (1996). Contribution of a neopullulanase,
a pullulanase, and an α-glucosidase to growth of Bacteroides thetaio-
taomicron on starch, J. Bacteriol., 178, pp. 7173–7179.
9. Kitamura, M., Okuyama, M., Tanzawa, F., Mori, H., Kitago, Y., Watanabe,
N., Kimura, A., Tanaka, I., and Yao, M. (2008). Structural and functional
analysis of a glycoside hydrolase family 97 enzyme from Bacteroides
thetaiotaomicron, J. Biol. Chem., 283, pp. 36328–36337.
10. Cameron, E. A., Maynard, M. A., Smith, C. J., Smith, T. J., Koropatkin,
N. M., and Martens, E. C. (2012). Multidomain carbohydrate-binding
proteins involved in Bacteroides thetaiotaomicron starch metabolism, J.
Biol. Chem., 287, pp. 34614–34625.
11. Cantarel, B. L., Coutinho, P. M., Rancurel, C., Bernard, T., Lombard, V.,
and Henrissat, B. (2009). The carbohydrate-active enzymes database
(CAZy): an expert resource for glycogenomics, Nucleic Acids Res., 37, pp.
D233–D238.
12. Boraston, A. B., Bolam, D. N., Gilbert, H. J., and Davies, G. J. (2004).
Carbohydrate-binding modules: fine-tuning polysaccharide recognition,
Biochem. J., 382, pp. 769–781.
13. Svensson, B., Pedersen, T. G., Svendsen, I., Sakai, T., and Ottesen, M.
(1982). Characterization of two forms of glucoamylase from Aspergillus
niger, Carlsberg Res. Commun., 47, pp. 55–70.
14. Juge, N., Nøhr, J., Le Gal-Coëffet, M., Kramhøft, B., Furniss, C. S. M.,
Planchot, V., Archer, D. B., Williamson, G., and Svensson, B. (2006). The
activity of barley α-amylase on starch granules is enhanced by fusion of

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

References 289

a starch binding domain from Aspergillus niger glucoamylase, Biochim.


Biophys. Acta, 1764, pp. 275–284.
15. Møller, M. S., Abou Hachem, M., Svensson, B., and Henriksen, A. (2012).
Structure of the starch-debranching enzyme barley limit dextrinase
reveals homology of the N-terminal domain to CBM21, Acta Crystallogr.
Sect. F Struct. Biol. Cryst. Commun., 68, pp. 1008–1012.
16. Vester-Christensen, M., Abou Hachem, M., Svensson, B., and Henriksen,
A. (2010). Crystal structure of an essential enzyme in seed starch
degradation: barley limit dextrinase in complex with cyclodextrins, J.
Mol. Biol., 403, pp. 739–750.
17. Janeček, Š, Svensson, B., and MacGregor, E. A. (2011). Structural and
evolutionary aspects of two families of non-catalytic domains present
in starch and glycogen binding proteins from microbes, plants and
animals, Enzyme Microb. Technol., 49, pp. 429–440.
18. Chaen, K., Noguchi, J., Omori, T., Kakuta, Y., and Kimura, M. (2012).
Crystal structure of the rice branching enzyme I (BEI) in complex with
maltopentaose, Biochem. Biophys. Res. Commun., 424, pp. 508–511.
19. Cockburn, D., and Svensson, B. (2013). Chapter 9, Surface binding sites
in carbohydrate active enzymes: an emerging picture of structural and
functional diversity. In Carbohydrate Chemistry: Chemical and Biological
Approaches, Lindhorst, T. K., and Rauter, A. P., eds. (Royal Society of
Chemistry, Cambridge), pp. 204–221.
20. Cuyvers, S., Dornez, E., Delcour, J. A., and Courtin, C. M. (2012).
Occurrence and functional significance of secondary carbohydrate
binding sites in glycoside hydrolases, Crit. Rev. Biotechnol., 32, pp. 93–
107.
21. Blake, C. C., Koenig, D. F., Mair, G. A., North, A. C., Phillips, D. C, and Sarma,
V. R. (1965). Structure of hen egg-white lysozyme. A three-dimensional
Fourier synthesis at 2 Å resolution, Nature, 206, pp. 757–761.
22. Hassell, A. M., An, G., Bledsoe, R. K., Bynum, J. M., Carter, H. L., Deng, S. J.,
Gampe, R. T., Grisard, T. E., Madauss, K. P., Nolte, R. T., Rocque, W. J., Wang,
L., Weaver, K. L., Williams, S. P., Wisely, G. B., Xu, R., and Shewchuk, L. M.
(2007). Crystallization of protein-ligand complexes, Acta. Crystallogr. D,
63, pp. 72–79.
23. Kadziola, A., Søgaard, M., Svensson, B., and Haser, R. (1998). Molecular
structure of a barley α-amylase-inhibitor complex: implications for
starch binding and catalysis, J. Mol. Biol., 278, pp. 205–217.
24. Koropatkin, N. M., and Smith, T. J. (2010). SusG: a unique cell-membrane-
associated α-amylase from a prominent human gut symbiont targets
complex starch molecules, Structure, 18, pp. 200–215.
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

290 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

25. Przylas, I., Terada, Y., Fujii, K., Takaha, T., Saenger, W., and Sträter,
N. (2000). X-ray structure of a carbose bound to amylomaltase from
Thermus aquaticus. Implications for the synthesis of large cyclic glucans,
Eur. J. Biochem, 267, pp. 6903–6913.
26. Gibson, R. M., and Svensson, B. (1987). Identification of tryptophanyl
residues involved in binding of carbohydrate ligands to barley α-
amylase 2, Carlsberg Res. Commun., 52, pp. 373–379.
27. Robert, X., Haser, R., Gottschalk, T., Ratajczek, F., Driguez, H., Svensson,
B., and Aghajari, N. (2003). The structure of barley α-amylase isozyme 1
reveals a novel role of domain C in substrate recognition and binding: a
pair of sugar tongs, Structure, 11, pp. 973–984.
28. Søgaard, M., Kadziola, A., Haser, R., and Svensson, B. (1993). Site-
directed mutagenesis of histidine 93, aspartic acid 180, glutamic
acid 205, histidine 290, and aspartic acid 291 at the active site and
tryptophan 279 at the raw starch binding site in barley α-amylase 1,
J. Biol. Chem., 268, pp. 22480–22484.
29. Payan, F., and Qian, M. (2003). Crystal structure of the pig pancreatic α-
amylase complexed with malto-oligosaccharides, J. Protein Chem., 22,
pp. 275–284.
30. Ramasubbu, N., Ragunath, C., and Mishra, P. J. (2003). Probing the role
of a mobile loop in substrate binding and enzyme activity of human
salivary amylase, J. Mol. Biol., 325, pp. 1061–1076.
31. Lyhne-Iversen, L., Hobley, T. J., Kaasgaard, S. G., and Harris, P. (2006).
Structure of Bacillus halmapalus α-amylase crystallized with and
without the substrate analogue acarbose and maltose, Acta Crystallogr.
Sect. F Struct. Biol. Cryst. Commun., 62, pp. 849–854.
32. Stam, M. R., Danchin, E. G. J., Rancurel, C., Coutinho, P. M., and
Henrissat, B. (2006). Dividing the large glycoside hydrolase family
13 into subfamilies: towards improved functional annotations of α-
amylase-related proteins, Prot. Eng. Des. Sel., 19, pp. 555–562.
33. Tan, T., Mijts, B. N., Swaminathan, K., Patel, B. K. C., and Divne, C.
(2008). Crystal structure of the polyextremophilic α-amylase AmyB
from Halothermothrix orenii: details of a productive enzyme-substrate
complex and an N domain with a role in binding raw starch, J. Mol. Biol.,
378, pp. 852–870.
34. Vujicić-Zagar, A., and Dijkstra, B. W. (2006). Monoclinic crystal form of
Aspergillus niger α-amylase in complex with maltose at 1.8 Å resolution,
Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun., 62, pp. 716–
721.

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

References 291

35. Dauter, Z., Dauter, M., Brzozowski, A. M., Christensen, S., Borchert, T.
V., Beier, L., Wilson, K. S., and Davies, G. J. (1999). X-ray structure
of Novamyl, the five-domain “maltogenic” α-amylase from Bacillus
stearothermophilus: maltose and acarbose complexes at 1.7 Å resolu-
tion, Biochemistry, 38, pp. 8385–8392.
36. Knegtel, R. M. A., Strokopytov, B., Penninga, D., and Faber, O. G.
(1995). Crystallographic studies of the interaction of cyclodextrin
glycosyltransferase from Bacillus circulans strain 251 with natural
substrates and products, Biochemistry, 270, pp. 29256–29264.
37. Leemhuis, H., Dijkstra, B. W., and Dijkhuizen, L. (2002). Thermoanaer-
obacterium thermosulfurigenes cyclodextrin glycosyltransferase, Eur. J.
Biochem., 270, pp. 155–162.
38. Bornemann, S. (2013). Binding, catalysis and regulation of the anti-
TB target GH13 3 GlgE, Presented at the 5th Symposium on the Alpha-
Amylase Family, October 20–24, 2013, Smolenice Castle, Slovakia.
39. Skov, L. K., Mirza, O., Sprogøe, D., Dar, I., Remaud-Siméon, M., Albenne,
C., Monsan, P., and Gajhede, M. (2002). Oligosaccharide and sucrose
complexes of amylosucrase. Structural implications for the polymerase
activity, J. Biol. Chem., 277, pp. 47741–47747.
40. Kanai, R., Haga, K., Akiba, T., Yamane, K., and Harata, K. (2004).
Biochemical and crystallographic analyses of maltohexaose-producing
amylase from alkalophilic Bacillus sp. 707, Biochemistry, 43, pp. 14047–
14056.
41. Linden, A., Mayans, O., Meyer-Klaucke, W., Antranikian, G., and
Wilmanns, M. (2003). Differential regulation of a hyperthermophilic α-
amylase with a novel (Ca,Zn) two-metal center by zinc, J. Biol. Chem.,
278, pp. 9875–9884.
42. Geiger, J. H. (2013). Structure, function and specificity of branching
enzyme, Presented at the 5th Symposium on the Alpha-Amylase Family,
October 20–24, 2013, Smolenice Castle, Slovakia.
43. Timmins, J., Leiros, H. S., Leonard, G., Leiros, I., and McSweeney, S.
(2005). Crystal structure of maltooligosyltrehalose trehalohydrolase
from Deinococcus radiodurans in complex with disaccharides, J. Mol.
Biol., 347, pp. 949–963.
44. Sim, L., Beeren, S. R., Findinier, J., Dauvillee, D., Ball, S. G., Henriksen,
A., and Palcic, M. M. (2014). Crystal structure of the Chlamydomonas
starch debranching enzyme isoamylase ISA1 reveals insights into the
mechanism of branch trimming and complex assembly, J. Biol. Chem.,
289, pp. 22991–23003.
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

292 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

45. Caner, S., Nguyen, N., Aguda, A., Zhang, R., Pan, Y. T., Withers, S. G., and
Brayer, G. D. (2013). The structure of the Mycobacterium smegmatis
trehalose synthase reveals an unusual active site configuration and
acarbose-binding mode, Glycobiology, 23, pp. 1075–1083.
46. Abe, A., Tonozuka, T., Sakano, Y., and Kamitori, S. (2004). Complex
structures of Thermoactinomyces vulgaris R-47 α-amylase 1 with malto-
oligosaccharides demonstrate the role of domain N acting as a starch-
binding domain, J. Mol. Biol., 335, pp. 811–822.
47. Ajandouz, E. H., Abe, J., Svensson, B., and Marchis-Mouren, G. (1992).
Barley malt α-amylase. Purification, action pattern, and subsite map-
ping of isozyme 1 and two members of the isozyme 2 subfamily using
p-nitrophenylated maltooligosaccharide substrates, Biochim. Biophys.
Acta, 1159, pp. 193–202.
48. Bak-Jensen, K., André, G., Gottschalk, T. E., Paës, G., Tran, V., and
Svensson, B. (2004). Tyrosine 105 and threonine 212 at outermost
substrate binding subsites −6 and +4 control substrate specificity,
oligosaccharide cleavage patterns, and multiple binding modes of
barley α-amylase 1, J. Biol. Chem., 279, pp. 10093–10102.
49. Mori, H., Bak-Jensen, K., Gottschalk, T. E., Motawia, M. S., Damager,
I., Møller, B. L., and Svensson, B. (2001). Modulation of activity and
substrate binding modes by mutation of single and double subsites
+1/+2 and -5/-6 of barley α-amylase 1, FEBS J., 268, pp. 6545–6558.
50. Cuyvers, S., Dornez, E., Abou Hachem, M., Svensson, B., Hothorn, M.,
Chory, J., Delcour, J. A., and Courtin, C. M. (2012). Isothermal titration
calorimetry and surface plasmon resonance allow quantifying substrate
binding to different binding sites of Bacillus subtilis xylanase, Anal.
Biochem., 420, pp. 90–92.
51. Ludwiczek, M. L., Heller, M., Kantner, T., and McIntosh, L. P. (2007). A
secondary xylan-binding site enhances the catalytic activity of a single-
domain family 11 glycoside hydrolase, J. Mol. Biol., 373, pp. 337–354.
52. Nielsen, M. M., Bozonnet, S., Seo, E., Mótyán, J. A., Andersen, J. M.,
Dilokpimol, A., Abou Hachem, M., Gyémánt, G., Naested, H., Kandra,
L., Sigurskjold, B. W., and Svensson, B. (2009). Two secondary
carbohydrate binding sites on the surface of barley α-amylase 1 have
distinct functions and display synergy in hydrolysis of starch granules,
Biochemistry, 48, pp. 7686–7697.
53. Oudjeriouat, N., Moreau, Y., Santimone, M., Svensson, B., Marchis-
Mouren, G., and Desseaux, V. (2003). On the mechanism of α-amylase,
Eur. J. Biochem., 270, pp. 3871–3879.

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

References 293

54. Nielsen, J. W., Kramhøft, B., Bozonnet, S., Abou Hachem, M., Stipp, S. L.
S., Svensson, B., and Willemoës, M. (2012). Degradation of the starch
components amylopectin and amylose by barley α-amylase 1: role of
surface binding site 2, Arch. Biochem. Biophys., 528, pp. 1–6.
55. Bozonnet, S., Jensen, M. T., Nielsen, M. M., Aghajari, N., Jensen, M. H.,
Kramhøft, B., Willemoës, M., Tranier, S., Haser, R., and Svensson, B.
(2007). The “pair of sugar tongs” site on the non-catalytic domain C of
barley α-amylase participates in substrate binding and activity, FEBS J.,
274, pp. 5055–5067.
56. Cockburn, D., Nielsen, M. M., Christiansen, C., Andersen, J. M., Rannes,
J. B., Blennow, A., and Svensson, B. (2015). Surface binding sites in
amylase have distinct roles in recognition of starch structure motifs and
degradation, Int. J. Biol. Macromol., 75, pp. 338–345.
57. Kwan, A. H., Mobli, M., Gooley, P. R., King, G. F., and Mackay, J. P. (2011).
Macromolecular NMR spectroscopy for the non-spectroscopist, FEBS J.,
278, pp. 687–703.
58. Bøg-Hansen, T. C., and Brogren, C. H. (1975). Identification of glycopro-
teins with one and with two or more binding sites to Con A by crossed
immuno-affinoelectrophoresis, Scand. J. Immunol., 101, pp. 135–
139.
59. Horejsfi, V., and Ticha, M. (1986). Qualitative and quantitative ap-
plications of affinity electrophoresis for the study of protein-ligand
interactions: a review, J. Chromatogr., 376, pp. 49–67.
60. Blennow, A., Viksø-Nielsen, A., and Morell, M. K. (1998). α-glucan
binding of potato-tuber starch-branching enzyme I as determined by
tryptophan fluorescence quenching, affinity electrophoresis and steady-
state kinetics, Eur. J. Biochem., 252, pp. 331–338.
61. Tomme, P., Boraston, A., Kormos, J. M., Warren, R. A., and Kilburn, D. G.
(2000). Affinity electrophoresis for the identification and characteriza-
tion of soluble sugar binding by carbohydrate-binding modules, Enzyme
Microb. Technol., 27, pp. 453–458.
62. Fraiberg, M., Borovok, I., Weiner, R. M., Lamed, R., and Bayer, E. A. (2012).
Bacterial cadherin domains as carbohydrate binding modules: deter-
mination of affinity constants to insoluble complex polysaccharides,
Methods Mol. Biol., 908, pp. 109–118.
63. Yaniv, O., Jindou, S., Frolow, F., Lamed, R., and Bayer, E. A. (2012). A simple
method for determining specificity of carbohydrate-binding modules
for purified and crude insoluble polysaccharide substrates, Methods
Mol. Biol., 908, pp. 101–107.
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

294 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

64. Ragunath, C., Manuel, S. G. A., Venkataraman, V., Sait, H. B. R., Kasinathan,
C., and Ramasubbu, N. (2008). Probing the role of aromatic residues at
the secondary saccharide-binding sites of human salivary α-amylase in
substrate hydrolysis and bacterial binding, J. Mol. Biol., 384, pp. 1232–
1248.
65. Sevcı́k, J., Hostinová, E., Solovicová, A., Gasperı́k, J., Dauter, Z., and Wilson,
K. S. (2006). Structure of the complex of a yeast glucoamylase with
acarbose reveals the presence of a raw starch binding site on the
catalytic domain, FEBS J., 273, pp. 2161–2171.
66. Sevcı́k, J., Hostinová, E., Solovicová, A., Gasperı́k, J., Dauter, Z., and Wilson,
K. S. (2005). Acarbose binding at the surface of Saccharomycopsis
fibuligera glucoamylase suggests the presence of a raw starch-binding
site, Biologia, 60(Suppl. 16), pp. 167–170.
67. Cuyvers, S., Dornez, E., Delcour, J. A., and Courtin, C. M. (2011). The
secondary substrate binding site of the Pseudoalteromonas haloplanktis
GH8 xylanase is relevant for activity on insoluble but not soluble
substrates, Appl. Microbiol. Biotechnol., 92, pp. 539–549.
68. Cuyvers, S., Dornez, E., Rezaei, M. N., Pollet, A., Delcour, J. A., and Courtin,
C. M. (2011). Secondary substrate binding strongly affects activity and
binding affinity of Bacillus subtilis and Aspergillus niger GH11 xylanases,
FEBS J., 278, pp. 1098–1111.
69. Meekins, D. A., Guo, H., Husodo, S., Paasch, B. C., Bridges, T. M., Santelia,
D., Kötting, O., Vander Kooi, C. W., and Gentry, M. S. (2013). Structure
of the Arabidopsis glucan phosphatase like sex four2 reveals a unique
mechanism for starch dephosphorylation, Plant Cell, 25, pp. 2302–2314.
70. Rich, R. L., and Myszka, D. G. (2007). Survey of the year 2006 commercial
optical biosensor literature, J. Mol. Recognit., 20, pp. 300–366.
71. Perozzo, R., Folkers, G., and Scapozza, L. (2004). Thermodynamics of
protein–ligand interactions: history, presence, and future aspects, J.
Recept. Signal Transduct. Res., 24, pp. 1–52.
72. Fujii, K., Minagawa, H., Terada, Y., Takaha, T., Kuriki, T., Shimada, J., and
Kaneko, H. (2007). Function of second glucan binding site including
tyrosines 54 and 101 in Thermus aquaticus amylomaltase, J. Biosci.
Bioeng., 103, pp. 167–173.
73. Asano, N., Ishii, S., Kizu, H., Ikeda, K., Yasuda, K., Kato, A., Martin, O. R.,
and Fan, J. Q. (2000). In vitro inhibition and intracellular enhancement
of lysosomal α-galactosidase A activity in Fabry lymphoblasts by 1-
deoxygalactonojirimycin and its derivatives, FEBS J., 267, pp. 4179–
4186.

www.ebook3000.com
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

References 295

74. Fan, J. Q., Ishii, S., Asano, N., and Suzuki, Y. (1999). Accelerated transport
and maturation of lysosomal α-galactosidase A in Fabry lymphoblasts
by an enzyme inhibitor, Nat. Med., 5, pp. 112–115.
75. Ozen, H. (2007). Glycogen storage diseases: new perspectives, World J.
Gastroenterol., 13, pp. 2541–2553.
76. Zarate, Y. A., and Hopkin, R. J. (2008). Fabry’s disease, Lancet, 372, pp.
1427–1435.
77. Guce, A. I., Clark, N. E., Rogich, J. J., and Garman, S. C. (2011).
The molecular basis of pharmacological chaperoning in human α-
galactosidase, Chem. Biol., 18, pp. 1521–1526.
78. Guce, A. I., Clark, N. E., Salgado, E. N., Ivanen, D. R., Kulminskaya, A. A.,
Brumer, H., and Garman, S. C. (2010). Catalytic mechanism of human α-
galactosidase, J. Biol. Chem., 285, pp. 3625–3632.
79. Song, L., Siguier, B., Dumon, C., Bozonnet, S., and O’Donohue, M. J. (2012).
Engineering better biomass-degrading ability into a GH11 xylanase
using a directed evolution strategy, Biotechnol. Biofuels, 5, p. 3.
80. Liu, Y. H., Hu, B., Xu, Y. J., Bo, J. X., Fan, S., Wang, J. L., and Lu, F. P. (2012).
Improvement of the acid stability of Bacillus licheniformis α-amylase by
error-prone PCR, J. Appl. Microbiol., 113, pp. 541–549.
81. Nielsen, J. E, and Borchert, T. V. (2000). Protein engineering of
bacterial α-amylases, Biochim. Biophys. Acta, 1543, pp. 253–274.
82. Priyadharshini, R., Manoharan, S., Hemalatha, D., and Gunasekaran,
P. (2010). Repeated random mutagenesis of α-amylase from Bacillus
licheniformis for improved pH performance, J. Microbiol. Biotechnol., 20,
pp. 1696–1701.
83. Deng, Z., Yang, H., Li, J., Shin, H., Du, G., Liu, L., and Chen, J. (2014).
Structure-based engineering of alkaline α-amylase from alkaliphilic
Alkalimonas amylolytica for improved thermostability, Appl. Microbiol.
Biotechnol., 98, pp. 3997–4007.
84. Liu, L., Deng, Z., Yang, H., Li, J., Shin, H., Chen, R. R., Du, G., and
Chen, J. (2014). In silico rational design and systems engineering of
disulfide bridges in the catalytic domain of an alkaline α-amylase
from Alkalimonas amylolytica to improve thermostability, Appl. Environ.
Microbiol., 80, pp. 798–807.
85. Yang, H., Liu, L., Shin, H., Chen, R. R., Li, J., Du, G., and Chen, J. (2013).
Integrating terminal truncation and oligopeptide fusion for a novel
protein engineering strategy to improve specific activity and catalytic
efficiency: alkaline α-amylase as a case study, Appl. Environ. Microbiol.,
79, pp. 6429–6438.
March 23, 2016 12:32 PSP Book - 9in x 6in 07-Allan-Svendsen-c07

296 Structure and Functional Roles of Surface Binding Sites in Amylolytic Enzymes

86. Yang, H., Liu, L., Shin, H., Li, J., Du, G., and Chen, J. (2013). Structure-
guided systems-level engineering of oxidation-prone methionine
residues in catalytic domain of an alkaline α-amylase from Alkalimonas
amylolytica for significant improvement of both oxidative stability and
catalytic efficiency, PLOS ONE, 8, p. e57403.
87. Yang, H., Liu, L., Wang, M., Li, J., Wang, N. S., Du, G., and Chen, J. (2012).
Structure-based engineering of methionine residues in the catalytic
cores of alkaline amylase from Alkalimonas amylolytica for improved
oxidative stability, Appl. Environ. Microbiol., 78, pp. 7519–7526.
88. Brzozowski, A. M., Lawson, D. M., Turkenburg, J. P., Bisgaard-Frantzen,
H., Svendsen, A., Borchert, T. V., Dauter, Z., Wilson, K. S., and Davies, G.
J. (2000). Structural analysis of a chimeric bacterial α-amylase. High-
resolution analysis of native and ligand complexes, Biochemistry, 39, pp.
9099–9107.

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Chapter 8

Interfacial Enzymes and Their


Interactions with Surfaces: Molecular
Simulation Studies

Nathalie Willems, Mickaël Lelimousin, Heidi Koldsø,


and Mark S. P. Sansom
Department of Biochemistry, University of Oxford, South Parks Road,
Oxford, OX1 3QU, UK
mark.sansom@bioch.ox.ac.uk

8.1 Introduction

Enzymes that function at an interface (i.e., interfacial enzymes) act


in biological processes as diverse as cell signaling and transduction,
membrane remodeling, digestion, endocytosis, and inflammation
[1]. Indeed almost half the proteins in a cell may exhibit a degree
of association with membranes and, therefore, must work at an
interface [2]. Interfaces can include the cellular membrane environ-
ment, into which nonpolar enzymatic substrates can partition, or the
aggregation of nonpolar substrates that form micelle-like structures,

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

298 Interfacial Enzymes and Their Interactions with Surfaces

liposomal dispersions, or emulsions [3, 4]. Interfacial enzymes have


evolved to function within the physical constraints imposed by
such an environment, including effects on substrate accessibility,
enzyme orientation, partitioning, and the equilibrium between
associated and dissociated states [4]. These processes ultimately
govern the kinetics of the catalyzed reaction and determine the
conditions essential for the normal functioning of biologically
important processes. These dynamic processes are significant in
understanding the cell biology of membranes and the intracellular
environment and are also of biotechnological relevance. Interfacial
enzymes have considerable potential for industrial applications,
such as lipases in food, detergent, and related industries. For
example, Candida antartica lipase B is commercially available
as an immobilized enzyme in laundry detergent [5]. However,
fuller exploitation of interfacial enzymes remains limited by our
incomplete molecular understanding of the underlying dynamic
enzymatic mechanism.
Molecular dynamics (MD) simulations provide a computational
approach to studying the dynamic behavior of interfacial enzymes.
Elucidating how these proteins interact with interfaces and undergo
subsequent conformational changes is critical to understanding
the underlying mechanisms of biological processes. Over recent
years, MD studies have provided contributions into unraveling how
a number of different interfacial enzymes interact with surfaces,
revealing the structural and functional consequences of these inter-
actions [6–10]. Gaining a deeper understanding of these processes
has helped us to better harness the power of many interfacial
enzymes, especially lipases. Lipases are important in processes,
including chiral synthesis, fat dispersion, and flavor chemistry, and
have been the subject of a number of computational studies that we
review in this chapter. Other interfacial enzymes such as cellulases
are important for biofuel production and also have been studied
computationally [10, 11], although they are not covered in detail
here. This chapter will focus on how MD simulations have aided
in exploring the structure–function relationships of a number of
interfacial enzymes, with a particular emphasis on lipases and other
enzymes acting at lipid bilayers and related surfaces.

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Enzyme Interactions at Interfaces 299

8.2 Enzyme Interactions at Interfaces

Enzymes that display activity at an aqueous/membrane (or related)


interface include transferases, such as lipid kinases, and hydrolases.
Hydrolases constitute a class of enzymes that catalyze the hydrolysis
of a chemical bond, which includes lipases, phospholipases (Fig.
8.1B), esterases, and phosphatases [3]. The kinetic mechanisms
exhibited by these interfacial enzymes differ, depending on the
family in question. In particular, lipases exhibit a unique mechanism
of interfacial catalysis whereby the enzyme becomes activated at
the interface. An early example of this was the dramatic increase
in activity exhibited by pancreatic lipase when its substrate was
present in micelles compared to when it was dispersed in solution
[13]. Thus, the lipase activity is not directly influenced by the molar
concentration of substrate but rather by the substrate concentration
at the interface. Such activation is seen with many other lipases and
is influenced by the lipase itself, its substrate, and the interface the
enzyme interacts with.
Clearly, the kinetics of interfacial enzymes catalysis are more
complex than those for enzymes free in solution and do not
follow a simple Michaelis–Menten-type dependence on substrate
concentration [1]. Early kinetic models such as the Verger-de Haas
model of interfacial lipid hydrolysis involve the equilibrium between
associated and dissociated states of the enzyme binding to the
interface and the formation of an enzyme–substrate complex [14].
An extension of these models based on kinetic studies with phos-
pholipases is shown in (Fig. 8.1A) [2, and references therein]. This
model takes into account the substrate-partitioning equilibrium at
the interface, the dissociation of the enzyme–product complex after
enzyme catalysis, and the equilibrium dissociation of the product
at the interface. However, interfacial enzymes often hydrolyze
long-chain fatty acids and will therefore generate water-insoluble
products that may alter the equilibrium dissociation of the product
from the interface. The physicochemical properties of the product
species will therefore affect its association with a hydrophobic lipid
interface and could in turn lead to molecular reorganization of
the interface (segregation of water-insoluble products), as well as
affecting the activity of the enzyme itself [15, 16].
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

300 Interfacial Enzymes and Their Interactions with Surfaces

Figure 8.1 (A) Kinetic model for interfacial catalysis and activation based
on Ref. [2]. Enzyme (E) and substrate (S) species at the interface are in
equilibrium with their counterparts in the surrounding aqueous phase.
(B) Binding of an interfacial enzyme, porcine pancreatic phospholipase A2,
to a lipid bilayer. Interacting residues are represented as van der Waals
spheres, colored blue (W3), red (L19), and green (M20), and thought to
act as hydrophobic anchors for binding to the interface. Reprinted from Ref.
[12], Copyright (2008), with permission from Elsevier.

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Molecular Dynamic Simulations of Biomolecular Systems 301

Crystal structures of interfacial enzyme may provide clues about


the underlying structural determinants of these complex reaction
mechanisms. For example, the first crystal structure of an active
lipase (Rhizomucor miehei lipase) bound to an inhibitor revealed
structural factors related to activation of the lipase at an interface.
This provided insights about how the interaction of lipases with lipid
bilayers may influence their association equilibrium [17]. Similarly,
crystal structures of cobra venom phospholipase A2 suggested a
mechanism involving facilitated diffusion of substrate from the
interfacial binding surface to the catalytic site [18]. All these factors
will influence the overall kinetic mechanisms exhibited by interfacial
enzymes.
While crystallographic studies may provide (static) structures
of multiple conformational states of an enzyme, they do not
directly provide exhaustive details of the motions involved in
the mechanisms of interfacial binding and/or activation. Enzyme
recognition and binding to a dynamic interface such as a lipid bilayer
will likely affect its conformational dynamics. For example, calcium
destabilizes substrate binding to the cobra venom phospholipase
A2 and interfaces with high curvature may promote diffusion of
lipid into the substrate-binding channel [18]. These dynamic events
will affect the kinetics of the enzyme-catalyzed reaction and are
difficult to deduce from crystal structures. Thus techniques able
to probe molecular motions are required to better understand
the catalytic activity of interfacial enzymes. Molecular simulations
allow us to describe the molecular motions that occur during and
following enzyme interactions with an interface and, thus, enable the
formulation of more detailed and quantitatively predictive dynamic
models of the mechanism of action of interfacial enzymes.

8.3 Molecular Dynamic Simulations of Biomolecular


Systems

MD simulations allow us to predict the motions of atoms or particles


within a biomolecular system [19, 20]. Most MD simulations repre-
sent all atoms of the simulation system explicitly. This allows details
of protein/ligand interactions, protein conformational dynamics,
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

302 Interfacial Enzymes and Their Interactions with Surfaces

Figure 8.2 Phospholipid (DPPC) shown in atomistic and coarse-grained


representations, based on the Martini CG model [23]. Mapping roughly
follows a 1:4 scheme, where four nonhydrogen atoms are mapped into a
single CG bead. Representations shown in Van der Waals spheres, where
glycerol groups are in green, phosphate in red, and choline in blue.

and structural changes to be studied in atomic resolution. Therefore


MD simulations offer a unique way to study enzyme properties
[21–23]. However, the computational demands of such atomistic
simulations can be quite severe, especially for complex systems
involving lipid bilayers or comparable interfacial environments
[24, 25]. More recently, coarse-grained (CG) methods have been
developed for simulations of membranes and related systems [11,
19, 22, 26]. CG models collect together three or four atoms into an
equivalent particle (Fig. 8.2). They are simplified models and are
thus computationally less expensive (by a factor of 100- to 1000-
fold), allowing for much longer simulations of larger systems [27].
One of the most widely applied CG methods for lipid simulations is
Martini [24, 28]. The Martini force field has been used to simulate
a wide range of systems, including proteins, complex membranes,
carbohydrates, glyconjugated systems, and inorganic materials [30,
31].
Still, there are considerable challenges to simulating enzymes at
interfaces. In particular, one must use simulations to help define the
exact location and orientation of the enzyme at an interface. The
relatively slow relaxation of a soft interface (e.g., a lipid bilayer) in
response to binding of an enzyme may require long simulations (e.g.,
microseconds) to allow for a fully equilibrated position/orientation
of the enzyme to be reached. CG simulations, due to their increased

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Lipases 303

computation efficiency, are thus well suited to exploring such


processes. However, to fully characterize conformational changes of
the protein once bound to such an interface requires more detailed
atomistic simulations. Methods have been developed to convert CG
simulations to atomistic models [31, 32], enabling one to study
biological events on a larger range of timescales and at different
resolutions [33–36]. Therefore, multiscale simulation techniques
are suitable to investigate enzyme/interface interactions.

8.4 Lipases

Lipases are among the best-studied examples of interfacial enzymes,


in part as a consequence of their roles in fat metabolism, human
disorders, and industrial biocatalysts [31–33]. The natural function
of lipases is to hydrolyze the ester bonds in triacylglycerols. They
belong to the serine hydrolase family of enzymes, containing a
Ser/His/Asp catalytic triad [37–40]. Lipases are “promiscuous”
enzymes, able to catalyze additional reactions, including transestrifi-
cation, aminolysis, alcoholysis, aldol addition, and epoxidation [41].
This versatility has resulted in the exploitation of lipases in food,
detergent, pharmaceutical, and other industries [5].
As previously mentioned, many lipases display interfacial ac-
tivation, whereby the interface affects enzyme activity. Interfacial
activation in lipases involves the movement of an amphipathic,
flexible region (the “lid”) that covers the active site. The lid is
thought to undergo a conformational change upon interaction with
a hydrophobic interface, which “opens” the lipase, making the active
site accessible to the substrate (Fig. 8.3). Many factors can affect
how the lipase is activated and the degree of activation exhibited,
including the nature of the interface. For example, while surfaces
with a high interfacial tension may irreversibly denature an enzyme
[14], lipases can remain fully active at surfaces with a low interfacial
tension (e.g., substrate emulsions) [43]. The types of interactions
between the enzyme and the interface are also important, as
enzyme inhibition can occur via electrostatic interactions that cause
conformational changes and substrate aggregation.
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

304 Interfacial Enzymes and Their Interactions with Surfaces

Figure 8.3 Open crystal structure of TLL in blue (1DT5) and closed struc-
ture in green (1DTE) [42]. Opening of TLL structure occurs via movement
of the α1 (lid) helix, indicated by the arrow, allowing substrate access to the
underlying catalytic triad, shown in orange stick representations.

A central issue is thus to elucidate the extent to which these


enzymes bind to, react with, or are activated by an interface and how
protein interaction with the surface (e.g., in the lid region) leads to
interfacial activation, substrate accessibility, and catalysis. Molecular
simulations provide a powerful method for understanding these
factors.

8.4.1 Atomistic MD Studies of Lipase Interactions with


Interfaces
A number of atomistic MD simulations of lipases have provided
dynamic conformational insights into how lipases may bind to
interfaces. Simulations of both the open and the closed confor-
mations of Thermomyces lanuginosa lipase (TLL) used an air–
water interface as a simple proxy for a lipid–water interface [44].
The simulations were compared to experimental X-ray reflectivity

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Lipases 305

data of the same system. This enabled investigators to deduce the


orientation and conformation of the lipase at the air–water interface,
indicating that TLL interacts with this interface in a semi-open
conformation, with the lid orientated away from the interface. The
suggestion that lipases may be able to bind interfaces in a number
of different orientations has implications for lid movement and
activation, as seen in other experimental studies [45, 46]. This also
has implications for the reaction kinetics and thermodynamics of
lipase–surface interactions.
Simulations of lipases at more complex interfaces have also
proved useful in elucidating the binding orientations and con-
formational changes that may occur upon interaction. One such
study involved simulations of C. rugosa lipase (CRL) at three
different water/alkane interfaces, namely with hexane, octane, and
decane [8]. The authors reported opening of the initially closed
CRL structure (PDB 1TRH [47]) after 4 ns of atomistic simulation
in the presence of each of the interfaces, with varying degrees
of lid opening, depending on the alkane. The largest root-mean-
square deviation (RMSD) of the lid region in comparison to the
closed structure of the enzyme was 14 Å upon interaction with
the decane interface, compared to 10 Å and 7 Å for octane and
hexane, respectively. Hydrogen-bonding patterns between the lid
and interface also differed depending on the alkane. Interestingly,
the opening structure of CRL generated in these simulations differed
from the open crystal structure of CRL (PDB 1CRL [48]), particularly
in the orientation adopted by the lid (Fig. 8.4). The hinge region
involved in lid activation observed in the simulations included
residues Gln63 and Val86, while the hinge residues identified from
comparison of the closed and open crystal structures were Glu66
and Pro92. These results suggest that the pathway to an activated
lipase structure may differ, depending on the exact nature of the
interface as well as on the presence of substrate and the binding
orientation of the enzyme.
Insights into the possible role of the lid have also been
obtained from simulations investigating the activation process of
a Pseudomonas aeruginosa lipase (PAL) [49]. This lipase was
crystallized in the open (active) state and was thought to possess
one distinct lid region spanning residues 125–148. MD simulations
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

306 Interfacial Enzymes and Their Interactions with Surfaces

Figure 8.4 Position of the lid helix of CRL after simulation (red) compared
to the lid helix in the open crystal structure of CRL (blue) and closed
structure (green) for the decane-aqueous interfaces. Ser209, His449, and
Glu341 form the catalytic triad and are colored blue, wheat, and cyan,
respectively (sticks). Reprinted from Ref. [8], Copyright (2007), with
permission from Elsevier.

of the open lipase structure in water suggested, however, that the


enzyme may instead contain two lid regions, both with possible
functional roles in activation of the lipase. The second lid was
identified between residues 210 and 222 and suggested to trigger
movement of the first lid (residues 125–148) region. An increased
RMSD in both lid regions was observed over the course of
20 ns of atomistic simulations, alongside a decreasing distance
between these regions. It was suggested that this corresponds to
the early stage(s) of PAL undergoing a transition from an open
to closed state when in an aqueous environment, driven by a
tendency to bury water-exposed hydrophobic residues on the two
lid regions. Interestingly, subsequent simulations of the resultant
closed structure of PAL (re)positioned at an octane–water interface
resulted in movement of the lid regions back to their original
(open) conformations, as seen in the crystal structure. Mutations
of hydrophobic residues located in the second proposed lid of the

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Lipases 307

closed structure resulted in inability of the enzyme to open at the


interface, opposite to what is seen with the wild type. Simulations
of the same mutants in the original open PAL structure in water
also resulted in an inability of the enzyme to close, indicating the
importance of hydrophobic interactions in lipase activation. This
study thus highlights how the structural features of lipases (e.g.,
lid regions), and their interactions can influence lipase activation
at interfaces. This investigation also indicates that crystallographic
studies, while revealing key open conformations of these proteins,
may not reveal all possible conformations of lid regions upon
activation at different types of interfaces.

8.4.2 The Role of Water in Lipase Catalysis at Interfaces


Computational studies have also focused on understanding the role
of water in the catalytic action of lipase. Lipases may catalyze
different reactions, depending on water access to the active site, a
topic of particularly relevance for industrial applications of lipases.
Water access to the active site may be altered depending on lipase
orientation and interactions at the interface. For example, TLL can
catalyze the hydrolysis of triacylglycerols at high water content
but catalyzes transestrification of substrates at virtually anhydrous
conditions [50]. A two-step water-shuttling mechanism has been
proposed for this enzyme, whereby water is released during the
adsorption/activation process upon lipase interaction with the
interface [51]. It is thus of interest to understand how water is
displaced during lipase binding/activation at an interface while also
playing a key role in the reaction catalyzed.
Simulations of open TLL at a tributyrin substrate interface were
useful in elucidating some of these aspects [52]. The study focused
particularly on the role of ordered water molecules found within the
crystal structure of open TLL and the role of a buried cluster of polar
amino acids within the protein cavities (Fig. 8.5) [53]. Indeed, these
ordered water molecules and the polar residues could have roles
in attenuating water access to the catalytic site. Over the course of
30 ns simulations, the presence of a secondary lipid-binding pocket
was revealed, as well as a polar channel that connects the active site
pocket to the surrounding solvent. Hydrogen bond analysis revealed
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

308 Interfacial Enzymes and Their Interactions with Surfaces

Figure 8.5 Open TLL crystal structure with polar residues (blue sticks) in
a cavity close to the catalytic residues (orange sticks) that could mediate
interactions with water [53]. The lid region is colored in cyan.

that residues surrounding the active site allowed water movement


from the catalytic site to the polar channel, which has access to the
surrounding aqueous environment. The simulations of the protein
thus suggested a possible role of a tyrosine residue positioned
near the polar channel, acting as a valve to control water flow to
the active site. Interestingly, varying the initial amount of water
present in the protein at the start of the simulation also affected the
motion of substrate molecules in the vicinity of the active site. Here,
increased substrate interactions with catalytic residues were seen
when starting from a TLL structure with no initial water molecules
bound. This study indicates the importance of water access to the
active site of the bound lipase and how simulations may be used
to explore the dependence of this access on lipase orientation at an
interface.

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Coarse-Grained MD Studies of Interfacial Enzymes 309

8.5 Coarse-Grained MD Studies of Interfacial Enzymes:


Orientation and Interactions

Atomistic simulations are well suited to exploring the confor-


mational dynamics of an enzyme once correctly located at an
interface. However, it is often difficult to predict the location and
orientation of an enzyme at a soft interface such as a lipid bilayer.
Although biomolecular CG simulations have previously focused on
studying transmembrane proteins and lipid interactions [54, 55],
they constitute a valuable approach to predicting the nature and
extent of penetration of an interfacial enzyme into an interface [56].
These predictions can then be compared with available biophysical
and other experimental data.

8.5.1 Phospholipase A2
Interfacial enzymes acting at membrane surfaces include phopholi-
pases, which function to hydrolyze the bonds in phospholipids
[3]. Phospholipases are classified according to which bond in the
phospholipid is cleaved. For example, phospholipases A2 hydrolyze
phospholipids at the sn-2 acyl-ester bond, whereas phospholipases
A1 hydrolyze the sn-1 acyl-ester bond [57]. The phospholipase A2
family is a particularly well-studied class of interfacial enzyme.
They have evolved to hydrolyze phospholipids at organized lipid–
aqueous interfaces, are components of insect and snake venoms and
digestive secretions in mammals, and also function in inflammation.
Phospholipases A2 have thus provided important test cases for
understanding binding and orientation of interfacial enzymes.
Porcine pancreatic phospholipase A2 (pPLA2) was the subject
of a CG simulation study investigating enzyme interactions and
association with bilayers as a function of the nature of the phospho-
lipid headgroups [12] (Figs. 8.1B and 8.6). The bilayers contained
differing amounts of zwitterionic (phosphatidylcholine, PC) and
anionic (phosphatidylglycerol, PG) lipids. From these simulations,
the investigators identified a patch of hydrophobic residues on the
surface of pPLA2, surrounded by polar and basic amino acids, that
preferentially interacts with the anionic glycerol headgroups of PG
lipids (Fig. 8.6). A number of experimental studies have shown the
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

310 Interfacial Enzymes and Their Interactions with Surfaces

Figure 8.6 Phospholipase A2 enzyme binding to phospholipid membrane


during a CG simulation. The hydrophobic anchor residues are shown in red
spheres, and lipid headgroups are shown in gold spheres. Reprinted from
Ref. [12], Copyright (2008), with permission from Elsevier.

importance of electrostatics in PLA2–surface interactions, showing


a preference for anionic surfaces [58, 59]. The results from the
simulations were thus in good agreement with experimental data
that proposed hydrogen bonding between pPLA2 and lipid carbonyl
oxygens and electrostatic interactions with lipid phosphate moieties
[59]. These interactions anchor the protein in a specific orientation
in the bilayer, such that the catalytic His48 residue is positioned
directly above the bilayer. This suggests that the bound orientation
of pPLA2 could form a catalytically competent complex with the
membrane, able to hydrolyze substrates partitioned into the bilayer.
Importantly, the CG simulation sampled the enzyme recognition and
binding to the interface from an initially distant position, mimicking
the stochastic enzyme approach to a surface, as may occur in vivo.
More generally, this study illustrates the use of CG MD simulations
in studying specific interactions of enzymes with interfaces such as
bilayers.

8.5.2 PTEN
A more complex example of interfacial binding involves multido-
main recognition of the interface by the phosphatase PTEN [60].
PTEN is a key tumor suppressor protein that shuttles between a
cytosolic and an interfacial (membrane-bound) location [61]. When
bound to the inner leaflet of mammalian cell membranes, PTEN
can hydrolyze PIP3 to PIP2 , thus attenuating downstream signaling

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

Conclusions 311

Figure 8.7 Binding and reorientation of PTEN protein during a CG


simulation to a bilayer containing neutral (gray beads) and anionic lipids
(blue beads) and PIP3 molecules (red beads). The C2 domain of PTEN is
shown in yellow, and the PD domain is blue. The final orientation of PTEN in
the membrane (far right) is thought to be the productive complex referred
to in the text. Adapted from Ref. [60].

by P13K/Akt [62]. Multiscale (CG and AT) MD simulations have


identified specific loops in a membrane recognition (C2) domain of
PTEN and related proteins that form direct interactions with anionic
lipids [51, 54]. CG studies also show clustering of PIP3 molecules
around the bound protein, suggesting strong interactions between
the enzyme and its substrate. These interactions may also function
as a driving force for reorientation of the initial encounter complex
of the protein at the interface, along with local reorganization
of the membrane environment, to form a catalytically competent
PTEN–interface complex (Fig. 8.7). Productive orientations of
PTEN within the membrane were previously proposed by both
experimental and computational studies [55, 56], suggesting that
maximum interaction of the PTEN domains with anionic lipids
causes reorientation of the protein for optimal binding of the PIP3
substrate [63–65].

8.6 Conclusions

MD simulations help us to better understand the role of interfacial


interactions involved in the activity of key enzymes such as lipases.
In particular, CG simulations can predict protein orientations and
interactions at interfaces. In turn, atomistic simulations allow us to
explore in detail the nature of conformational changes in enzymes
as a function of their interfacial environment. There is now a
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

312 Interfacial Enzymes and Their Interactions with Surfaces

need for a more integrative approach combining both methods to


provide a complete description of enzyme–interface interactions.
It may also be useful to combine such multiscale methodology
with enhanced sampling methods (e.g., metadynamics [66]) in
order to fully explore the long-timescale motions of the interfacial
systems and obtain thermodynamic models of enzyme activity.
Furthermore, to be of greater predictive value, for example, for
biotechnological applications, we also need improved methods for
modeling a wider variety of biological and nonbiological (e.g.,
graphene [67–69]) surfaces. This will be useful in understanding
the role of the interface in modulating conformation and activity of
complex interfacial enzymes. These developments will be necessary
to provide an accurate description of interfacial enzyme systems
and bridge the computational–experimental gap. Integration of both
experimental and theoretical data will provide fundamental insight
into how enzymes behave at interfaces, and will facilitate their
continued development as important industrial biocatalysts.

References

1. Reis, P., Holmberg, K., Watzke, H., Leser, M. E., and Miller, R. (2009).
Lipases at interfaces: a review, Adv. Colloid Interface Sci., 147–148, pp.
237–250.
2. Berg, O. G., Cajal, Y., Butterfoss, G. L., Grey, R. L., Alsina, M. A., Yu, B.,
and Jain, M. K. (1998). Interfacial activation of triglyceride lipase from
Thermomyces (Humicola) lanuginosa: kinetic parameters and a basis
for control of the lid, Biochemistry, 37, pp. 6615–6627.
3. Gelb, M., and Jain, M. (1995). Interfacial enzymology of glycerolipid
hydrolases: lessons from secreted phospholipases A2, Annu. Rev.
Biochem., 64, pp. 653–688.
4. Berg, O., and Jain, M. (2002). Interfacial Enzyme Kinetics (John Wiley and
Sons, Chichester).
5. Houde, A., Kademi, A., and Leblanc, D. (2004). Lipases and their
industrial applications, Appl. Biochem. Biotechnol., 118, pp. 155–170.
6. Peters, G. H., Toxvaerd, S., Olsen, O. H., and Svendsen, A. (1997).
Computational studies of the activation of lipases and the effect of a
hydrophobic environment, Protein Eng., 10, pp. 137–147.

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

References 313

7. Benedetti, F., Berti, F., and Linda, P. (1996). Modeling of solvent effects in
the activation of the lipase from Rhizomucor miehei, Bioorg. Med. Chem.
Lett., 6, pp. 839–844.
8. James, J. J., Lakshmi, B. S., Seshasayee, A. S. N., and Gautam, P. (2007).
Activation of Candida rugosa lipase at alkane-aqueous interfaces: a
molecular dynamics study, FEBS Lett., 581, pp. 4377–83.
9. Bernardi, R. C., Cann, I., and Schulten, K. (2014). Molecular dynamics
study of enhanced Man5B enzymatic activity, Biotechnol. Biofuels, 7, p.
83.
10. Sun, Y., and Cheng, J. (2002). Hydrolysis of lignocellulosic materials for
ethanol production: a review, Bioresour. Technol., 83, pp. 1–11.
11. Wohlert, J., and Berglund, L. A. (2011). A coarse-grained model for
molecular dynamics simulations of native cellulose, J. Chem. Theory
Comput., 7, pp. 753–760.
12. Wee, C. L., Balali-Mood, K., Gavaghan, D., and Sansom, M. S. P. (2008). The
interaction of phospholipase A2 with a phospholipid bilayer: coarse-
grained molecular dynamics simulations, Biophys. J., 95, pp. 1649–
1657.
13. Wong, D. W. S. (1995). Food Enzymes: Structure and Mechanism
(Springer-Verlag, New York).
14. Verger, R., Pattus, F., Pieroni, G., Riviere, C., Ferrato, F., Leonardi, J.,
and Dargent, B. (1984). Regulation by the “interfacial quality” of some
biological activities, Colloids Surf., 10, pp. 163–180.
15. O’Connor, C. J., and Walde, P. (1986). Interactions of human milk lipase
with sodium taurocholate and other surfactants, Langmuir, 2, pp. 139–
146.
16. Valivety, R., Halling, P., and Macrae, A. (1993). Water as a competitive
inhibitor of lipase-catalysed esterification in organic media, Biotechnol.
Lett., 15, pp. 1133–1138.
17. Van Tilbeurgh, H., Egloff, M.P., Martinez, C., Rugani, N., Verger, R., and
Cambillau, C. (1993). Interfacial activation of the lipase-procolipase
complex by mixed micelles revealed by X-ray crystallography, Nature,
362, pp. 814–820.
18. Scott, D. L., White, S. P., Otwinowski, Z., Yuan, W., Gelb, M. H., and Sigler,
P. B. (1990). Interfacial catalysis: the mechanism of phospholipase A(2),
Science, 250, pp. 1541–1546.
19. Karplus, M., and McCammon, J. A. (2002). Molecular dynamics simula-
tions of biomolecules, Nat. Struct. Biol., 9, pp. 646–652.
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

314 Interfacial Enzymes and Their Interactions with Surfaces

20. Adcock, S. A, and McCammon, J. A. (2006). Molecular dynamics: survey


of methods for simulating the activity of proteins, Chem. Rev., 106, pp.
1589–1615.
21. Garcia-Viloca, M., Gao, J., Karplus, M., and Truhlar, D. G. (2004).
How enzymes work: analysis by modern rate theory and computer
simulations, Science, 303, pp. 186–195.
22. Henzler-Wildman, K. A., Lei, M., Thai, V., Kerns, S. J., Karplus, M., and
Kern, D. (2007). A hierarchy of timescales in protein dynamics is linked
to enzyme catalysis, Nature, 450, pp. 913–916.
23. Pisliakov, A. V, Cao, J., Kamerlin, S. C. L., and Warshel, A. (2009). Enzyme
millisecond conformational dynamics do not catalyze the chemical step,
Proc. Natl. Acad. Sci. U S A, 106, pp. 17359–17364.
24. Marrink, S. J., de Vries, A. H., and Mark, A. E. (2004). Coarse grained
model for semiquantitative lipid simulations, J. Phys. Chem. B, 108, pp.
750–760.
25. Shelley, J. C., and Shelley, M. Y. (2000). Computer simulation of surfactant
solutions, Curr. Opin. Colloid Interface Sci., 5, pp. 101–110.
26. Izvekov, S., and Voth, G. A. (2005). A multiscale coarse-graining method
for biomolecular systems, J. Phys. Chem. B, 109, pp. 2469–2473.
27. Klein, M. L., and Shinoda, W. (2008). Large-scale molecular dynamics
simulations of self-assembling systems, Science, 321, pp. 798–800.
28. Marrink, S. J., Risselada, H. J., Yefimov, S., Tieleman, D. P., and de
Vries, A. H. (2007). The Martini force field: coarse grained model for
biomolecular simulations, J. Phys. Chem. B, 111, pp. 7812–7824.
29. Marrink, S. J., and Tieleman, D. P. (2013). Perspective on the Martini
model, Chem. Soc. Rev., 42, pp. 6801–22.
30. Ingólfsson, H. I., Melo, M. N., van Eerden, F. J., Arnarez, C., Lopez, C. A.,
Wassenaar, T. A., Periole, X., de Vries, A. H., Tieleman, D. P., and Marrink,
S. J. (2014). Lipid organization of the plasma membrane, J. Am. Chem.
Soc., 136, pp. 14554–14559.
31. Stansfeld, P. J., and Sansom, M. S. P. (2011). From coarse grained
to atomistic: a serial multiscale approach to membrane protein
simulations, J. Chem. Theory Comput., 7, pp. 1157–1166.
32. Wassenaar, T. (2013). Going backward: a flexible geometric approach
to reverse transformation from coarse grained to atomistic models, J.
Chem. Theory Comput., 10, pp. 676–690.
33. Ayton, G. S., Noid, W. G., and Voth, G. A. (2007). Multiscale modeling of
biomolecular systems: in serial and in parallel, Curr. Opin. Struct. Biol.,
17, pp. 192–198.

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

References 315

34. Arkhipov, A., Yin, Y., and Schulten, K. (2009). Membrane-bending


mechanism of amphiphysin N-BAR domains, Biophys. J., 97, pp. 2727–
2735.
35. Sherwood, P., Brooks, B. R., and Sansom, M. S. P. (2008). Multiscale
methods for macromolecular simulations, Curr. Opin. Struct. Biol., 18,
pp. 630–640.
36. Kamerlin, S. C. L., Vicatos, S., Dryga, A., and Warshel, A. (2011). Coarse-
grained (multiscale) simulations in studies of biophysical and chemical
systems, Annu. Rev. Phys. Chem., 62, pp. 41–64.
37. Naik, S., Basu, A., Saikia, R., Madan, B., Paul, P., Chaterjee, R., Brask, J., and
Svendsen, A. (2010). Lipases for use in industrial biocatalysis: specificity
of selected structural groups of lipases, J. Mol. Catal. B Enzym., 65, pp.
18–23.
38. Fernandez-Lafuente, R. (2010). Lipase from Thermomyces lanuginosus:
uses and prospects as an industrial biocatalyst, J. Mol. Catal. B Enzym.,
62, pp. 197–212.
39. Hasan, F., Shah, A. A., and Hameed, A. (2006). Industrial applications of
microbial lipases, Enzyme Microb. Technol., 39, pp. 235–251.
40. Packter, N. M. (1994). Lipases—their structure, biochemistry and
application. Woolley, P. and Petersen, S.B., eds. (Cambridge University
Press) p. 216.
41. Adlercreutz, P. (2013). Immobilisation and application of lipases in
organic media, Chem. Soc. Rev., 42, pp. 6406–6436.
42. Brzozowski, a M., Savage, H., Verma, C. S., Turkenburg, J. P., Lawson, D.
M., Svendsen, a, and Patkar, S. (2000). Structural origins of the interfacial
activation in Thermomyces (Humicola) lanuginosa lipase, Biochemistry,
39, pp. 15071–15082.
43. Stamatis, H., Xenakis, A., and Kolisis, F. N. (1999). Bioorganic reactions in
microemulsions: the case of lipases, Biotechnol. Adv., 17, pp. 293–318.
44. Jensen, M. Ø., Jensen, T. R., Kjaer, K., Bjørnholm, T., Mouritsen, O. G.,
and Peters, G. H. (2002). Orientation and conformation of a lipase at an
interface studied by molecular dynamics simulations, Biophys. J., 83, pp.
98–111.
45. Hedin, E. M. K., Høyrup, P., Patkar, S. A, Vind, J., Svendsen, A., Fransson, L.,
and Hult, K. (2002). Interfacial orientation of Thermomyces lanuginosa
lipase on phospholipid vesicles investigated by electron spin resonance
relaxation spectroscopy, Biochemistry, 41, pp. 14185–14196.
46. Cajal, Y., Svendsen, A, Girona, V., Patkar, S. A, and Alsina, M. A. (2000).
Interfacial control of lid opening in Thermomyces lanuginosa lipase,
Biochemistry, 39, pp. 413–23.
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

316 Interfacial Enzymes and Their Interactions with Surfaces

47. Grochulski, P., Li, Y., Schrag, J. D., and Cygler, M. (1994). Two
conformational states of Candida rugosa lipase, Protein Sci., 3, pp. 82–
91.
48. Grochulski, P., Schrags, J. D., Bouthillier, F., Smith, P., Harrison, D., Rubin,
B., and Cygler, M. (1993). Insights into interfacial activation from an
open structure of Candida rugosa lipase, J. Biol. Chem., 268, pp. 12843–
12847.
49. Cherukuvada, S. L., Seshasayee, A. S. N., Raghunathan, K., Anishetty,
S., and Pennathur, G. (2005). Evidence of a double-lid movement in
Pseudomonas aeruginosa lipase: insights from molecular dynamics
simulations, PLOS Comput. Biol., 1, p. e28.
50. Villeneuve, P. (2007). Lipases in lipophilization reactions, Biotechnol.
Adv., 25, pp. 515–536.
51. Noinville, S., Revault, M., Baron, M.-H., Tiss, A., Yapoudjian, S., Ivanova,
M., and Verger, R. (2002). Conformational changes and orientation of
Humicola lanuginosa lipase on a solid hydrophobic surface: an in situ
interface Fourier transform infrared-attenuated total reflection study,
Biophys. J., 82, pp. 2709–2719.
52. Santini, S., Crowet, J. M., Thomas, a, Paquot, M., Vandenbol, M., Thonart,
P., Wathelet, J. P., Blecker, C., Lognay, G., Brasseur, R., Lins, L., and
Charloteaux, B. (2009). Study of Thermomyces lanuginosa lipase in
the presence of tributyrylglycerol and water, Biophys. J., 96, pp. 4814–
4825.
53. Derewenda, U., Swenson, L., and Green, R. (1994). An unusual buried
polar cluster in a family of fungal lipases, Nat. Struct. Biol., 1, pp. 36–47.
54. Aryal, P., Sansom, M. S. P., and Tucker, S. J. (2014). Hydrophobic gating in
ion channels, J. Mol. Biol., pp. 1–10.
55. Bond, P. J., Holyoake, J., Ivetac, A., Khalid, S., and Sansom, M. S. P. (2007).
Coarse-grained molecular dynamics simulations of membrane proteins
and peptides, J. Struct. Biol., 157, pp. 593–605.
56. Balali-Mood, K., Bond, P. J., and Sansom, M. S. P. (2009). Interaction of
monotopic membrane enzymes with a lipid bilayer: a coarse-grained
MD simulation study, Biochemistry, 48, pp. 2135–2145.
57. Fisher, A. B., and Jain, M. (2009). Phospholipases: Degradation of
Phospholipids in Membranes and Emulsions (John Wiley and Sons,
Chichester).
58. Winget, J. M., Pan, Y. H., and Bahnson, B. J. (2006). The interfacial binding
surface of phospholipase A2s, Biochim. Biophys. Acta, 1761, pp. 1260–
1269.

www.ebook3000.com
March 23, 2016 12:34 PSP Book - 9in x 6in 08-Allan-Svendsen-c08

References 317

59. Tatulian, S. A., Biltonen, R. L., and Tamm, L. K. (1997). Structural changes
in a secretory phospholipase A2 Induced by membrane binding: a clue
to interfacial activation?, J. Mol. Biol., 268, pp. 809–815.
60. Kalli, A. C., Devaney, I., and Sansom, M. S. P. (2014). Interactions of phos-
phatase and tensin homologue (PTEN) proteins with phosphatidylinos-
itol phosphates: insights from molecular dynamics simulations of PTEN
and voltage sensitive phosphatase, Biochemistry, 53, pp. 1724–1732.
61. Nguyen, H. N., Afkari, Y., Senoo, H., Sesaki, H., Devreotes, P. N., and
Iijima, M. (2014). Mechanism of human PTEN localization revealed
by heterologous expression in Dictyostelium, Oncogene, 33, pp. 5688–
5696.
62. Di Cristofano, A., and Pandolfi, P. P. (2000). The multiple roles of PTEN
in tumor suppression, Cell, 100, pp. 387–390.
63. Kalli, A. C., Morgan, G., and Sansom, M. S. P. (2013). Interactions
of the auxilin-1 PTEN-like domain with model membranes result in
nanoclustering of phosphatidyl inositol phosphates, Biophys. J., 105, pp.
137–145.
64. Lumb, C. N., and Sansom, M. S. P. (2013). Defining the membrane-
associated state of the PTEN tumor suppressor protein, Biophys. J., 104,
pp. 613–621.
65. Shenoy, S., Shekhar, P., Heinrich, F., Daou, M.-C., Gericke, A., Ross, A.
H., and Lösche, M. (2012). Membrane association of the PTEN tumor
suppressor: molecular details of the protein-membrane complex from
SPR binding studies and neutron reflection, PLOS ONE, 7, p. e32591.
66. Leone, V., Marinelli, F., Carloni, P., and Parrinello, M. (2010). Targeting
biomolecular flexibility with metadynamics, Curr. Opin. Struct. Biol., 20,
pp. 148–154.
67. Geim, A.K., and Novoselov, K. S. (2007). The rise of graphene, Nat. Mater.,
6, pp. 183–191.
68. Zhang, J., Zhang, F., Yang, H., Huang, X., Liu, H., Zhang, J., and Guo,
S. (2010). Graphene oxide as a matrix for enzyme immobilization,
Langmuir, 26, pp. 6083–6085.
69. Gobbo, C., Beurroies, I., de Ridder, D., Eelkema, R., Marrink, S. J., De
Feyter, S., van Esch, J. H., and de Vries, A. H. (2013). MARTINI model for
physisorption of organic molecules on graphite, J. Phys. Chem. C, 117, pp.
15623–15631.
This page intentionally left blank

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

PART II

ENZYME DESIGN

319
This page intentionally left blank

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Chapter 9

Sequence, Structure, Function: What We


Learn from Analyzing Protein Families

Michael Widmann and Jürgen Pleiss


Institute of Technical Biochemistry, University of Stuttgart, Allmandring 31,
70569 Stuttgart, Germany
Juergen.Pleiss@itb.uni-stuttgart.de

9.1 Introduction

Since 1951 when the protein sequence of insulin became available


[1], the amount of available sequence information has been growing
at an everincreasing rate with more than 108 sequences known
today [2]. Information about protein structures as well as about
biochemical properties of proteins is increasing as well albeit at a
slower rate due to the increased experimental effort as compared
to automated sequencing methods. This progress leads to an
ever-growing wealth of information and should also increase our
understanding of proteins in equal measure. However, this is often
not the case. Scientists today are flooded with information about
protein sequences, structures, biochemical properties, interactions,
purification schemes, and many more biological data. However, a

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

322 Sequence, Structure, Function

widely accepted data model to store, assess, and systematically


classify these data is still lacking. Protein sequences are available
from data repositories such as GenBank or Uniprot [3], but the
sequence entries often lack annotation, provide information that
contradicts the content of other entries, and contain only a limited
number of links to other types of information. Especially naming of
members of protein families with established naming schemes such
as the metalloβ-lactamases (MBLs) is sometimes ambiguous. Thus,
large amounts of data are isolated and lack links to other, critically
important pieces of information. While the number of needles is
increasing, the haystack is also getting bigger and bigger.
However, the growing amount of available data also offers
enormous opportunities if comprehensively analyzed. Using a
systematic bioinformatics approach, a large number of protein
sequences can be collected, grouped, and compared. This allows for
the detection of patterns, the creation of profiles, the identification
of outliers and the correlation to other data sets. An appropriate
navigation tool in sequence space requires a metric derived from
a global multisequence alignment [4, 5]. However, in many cases
global sequence alignments are not suited to identify biologically
relevant similarities between protein sequences. A prominent
example is circular permutations that result in proteins with a
locally similar sequence but shifted N- and C-termini [6–8]. Although
circular permutants have a similar structure and are often stable
and functional, standard global sequence alignment methods fail to
identify their similarity. At best they find local similarities embedded
in large unmatched regions, resulting in a low global similarity
score, and are therefore not identified by procedures that use the
global similarity score. Even more prevalent is the occurrence of
highly conserved modules and domains in protein families [9–11].
While the spatial arrangement of domains is often conserved, their
order on sequence level might vary [11]. If multiple protein domains
are encoded in a single gene, they might be rearranged or even
split into separate genes, resulting in homologous proteins that
are easily missed by standard global sequence similarity searches
[12]. A third challenge consists of finding conserved, functionally
relevant hotspot positions in members of a protein family with
a low global sequence similarity. Identifying these positions often

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Detection of Inconsistencies Utilizing a Standard Numbering Scheme 323

requires knowledge about the structure and enzymatic mechanism


of representative members of the protein family.
To integrate information on homologous subfamilies, on se-
quence and structure, and on the location of domains and func-
tionally relevant positions, we developed a data warehouse system
for analyzing protein families [13]. Using a relational data model,
databases on large and diverse protein families were established
and systematically analyzed. Annotation includes a protein family-
wide standard numbering scheme, which assigns identical numbers
to structurally equivalent residues [14]. By establishing a standard
numbering scheme, a large number of sequences are implicitly
aligned and can be systematically analyzed for conservation, thus
identifying structurally or functionally relevant positions.

9.2 Detection of Inconsistencies Utilizing a Standard


Numbering Scheme

β-Lactamases (E.C. 3.5.2.6) catalyze the hydrolysis of the β-lactam


ring of antibiotics and are the main cause of resistance against these
drugs, especially in gram-negative bacteria [15].
In contrast to serine β-lactamases, which employ a catalytic
serine for hydrolysis, MBLs (also annotated as molecular class B [16]
or functional class 3 [17] β-lactamases) employ one or two Zn(II)
ion(s) in the catalytic mechanism [18, 19]. The subfamilies IMP and
the VIM [20, 21] as well as the recently identified NDM enzymes
[22, 23] are the clinically most significant MBLs. These enzymes
are often encoded on plasmids and have been isolated from several
opportunistic pathogenic bacteria, such as Serratia marcescens
[24], Pseudomonas aeruginosa [25], Klebsiella pneumoniae [26], and
Escherichia coli [23].
A better understanding of how mutations may be acquired
under the selective pressure by antibiotics is critical for the design
of novel antibacterial drugs with long-term efficacy [27]. One
important step toward a better understanding of the sequence–
structure–function relationships is to systematically analyze the
current knowledge on MBL sequences, structures, and activities
found in databases and in the scientific literature. Information on
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

324 Sequence, Structure, Function

MBL sequences, their source organisms, and their designations


can be found in the publically accessible nonredundant National
Center for Biotechnology Information (NCBI) peptide database [28].
However, numerous protein entries show inconsistencies, such as
the use of a protein name that has already been assigned to another
protein, the use of different names for the same protein, or the
disregard of mutations in the sequence that would warrant a new
name.
Fortunately, a numbering scheme for MBLs has been established
[29], which allows for an unambiguous identification of all residues
in each MBL sequence. To facilitate the classification, nomenclature,
and analysis of MBLs, an automated database system was developed,
the Metallo-β-Lactamase Engineering Database (www.MBLED.uni-
stuttgart.de). It contains information on MBLs retrieved from the
NCBI peptide database, while strictly following the nomenclature
by Jacoby and Bush (www.lahey.org/Studies/) and the generally
accepted class B β-lactamase standard numbering scheme.
The Metallo-β-Lactamase Engineering Database comprises 589
MBL sequences and enables a systematic analysis. For the two
most important IMP and VIM subfamilies, mutation profiles were
generated [30]. Five MBL protein entries from the NCBI peptide
database were identified that were inconsistent with the Jacoby
and Bush nomenclature, and 15 new IMP candidates and 9 new
VIM candidates were suggested [30]. Mutation profiles describe all
amino acid exchanges in comparison to the reference sequence of
the respective protein family (IMP-1 and VIM-2 for the IMP and
the VIM family, respectively). By comparing the mutation profiles of
all MBLs retrieved from the NCBI database to already named MBL
variants, new variants and falsely named variants were identified.
This included variants where the sequence was already known but
that were not annotated by a name (e.g., GI 4760643 was found to
be identical to IMP-3, GI 473726 to IMP-1), as well as variants with
a falsely assigned name (e.g., GI 217038357 is reported as IMP-4
but deviates by a P39S mutation from the official IMP-4 sequence).
In total, 15 IMP-type sequences from microbial sources were found
with new mutation profiles. However, except for one entry (GI
90101507), names of existing variants were falsely assigned. A
closer examination of these entries revealed that seven of them

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Detection of Inconsistencies Utilizing a Standard Numbering Scheme 325

were artificially generated mutants [31, 32] and one (GI 50897036)
was missing the N-terminal methionine, possibly a polymerase
chain reaction (PCR) amplification or sequencing artifact. One
variant was published as IMP-1 [33], but the deposited sequence
(GI 110350569) contains two mutations, R102P and V202L. GI
182382568, an S262G mutant of IMP-9, was named IMP-25, but
this name had already been assigned to a different variant (GI
171188394) at that time; therefore GI 182382568 should not have
been named IMP-25. Unfortunately, the incorrectly named IMP-25
has already been adopted in the literature [21, 22], underlining
the importance of a consistent naming procedure. Two further
entries (GI 295002614 and GI 300391791) followed the incorrect
naming of IMP-25 and were given the same name, although they
harbor an additional mutation, P194S. GI 217038357 was reported
as IMP-4 [34], although the protein deviates by a P39S mutation,
resulting in a serine at position 3 of the mature protein, just
like in IMP-1. GI 40644311 was reported as IMP-13 but deviates
from the officially named IMP-13 [35] (GI 33086392 or GenBank
accession code AJ550807 according to www.lahey.org/Studies/) by
two mutations, E175G and K215E. GI 90101507 was annotated as an
IMP-type variant. This enzyme belongs to the IMP-2 cluster [21] and
differed by three substitutions from IMP-2. In addition to the 15 IMP
variants that showed new mutation profiles despite being described
as existing variants, two IMPs were annotated by names that were
inconsistent with their amino acid sequences and matched existing
profiles. The protein entry GI 83583501 was reported as IMP-4, but
its mutation profile is an exact match to IMP-26 (Table 9.1). We also
noticed that the sequence of VIM-14, which was published in 2011
[36], was deposited in 2004 as VIM-11 (GenBank accession codes
AY635904 and GI 49035768).
The discovery of identical proteins in GenBank carrying different
names and entries with missing information demonstrates the need
for a standardized and consistent method for submitting and pub-
lishing protein entries. The current status of some GenBank entries
is bound to lead to confusion among the scientific community,
especially considering the steadily growing amount of data being
generated and submitted to the databases and for publication.
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

326 Sequence, Structure, Function

Table 9.1 IMPs with a new mutation profile. Mislabeled and newly
discovered IMP variants. New mutations are highlighted in bold

Identifier Similar to Annotated as Mutation profile


50897036 IMP-1 IMP-1 M19-
256260267 IMP-6 Mutant IMP-6 F218I, S262G
256260265 IMP-6 Mutant IMP-6 F218Y, S262G
256260263 IMP-6 Mutant IMP-6 S121G, S262G
256260271 IMP-6 Mutant IMP-6 F218Y, S262A
256260273 IMP-6 Mutant IMP-6 S262A
256260269 IMP-1 Mutant IMP-1 F218Y
90101507 IMP-8 IMP-type S20K, S23F, F25L, F26C, I27V, F28C, L29F, F30L,
with one A34T, T35A, A37G, E38A, S39A, D49E, V67F,
additional P68S, A78T, E79D, K90T, T97N, S112T, R132Q,
mutation T170K, N177S, P198Q, R208K, I223V, Y227D,
I241L, L250I, K252M, P261S, V266I, L302R,
L304W, K319Q,
256260261 IMP-6 IMP-6 S262V
110350569 IMP-1 IMP-1 R102P, V202L
295002614 IMP-9 IMP-25 S23F, I27M, A34T, T35A, A37G, V67I, A78T,
with 2 E79D, K94N, T97N, K108R, R132Q, T170K,
additional N171Y, N177S, N183K, P194S, T197A, R208N,
mutations K215R, I216V, I223V, I241L, K252M, G254bS,
S262G, E265D, V266I, A297S, L304W, A307T,
L311F, P317S, S318T, K319T, P320A, S321H,
N322-
300391791 IMP-9 IMP-25 S23F, I27M, A34T, T35A, A37G, V67I, A78T,
with 2 E79D, K94N, T97N, K108R, R132Q, T170K,
additional N171Y, N177S, N183K, P194S, T197A, R208N,
mutations K215R, I216V, I223V, I241L, K252M, G254bS,
S262G, E265D, V266I, A297S, L304W, A307T,
L311F, P317S, S318T, K319T, P320A, S321H,
N322-
182382568 IMP-9 IMP-25 S23F, I27M, A34T, T35A, A37G, V67I, A78T,
E79D, K94N, T97N, K108R, R132Q, T170K,
N171Y, N177S, N183K, T197A, R208N, K215R,
I216V, I223V, I241L, K252M, G254bS, S262G,
E265D, V266I, A297S, L304W, A307T, L311F,
P317S, S318T, K319T, P320A, S321H, N322-
217038357 IMP-4 IMP-4 N77D, R132Q, T170K, S174G, V201L, I241L,
K252I, V266A, P320L
40644311 IMP-13 IMP-13 S20K, S23F, F25L, F26C, I27V, F28C, L29F, A34T,
T35A, A37G, E38A, S39A, D49E, Y53F, P68T,
A78T, E79D, K90T, T97N, K108E, S112T, R132Q,
D149S, T170K, N171Y, N177S, P198Q, V201L,
R208S, K215E, Y227H, I241L, L250I, K252M,
P261S, V266K, L300M, L302R, L304W, V308L,
N312K, P317T, K319S

Source: Reprinted with permission from Ref. [30]. Copyright (2012) American Society for
Microbiology.

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Identification of Functionally Relevant Positions 327

By combining an automated process for the generation of protein


family databases with a family-specific identification mechanism
based on an established naming and numbering scheme, a reliably
annotated protein family database was generated that revealed
various inconsistencies in currently existing database entries, while
also discovering new protein variants. By systematically analyzing
protein families, mutation profiles and mutation hotspots were
identified and can be used as a template for further analyses.

9.3 Identification of Functionally Relevant Positions

Polyhydroxyalkanoates (PHAs) such as poly(R)-3-hydroxybutyric


acid (PHB) serve as storage compounds of carbon and energy [37] in
bacteria and are of industrial interest as biodegradable substitutes
for nondegradable plastics. PHAs are degraded by intracellular
and by extracellular PHA depolymerases [38]. Depending on their
preferred substrate, depolymerases are grouped into four families:
PHA depolymerases degrading the native intracellular granules
consisting of medium or short-chain length monomers (nPHAMCL
depolymerases and nPHASCL depolymerases, respectively) and PHA
depolymerases degrading the denatured extracellular PHA granules
(dPHAMCL depolymerases and dPHASCL depolymerases). All PHA
depolymerases belong to the α-/β-hydrolase fold family, which also
contains lipases [38].
Except for a few intracellular nPHASCL depolymerases, all PHA
depolymerases have a catalytic triad (serine–histidine–aspartic
acid) as the active site. The catalytic serine is embedded in a GxSxG
sequence motif as found in other α-/β-hydrolases. Additionally, a
conserved noncatalytic histidine near the oxyanion hole is found
analogous to lipases [38, 39]. The best-studied PHA depolymerases
are dPHASCL depolymerases. They share a common domain architec-
ture consisting of a short signal peptide, a catalytic domain, a short
linker domain, and a substrate-binding domain [40]. The factors
that enable depolymerases to degrade PHAs are not yet understood.
Although the overall sequence similarity of PHA depolymerases to
other known α-/β-hydrolases like lipases and esterases is low and
substrate specificity differs considerably, they belong to the same
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

328 Sequence, Structure, Function

fold family and possess a highly conserved active site. Thus, from
a systematic comparison of the PHA depolymerase family to other
α-/β-hydrolases, depolymerase-specific motifs can be derived.
As a data resource for a systematic analysis of the sequence–
structure–function relationship of PHA depolymerases, the PHA De-
polymerase Engineering Database (www.DED.uni-stuttgart.de) [41]
has been designed to assist a comprehensive analysis of sequences,
the annotation of new sequences, and the design of mutants.
In total, 735 PHA depolymerase sequence entries of 587 different
proteins were assigned to 8 superfamilies and 38 homologous
families.
By systematically analyzing these families, several characteristics
for the individual families could be discovered. All PHA depoly-
merases show a Gx1 Sx2 G sequence motif around the catalytic serine,
except for the family of intracellular nPHASCL depolymerases, which
possess a catalytic cysteine. For particular PHA depolymerases it has
been previously described that a hydrophobic residue is found at
position x1 within the Gx1 Sx2 G motif [39, 42, 43]. This is a common
feature of almost all PHA depolymerases. Thus, compared to other
α-/β-hydrolases like lipases and esterases, where a polar residue
is most frequently found at position x1 , this conserved residue
of the Gx1 Sx2 G motif might be relevant to differentiate between
lipases or esterases and PHA depolymerases on the sequence level.
This hydrophobic residue is solvent exposed and located near the
catalytic serine at the bottom of a deep cleft, as seen in the structure
of the PHB depolymerase from Penicillium funiculosum (PDB entry
2D80) [44] (Fig. 9.1).
The hydrophobic residue at position x1 is tryptophan and
isoleucine for the families of intracellular nPHASCL depolymerases
and periplasmatic PHA depolymerases, respectively. For the family
of intracellular nPHAMCL depolymerases, the residue at position x1
is valine for almost all proteins. All proteins of the family of extracel-
lular type 1 dPHASCL depolymerases possess a hydrophobic residue
at position x1 , and only 81% of the proteins of the family of type
2 extracellular dPHASCL depolymerases have a hydrophobic residue
at position x1 . All extracellular dPHAMCL depolymerases have an
isoleucine at position x1 . One exception is the family of extracellular
nPHASCL depolymerases, which neither possesses a typical Gx1 Sx2 G

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Identification of Functionally Relevant Positions 329

Figure 9.1 Top view of the binding site of the PHB depolymerase from
Penicillium funiculosum (PDB entry 2D80). The catalytic residues are
marked in red, and the hydrophobic residue at position x1 of the Gx1Sx2G
motif is marked in blue. Reprinted with permission from Ref. [41]. Copyright
2009 BioMed Central.

motif nor has a hydrophobic residue a position x1 . In this family,


the Gx1 Sx2 G motif is altered to an AHSMG motif, which can also be
found in the family of Bacillus lipases. One family member of this
special family is the PHB depolymerase from Paucimonas lemoignei,
for which also structure information is available (PDB entry: 2VTV)
[45, 46]. This PHB depolymerase has special biochemical properties,
as it is an extracellular nPHASCL depolymerase degrading native
granules, and is the only experimentally validated extracellular
PHASCL depolymerase not having a substrate-binding domain.
Using the PHA Depolymerase Engineering Database [41] as
a navigation and analysis tool of PHA depolymerases, sequence–
structure–function relationships were analyzed, new sequences
were classified using multiple sequence alignments and phyloge-
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

330 Sequence, Structure, Function

netic trees and family-specific profiles were created, thus paving the
way for a deeper understanding of biochemical properties of PHA
depolymerases and for a successful design of PHA depolymerases
with improved properties.

9.4 The Modular Structure of Thiamine


Diphosphate–Dependent Decarboxylases

Thiamine diphosphate (ThDP)-dependent enzymes form a vast and


diverse class of proteins, catalyzing a wide variety of enzymatic
reactions, including the formation or cleavage of carbon–sulfur,
carbon–oxygen, carbon–nitrogen, and especially carbon–carbon
bonds [47, 48]. Because of their ability to form asymmetric C–C
bonds, ThDP-dependent enzymes are versatile catalysts for a variety
of biotransformations [49–54]. In addition, the ThDP-dependent
enzyme family has been shown to possess a wide substrate
spectrum ranging from small compounds like formaldehyde to
bulky 2-hydroxyphytanoyl-CoA molecules [55, 56]. Their highly
diverse substrate specificity and catalytic activity are reflected in
their sequences and structures, which differ significantly between
different families of ThDP-dependent enzymes [57, 58]. During
the course of evolution, shuffling, rearrangement, and fusion of
domains, as well as mutation and gene duplications, have led to the
large diversity of ThDP-dependent enzymes. Due to the scientific
and industrial relevance of enzymes capable of catalyzing C–C
bond formation and cleavage, we have focused in this chapter
on the decarboxylase superfamily of ThDP-dependent enzymes.
This superfamily contains among others pyruvate decarboxylases,
pyruvate oxidases, benzaldehyde lyases, benzoylformate decarboxy-
lases, 2-hydroxyphytanoyl-CoA lyases and 2-succinyl-6-hydroxy-2,4-
cyclohexadiene-1-carboxylic acid synthases.
Despite low sequence similarities between sequences of the
decarboxylase superfamily their structures are highly similar. They
consist of three domains: The N- and C-terminal domains are
involved in binding of the cofactor ThDP [59] and are named
pyrophosphate (PP)- and pyrimidine (PYR)-binding domains [58],
respectively. They are linked by a third domain, which is less

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

The Modular Structure of Thiamine Diphosphate–Dependent Decarboxylases 331

conserved and adopts different functions in the various enzyme


families, for example, by binding cofactors such as adenosine
diphosphate (ADP) [60] or flavin adenine dinucleotide (FAD)
[61]. This domain has structural similarity to transhydrogenase
domain dIII and therefore is called the TH3 domain [58]. The PYR
domain includes a conserved catalytic glutamic acid, while the
PP domain contains a conserved GDx25−30 N motif. Structural
comparisons of the PYR and PP domains show that despite their
overall low sequence similarity both domains have highly similar
structures. By providing a consistent annotation of the two domains,
a systematic analysis of corresponding domains from different
proteins becomes possible despite low overall sequence similarities.

Figure 9.2 Structural arrangement of protein domains of the superfamilies


of the thiamine diphosphate-dependent Enzyme Engineering Database.
All protein families are listed with their internal superfamily ID, the
superfamily name, and a 2D representation of the domain arrangement. The
pyrophosphate (PP) domain is colored in blue, and the pyrimidine (PYR)
domain is colored red. Reprinted with permission from Ref. [11]. Copyright
2010 BioMed Central.
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

332 Sequence, Structure, Function

By selecting seed sequences on the basis of the enzymatic


activity of the protein and the order of the protein domains in the
genes (Fig. 9.2) eight superfamilies were defined in the ThDPdepen-
dent Enzyme Engineering Database (www.TEED.uni-stuttgart.de)
[11].
The annotation of individual domains in all proteins allowed for a
systematic comparison regardless of the arrangement on the protein
sequence level. In the case of the ThDP-dependent decarboxylases,
this system was used to address the question of the evolutionary
relationships between the PP and the PYR domains. Since both
domains are required for enzymatic activity, it can be assumed that
both domains evolve at similar rates. By performing a pairwise
comparison of all PP domains of the decarboxylase superfamily as
well as all between all PYR domains, a similarity distribution of each
domain was created (Fig. 9.3). While the shapes of the distributions
are similar, the PP domains show that while both domains have a
similar distribution, the PYR domains are shifted to higher sequence
similarities, indicating a higher degree of conservation.

Figure 9.3 Pairwise comparison of all PP and all PYR domains of the
decarboxylase superfamily. A similarity distribution of each domain was
created by plotting the number of protein pairs for a given similarity score.

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Stereoselectivity-Determining Positions 333

9.5 Stereoselectivity-Determining Positions: The S-Pocket


Concept in Thiamine Diphosphate–Dependent
Decarboxylases

The superfamily of ThDP-dependent decarboxylases is the most


intensively studied superfamily because of the potential of de-
carboxylases to catalyze the formation of chiral α-hydroxyketones
by carboligation of two aldehydes [62]. Two members of the
decarboxylase superfamily, benzoylformate decarboxylase from P.
putida (PpBFD) and benzaldehyde lyase from P. fluorescens (PfBAL),
were thoroughly characterized for carboligation of acetaldehyde
and benzaldehyde. With the exception of PpBFD, all decarboxylases
characterized so far are R-selective for benzaldehyde as donor or
acceptor, while PpBFD is S-selective for benzaldehyde as donor and
acetaldehyde as acceptor but R-selective with acceptors larger than
acetaldehyde [63]. Because all decarboxylases have a conserved
active site architecture, it was challenging to identify the molecular
basis of the opposite stereoselectivity and to establish a general
concept of stereoselectivity of decarboxylases, the so-called S-pocket
[63]. The acceptor molecule is assumed to bind in either of two
possible orientations relative to the donor molecule, parallel or
antiparallel (Fig. 9.4). These two orientations result in (R)- or
(S)-α-hydroxyketones, respectively. In most of the decarboxylases,
the antiparallel orientation is sterically hindered by amino acids
blocking the S-pocket. Thus, preferably R-products are formed, such
as the PfBAL-catalyzed formation of (R)-2-HPP at >99% ee from
benzaldehyde and acetaldehyde as donor and acceptor, respectively
[64]. In contrast, PpBFD has a small S-pocket that allows for binding
of small acceptor molecules such as acetaldehyde, resulting in
an S-product. This model explains the PpBFD-catalyzed formation
of (S)-2-HPP at 92% ee from benzaldehyde and acetaldehyde as
donor and acceptor, respectively. However, the S-pocket of PpBFD
is too small for any acceptor molecule larger than acetaldehyde;
therefore the R-product (21% ee) is preferably formed upon
carboligation of benzaldehyde and propanal as donor and acceptor,
respectively [65]. This model was validated by designing variants
with shifted stereoselectivity. Upon increasing the S-pocket of
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

334 Sequence, Structure, Function

Figure 9.4 Schematic representation of the different shapes of the binding


sites of (a and c) PfBAL and (b and d) PpBFD. The two orientations of
the acceptor acetaldehyde, (a) parallel and (b) antiparallel, relative to the
donor benzaldehyde, result in formation of (a) (R)-2-HPP by PfBAL and
(b) (S)-2-HPP by PpBFD. (c and d) Benzaldehyde as acceptor binds in
parallel orientation for both enzymes; thus (R)-benzoin is formed by PfBAL
and PpBFD. Reprinted with permission from Ref. [63]. Copyright  c 2006
WILEY-VCH.

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Stereoselectivity-Determining Positions 335

PpBFD by replacing L461 by alanine, a reversal to S-selectivity (93%


ee) was observed for the formation of 2-hydroxy-1-phenylbutan-1-
one with propanal as the acceptor [65].
To identify S-pocket residues in other decarboxylases, a stan-
dard numbering scheme was established on the basis of the
decarboxylase-specific profile hidden Markov model [14] and
implemented in the ThDP-dependent Enzyme Engineering Database
[11]. Because all decarboxylases have a similar substrate-binding
site structure, the S-pocket concept could be successfully transferred
to other decarboxylases with a wide range of donor specificities
such as ApPDC [66], EcMenD [67], and EcAHAS II [68], and variants
with changed stereoselectivity were designed. As predicted from
the model, decreasing the bulkiness of S-pocket residues increased
the size of the S-pocket and resulted in a decreased R-selectivity or
increased S-selectivity.
ApPDC prefers short nonbranched aliphatic donor aldehydes
in carboligase reactions. In mixed carboligation reactions with
benzaldehyde as the acceptor, PAC is synthesized at high R-
selectivity (93% ee). Upon exchange of the S-pocket residue E469
by glycine, the S-pocket was drastically increased and (S)-PAC
was formed at 61% and 89% ee for acetaldehyde and propanal,
respectively, as donors [66]. EcMenD uses α-ketoglutarate as donor
and accepts a broad range of aldehydes as acceptors. Carboligation
of α-ketoglutarate with benzaldehyde as acceptor resulted in the R-
product (ee >93%) [69]. However, by exchanging the two S-pocket
residues F475 and I474 by alanine and glycine, respectively, the S-
pocket became accessible for benzaldehyde as acceptor [67], and
the S-product was formed (75% ee). EcAHAS II starts with the initial
decarboxylation of pyruvate, which is subsequently transferred to
benzaldehyde as an acceptor, resulting in almost enantiopure (R)-
PAC (>98% ee) [70]. Mutations of the S-pocket residue V461 to
glycine and two additional residues at the entrance to the S-pocket
(G459A, M460A) allowed for benzaldehyde to bind in the S-pocket
and resulted in a considerably reduced R-selectivity (48% ee) [68].
In almost all ThdP-dependent decarboxylases, the S-pocket,
which binds the acceptor molecule, is small or inaccessible; thus
bulky acceptor molecules can only bind in parallel orientation
relative to the donor molecule and R-products are formed. Reducing
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

336 Sequence, Structure, Function

the size of the side chains of the S-pocket residues allows for binding
of the acceptor molecule in antiparallel orientation, resulting in a
decrease of R-selectivity for bulky acceptor molecules or a switch to
S-selectivity for aliphatic acceptor molecules.

9.6 Regioselectivity-Determining Positions: Design of


Smart Cytochrome P450 Monooxygenase Libraries

Cytochrome P450 monooxygenases (CYPs) catalyze the oxidation


of inert hydrocarbons utilizing molecular oxygen as oxidant and
heme as catalytically active site [71]. The best studied enzymes are
the self-sufficient fatty acid hydroxylase CYP102A1 from Bacillus
megaterium, a fusion protein of a monooxygenase and a CPR-
type reductase, and the bacterial, camphor-oxidizing CYP101A1
from P. putida, and variants with widened substrate scope have
been developed. Often, however, substrates are converted with
low regioselectivity or the product of interest is not obtained.
Therefore, a general engineering strategy is needed to improve
regio- and chemoselectivity of CYPs. But how to identify among
many potential substrate-interacting residues those positions that
mediate selectivity of a broad range of substrates?
The Cytochrome P450 Engineering Database (www.CYPED.uni-
stuttgart.de) was designed as a navigation tool for the compre-
hensive and systematic analysis of CYP sequences and structures
[72]. A systematic analysis of the database to identify conserved
and variable positions in CYPs was combined with molecular
dynamics (MD) simulations to study enzyme–substrate interactions
in atomistic detail, starting from the X-ray structure of human
CYP2C9 in complex with its substrate warfarin bound far away from
the catalytic heme [66]. In total, 16 simulations were performed.
In only one simulation, the substrate approached the heme, and a
reactive substrate orientation was obtained. In all other simulations,
the substrate was binding in a large substrate cavity with a large
distance from the heme oxygen. In the reactive orientation, the
6 and 7 positions were close to the active heme oxygen (Fig.
9.5). Because short distances between the substrate and the heme
oxygen are prerequisite to a reactive substrate orientation, the

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Regioselectivity-Determining Positions 337

Figure 9.5 Warfarin conformers with the 6 and 7 positions (black) or the
4 position (white) close to heme as compared to the warfarin orientation
in the crystal structure (gray). Arrows mark the 4-, 6-, and 7-hydroxylation
sites of warfarin. Reprinted with permission from Ref. [73]. Copyright  c
2006 WILEY-VCH.

experimentally observed high selectivity of CYP2C9 toward the


7 and 6 positions (product ratio 71:22) was in agreement with
the simulation results. Thus, CYP2C9-catalyzed hydroxylation of
warfarin is highly regioselective and mainly depends on the shape
of the substrate-binding cavity, because other CYPs such as CYP3A4
hydroxylate warfarin at positions 4 and 10 [73]. In CYP2C9, the
orientation of warfarin was caused by a narrow hydrophobic funnel
between the heme group and the substrate-binding cavity. Similarly,
the experimentally observed regioselectivity of bacterial CYP102A1-
catalyzed oxidation of α-pinene was modeled using MD simulations
to monitor the preferred orientation of the substrate toward heme
[74].
Since for the great majority of CYP enzymes only sequence
information is available, it is of broad interest to identify selectivity-
determining residues from sequence alone. On the basis of the
analysis of 31 CYP crystal structures and over 6300 CYP sequences,
it was predicted that in 98.4% of all CYPs the residue at position
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

338 Sequence, Structure, Function

5 after the highly conserved ExxR motif is preferentially involved


in substrate binding during oxidation and, therefore, crucial for
selectivity control [75]. A further selectivity-determining residue
originates from the loop between the B-helix and the C-helix (BC-
loop). The ability to identify selectivity-determining positions from
CYP sequence further allowed us to analyze the distribution of
amino acids in these positions throughout the CYP family and re-
vealed a strong preference for hydrophobic residues (alanine, valine,
leucine, isoleucine, and phenylalanine), while charged residues are
rare exceptions [75, 76].
The prediction of selectivity-determining residues was applied
to develop selective CYP-catalysts for biooxidation. The major dis-
advantage of approaches that are based on random mutagenesis is
the huge amount of variants that have to be generated and screened
[77]. To apply such methods in the quest of improving regio-,
chemo-, and stereoselectivity is time and resource intensive, since
high-throughput screening assays for selectivity would typically
require the extensive quantification of oxidation products by the use
of chromatographic methods. Therefore, the screening effort was
reduced by a designing small, highly enriched CYP102A1 variant
library by focusing mutagenesis to only two selectivity-determining
positions. The library size was further reduced by limiting the
number of amino acids to five most abundant amino acids at
this positions, the hydrophobic amino acids A, V, I, L, and F [75,
76]. Thus, a focused CYP102A1 variant library was constructed
consisting of 25 combinations of 5 hydrophobic residues in the two
hotspot positions, F87 and A328 (Fig. 9.6) [78]. This focused library
was screened with four terpenes (geranylacetone, nerylacetone,
(4R)-limonene, and (+)-valencene). While the wild-type enzyme
converted all 4 terpenes with poor regioselectivity, 11 variants
demonstrated either a strong shift or an improved regio- or
stereoselectivity during oxidation of at least one substrate. Only
3 variants showed no activity toward any of the tested terpenes.
In a subsequent study, the effect of the two hotspot positions on
regioselectivity toward cyclic and acyclic alkanes was investigated
by screening the focused CYP102A1 variant library [79]. The
double variant F87V/A328F hydroxylated n-octane to 2-octanol
with higher regioselectivity (92%) than the wild type (15%).

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Regioselectivity-Determining Positions 339

Figure 9.6 Schematic view of the binding pocket of GX-type hydrolases.


The carbonyl oxygen of the oxyanion hole residue faces inward the binding
pocket of the alcohol moiety. Thus interaction of this oxygen with the
quaternary Cα of a TAE disables an appropriate binding of the substrate.
Reprinted with permission from Ref. [83]. Copyright 
c 2002 WILEY-VCH.

This double variant also efficiently hydroxylated cyclooctane at


a 100-fold increased oxidation rate. Other variants were highly
active toward the cyclodecane and cyclododecane (A328V and
F87A/A328V, respectively).
To assess whether the concept of regioselectivity hotspots near
the heme was transferrable, residues that are equivalent to the two
hotspot positions in CYP102A1 were mutated in CYP153A from
Marinobacter aquaeolei. In the fatty acid ω-hydroxylase CYP153A,
L354 corresponds to A328 in CYP102A1 [80] and was a shown to
be a crucial determinant of ω versus ω-1 regioselectivity. While the
wild-type enzyme was highly ω-selective toward nonanoic acid with
a ratio of 97:3 between the ω and the ω-1 product, the variant L354I
preferably hydroxylated nonanoic acid at the ω-1 position (24:76).
Thus, screening a focused variant library generated from a
variation of two regioselectivity-mediating hotspot positions offered
a promising first step to identify variants with changed regioselec-
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

340 Sequence, Structure, Function

tivity. In the second step, the best variant was further improved
by combining molecular modeling and variant screening. Because
the 7-hydroxylation product of limonene, perillyl alcohol, has shown
promising antitumor activity, a highly selective conversion of the
readily available limonene to perillyl alcohol would be desirable.
However, the regioselective oxidation at the C7 position represents
a central chemical challenge, because the unactivated sp3 C–H is less
reactive than alternate sites (three allylic ring positions and two
double bonds) and because it is not positioned adjacent to a direct-
ing group. Indeed, while the wild-type CYP102A1 converted (4R)-
limonene to four different oxidation products, terminal hydroxyla-
tion at the unactivated C7 position was not observed. However, upon
screening of the focused CYP102A1 variant library for conversion of
(4R)-limonene, one variant (A328V) was identified, which resulted
in 27% of the product being perillyl alcohol. In two subsequent
rounds of modeling and variant screening, two additional positions
were identified, which mediated regioselectivity toward the C7
position [81]. The triple variant A328VL/437F/A264V showed the
highest regioselectivity of 97% of the product being perillyl alcohol
and was obtained by screening of only 29 variants in total. The triple
variant was highly selective not only toward (4R)-limonene but also
toward molecules with a similar shape such as (4S)-limonene, p-
cymene, and trans-4-isopropyl-1-methylcyclohexane [81].

9.7 Substrate Specificity–Determining Positions: The


GX/GGGX Motif in Lipases

Although all lipases catalyze the same reaction, the hydrolysis or


formation of an ester bond, their sequences vary widely. All lipases
belong to the same fold family, the α/β hydrolases. Their catalytic
machinery consists of two elements: a structurally conserved
catalytic triad (Ser-His-Asp/Glu) and an oxyanion hole formed by
two backbone amides of a residue in the N-terminal region of the
lipase and the C-terminal neighbor of the catalytic serine. While the
latter is located in the structurally conserved nucleophilic elbow, the
former is located in a variable loop. However, a systematic analysis
of the Lipase Engineering Database (www.LED.uni-stuttgart.de)

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

Conclusion 341

revealed that all lipases can be assigned to either of two types, the
GX type and the GGGX type [82]. Most lipases belong to the GX
type, where the position of the backbone amide of the oxyanion hole
residue X is highly conserved, while the side chain of X is hydrophilic
or hydrophobic. A few lipase and esterase superfamilies such as
the carboxylesterases belong to the GGGX type, where the oxyanion
hole-forming glycine (or alanine) is preceded by two glycines
and followed by a conserved hydrophobic residue X. This local
difference in sequence has a major impact on substrate specificity,
as concluded from modeling the binding tertiary alcohol esters. In
GX types, the backbone carbonyl oxygen of the residue X pointed
into the binding pocket, resulting in a repulsive interaction with
the spherically arranged substituents of the quaternary Cα [83]. In
contrast, in GGGX types the carbonyl oxygen is arranged parallel to
the binding pocket, thus providing sufficient space for the sterically
demanding tertiary alcohol group. This analysis was confirmed by
measuring hydrolytic activity of GX- and GGGX-type lipases toward
esters of tertiary alcohols. While the GX-type esterases from P.
fluorescens and B. stearothermophilus showed no activity toward
selected tertiary alcohol esters, a GGGX-type acetylcholinesterase,
p-nitrobenzyl esterase, and pig liver esterase were active and even
stereoselective, as shown for p-nitrobenzyl esterase from B. subtilis
[84]. Thus, the presence of a short characteristic sequence motif
GGGX in a variable sequence environment was found to be highly
indicative for substrate specificity.

9.8 Conclusion

There are several lessons that we learned from studying sequences


and structures of protein families. Although more than 50 million
protein sequences are already known [3], the viable sequence space
is largely unknown. The number of different contemporary protein
sequences was estimated to be 1034 [85]. Thus we only know a
tiny fraction of the total sequence space of contemporary proteins.
Although it is difficult to predict its structure, the amazingly rich
microdiversity seen in a few instances (such as the β-lactamases)
indicates that the sequence space of existing, functional proteins is
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

342 Sequence, Structure, Function

highly interlinked. It was estimated that each day, 600 new variants
emerge of every microbial gene contained in a 100 g soil sample
[86]. Despite the overwhelming diversity of protein sequences, we
might rely on the intrinsic simplicity of protein architecture and
their high level of structural modularity. Homologous modules and
domains are found in all protein families and have been exchanged,
rearranged, fine-tuned, and reused during evolution. The domains
are amazingly conserved in structure and carry function, which is
encoded in a limited number of structural and sequence motifs,
though hidden in a confusing microdiversity. Thus, we can learn
about the relationship between sequence and function by collecting
available sequence and structure information, identifying domains,
analyzing conservation and variation, and relating sequence and
structure information to known biochemical properties on a
domain level. And we can apply this knowledge for the design of
novel enzymes with desired properties by recombining domains,
transplanting functionally relevant motifs, and fine-tuning variable
positions to optimize activity, specificity, selectivity, or stability.

Acknowledgments

Financial support from Deutsche Forschungsgemeinschaft


(FOR1296, SFB706, and SFB716) is gratefully acknowledged.

References

1. Sanger, F., and Tuppy H. (1951). The amino-acid sequence in the


phenylalanyl chain of insulin. I. The identification of lower peptides
from partial hydrolysates, Biochem. J., 49, pp. 463–481.
2. Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D. J.,
Ostell, J., and Sayers, E. W. (2013). GenBank, Nucleic Acids Res., 41, pp.
D36–D42.
3. Apweiler, R., Bateman, A., Martin, M. J., O’Donovan, C., Magrane, M., Alam-
Faruque, Y., Alpi, E., Antunes, R., Arganiska, J., Casanova, E. B., et al.
(2014). Activities at the universal protein resource (UniProt), Nucleic
Acids Res., 42, pp. D191–D198.

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

References 343

4. Bilu, Y., Agarwal, P. K., and Kolodny, R. (2006). Faster algorithms for
optimal multiple sequence alignment based on pairwise comparisons,
IEEE-ACM Trans. Comput. Biol. Bioinform., 3, pp. 408–422.
5. Gong, Z., Li, F. Z., and Dong, L. H. (2010). Performance assessment of
protein multiple sequence alignment algorithms based on permutation
similarity measurement, Biochem. Biophys. Res. Commun., 399, pp. 470–
474.
6. Qian, Z., and Lutz, S. (2005). Circular permutation of Candida antarctica
lipase B, Abstr. Pap. Am. Chem. Soc., 230, pp. U631–U631.
7. Dai, X. F., Zhu, M. L., and Wang, Y. P. (2014). Circular permutation of E.
coli EPSP synthase: increased inhibitor resistance, improved catalytic
activity, and an indicator for protein fragment complementation, Chem.
Commun., 50, pp. 1830–1832.
8. Guntas, G., Kanwar, M., and Ostermeier, M. (2012). Circular permutation
in the β-loop of TEM-1 α-lactamase results in improved activity and
altered substrate specificity, PLOS ONE, 7, p. e35998.
9. Zhang, Y., Sun, Y. N., and Cole, J. R. (2013). A sensitive and accurate
protein domain classification tool (SALT) for short reads,Bioinformatics,
29, pp. 2103–2111.
10. Xu, Q. F., and Dunbrack, R. L. (2012). Assignment of protein sequences
to existing domain and family classification systems: Pfam and the
PDB,Bioinformatics, 28, pp. 2763–2772.
11. Widmann, M., Radloff, R., and Pleiss, J. (2010). The thiamine diphos-
phate dependent enzyme engineering database: a tool for the system-
atic analysis of sequence and structure relations, BMC Biochem., 11, p.
9
12. Marston, F. Y., Grainger, W. H., Smits, W. K., Hopcroft, N. H., Green, M.,
Hounslow, A. M., Grossman, A. D., Craven, C. J., and Soultanas, P. (2010).
When simple sequence comparison fails: the cryptic case of the shared
domains of the bacterial replication initiation proteins DnaB and DnaD,
Nucleic Acids Res., 38, pp. 6930–6942.
13. Fischer, M., Thai, Q. K., Grieb, M., and Pleiss, J. (2006). DWARF: a data
warehouse system for analyzing protein families, BMC Bioinformatics,
7, p. 495.
14. Vogel, C., Widmann, M., Pohl, M., and Pleiss, J. (2012). A standard num-
bering scheme for thiamine diphosphate-dependent decarboxylases,
BMC Biochem., 13, p. 24.
15. Llarrull, L. I., Testero, S. A., Fisher, J. F., and Mobashery, S. (2010). The
future of the beta-lactams, Curr. Opin. Microbiol., 13, pp. 551–557.
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

344 Sequence, Structure, Function

16. Ambler, R. P. (1980). The structure of beta-lactamases,Philos. Trans. R


Soc. Lond. B Biol. Sci., 289, pp. 321–331.
17. Bush, K., Jacoby, G. A., and Medeiros, A. A. (1995). A functional
classification scheme for beta-lactamases and its correlation with
molecular structure, Antimicrob. Agents Chemother., 39, pp. 1211–1233.
18. Fisher, J. F., Meroueh, S. O., and Mobashery, S. (2005). Bacterial resis-
tance to beta-lactam antibiotics: compelling opportunism, compelling
opportunity, Chem. Rev., 105, pp. 395–424.
19. Crowder, M. W., Spencer, J., and Vila, A. J. (2006). Metallo-beta-
lactamases: novel weaponry for antibiotic resistance in bacteria, Acc.
Chem. Res., 39, pp. 721–728.
20. Walsh, T. R., Toleman, M. A., Poirel, L., and Nordmann P. (2005). Metallo-
beta-lactamases: the quiet before the storm, Clin. Microbiol. Rev., 18, pp.
306–325.
21. Oelschlaeger, P., Ai, N., Duprez, K. T., Welsh, W. J., and Toney, J. H. (2010).
Evolving carbapenemases: can medicinal chemists advance one step
ahead of the coming storm, J. Med. Chem., 53, pp. 3013–3027.
22. Cornaglia, G., Giamarellou, H., and Rossolini, G. M. (2011). Metallo-beta-
lactamases: a last frontier for beta-lactams, Lancet Infect. Dis., 11, pp.
381–393.
23. Kumarasamy, N., Venkatesh, K. K., Devaleenal, B., Poongulali, S.,
Yepthomi, T., Solomon, S., Flanigan, T. P., and Mayer, K. H. (2011).
Safety, tolerability, and efficacy of second-line generic protease inhibitor
containing HAART after first-line failure among South Indian HIV-
infected patients, J. Int. Assoc. Physicians AIDS Care (Chic), 10, pp. 71–75.
24. Osano, E., Arakawa, Y., Wacharotayankun, R., Ohta, M., Horii, T., Ito,
H., Yoshimura, F., and Kato, N. (1994). Molecular characterization of
an enterobacterial metallo beta-lactamase found in a clinical isolate
of Serratia marcescens that shows imipenem resistance, Antimicrob.
Agents Chemother., 38, pp. 71–78.
25. Laraki, N., Galleni, M., Thamm, I., Riccio, M. L., Amicosante, G., Frere,
J. M., and Rossolini, G. M. (1999). Structure of In31, a blaIMP-
containing Pseudomonas aeruginosa integron phyletically related to
In5, which carries an unusual array of gene cassettes, Antimicrob. Agents
Chemother., 43, pp. 890–901.
26. Koh, T. H., Babini, G. S., Woodford, N., Sng, L. H., Hall, L. M.,
and Livermore, D. M. (1999). Carbapenem-hydrolysing IMP-1 beta-
lactamase in Klebsiella pneumoniae from Singapore, Lancet, 353, p.
2162.

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

References 345

27. Oelschlaeger, P. (2008). Outsmarting metallo-beta-lactamases by mim-


icking their natural evolution, J. Inorg. Biochem., 102, pp. 2043–2051.
28. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., and Sayers, E.
W. (2011). GenBank, Nucleic Acids Res., 39, pp. D32–D37.
29. Garau, G., Garcia-Saez, I., Bebrone, C., Anne, C., Mercuri, P., Galleni, M.,
Frere, J. M., and Dideberg, O. (2004). Update of the standard numbering
scheme for class B beta-lactamases, Antimicrob. Agents Chemother., 48,
pp. 2347–2349.
30. Widmann, M., Pleiss, J., and Oelschlaeger, P. (2012). Systematic analysis
of metallo-beta-lactamases using an automated database, Antimicrob.
Agents Chemother., 56, pp. 3481–3491.
31. Oelschlaeger, P., Mayo, S. L., and Pleiss, J. (2005). Impact of remote
mutations on metallo-beta-lactamase substrate specificity: implications
for the evolution of antibiotic resistance, Protein Sci., 14, pp. 765–774.
32. Oelschlaeger, P., and Mayo, S. L. (2005). Hydroxyl groups in the
(beta)beta sandwich of metallo-beta-lactamases favor enzyme activity:
a computational protein design study, J. Mol. Biol., 350, pp. 395–401.
33. Mendes, R. E., Castanheira, M., Toleman, M. A., Sader, H. S., Jones, R.
N., and Walsh, T. R. (2007). Characterization of an integron carrying
blaIMP-1 and a new aminoglycoside resistance gene, aac(6’)-31, and
its dissemination among genetically unrelated clinical isolates in
a Brazilian hospital, Antimicrob. Agents Chemother., 51, pp. 2611–
2614.
34. Liu, Y., Zhang, B., Cao, Q., Huang, W., Shen, L., and Qin X. (2009).
Two clinical strains of Klebsiella pneumoniae carrying plasmid-
borne blaIMP-4, blaSHV-12, and armA isolated at a pediatric center
in Shanghai, China, Antimicrob. Agents Chemother., 53, pp. 1642–
1644.
35. Toleman, M. A., Biedenbach, D., Bennett, D., Jones, R. N., and Walsh, T.
R. (2003). Genetic characterization of a novel metallo-beta-lactamase
gene, blaIMP-13, harboured by a novel Tn5051-type transposon
disseminating carbapenemase genes in Europe: report from the SEN-
TRY worldwide antimicrobial surveillance programme, J. Antimicrob.
Chemother., 52, pp. 583–590.
36. Mazzariol, A., Mammina, C., Koncan, R., Di Gaetano, V., Di Carlo, P.,
Cipolla, D., Corsello, G., and Cornaglia G. (2011). A novel VIM-type
metallo-beta-lactamase (VIM-14) in a Pseudomonas aeruginosa clinical
isolate from a neonatal intensive care unit, Clin. Microbiol. Infect., 17, pp.
722–724.
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

346 Sequence, Structure, Function

37. Anderson, A. J., and Dawes, E. A. (1990). Occurrence, metabolism,


metabolic role, and industrial uses of bacterial polyhydroxyalkanoates,
Microbiol. Rev., 54, pp. 450–472.
38. Jendrossek, D., and Handrick R. (2002). Microbial degradation of
polyhydroxyalkanoates, Annu. Rev. Microbiol., 56, pp. 403–432.
39. Jaeger, K. E., Steinbüchel, A., and Jendrossek D. (1995). Substrate
specificities of bacterial polyhydroxyalkanoate depolymerases and
lipases: bacterial lipases hydrolyze poly(omega-hydroxyalkanoates),
Appl. Environ. Microbiol., 61, pp. 3113–3118.
40. Behrends, A., Klingbeil, B., and Jendrossek D. (1996). Poly(3-hydroxy-
butyrate) depolymerases bind to their substrate by a C-terminal located
substrate binding site, FEMS Microbiol. Lett., 143, pp. 191–194.
41. Knoll, M., Hamm, T. M., Wagner, F., Martinez, V., and Pleiss J. (2009). The
PHA depolymerase engineering database: a systematic analysis tool for
the diverse family of polyhydroxyalkanoate (PHA) depolymerases, BMC
Bioinformatics, 10, p. 89
42. Jendrossek, D., Frisse, A., Behrends, A., Andermann, M., Kratzin, HD.,
Stanislawski, T., and Schlegel, HG. (1995). Biochemical and molecular
characterization of the Pseudomonas lemoignei polyhydroxyalkanoate
depolymerase system, J. Bacteriol., 177, pp. 596–607.
43. Tokiwa, Y., and Calabia, B. P. (2004). Degradation of microbial polyesters,
Biotechnol. Lett., 26, pp. 1181–1189.
44. Hisano, T., Kasuya, K., Tezuka, Y., Ishii, N., Kobayashi, T., Shiraki, M.,
Oroudjev, E., Hansma, H., Iwata, T., Doi, Y., et al. (2006). The crystal
structure of polyhydroxybutyrate depolymerase from Penicillium fu-
niculosum provides insights into the recognition and degradation of
biopolyesters, J. Mol. Biol., 356, pp. 993–1004.
45. Papageorgiou, A. C., Hermawan, S., Singh, C. B., and Jendrossek, D.
(2008). Structural basis of poly(3-hydroxybutyrate) hydrolysis by
PhaZ7 depolymerase from Paucimonas lemoignei, J. Mol. Biol., 382, pp.
1184–1194.
46. Handrick, R., Reinhardt, S., Focarete, M. L., Scandola, M., Adamus, G.,
Kowalczuk, M., and Jendrossek, D. (2001). A new type of thermoalka-
lophilic hydrolase of Paucimonas lemoignei with high specificity for
amorphous polyesters of short chain-length hydroxyalkanoic acids, J.
Biol. Chem., 276, pp. 36215–36224.
47. Enders, D., Niemeier, O., and Henseler, A. (2007). Organocatalysis by N-
heterocyclic, carbenes, Chem. Rev., 107, pp. 5606–5655.

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

References 347

48. Zeitler, K. (2005). Extending mechanistic routes in heterazolium


catalysis-promising concepts for versatile synthetic methods, Angew.
Chem., Int. Ed., 44, pp. 7506–7510.
49. Demir, A. S., Ayhan, P., and Sopaci, S. B. (2007). Thiamine pyrophosphate
dependent enzyme catalyzed reactions: stereoselective C-C bond
formations in water, Clean-Soil Air Water, 35, pp. 406–412.
50. Müller, M., Gocke, D., and Pohl, M. (2009). Thiamin diphosphate in
biological chemistry: exploitation of diverse thiamin diphosphate-
dependent enzymes for asymmetric chemoenzymatic synthesis, FEBS
J., 276, 2894–2904.
51. Pohl, M., Sprenger, G. A., and Muller, M. (2004). A new perspective on
thiamine catalysis, Curr. Opin. Biotechnol., 15, pp. 335–342.
52. Berthold, C. L., Gocke, D., Wood, D., Leeper, F. J., Pohl, M., and Schneider, G.
(2007). Structure of the branched-chain keto acid decarboxylase (KdcA)
from Lactococcus lactis provides insights into the structural basis for
the chemoselective and enantioselective carboligation reaction, Acta
Crystallogr. D, 63, pp. 1217–1224.
53. Iding, H., Siegert, P., Mesch, K., and Pohl, M. (1998). Application of alpha-
keto acid decarboxylases in biotransformations, Biochim. Biophys. Acta,
1385, pp. 307–322.
54. Stillger, T., Pohl, M., Wandrey, C., and Liese, A. (2006). Reaction
engineering of benzaldehyde lyase from Pseudomonas fluorescens
catalyzing enantioselective C-C bond formation, Org. Process Res. Dev.,
10, pp. 1172–1177.
55. Casteels, M., Foulon, V., Mannaerts, G. P., and Van Veldhoven, P. P. (2003).
Alpha-oxidation of 3-methyl-substituted fatty acids and its thiamine
dependence, Eur. J. Biochem., 270, pp. 1619–1627.
56. Bornemann, S., Crout, D. H. G., Dalton, H., Hutchinson, D. W., Dean,
G., Thomson, N., and Turner, M. M. (1993). Stereochemistry of the
formation of lactaldehyde and acetoin produced by the pyruvate
decarboxylases of yeast (Saccharomyces sp) and Zymomonas-mobilis:
different Boltzmann distributions between bound forms of the elec-
trophile, acetaldehyde, in the 2 enzymatic-reactions, J. Chem Soc., Perkin
Trans. 1, pp. 309–311.
57. Costelloe, S. J., Ward, J. M., and Dalby, P. A. (2008). Evolutionary analysis
of the TPP-dependent enzyme family, J. Mol. Evol., 66, pp. 36–49.
58. Duggleby, R. G. (2006). Domain relationships in thiamine diphosphate-
dependent enzymes, Acc. Chem. Res., 39, pp. 550–557.
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

348 Sequence, Structure, Function

59. Wang, J. J. L., Martin, P. R., and Singleton, C. K. (1997). Aspartate 155 of
human transketolase is essential for thiamine diphosphate magnesium
binding, and cofactor binding is required for dimer formation, Biochim.
Biophys. Acta, 1341, pp. 165–172.
60. Werther, T., Zimmer, A., Wille, G., Golbik, R., Weiss, M. S., and
König, S. (2010). New insights into structure–function relationships of
oxalyl CoA decarboxylase from Escherichia coli, FEBS J., 277, pp. 2628–
2640.
61. Lindqvist, Y., and Schneider, G. (1993). Thiamin diphosphate dependent
enzymes: transketolase, pyruvate oxidase and pyruvate decarboxylase,
Curr. Opin. Struct. Biol., 3, pp. 896–901.
62. Brovetto, M., Gamenara, D., Mendez, P. S., and Seoane, G. A. (2011). C-
C bond-forming lyases in organic synthesis, Chem. Rev., 111, pp. 4346–
4403.
63. Knoll, M., Muller, M., Pleiss, J., and Pohl, M. (2006). Factors mediating ac-
tivity, selectivity, and substrate specificity for the thiamin diphosphate-
dependent enzymes benzaldehyde lyase and benzoylformate decar-
boxylase, ChemBioChem, 7, pp. 1928–1934.
64. Demir, A. S., Sesenoglu, O., Eren, E., Hosrik, B., Pohl, M., Janzen,
E., Kolter, D., Feldmann, R., Dunkelmann, P., and Muller, M. (2002).
Enantioselective synthesis of alpha-hydroxy ketones via benzaldehyde
lyase-catalyzed C-C bond formation reaction, Adv. Synth. Catal., 344, pp.
96–103.
65. Gocke, D., Walter, L., Gauchenova, E., Kolter, G., Knoll, M., Berthold, C. L.,
Schneider, G., Pleiss, J., Muller, M., and Pohl, M. (2008). Rational pro-
tein design of ThDP-dependent enzymes-engineering stereoselectivity,
ChemBioChem, 9, pp. 406–412.
66. Rother, D., Kolter, G., Gerhards, T., Berthold, C. L., Gauchenova, E.,
Knoll, M., Pleiss, J., Muller, M., Schneider, G., and Pohl, M. (2011). S-
selective mixed carboligation by structure-based design of the pyruvate
decarboxylase from Acetobacter pasteurianus, ChemCatChem, 3, pp.
1587–1596.
67. Westphal, R., Waltzer, S., Mackfeld, U., Widmann, M., Pleiss, J., Beigi, M.,
Muller, M., Rother, D., and Pohl, M. (2013). (S)-Selective MenD variants
from Escherichia coli provide access to new functionalized chiral alpha-
hydroxy ketones, Chem. Commun. (Camb.), 49, pp. 2061–2063.
68. Hailes, H. C., Rother, D., Muller, M., Westphal, R., Ward, J. M., Pleiss, J.,
Vogel, C., and Pohl, M. (2013). Engineering stereoselectivity of ThDP-
dependent enzymes, FEBS J., 280, pp. 6374–6394.

www.ebook3000.com
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

References 349

69. Kurutsch, A., Richter, M., Brecht, V., Sprenger, G. A., and Muller, M. (2009).
MenD as a versatile catalyst for asymmetric synthesis, J. Mol. Catal. B:
Enzym., 61, pp. 56–66.
70. Engel, S., Vyazmensky, M., Geresh, S., Barak, Z., and Chipman, D. M.
(2003). Acetohydroxyacid synthase: a new enzyme for chiral synthesis
of R-phenylacetylcarbinol, Biotechnol. Bioeng., 83, pp. 833–840.
71. Urlacher, V. B., and Girhard, M. (2013). Cytochrome P450 monooxy-
genases: an update on perspectives for synthetic application, Trends
Biotechnol., 30, pp. 26–36.
72. Fischer, M., Knoll, M., Sirim, D., Wagner, F., Funke, S., and Pleiss, J.
(2007). The cytochrome P450 engineering database: a navigation and
prediction tool for the cytochrome P450 protein family, Bioinformatics,
23, pp. 2015–2017.
73. Seifert, A., Tatzel, S., Schmid, R. D., and Pleiss, J. (2006). Multiple
molecular dynamics simulations of human p450 monooxygenase
CYP2C9: the molecular basis of substrate binding and regioselectivity
toward warfarin, Proteins, 64, pp. 147–155.
74. Branco, R. J., Seifert, A., Budde, M., Urlacher, V. B., Ramos, M. J., and
Pleiss, J. (2008). Anchoring effects in a wide binding pocket: the
molecular basis of regioselectivity in engineered cytochrome P450
monooxygenase from B. megaterium, Proteins, 73, pp. 597–607.
75. Seifert, A., and Pleiss, J. (2009). Identification of selectivity-determining
residues in cytochrome P450 monooxygenases: a systematic analysis of
the substrate recognition site 5, Proteins, 74, pp. 1028–1035.
76. Sirim, D., Widmann, M., Wagner, F., and Pleiss, J. (2010). Prediction and
analysis of the modular structure of cytochrome P450 monooxygenases,
BMC Struct. Biol., 10, p. 34.
77. Chen, M. M., Snow, C. D., Vizcarra, C. L., Mayo, S. L., and Arnold, F. H.
(2012). Comparison of random mutagenesis and semi-rational designed
libraries for improved cytochrome P450 BM3-catalyzed hydroxylation
of small alkanes, Protein Eng. Des. Sel., 25, pp. 171–178.
78. Seifert, A., Vomund, S., Grohmann, K., Kriening, S., Urlacher, V. B., Laschat,
S., and Pleiss, J. (2009). Rational design of a minimal and highly
enriched CYP102A1 mutant library with improved regio-, stereo- and
chemoselectivity, ChemBioChem, 10, pp. 853–861.
79. Weber, E., Seifert, A., Antonovici, M., Geinitz, C., Pleiss, J., and Urlacher,
V. B. (2011). Screening of a minimal enriched P450 BM3 mutant library
for hydroxylation of cyclic and acyclic alkanes, Chem. Commun. (Camb.),
47, pp. 944–946.
March 23, 2016 12:36 PSP Book - 9in x 6in 09-Allan-Svendsen-c09

350 Sequence, Structure, Function

80. Malca, S. H., Scheps, D., Kuhnel, L., Venegas-Venegas, E., Seifert, A., Nestl,
B. M., and Hauer, B. (2012). Bacterial CYP153A monooxygenases for the
synthesis of omega-hydroxylated fatty acids, Chem. Commun., 48, pp.
5115–5117.
81. Seifert, A., Antonovici, M., Hauer, B., and Pleiss, J. (2011). An efficient
route to selective bio-oxidation catalysts: an iterative approach com-
prising modeling, diversification, and screening, based on CYP102A1,
ChemBioChem, 12, pp. 1346–1351.
82. Pleiss, J., Fischer, M., Peiker, M., Thiele, C., and Schmid, R. D. (2000).
Lipase engineering database: understanding and exploiting sequence-
structure-function relationships, J. Mol Catal. B: Enzym., 10, pp. 491–
508.
83. Henke, E., Pleiss, J., and Bornscheuer, U. T. (2002). Activity of lipases
and esterases towards tertiary alcohols: insights into structure-function
relationships, Angew. Chem., Int. Ed., 41, pp. 3211–3213.
84. Henke, E., Bornscheuer, U. T., Schmid, R. D., and Pleiss, J. (2003). A
molecular mechanism of enantiorecognition of tertiary alcohols by
carboxylesterases, ChemBioChem, 4, pp. 485–493.
85. Dryden, D. T. F., Thomson, A. R., and White, J. H. (2008). How much of
protein sequence space has been explored by life on earth, J. R. Soc.
Interface, 5, pp. 953–956.
86. Gabor, E., Niehaus, F., Aehle, W., and Eck, J. (2012). Zooming in on
metagenomics: molecular microdiversity of subtilisin Carlsberg in soil,
J. Mol. Biol., 418, pp. 16–20.

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Chapter 10

Bioinformatic Analysis of Protein


Families to Select Function-Related
Variable Positions

Dmitry Suplatov, Evgeny Kirilin, and Vytas Švedas


Belozersky Institute of Physicochemical Biology, Lomonosov Moscow State University,
Moscow 119991, Russia
d.a.suplatov@belozersky.msu.ru

Understanding how changes in protein sequence affect biological


function is one of the most challenging problems of modern struc-
tural biology. During evolution of proteins from a common ancestor
one functional property can be preserved, while others can vary,
leading to functional diversity. For example, homologous enzymes
within a superfamily can share a common structural framework and
overall reaction chemistry but differ in other catalytic properties
such as substrate specificity, enantioselectivity, and regioselectivity
or even catalyze promiscuous chemical conversions. Why do similar
active sites in homologous enzymes perform different chemical
transformations? How should we study the structure–function
relationship and predict particular structural changes that lead to
functional diversity?
Our understanding of the fundamental molecular mechanisms in
practice can be evaluated by the ability to design protein functions

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

352 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

that have important biotechnological applications in a rational


way. However, as yet no clear methodology has been suggested to
change the structure of an enzyme in order to improve its catalytic
properties or modify its function.
While sequence data and crystallographic structures of pro-
teins and protein complexes reveal amino acid composition and
organization of the active sites, as well as binding interfaces, they
do not provide information about the importance of individual
residues. Consequently, in this chapter we suggest a systematic
analysis of adaptive mutations that survived evolution as a key
to understand the impact of amino acid substitutions on enzyme
function. An overview is given of bioinformatic methods, which
can be used to identify variable amino acid residues that seem to
play an important role in protein function. We describe algorithmic
assumptions behind them and outline biological problems to which
they can be applied. We then focus on the most recent advances
in the field and discuss the bioinformatic methods that can help to
select a focused set of variable positions to be used as hotspots for
directed evolution or rational design of the protein function.

10.1 Introduction

Proteins are linear polymers built of amino acids with different


physicochemical properties. The unique functions of enzymes as
catalytically active proteins are determined by a precise sequence
of amino acid residues that allows them to fold into complex 3D
structures that include active site residues under certain functional
constraints. Over millions of years the change of a protein sequence
was driven by mutations that led to a new function and provided
selective advantage to the host organism. Consequently, evolution
has optimized catalytic capabilities of enzymes by sequence
modifications—substitutions, deletions, or insertions of residues—
diverging them into families of homologous proteins.
The recent growth of protein sequence and structural databases
highlighted functional variations of homologous enzymes [1–3].
Evolutionary distance between relatives defines the degree to
which their sequence and structural and, consequently, functional

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Introduction 353

features have changed. Therefore, similarity of primary and tertiary


structures can be used to classify proteins at different levels of
functional categories. Remote homologs that have little sequence
similarity but whose structural organization and major functional
features suggest common evolutionary origin can be grouped into
superfamilies. They are assumed to diverge from a very distant
common ancestor and are likely to have broad functional variability,
for example, diverge in both the reaction chemistry and substrate
profile. Closer homologs with high sequence similarity usually share
a similar reaction mechanism but may act on diverse substrates
and can be grouped into families assuming a closer common
ancestor and, as a consequence, lower functional diversity. Finally,
we consider a subfamily as a group of proteins with high sequence
and structural identity and functional homogeneity. To conclude, the
term superfamily can be used to describe a common evolutionary
origin of functionally diverse enzymes on the basis of significant
structural similarity and common major functional features (e.g.
common topology and role of the catalytic residues). Assignment
of families and subfamilies within a superfamily would then refer
to groups of evolutionary-related enzymes with similar properties
and would be subjective to our understanding of the hierarchy of
functional categories. This sequence-structure-function classifica-
tion of proteins is a very important tool to organize information
about different enzymes and study them systematically in order
to understand the molecular mechanisms of their functioning. It is
much more versatile compared to the empiric EC classification of
enzymes that is based on quite limited experimental data and thus
has some well-known drawbacks [3].
During the past decades enzymes have been attracting consider-
able scientific attention due to their catalytic potential and various
biotechnological applications [4]. Consequently, significant protein
engineering efforts have been undertaken to study the catalytic
mechanisms in order to enrich our understanding of how enzymes
function and how to find the relationship between sequence
and structure. Stochastic techniques such as experimental-directed
evolution that combine random mutagenesis with screening and
selection for a desired phenotype to mimic the Darwinian process
have been developed to produce enzymes with improved functional
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

354 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

properties [5–9]. However, the total number of possible mutations


in a protein structure is astronomical. Thus, completely random
approaches were resource-demanding and required very large
mutant libraries and efficient screening and selection techniques
but still were able to scan only a small part of the sequence
space. Moreover, these experiments limited the evolutionary process
to incorporating only point mutations rather than insertions or
deletions. At the same time it is known that extra loops (or their
absence) in enzymes, especially in the active sites, can have a great
impact on the type or efficiency of chemical transformation (see Ref.
[10] as an example).
Rapid expansion of computer technologies and availability of
quickly increasing genomic and structural databases marked the
new trend away from the unguided evolutionary approaches [11,
12]. This triggered the development of the focused stochastic
methods that have attempted to improve their efficiency by
decreasing the size of libraries and increasing the frequency of
functionally beneficial phenotypes. These can implement different
optimization strategies, sometimes combined within one method,
which can be generalized into three main classes.
The first class of methods applies the saturation mutagenesis
to only those parts of the protein that seem most likely to
lead to the improvements. These hotspots can be identified from
a prior round of unfocused directed evolution and consequent
selection of those mutations that significantly improved the desired
properties [13]. The growing availability of protein structural
information has inspired more knowledge-based strategies. As
far as catalytic activity or substrate specificity was concerned
the amino acid residues that are close to the active site in the
3D space were suggested as the most interesting targets for
saturation mutagenesis [14]. Combinatorial Active-Site Saturation
Testing (CASTing) presents further development of this idea. It
chooses several nearby residues from the active site and performs
simultaneous mutagenesis to take advantage of the additive effect
of multiple mutations [15]. Iterative approaches that combine
selected beneficial active site mutations in several consecutive
rounds of mutagenesis have led to further improvements [16–
18]. All these methods are strongly dependent on the accuracy of

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Introduction 355

the hotspot selection. Indeed, it is known from the experimental


enzymology that mutations of the active site residues can change
enantioselectivity, substrate specificity, and catalytic promiscuity
more effectively than distant ones. However, interacting residues
are not the sole determinant of substrate specificity as residues
distributed over the fold can provide shape and flexibility to the
active site for complementary interactions with the substrate. Thus,
both close and distant mutations can be important for substrate
specificity and catalytic activity [19]. Moreover, stability can be a
crucial issue in the evolution of functional diversity by allowing a
protein to tolerate destabilizing mutations that confer functionally
beneficial phenotypes [20]. These facts highlight complexity of the
evolutionary adaptation and suggest that functionally important
residues should not be associated only with the active site area.
The second optimization strategy is to limit the alphabet of
substitutions at each position. Typically, the NNK codon (N = GATC
and K = GT) is used for saturation mutagenesis to produce random
peptides, which includes a redundant set of 32 possible triplets
to encode for 20 amino acids. This results in a large fraction of
duplicate mutants in the random library. To overcome this problem
the use of NDT codon (D = GAT) has been suggested instead to
code for 12 different amino acids [21]. This will eliminate duplicates
but miss out eight other possible substitutions. A more complex
approach based on the 3DM database of multiple alignments built
from structurally equivalent positions within a superfamily has
been suggested [22]. First, the hotspots can be selected among the
catalytic site residues to design enantioselectivity [23] or on the
basis of the crystallographic B-factor to change thermostability [24].
Then, a library of mutants can be prepared by introducing only
those amino acids that appear frequently within the superfamily
alignment at the selected hotspot positions. Such a back-to-
consensus approach reduces the amino acid alphabet used for
mutagenesis and thus is more efficient compared to fully random
approaches. The strategy was successfully applied in several studies,
however, has some limitations. First, homologous enzymes diverged
from a common ancestor not only by substitutions but also by
deletions and insertions of residues [3]. Therefore, by focusing only
on the structurally conserved positions we are likely to miss out
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

356 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

the functionally important residues that do not have an equivalent


in all homologous proteins. Second, it cannot be expected that
the most frequently occurring amino acids at certain positions are
responsible for some particular or enhanced properties. Thus, by
implementing the back-to-consensus approach we are introducing
the most frequently occurring property within the superfamily
without being able to control its value. Finally, the available
sequence and structural databases are likely to be biased and
should not be expected to accurately represent all the different
functional types according to their evolutionary fitness. For example,
the lipase/esterase (EC 3.1.1.-) catalytic function is one of the most
popular within the α-/β-hydrolase superfamily. However, only a few
hydroxynitrile lyases (EC 4.1.2.-) are known to have the same fold.
Consequently, identification of mutations essential to convert an
esterase into a hydroxynitrile lyase [25] will not be possible by the
back-to-consensus approach.
The third class of smart stochastic approaches focuses on
selection of functionally beneficial phenotypes at iterative screening.
ProSAR approach has been suggested to improve catalytic function
by identification of additive mutations [26]. At the first stage
different mutagenesis strategies are combined to generate diversity
libraries which are screened for activity. Then, statistical analysis
is applied to assess the contributions of mutational effect to the
protein function and select the potentially beneficial substitutions.
At the end of the first round these best variants are selected
for the next round in order to identify positive cooperative
effects. The process would terminate when the achieved results
meet the predefined design criteria. Another approach suggested
accumulating only those random mutations that maintain the
original function, resulting in stable and properly folded variants
[27]. This process mimics the genetic drift that naturally takes place
in populations due to random sampling. It reduces the library size
and should result in variants that are functional and evolvable and
may have improved properties.
The focused stochastic approaches are much faster than
decades ago due to implementation of the statistical analysis and
computational tools to assist the protein engineering. However,
they remain resource demanding and yet require large mutant

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Introduction 357

libraries in order to identify functionally positive phenotypes. To


sum up, the evolutionary stochastic methods are hampered by
a high frequency of deleterious mutations and low frequency of
functionally beneficial phenotypes.
Finally, the empiric rational design of enzymes is becoming more
popular in recent years due to implementation of computational
tools. Yet the weak side of both the focused stochastic approaches
and the empiric rational design of enzymes is the starting point.
The choice of hotspots for mutations is very subjective and
begins from the visual expert inspection of sequence or structural
information. Results of this expert analysis can then be used, for
example, to design the specific interaction profile in the active
site to accommodate a selected substrate or introduce stabilizing
interactions in the protein fold [28, 29]. Another strategy is to
compare few evolutionary-related proteins with different properties
and determine crucial amino acid residues that can be switched
to transfer the functional property from one enzyme into its
homolog [25, 30, 31]. Molecular modeling techniques can then be
applied to study the impact of suggested substitutions [12]. These
empiric rationalized strategies can indeed reduce the experimental
evaluation to a smaller number of mutants; however, their efficiency
greatly depends on the qualification of a particular researcher.
In general, the whole process is hardly reproducible. A more
systematic and reproducible approach, is needed to study structure–
function relationship in enzymes and to design their functional
properties.
Accommodation of diverse functions within a common structural
scaffold is a complex process that can include global structural
variations in domain organization and subunit assembly. However,
it seems that incremental mutations of the key amino acid residues
were a major mechanism to achieve functional diversity of enzyme
superfamilies [3, 32]. Continuous improvement of computational
resources and expansion of protein databases now offer a key to
break this nature code with bioinformatic analysis of the adaptive
mutations that survived evolution by providing enhanced or novel
functional phenotypes.
The relatively new discipline of bioinformatics has great impli-
cations to study protein structure and function [33]. Potentials of
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

358 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

bioinformatic methods were amplified by emergence of the new


and low-cost high-throughput sequencing technologies. Therefore,
multiple alignments of homologous proteins represent both afford-
able and powerful tools to study functionally important residues
by tracing their evolutionary history within a superfamily. It is
widely accepted that the importance of a residue for maintaining the
structure and function can be inferred from how conserved it ap-
pears in a multiple sequence alignment (MSA) [34]. Scoring residue
conservation, however, is not a trivial task and implements advanced
schemes and subtle sequence patterns to quantify evolutionary
restraints very accurately considering physicochemical properties,
gaps inclusion, sequence weighting to exclude redundancy, and
other criteria [35]. Several tools have been developed to detect
functional groups of amino acids using different conservation
scoring techniques combined with structural analysis [36, 37].
Highly conserved positions seem to appear during evolution as a
result of the selective pressure and therefore are very useful to
indicate properties common for the entire family. As a matter of
fact, residues directly involved in the catalytic machinery are usually
found as highly conserved [38]. However, conservation does not
explain functional diversity. More importantly, replacement of amino
acid residues at conserved positions usually diminishes catalytic
function. Thus, information about conserved positions can help to
understand how an enzyme performs an existing function but hardly
can help changing it.
Consequently, it is important to identify positions that demon-
strate a specific pattern of conservation and variability inside
enzyme families and superfamilies. One option would be to identify
positions that are conserved only within subfamilies but are
different between subfamilies—the subfamily-specific positions, or
SSPs—and that seem to play an important role in functional
discrimination [39–42]. An SSP does not necessarily have to be
totally conserved within subfamilies; instead it may contain residues
with similar physicochemical properties. A subfamily can also
contain a gap to outline functionally important deletion or insertion
of residues during evolution from a common ancestor (Fig. 10.1).
Another option would be to identify the co-evolving positions, or
CEPs, that compensate mutation in one region by mutation in

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Bioinformatic Analysis of Evolutionary Information to Identify 359

Figure 10.1 Patterns of variable positions in protein superfamilies: a high-


entropy position, a pair of co-evolving positions (CEPs), subfamily-specific
positions (SSPs), and a conserved position.

another region, for example, to maintain energetically favorable


interactions. CEPs can be used to highlight structurally important
residues that should be implemented in structure prediction and
docking studies [43, 44]. Several techniques have been developed
in the field that rely on different assumptions, require different
input, and implement different statistical models to identify these
types of variable residues. In the next section we will discuss how
to select variable positions and study their implications to protein
function.

10.2 Bioinformatic Analysis of Evolutionary Information


to Identify Function-Related Variable Positions

10.2.1 Problem Definition


Structural genomics initiatives and the high-throughput sequencing
of genomes provide a huge amount of raw information for public
use. More value is added by numerous experimental studies of
enzyme properties that are carried out in both academia and biotech
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

360 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

companies all over the world. But how should we combine and
systematically evaluate the accumulated data in order to generate
new knowledge and enrich our understanding of the structure–
function relationship in enzymes? Homologous proteins within a
superfamily can be compared by means of sequence and structural
alignment. This reduces the multidimensional sequence-structure-
function space to a 2D or 3D superimposition of amino acids.
Residues with a common structural localization in different enzymes
are expected to make up a column of this multiple alignment,
suggesting their involvement in the same functional and mechanistic
processes. The goal of the bioinformatic analysis is to select a subset
of columns with relevant features. In the context of this chapter
we are interested in positions that have a variable amino acid
composition organized according to a particular pattern that implies
their functional role within the superfamily. This problem is an
instance of a more general class of algorithms known as Feature
Selection that is widely used in computer science [45]. Feature
Selection is a dimensionality reduction technique to find a subset of
features that are the most relevant for a particular model building by
eliminating redundant or irrelevant features that provide no useful
information.
A subclass within Feature Selection—the so-called filter-based
methods—is generally used to search for functional residues in
proteins. In order to discriminate between functionally important
and nonimportant residues a scoring function is defined and
relevance scores are calculated for every feature—a column of a
multiple alignment. The low-scoring features are then removed and
the biological relevance of the top n positions can be evaluated by the
user, for example, by point mutagenesis experiments. Filter-based
algorithms are computationally fast and easily scalable for large
biological datasets and can implement various scoring schemes.
Consequently, numerous tools have been developed utilizing this
approach to select function-related variable positions. They are
based on different scoring schemes, require different types of input
data (e.g., may or may not use structural information), and may
implement statistical analysis to improve the prediction accuracy.
Next we discuss the most prominent algorithms and biological
problems to which they can be applied.

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Bioinformatic Analysis of Evolutionary Information to Identify 361

10.2.2 Scoring Schemes in the Variable Position Selection:


High-Entropy, Subfamily-Specific, and Co-Evolving
Positions
Conserved residues that can define properties common for the
entire superfamily are usually directly involved in the catalytic
machinery, and functional property is diminished when they are
changed or modified. Therefore, information about evolutionary
conservation of particular residues in proteins can be used as an
indicator of their mutability. This strategy has been implemented
in the HotSpot Wizard that targets highly variable positions located
in functionally important regions, for example, in the enzymes
active sites [46]. High entropy of the selected positions implies a
better chance that the amino acid replacement will not compromise
a general function and may explain functional diversity. On the
contrary such variability can be a result of a low or inexplicit impact
of the selected amino acid residue on protein structure and function.
To overcome this uncertainty more sophisticated and structured
variability patterns associated with functional subfamilies should be
implemented and studied.
The particular distribution of amino acid residues in a column
of a multiple alignment can possess an important value. The
most well-known sequence pattern implies amino acid residues
conserved within subfamilies but different between them and
is used to identify the SSPs that seem to be responsible for
functional discrimination [39–42]. The pioneer work implementing
this idea used predefined classification of functional subfamilies
to identify SSPs [47]. At the first step proteins were grouped
according to sequence identity, functional similarity, taxonomic
origin, and other criteria. Given such classification the hierarchical
conservation analysis was applied by using a simplified scheme of
physicochemical similarity between amino acid residues within one
column. The most informative positions based on simple threshold
criteria were then selected. A similar idea was further explored
in the evolutionary trace method [48]. Protein sequences were
grouped on the basis of a phylogenetic tree constructed from
an MSA. By implementing different cutoffs to split the tree into
subfamilies a list of SSPs was obtained that are totally conserved
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

362 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

within every subfamily and different between them. The identified


positions were then mapped to a 3D protein structure to select
residues that form surface clusters. These were finally used to
describe different substrate specificities among subfamilies. These
methods made the first important step toward the development
of more sophisticated approaches and were also applied to study
several protein families. At the same time they lack the unified
scoring function and thus are sensitive to minor substitutions in a
column. No implementation of statistical analysis to filter out the
background noise was reported. Moreover, the evolutionary trace
will miss important buried specific positions that are not grouped
into surface clusters.
These drawbacks have been addressed by implementation of
the relative entropy (RE) measure, also known as Kullback–Leibler
distance [49] that estimates a distance between two probability
distributions P (x) and Q(x)
 P (x)
RE = P (x) × log . (10.1)
x∈X
Q(x)
RE is nonnegative and is equal to zero if and only if the two
distributions are identical. Consequently, it has been suggested to
estimate specificity of a column by calculating RE between the amino
acid frequencies corresponding to subfamilies and to the whole
column [50]. Alternatively, the mutual information (MI) has been
proposed as an equivalent measure [51]

N 
20
f (α, i )
MI = f (α, i ) log (10.2)
i =1 α=1
f (α) f (i )
where α = 1, . . . , 20 represents the amino acid types, i = 1, . . . , N,
denotes the functional groups, f (α,i ) is the frequency of a residue
α within a column in a group i , f (α) is the frequency of a residue
α in the whole alignment column, and f (i ) is the fraction of
proteins belonging to a group i . The MI measures the RE between
the joint distribution and the product distribution of two discrete
random variables α and i , which reflects the statistical association
between them. In other words the MI quantifies the amount of
information that one random variable contains about the other. It is
nonnegative and is equal to zero if and only if α and i are statistically

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Bioinformatic Analysis of Evolutionary Information to Identify 363

independent. Higher values of MI reflect strong association between


the amino acid types (α) and subfamilies (i ) and thus indicate a
higher specificity. These classical entropy-based methods quantify
whether the amino acid content of one subfamily can be used to
predict the amino acid content of the other subfamilies. In other
words, they treat different amino acids as different symbols and do
not account for physicochemical relationship between them.
Further development of novel SSP prediction strategies was
largely based on simple but powerful entropy functions and
introduced new improvements [52, 53]. For example, to improve
the RE performance the SPEER program applies a similarity-based
weighted Euclidean distances to quantify the difference between
any two sequences in a given column and a phylogeny-based
estimation of the site heterogeneity [54]. Another strategy has been
implemented in Sequence Harmony to calculate a sum of entropy-
based measures aiming to score compositional differences between
subfamilies without imposing conservation within subfamilies [55].
However, some modern methods exist that are based on
completely different algorithmic assumptions. For example, the
GroupSim program does not implement any measures derived from
the information theory but instead suggests a new scoring scheme
based on the sum of pairs calculated, by default, with the identity
matrix [56]. Another novelty takes into account the conserved
neighbors by sequence to estimate the position specificity. Similarly,
3D structure information has been exploited by a machine-learning
approach multi-RELIEF to increase the weight of specific residues
that have high-weight-specific neighbors [57].
Some other methods implement complex classification schemes
like the principal component analysis (PCA) [39, 58]. It treats
sequences in a multiple alignment as vector points in a multidi-
mensional space, with residue positions and residue types as the
basic dimensions. Most populated principal axes are then selected
along with the corresponding characteristic residues of the different
subfamilies. However, PCA does not account for similarities between
amino acid types and might miss some functionally relevant splits
in the data if used in the unsupervised manner. Another study
was intended to identify the minimal groups of positions that
would discriminate protein alignment into functional subtypes
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

364 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

based on the scoring function provided by the support vector


machine classifier [59]. This approach uses structure information
to focus on the surface patches only and thus will miss functionally
important internal residues. In general, these approaches are
resource demanding and implement heuristics to perform the
search. Moreover, they may require interactive manual inspection
of the PCA spaces by an expert in order to accurately identify
subfamilies and the corresponding SSPs.
Each of the outlined methods has its own strong sides and
weaknesses. First, the majority of methods require a user-defined
classification of sequences into functional subfamilies. Although
functional annotation can be obtained from experimental studies
such information is yet not available for many protein families,
thus leaving the user without a clear guideline on how to apply
the method. Second, it seems that the majority of approaches
intentionally avoided use of structural information in their al-
gorithms due to insufficient data in protein databases. The rare
methods that implemented structural information used it mostly
to limit the search to surface patches only and thus missed out
the important buried residues. However, nowadays the amount
of protein structural data is steadily growing and can be used
to provide useful information about the relationship between
residues in the 3D space. Third, the amino acid similarity was
usually quantified on the basis of sequence (alphabet) conservation
rather than physicochemical similarity. Finally, many methods do
not implement any statistical analysis of the results, making the
selection of the most significant positions for further evaluation very
subjective.
Another class of algorithms that implements similar calculation
schemes is intended for identification of the co-evolving residue
pairs or CEPs. This pattern corresponds to variable positions at
two different columns of a multiple alignment that can be located
in spatial proximity in protein structures and co-evolve to maintain
functionally and structurally favorable interactions. In other words,
changes at one site are treated as compensatory to mutations
at another site. Consequently, identification of the co-evolution in
proteins can provide information about residues which are close

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Bioinformatic Analysis of Evolutionary Information to Identify 365

in the 3D space and can therefore be used in molecular modeling


and structure prediction [60, 61]. At the same time, co-evolution
not always corresponds to amino acids, which are in close structural
proximity, and may highlight more complex biological constraints as
a result of evolutionary adaptation.
Initially the residue co-evolution was assessed by detecting
correlation between amino acid frequencies at two alignment
positions i and j
 
1  (si kl − si ) × (s j kl − s j )
ri j = 2 (10.3)
N kl σi × σ j

where si kl is the similarity between the amino acids of proteins k


and l at a position i , σi is the standard deviation of si kl around
the mean si , and the indices k and l iterate from 1 to the
number of sequences N in the alignment. This approach, also known
as the McLachlan-based substitution correlation [62], estimates
the similarity between amino acid substitution matrices by linear
correlation. It has been extensively implemented in various studies
and still serves as a baseline to mark the performance of the state-
of-the-art algorithms [63, 64]. However, the traditional correlation
approaches have a number of shortcomings. In particular, they
do not take into account the biochemical nature of the observed
compensations and cannot distinguish between directly and indi-
rectly correlated residues. Since CEPs are widely used for structure
prediction by imposing additional geometric restraints to fold up
the target protein, identification of the closely located co-evolving
residue pairs has become the major priority in the development
of new methods. Certain progress has been achieved by applying
corrections for the background phylogenetic co-evolution and
using weighted comparison of divergence between amino acid
sites [65] or combining covariance analysis with global inference
analysis adopted from statistical physics [66]. The entropy-based
measures were also applied to search for co-evolving residue pairs,
however, have not been as successful as with the SSP prediction
[63]. Further improvements to the contact prediction accuracy
have been achieved by offsetting the evolutionary background
[43, 67].
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

366 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

10.2.3 Association of the Variable Positions with


Functional Subfamilies
Identification of the CEPs is not associated with functional subfam-
ilies and therefore does not require a corresponding classification.
On the contrary, the successful identification of SSPs relies on
accurate definition of functional subfamilies within a superfamily.
Most programs in the field can accept the user-defined functional
classification that can be based on experimental studies, for
example, can discriminate enzymes by their substrate specificities
or other functional properties. However, such a classification is yet
not available for many protein families due to limited experimental
data. Some other programs predict functional classifications by
chopping a phylogenetic tree of the corresponding proteins into
clusters [68]. This can result in a redundant set of classifications to
be analyzed and make the groups dependent on the clade topology
that is specific to a particular tree-building algorithm. Alternatively,
it has been proposed that functional properties are conserved
among orthologous proteins that diverged after speciation and are
different in paralogous proteins [40, 51]. Thus, a protein family
can be divided into ortholog groups as functional subfamilies.
However, it is known that functional conservation among orthologs
is not necessarily true and it is clearly possible for orthologs to
diverge functionally [69]. Finally, a new method to predict functional
subfamilies in a phylogeny-independent stochastic manner has been
reported [70]. However, it tends to predict unreasonably large
number of subfamilies [41]. Finally and the most importantly,
current algorithms provide only a single functional classification of
superfamily into subfamilies, while it is well known that more than
one level of differentiation can exist due to complexity of functions
in protein superfamilies.

10.2.4 How to Select Functionally Important Positions as


Hotspots for Further Evaluation: Implementation of
Statistical Analysis
Evaluation of the statistical significance is important at selection
of both the subfamily-specific and CEPs. In the case of SSPs the

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Bioinformatic Analysis of Evolutionary Information to Identify 367

functional subfamilies would generally include recently diverged


proteins that are expected to be closer in sequence than distant
relatives. Thus, similarity within subfamilies would be naturally
higher than between subfamilies. Therefore, a statistical test has
to be developed that will select the SSPs that are stronger asso-
ciated with the functional classification than with the background
similarities of complete protein sequences. In the case of CEPs
the statistical correction is important to filter out the phylogeny-
based co-evolution noise and to discriminate between directly and
indirectly correlated residues. Speaking in general, the statistical
analysis is needed to separate the functional signal associated with a
particular variability pattern from the background noise associated
with evolutionary relationship and other constraints.
The pioneer algorithms of SSP prediction did not implement any
statistical evaluation leaving the selection of important positions
at the discretion of the user. For example, the evolutionary trace
method suggested the use of a 3D protein structure to select
residues that form clusters on the protein surface as functionally
important SSPs [48]. Since then the statistical analysis has become
a necessary part of almost every bioinformatic algorithm. However,
even some modern programs (e.g., Sequence Harmony and Group-
Sim) lack this important routine.
The first step toward statistical assessment of the SSPs was made
by converting the RE values obtained for each position into Z -scores
on the basis of the average distribution of entropies in all alignment
columns
REi − μ
Zi = (10.4)
σ
where REi is the RE at a column i and μ and σ are the mean
and the standard deviation, respectively. Positions with Z -score
exceeding 3.0 were suggested to be functionally important [50].
In an independent study the P -values of individual columns were
similarly used to assess significance; however, a different statistical
test to calculate Z -scores was implemented [40]. In this case the
random shuffling was applied to every column of the MSA to
rearrange amino acids while retaining the size of columns and
subfamilies. Therefore, background specificities were estimated
for every column independently. Similar phylogeny-based random
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

368 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

evolution model has been implemented in the SPEL method [71].


These methods have suffered from a similar drawback: problem: the
significance threshold to select the best hits for further evaluation
is subjective and left at the discretion of the user. Z -scores are
standard normally distributed values that indicate to what extent
a random variable is different from the mean in terms of standard
deviations. A corresponding distribution function can be used to
assign the so-called P -values—probabilities of observing by chance
a result with at least as large Z -score—to every column of a multiple
alignment. A Z -score of 3.0 roughly corresponds to a P -value of
0.00135 (probability to obtain a good result, at least as good as the
observed one, by chance is 0.135%). Is this significant enough to
select a functionally important SSP?
To overcome this uncertainty a new B-cutoff method was
proposed that calculated P -values not for individual columns but
for sets of positions [51, 72]. The obtained Z -scores of column
specificities are sorted in decreasing order, and then a cutoff rank
k is computed so that the first k scores represent a set of hits the
least probable to be observed by chance:
 n−k 

k = argk min C nn−i × pn−i × q i (10.5)
i =o
where n is the total number of computed Z -scores and
∞
1
p = P (Z ≥ Z k ) = √ exp(−Z 2 )d Z , q = 1 − p. (10.6)

Zk
The obtained P -value minima were suggested as thresholds to select
sets of highly significant SSPs.
Considering the CEPs, co-evolution between amino acid residues
can be due to their structural/functional interaction, the stochastic
covariation, and phylogenetic convergence [61]. Thus, it is important
to discriminate the functional/structural signal from the others.
The stochastic covariation is addressed by using Z -score analysis
with mean and variance measured over randomly sampled pairs
of sites. Then, to filter out the phylogenetic background the tree-
specific clades are removed and CEPs that are no longer detected are
dismissed. The co-evolving sites left after filtration are suggested as
structural/functional co-evolving sites [65].

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

The Bioinformatic Analysis of Diverse Protein Superfamilies 369

As was previously mentioned, the correlated amino acid po-


sitions in proteins can provide information about proximity of
residues in space. It was shown that basic correlation measures
alone are not sufficient to construct residue contact maps [43].
The problem arises from the transitivity of pair correlations when
more than two positions show coordinated substitution patterns,
resulting in the so-called indirect couplings. To overcome these
limitations, the sophisticated probabilistic models have been devel-
oped [73–75]. In general, these methods search for a set of directly
coupled positions, which seeds a larger set of indirectly coupled
residues. For example, the direct coupling analysis [75] method uses
a maximum entropy model of the protein sequence, constrained by
compositional statistics of the columns in the multiple alignment.
With the growing number of genomic information these methods
seem to be very attractive and promising for in silico folding
experiments.

10.3 The Bioinformatic Analysis of Diverse Protein


Superfamilies

10.3.1 Bioinformatic Challenges at Studying Enzymes


We have so far discussed the development of different methods to
select the functionally important variable amino acid residues in
protein superfamilies. We have focused on the subfamily-specific
and co-evolving residues that can be characterized by specialized
content patterns that are associated with particular functional roles.
We have shown that the main trend in the CEPs prediction methods
is moving toward the discrimination of directly and indirectly
correlated residues to predict molecular contacts between different
protein residues and protein–protein interactions. Consequently, the
identified co-evolving amino acid pairs are likely to play important
structural roles and can be implemented in molecular modeling and
structure prediction. However, their use in protein engineering, in
particular as hotspots to modify functional properties, seems to
be rather limited for several reasons. First, the CEPs are identified
as pairs or larger networks. Therefore, the standalone functionally
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

370 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

important residues can be missed out. However, it is known that


even a single substitution can enhance existing or introduce new
functions into enzymes [19]. Second, modification of interacting
residues is more likely to decrease/destroy a function rather than to
enhance it. Of course, analysis of the CEPs can identify functionally
related residues that have different structural localization but
nevertheless reflect true evolutionary constraints. Yet the problem
is that CEPs are not associated with functional subfamilies. This
makes it hard to systematically study the role of CEPs in protein
function.
In contrast, the SSP prediction methods fully benefit from
sequence-structure-function data about protein superfamilies. The
results can be presented as both individual positions and specific
patches that can regulate different functional properties. The amino
acid content of SSPs is strongly associated with the functional
subfamilies; thus swapping amino acid types in SSPs seems to
be a promising strategy for rational design of protein functional
properties. Consequently, a corresponding method to perform the
bioinformatic analysis of SSPs from sequence and structural data has
to be developed and implemented in common laboratory practice
to study the structure–function relationship in proteins and develop
novel protein engineering strategies. This promising trend in protein
bioinformatics will be further considered in more detail.

10.3.2 Zebra: A New Algorithm to Select Functionally


Important Subfamily-Specific Positions from
Sequence and Structural Data
Various bioinformatic approaches have been developed to date on
the basis of different algorithmic strategies. Each program has its
own benefits and shortcomings, which have been previously dis-
cussed. These drawbacks have been revised and new features added
in the state-of-the-art algorithm Zebra for the bioinformatic analysis
of diverse protein superfamilies that takes into account sequence
and structural information and physicochemical properties of amino
acid side chains, performs statistical ranking of selected SSPs, and
can operate in a flexible subfamily classification mode [41, 42]. Zebra
takes for input an MSA and optionally a 3D structure corresponding

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

The Bioinformatic Analysis of Diverse Protein Superfamilies 371

to one of the protein family members. The algorithm has three basic
steps. First, the functional subfamilies are predicted automatically or
provided by the user. Second, the proposed subfamily classifications
are used to calculate the SSPs using sequence and structural
information. Finally, statistical analysis is implemented to select
the most significant SSPs. We further address these steps in more
detail.
Zebra automatically provides a functional classification of
protein superfamilies on the basis of both sequence and structural
information. The similarity-based graph clustering at different pair-
wise identity thresholds are used to predict independent subfamily
classifications. Similar and identical classifications are eliminated,
and a nonredundant set of subfamily definitions is further used
to identify the SSPs (see below). Subfamily classifications are
finally ranked by significance of SSPs they produce (the lowest
global P -value). Consequently, Zebra is the first application that
provides specificity determinants at different levels of functional
classification, therefore addressing complex functional diversity of
large superfamilies (Fig. 10.2).
A new scoring function is implemented in Zebra that takes
into account structural information as well as physicochemical and
residue conservation in functional subfamilies. RE is intended to
measure the association of amino acid types with subfamilies, while
the sum-of-pairs term accounts for physicochemical similarity:



G AB M(A B) × qi (A B, G) × qi (A, G) × log(qi (A, G)/qi (A))


Si = G A
nG × G log(N/NG )

(10.7)
where A and B are the amino acid types; qi (AB, G) is the frequency
of pair AB in subfamily G of column i ; qi ( A) and qi ( A, G) are the
frequencies of residue A in the column i and in the subfamily G
of this column, respectively; N is the total number of sequences;
NG is the number of sequences in a particular subfamily; nG is the
total number of subfamilies; and M(AB) is a normalized measure
of physicochemical similarity between A and B calculated from the
BLOSUM matrix series. Si takes values from 0 to 1, with larger
values indicating higher specificity. Then, random permutations are
performed independently for every column by shuffling amino acid
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

372 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

Figure 10.2 Subfamily-specific positions at different levels of functional


classification predicted by Zebra web server in the glutathione S-transferase
superfamily. Greek letters represent different functional classes within the
superfamily. Position numbering as in the μ-class enzyme from rat (PDB:
2GST).

content, while retaining subfamilies in their original proportions—


to calculate the statistical significance Z -scores.
The absolute majority of SSP prediction methods rely on
sequence information only. However, nowadays incorporation of
the 3D data into the algorithm no more seems to be a limitation
[76]. Therefore, structural information can be used to evaluate the
relationships between residues in the 3D space and considered
for bioinformatic analysis of functionally important positions with
different structural localization. In order to do so the obtained
specificity Z -scores are corrected using structural information to
favor SSPs that assemble into clusters with other subfamily-specific
and conserved positions
CNS SSP
 j max(Z j , Z j )
Z i = γ × Z i + (1 − γ ) ×
SSP
(10.8)
N
where Z SSP is the normalized measure of column specificity,
while Z CNS is the normalized measure of column conservation

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

The Bioinformatic Analysis of Diverse Protein Superfamilies 373

[77], j defines the neighboring residues of a position i in the


representative 3D structure, and N is the total number of neighbors.
Thus, the structural information is directly incorporated into Zebra
calculations to identify clusters of closely located conserved and
specific residues. At the same time this approach does not lose
information about highly specific but stand-alone SSPs. It has been
shown that the consideration of only conserved or only specific
positions as structural neighbors significantly reduces accuracy of
predictions.
Finally, for a particular purpose, especially if we want to use
SSPs to study the structure–function relationship in proteins, it
is important to perform statistical ranking of SSPs and select the
top best results for further study. Zebra implements the previously
discussed B-cutoff method [72] to automatically select the most
significant SSPs.
The Zebra web server, user manuals, and examples are publicly
available at http://biokinet.belozersky.msu.ru/zebra. Zebra pro-
vides three input modes that differ by complexity and type of data
required to start the analysis. The QuickZebra mode is the most
straightforward and easy option to run the bioinformatic analysis
that requires an MSA only for the input. Functional subfamily
classifications are predicted automatically. The QuickZebra+3D
mode in addition to the sequence information operates with
structural information and requires a PDB structure file that should
correspond to one of the MSA sequences. Incorporation of the
3D information can significantly improve Zebra predictions [41].
Therefore, Zebra does not mandatorily require information about
the 3D protein structure but can largely benefit when it is available.
Finally, the Manual mode provides the ability to edit algorithm
parameters that control the classification and identification of SSPs.
Zebra results are provided in two ways—as a single all-in-
one parsable text file and as PyMol sessions with structural
representation of SSPs. The output text file contains a list of
subfamily classifications ranked by decreased significance and
for each of them—a detailed list of SSPs, including results of
statistical evaluation and amino acid content of the corresponding
columns. For each subfamily classification the results of the
statistical analysis are provided as a guideline to select hotspots
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

374 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

Figure 10.3 The structural representation of the subfamily-specific posi-


tions in the glutathione S-transferase superfamily (automatically produced
by the Zebra web server). Gradient paint of the protein backbone indicates
the estimated specificity of corresponding residues: red stands for highly
significant SSPs and cyan for nonspecific positions. The product of the
enzymatic reaction is shown in yellow with balls and sticks to highlight
the active site. The most significant hits are located in the active site loop
important for binding structurally diverse xenobiotic substances and also
in surrounding regions–domain–domain, subunit–subunit, and possibly
dimer–dimer interfaces that provide shape and flexibility to the active site
for complementarity with the substrate [78].

for primary consideration. Moreover, if the PDB file has been


submitted to the server it will be used to produce PyMol session files
(.pse) with structural representations of SSPs for each subfamily
classification. Specific residues will be gradient-painted according to
calculated specificity Z -scores (Fig. 10.3). These files can be used to
qualitatively analyze the distribution of SSPs in the structure. The
detailed practical guidelines for bioinformatic analysis of diverse
protein superfamilies with Zebra have been presented elsewhere
[78].
To sum up, Zebra can incorporate sequence, structural, and
functional information about protein superfamilies to generate a list
of statistically significant SSPs that are supposed to be responsible
for functional diversity at different levels of functional classification.
These positions can be used as hotspots for directed evolution or

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Subfamily-Specific Positions as a Tool for Enzyme Engineering 375

rational design experiments and analyzed to study the structure–


function relationship. In other words, Zebra bioinformatic analysis
provides reproducible and experimentally testable hypotheses for
further evaluation.

10.4 Subfamily-Specific Positions as a Tool for Enzyme


Engineering

We have discussed different types of variable positions in


evolutionary-related proteins and their implications to protein
function. We gave an overview of different algorithms that can
be used to identify corresponding variable residues in protein
superfamilies using sequence and structural information. Finally, we
have emphasized the value of statistical approaches applied to select
a set of the most promising positions for further evaluation. Now,
how can we put into action these computational results and use our
knowledge about SSPs to study enzyme mechanisms or design novel
enzymes? One option would be to produce focused mutant libraries
by targeting only the selected hotspots and reducing the amino acid
alphabet in each position. The obtained variants can be then studied
experimentally to evaluate the impact of introduced mutations on
structure and function. In such a way it would be possible to
dramatically decrease the screening efforts in the directed evolution
or random mutagenesis experiments and produce enzyme variants
with new or enhanced properties more effectively. Another, more
rational option would be to construct a library of enzyme mutants
by switching the amino acid types at the SSPs predicted by the
bioinformatic analysis and implementing the molecular modeling
to screen the obtained variants against an in silico library of
substrates of interest. Then the most promising enzyme variants
can be produced and characterized experimentally. A corresponding
example is discussed below.
The superfamily of α-/β-hydrolases is one of the largest groups
of enzymes with diverse catalytic function that share a common
structural framework and a topologically conserved nucleophile–
histidine–acid catalytic triad [79, 80]. At the same time members
of this group have lost sequence similarity during natural selection
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

376 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

and specialization from a common ancestor to efficiently transform


various substrates of different structure and physicochemical
properties. Consequently, the α-/β-hydrolases provide an excellent
model to study the structure–function relationship of multiple
catalytic capabilities within active sites of a common structural
organization. Lipases are members of the α-/β-hydrolase fold
superfamily that catalyze hydrolysis of poorly water-soluble long-
chain acylglycerols. However, lipases were shown to be very poor
catalysts for hydrolysis of amides [81] despite the wide spread of
protease activities within the α-/β-hydrolase fold. Consequently,
the bioinformatic analysis has been used to study how lipase
and amidase catalytic activities are implemented into the same
structural framework [82, 83].
SSPs—conserved only within subfamilies of lipases and pepti-
dases but different between the subfamilies—that were supposed
to be responsible for functional discrimination have been identified.
The most statistically significant SSPs were selected for further
consideration with molecular modeling and used as hotspots to
introduce amidase activity into Candida antarctica lipase B (CaLB).
The structural analysis has shown that substitution of selected SSPs
supports the near-to-attack conformation of the amide substrate
in the CaLB active site, introduces more space, and decreases
hydrophobicity of the leaving group binding subsite. An in silico
library of single and multiple CaLB variants was constructed by
introducing mutations into the SSPs. Computer screening was
applied to evaluate the influence of the selected residues on binding
and catalytic conversion of the amide substrate and select reactive
enzyme–substrate complexes that satisfy the knowledge-based
criteria of amidase catalytic activity [84]. Four distance criteria were
implemented to describe the near-to-attack conformation of the
substrate in the active site—distance between the γ -oxygen of the
catalytic Ser105 and the substrate carbonyl carbon and distances
from the substrate carbonyl oxygen to the backbone nitrogen atoms
of Thr40 and Gln106 and to the γ 1-oxygen of Thr40, all limited
to at most 3.5 Å. Additionally, the angle and the dihedral angle of
the nucleophilic attack trajectory were measured and should be
close to 90◦ . The structural analysis showed that substitution of
the selected SSPs supported the near-to-attack conformation of the

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

Conclusion 377

Figure 10.4 Amide substrate (N-benzylacetamide) bound in the active site


of the CaLB mutant G39A.T103G.W104F.L278A satisfies the near-to-attack
conformation criteria during molecular dynamic simulation.

amide substrate in the active site of CaLB mutants, introduced more


space and decreased hydrophobicity of the leaving group-binding
subsite. The selected CaLB variants were produced and showed
significant improvement of the experimentally measured amidase
activity. It was shown that the amidase activity was improved
in mutants that stabilized the near-to-attack conformation of the
substrate in the molecular modeling studies (Fig. 10.4) while no
improvement was found in those mutants where the near-to-attack
conformation was not observed [83].

10.5 Conclusion

A steady growth of the sequence-structure-function databases high-


lights functional and structural variability of enzymes and warms
up the ongoing discussion about structure–function relationship in
diverse protein superfamilies. How do changes in protein sequence
affect the biological function and structure, and how can we change
enzyme function and create novel biocatalysts? In the recent years
there have been significant advances in modifying enzymes by
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

378 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

the socalled stochastic approaches such as experimental-directed


evolution. At the same time there is a great need to limit the
size of mutant libraries by making the selection of hotspots more
knowledge based.
Analysis of sequence and structure information provide oppor-
tunities to study the structure–function relationship in enzymes
and rationalize protein engineering by moving away from unguided
evolutionary stochastic approaches. During enzyme evolution one
functional property can be preserved because of the strong selective
pressure, while others can vary. This phenomenon seems to be
programmed by different susceptibility to mutations of certain
positions in protein structures and can be studied by means of
sequence and structure comparison. In this chapter we have focused
on variable positions that can accommodate different amino acid
types in homologous proteins. The SSPs are especially interesting as
their amino acid composition is associated with functional subfam-
ilies, which makes them an important tool to study the structure–
function relationships in proteins. Moreover, swapping amino acid
types in SSPs seems to be a promising strategy for rational design
of the corresponding functional properties. Currently available
computer programs provide users with possibilities to perform
the bioinformatic analysis of diverse superfamilies and identify
functionally important SSPs to be used as hotspots for directed
evolution or rational design experiments. Despite certain progress
in understanding the influence of SSPs on enzyme function it is still
a poorly explored area that needs further development. We hope
that systematic analysis of SSPs and their role in enzyme catalytic
mechanisms will help to better understand enzymes, establish
the structure–function relationships, and facilitate development of
rational enzyme engineering technologies.

References

1. Murzin, A. G., Brenner, S. E., Hubbard, T., and Chothia, C. (1995). SCOP:
a structural classification of proteins database for the investigation of
sequences and structures, J. Mol. Biol., 247(4), pp. 536–540.
2. Todd, A. E., Orengo, C. A., and Thornton, J. M. (1999). Evolution of protein
function, from a structural perspective, Curr. Opin. Chem. Biol., 3(5), pp.
548–556.

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

References 379

3. Todd, A. E., Orengo, C. A., and Thornton, J. M. (2001). Evolution of


function in protein superfamilies, from a structural perspective, J. Mol.
Biol., 307(4), pp. 1113–1144.
4. Bornscheuer, U. T., Huisman, G. W., Kazlauskas, R. J., Lutz, S., Moore, J.
C., and Robins, K. (2012). Engineering the third wave of biocatalysis,
Nature, 485(7397), pp. 185–194.
5. Arnold, F. H. (1996). Directed evolution: creating biocatalysts for the
future, Chem. Eng. Sci., 51(23), pp. 5091–5102.
6. Arnold, F. H. (1998). Design by directed evolution, Acc. Chem. Res., 31,
pp. 125–131.
7. Chen, K., and Arnold, F. H. (1993). Tuning the activity of an enzyme for
unusual environments: sequential random mutagenesis of subtilisin E
for catalysis in dimethylformamide, Proc. Natl. Acad. Sci. U S A, 90(12),
pp. 5618–5622.
8. Reetz, M. T., Zonta, A., Schimossek, K., Jaeger, K. E., and Liebeton, K.
(1997). Creation of enantioselective biocatalysts for organic chemistry
by in vitro evolution, Angew. Chem., Int. Ed., 36(24), pp. 2830–2832.
9. Stemmer, W. P. (1994). Rapid evolution of a protein in vitro by DNA
shuffling, Nature, 370(6488), pp. 389–391.
10. Lauble, H., Miehlich, B., Förster, S., Wajant, H., and Effenberger, F.
(2002a). Crystal structure of hydroxynitrile lyase from Sorghum bicolor
in complex with the inhibitor benzoic acid: a novel cyanogenic enzyme,
Biochemistry, 41(40), pp. 12043–12050.
11. Kazlauskas, R. J., and Bornscheuer, U. T. (2009). Finding better protein
engineering strategies, Nat. Chem. Biol., 5(8), pp. 526–529.
12. Pleiss, J. (2012). Chapter 4, Rational design of enzymes. In Enzyme
Catalysis in Organic Synthesis, 3rd ed., Drauz, K., Gröger, H., and May, O.,
eds. (Wiley-VCH, Weinheim) pp. 89–117.
13. Miyazaki, K., and Arnold, F. H. (1999). Exploring nonnatural evolution-
ary pathways by saturation mutagenesis: rapid improvement of protein
function, J. Mol. Evol., 49(6), pp. 716–720.
14. Hayes, R. J., Bentzien, J., Ary, M. L., Hwang, M. Y., Jacinto, J. M., Vielmetter,
J., Kundu, A., and Dahiyat, B. I. (2002). Combining computational and
experimental screening for rapid optimization of protein properties,
Proc. Natl. Acad. Sci. U S A, 99(25), pp. 15926–15931.
15. Reetz, M. T., Bocola, M., Carballeira, J. D., Zha, D., and Vogel, A. (2005).
Expanding the range of substrate acceptance of enzymes: combinatorial
active-site saturation test, Angew. Chem., Int. Ed., 44(27), pp. 4192–
4196.
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

380 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

16. Privett, H. K., Kiss, G., Lee, T. M., Blomberg, R., Chica, R. A., Thomas, L.
M., Hilvert, D., Houk, K. N., and Mayo, S. L. (2012). Iterative approach to
computational enzyme design, Proc. Natl. Acad. Sci. U S A, 109(10), pp.
3790–3795.
17. Reetz, M. T., Carballeira, J. D., Peyralans, J., Höbenreich, H., Maichele,
A., and Vogel, A. (2006a). Expanding the substrate scope of enzymes:
combining mutations obtained by CASTing, Chem.-Eur. J., 12(23), pp.
6031–6038.
18. Reetz, M. T., Wang, L. W., and Bocola, M. (2006b). Directed evolution
of enantioselective enzymes: iterative cycles of CASTing for probing
protein-sequence space, Angew. Chem., Int. Ed., 45, pp. 1236–1241.
19. Morley, K., and Kazlauskas, R. J. (2005). Improve enzyme properties:
when are closer mutations better, Trends Biotechnol., 23(5), pp. 231–
237.
20. Bloom, J. D., Labthavikul, S. T., Otey, C. R., and Arnold, F. H. (2006). Protein
stability promotes evolvability, Proc. Natl. Acad. Sci. U S A, 103(15), pp.
5869–5874.
21. Reetz, M. T., Kahakeaw, D., and Lohmer, R. (2008). Addressing the
numbers problem in directed evolution, ChemBioChem, 9(11), pp.
1797–1804.
22. Kuipers, R. K., Joosten, H. J., van Berkel, W. J., Leferink, N. G., Rooijen,
E., Ittmann, E., van Zimmeren, F., Jochens, H., Bornscheuer, U., Vriend,
G., Martins dos Santos, V. A. P., and Schaap, P. J. (2010). 3DM:
systematic analysis of heterogeneous superfamily data to discover
protein functionalities, Proteins, 78(9), pp. 2101–2113.
23. Jochens, H., and Bornscheuer, U. T. (2010). Natural diversity to guide
focused directed evolution, ChemBioChem, 11(13), pp. 1861–1866.
24. Jochens, H., Aerts, D., and Bornscheuer, U. T. (2010). Thermostabilization
of an esterase by alignment-guided focused directed evolution, Protein
Eng. Des. Sel., 23(12), pp. 903–909.
25. Padhi, S. K., Fujii, R., Legatt, G. A., Fossum, S. L., Berchtold, R., and
Kazlauskas, R. J. (2010). Switching from an esterase to a hydroxynitrile
lyase mechanism requires only two amino acid substitutions, Chem.
Biol., 17(8), pp. 863–871.
26. Fox, R. J., Davis, S. C., Mundorff, E. C., Newman, L. M., Gavrilovic, V., Ma, S.
K., Chung, L. M., Ching, C., Tam, S., Muley, S., Grate, J., Gruber, J., Whitman,
J. C., Sheldon, R. A., and Huisman, G. W. (2007). Improving catalytic
function by ProSAR-driven enzyme evolution, Nat. Biotechnol., 25(3),
pp. 338–344.

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

References 381

27. Gupta, R. D., and Tawfik, D. S. (2008). Directed enzyme evolution via
small and effective neutral drift libraries, Nat. Methods, 5(11), pp. 939–
942.
28. Chen-Goodspeed, M., Sogorb, M. A., Wu, F., and Raushel, F. M.
(2001). Enhancement, relaxation, and reversal of the stereoselectivity
for phosphotriesterase by rational evolution of active site residues,
Biochemistry, 40(5), pp. 1332–1339.
29. Lauble, H., Miehlich, B., Förster, S., Kobler, C., Wajant, H., and Effen-
berger, F. (2002b). Structure determinants of substrate specificity of
hydroxynitrile lyase from Manihot esculenta, Protein Sci., 11(1), pp. 65–
71.
30. Bernhardt, P., Hult, K., and Kazlauskas, R. J. (2005). Molecular basis of
perhydrolase activity in serine hydrolases, Angew. Chem., 117(18), pp.
2802–2806.
31. Jochens, H., Stiba, K., Savile, C., Fujii, R., Yu, J. G., Gerassenkov, T.,
Kazlauskas R. J., and Bornscheuer, U. T. (2009). Converting an esterase
into an epoxide hydrolase, Angew. Chem., Int. Ed., 48(19), pp. 3532–
3535.
32. Martin, A. C., Orengo, C. A., Hutchinson, E. G., Jones, S., Karmirantzou, M.,
Laskowski, R. A., Mitchell, J. B., Taroni, C., and Thornton, J. M. (1998).
Protein folds and functions, Structure, 6(7), pp. 875–884.
33. Koonin, E. V., and Galperin, M. Y. (2003). Sequence-Evolution-Function:
Computational Approaches in Comparative Genomics (Kluwer Acad-
emic, Boston).
34. Zuckerkandl, E., and Pauling, L. (1965). Molecules as documents of
evolutionary history, J. Theor. Biol., 8, pp. 357–366.
35. Valdar, W. S. (2002). Scoring residue conservation, Proteins, 48(2), pp.
227–241.
36. Cheng, G., Qian, B., Samudrala, R., and Baker, D. (2005). Improvement
in protein functional site prediction by distinguishing structural and
functional constraints on protein family evolution using computational
design, Nucleic Acids Res., 33(18), pp. 5861–5867.
37. Pupko, T., Bell, R. E., Mayrose, I., Glaser, F., and Ben-Tal, N. (2002).
Rate4Site: an algorithmic tool for the identification of functional regions
in proteins by surface mapping of evolutionary determinants within
their homologues, Bioinformatics, 18(suppl 1), pp. 71–77.
38. Bartlett, G. J., Porter, C. T., Borkakoti, N., and Thornton, J. M. (2002).
Analysis of catalytic residues in enzyme active sites, J. Mol. Biol., 324(1),
pp. 105–122.
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

382 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

39. Casari, G., Sander, C., and Valencia, A. (1995). A method to predict
functional residues in proteins, Nat. Struct. Biol., 2, pp. 171–178.
40. Mirny, L., and Gelfand, M. (2002). Using orthologous and paralogous
proteins to identify specificity-determining residues in bacterial tran-
scription factors, J. Mol. Biol., 321(1), pp. 7–20.
41. Suplatov D., Shalaeva D., Kirilin E., Arzhanik V., and Švedas V. (2014a).
Bioinformatic analysis of protein families for identification of variable
amino acid residues responsible for functional diversity, J. Biomol.
Struct. Dyn., 32(1), pp. 75–87.
42. Suplatov D., Kirilin E., Arbatsky M., Takhaveev V., Švedas V. (2014b).
pocketZebra: a web-server for automated selection and classification
of subfamily-specific binding sites by bioinformatic analysis of diverse
protein families, Nucleic Acids Res., 42(W1), pp. W344–W349.
43. Marks, D. S., Colwell, L. J., Sheridan, R., Hopf, T. A., Pagnani, A.,
Zecchina, R., and Sander, C. (2011). Protein 3D structure computed from
evolutionary sequence variation, PLOS ONE, 6(12), p. e28766.
44. Tress, M., de Juan, D., Graña, O., Gómez, M. J., Gómez-Puertas, P., González,
J. M., López, G., and Valencia, A. (2005). Scoring docking models with
evolutionary information, Proteins, 60(2), pp. 275–280.
45. Saeys, Y., Inza, I., and Larrañaga, P. (2007). A review of feature selection
techniques in bioinformatics, Bioinformatics, 23(19), pp. 2507–2517.
46. Pavelka, A., Chovancova, E., and Damborsky, J. (2009). HotSpot wizard: a
web server for identification of hot spots in protein engineering, Nucleic
Acids Res., 37(suppl 2), pp. 376–383.
47. Livingstone, C. D., and Barton, G. J. (1993). Protein sequence alignments:
a strategy for the hierarchical analysis of residue conservation, Comput.
Appl. Biosci., 9(6), pp. 745–756.
48. Lichtarge, O., Bourne, H. R., and Cohen, F. E. (1996). An evolutionary
trace method defines binding surfaces common to protein families, J.
Mol. Biol., 257(2), pp. 342–358.
49. Cover, T. M., and Thomas, J. A. (2012). Elements of Information Theory,
2nd ed. (Wiley-Interscience, Hoboken).
50. Hannenhalli, S. S., and Russell, R. B. (2000). Analysis and prediction
of functional sub-types from protein sequence alignments, J. Mol. Biol.,
303(1), pp. 61–76.
51. Kalinina, O. V., Mironov, A. A., Gelfand, M. S., and Rakhmaninova, A.
B. (2004). Automated selection of positions determining functional
specificity of proteins by comparative analysis of orthologous groups in
protein families, Protein Sci., 13(2), pp. 443–456.

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

References 383

52. Del Sol Mesa, A., Pazos, F., and Valencia, A. (2003). Automatic methods
for predicting functionally important residues, J. Mol. Biol., 326(4), pp.
1289–1302.
53. Sankararaman, S., and Sjölander, K. (2008). INTREPID: information-
theoretic tree traversal for protein functional site identification, Bioin-
formatics, 24(21), pp. 2445–2452.
54. Chakrabarti, S., Bryant, S. H., and Panchenko, A. R. (2007). Functional
specificity lies within the properties and evolutionary changes of amino
acids, J. Mol. Biol., 373(3), pp. 801–810.
55. Brandt, B. W., Feenstra, K. A., and Heringa, J. (2010). Multi-harmony:
detecting functional specificity from sequence alignment, Nucleic Acids
Res., 38(suppl 2), pp. 35–40.
56. Capra, J. A., and Singh, M. (2008). Characterization and prediction
of residues determining protein functional specificity, Bioinformat-
ics, 24(13), pp. 1473–1480.
57. Ye, K., Feenstra, K. A., Heringa, J., IJzerman, A. P., and Marchiori, E. (2008).
Multi-RELIEF: a method to recognize specificity determining residues
from multiple sequence alignments using a Machine-Learning approach
for feature weighting, Bioinformatics, 24(1), pp. 18–25.
58. Wallace, I. M., and Higgins, D. G. (2007). Supervised multivariate analysis
of sequence groups to identify specificity determining residues, BMC
Bioinformatics, 8(1), p. 135.
59. Yu, G. X., Park, B. H., Chandramohan, P., Munavalli, R., Geist, A.,
and Samatova, N. F. (2005). In silico discovery of enzyme–substrate
specificity-determining residue clusters, J. Mol. Biol., 352(5), pp. 1105–
1117.
60. Altschuh, D., Vernet, T., Berti, P., Moras, D., and Nagai, K. (1988).
Coordinated amino acid changes in homologous protein families, Prot.
Eng., 2(3), pp. 193–199.
61. De Juan, D., Pazos, F., and Valencia, A. (2013). Emerging methods in
protein co-evolution, Nat. Rev. Genet., 14, pp. 249–261.
62. Olmea, O., and Valencia, A. (1997). Improving contact predictions by
the combination of correlated mutations and other sources of sequence
information, Fold. Des., 2, pp. 25–32.
63. Fodor, A. A., and Aldrich, R. W. (2004). Influence of conservation
on calculations of amino acid covariance in multiple sequence align-
ments, Proteins, 56(2), pp. 211–221.
64. Göbel, U., Sander, C., Schneider, R., and Valencia, A. (1994). Correlated
mutations and residue contacts in proteins, Proteins, 18(4), pp. 309–
317.
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

384 Bioinformatic Analysis of Protein Families to Select Function-Related Variable Positions

65. Fares, M. A., and McNally, D. (2006). CAPS: coevolution analysis using
protein sequences, Bioinformatics, 22(22), pp. 2821–2822.
66. Weigt, M., White, R. A., Szurmant, H., Hoch, J. A., and Hwa, T. (2009).
Identification of direct residue contacts in protein–protein interaction
by message passing, Proc. Natl. Acad. Sci. U S A, 106(1), pp. 67–72.
67. Dunn, S. D., Wahl, L. M., and Gloor, G. B. (2008). Mutual information
without the influence of phylogeny or entropy dramatically improves
residue contact prediction, Bioinformatics, 24(3), pp. 333–340.
68. Kalinina, O. V., Gelfand, M. S., and Russell, R. B. (2009). Combining
specificity determining and conserved residues improves functional site
prediction, BMC Bioinformatics, 10(1), p. 174.
69. Theißen, G. (2002). Orthology: secret life of genes, Nature, 415(6873),
pp. 741–741.
70. Mazin, P. V., Gelfand, M. S., Mironov, A. A., Rakhmaninova, A. B., Rubinov,
A. R., Russell, R. B., and Kalinina, O. V. (2010). An automated stochastic
approach to the identification of the protein specificity determinants
and functional subfamilies, Algorithms Mol. Biol., 5, p. 29.
71. Pei, J., Cai, W., Kinch, L. N., and Grishin, N. V. (2006). Prediction of
functional specificity determinants from protein sequences using log-
likelihood ratios, Bioinformatics, 22(2), pp. 164–171.
72. Vinogradov, D. V., and Mironov, A. A. (2002). Siteprob: yet another
algorithm to find regulatory signals in nucleotide sequences, Proc. 3rd
Int. Conf. Bioinf. Genome Regulat. Struct., BGRS 2002, 1, pp. 30–32.
73. Jones, D. T., Buchan, D. W., Cozzetto, D., and Pontil, M. (2012). PSICOV:
precise structural contact prediction using sparse inverse covariance es-
timation on large multiple sequence alignments, Bioinformatics, 28(2),
pp. 184–190.
74. Lapedes, A. S., Giraud, B. G., Liu, L., and Stormo, G. D. (1999). Correlated
mutations in models of protein sequences: phylogenetic and structural
effects, Lect. Notes: Monograph Ser., 33, pp. 236–256.
75. Morcos, F., Pagnani, A., Lunt, B., Bertolino, A., Marks, D. S., Sander, C.,
Zecchina, R., Onuchic, J., Hwa T., and Weigt, M. (2011). Direct-coupling
analysis of residue coevolution captures native contacts across many
protein families, Proc. Natl. Acad. Sci. U S A, 108(49), pp. 1293–1301.
76. Dutta, S., Zardecki, C., Goodsell, D. S., and Berman, H. M. (2010). Pro-
moting a structural view of biology for varied audiences: an overview
of RCSB PDB resources and experiences, J. Appl. Crystallogr., 43(5), pp.
1224–1229.

www.ebook3000.com
March 23, 2016 12:37 PSP Book - 9in x 6in 10-Allan-Svendsen-c10

References 385

77. Valdar, W. S. J., and Thornton, J. M. (2001). Protein-protein interfaces:


analysis of amino acid conservation in homodimes, Proteins, 42, pp.
108–124.
78. Suplatov D., Kirilin E., Takhaveev V., and Švedas V. (2014c). Zebra: a web-
server for bioinformatic analysis of diverse protein families, J. Biomol.
Struct. Dyn., 32(11), pp. 1752–1758.
79. Holmquist, M. (2000). Alpha beta-hydrolase fold enzymes structures,
functions and mechanisms, Curr. Protein Pept. Sci., 1(2), pp. 209–235.
80. Ollis, D. L., Cheah, E., Cygler, M., Dijkstra, B., Frolow, F., Franken, S.
M., Harel, M., Remington, S. J., Silman, I., Schrag, J., Sussman, J. L.,
Verschueren, K. H. G., and Goldman, A. (1992). The α/β hydrolase fold,
Protein Eng., 5(3), pp. 197–211.
81. Henke, E., and Bornscheuer, U. T. (2003). Fluorophoric assay for the
high-throughput determination of amidase activity, Anal. Chem., 75(2),
pp. 255–260.
82. Suplatov, D. A., Arzhanik, V. K., and Švedas, V. K. (2011). Comparative
bioinformatic analysis of active site structures in evolutionarily remote
homologues of α, β-hydrolase superfamily enzymes, Acta Naturae, 3(1),
pp. 93–98.
83. Suplatov, D. A., Besenmatter, W., Švedas, V. K., and Svendsen, A. (2012).
Bioinformatic analysis of alpha/beta-hydrolase fold enzymes reveals
subfamily-specific positions responsible for discrimination of amidase
and lipase activities, Protein Eng. Des. Sel., 25(11), pp. 689–697.
84. Radisky, E. S., and Koshland, D. E. (2002). A clogged gutter mechanism
for protease inhibitors, Proc. Natl. Acad. Sci. U S A, 99(16), pp. 10316–
10321.
This page intentionally left blank

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Chapter 11

Decoding Life Secrets in Sequences by


Chemicals

Zizhang Zhang
Department of Chemistry, Zhejiang University, Hangzhou 310027, China
zhangzz@zju.edu.cn

Entering the 21st century, scientific frontiers have moved to a


new horizon toward understanding the entire biological systems.
Revealing the enzymes’ functions on the basis of their primary
sequences and substrate structures is a new approach to reach this
mega goal. It presents a great opportunity and also an unparalleled
challenge due to the vast volume of data. Biocatalysis has supplied
seminal knowledge. On the foundation of this, combining the
use of databases created of experimental data of enzyme activity
and computational methods such as molecular modeling and
bioinformatics tool makes it possible to establish the relationship
between enzymes’ primary sequences and their catalytic functions.
In this chapter, a discussion is provided and a model is proposed on
the basis of review of amassed literatures and distilled information,
with emphasis on the potentially useful technologies.

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

388 Decoding Life Secrets in Sequences by Chemicals

11.1 Introduction

A new era of big data has come as we enter the 21st century. As a
matter of fact, the volume of data and information acquired from
biological and chemical studies has been increasing exponentially
in recent years. In particular, there are now 17.9 million gene
sequences in GenBank, 97,000+ protein structures in the Protein
Data Bank (PDB), and 71 million chemicals with unique structures
in the CAS Registry (as of January 20, 2015). Understanding the
enzymes’ functions on the basis of their primary sequences and
substrate structures appears to be an important step to better
understand biological systems. This presents a great opportunity
to reveal not only the secrets embedded in the sequences but also
unparalleled challenges, given the technical difficulties and vast
volume of data and information to be dealt with. A new strategy with
a global vision is needed.
Fortunately, a variety of reliable investigational methods and
technologies have been established and a large volume of knowledge
has been compiled. Briefly:

• Enzyme-catalyzed reactions are often featured with high


stereoselectivity, enantioselectivity, and regioselectivity,
and those can be characterized quantitatively and qualita-
tively [1–4].
• Enzymes are known to accept substrates structurally
similar to their natural ones; in particular, some of them
have broad substrate scope.
• Designed substrates may be used for characterization of
enzymes.
• In addition, computational methods, such as molecular
modeling based on protein structures, and machine-
learning methods based on statistics have been developed
to predict substrates and selectivity.
• Sequencing technology has been improved tremendously
and gets faster and faster.

Integration of all the above makes it possible to establish the rela-


tionship between enzymes’ primary sequences and their catalytic

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Linking an Enzyme’s Activity to Its Sequence 389

function. Doing so may constitute a fully new distinctive science


that rests on the foundation of biocatalysis. It is referred to as
genomic chemistry, or genochemistry in short, a study that bridges
sequence information and enzyme-catalyzed chemistry. It may serve
as a strategy and a technological platform for biocatalytic process
development for pharmaceuticals, agrochemicals, and biofuel [5–
7] and to get new insight into biological systems for medicinal
research, in addition to pharmacogenomics [8], metabolomics [9],
chemogenomics [10], chemical genomics [11], and systems biology
[12].

11.2 Linking an Enzyme’s Activity to Its Sequence

Enzymes, proteins with catalytic function, are polypeptides con-


sisting of amino acids in sequences encoded by their genes.
They exert their catalytic activity through specifically folded 3D
structures. On the basis of the Anfinsen’s principle [13], the
information that defines the 3D structures, and hence the functions,
is contained solely in the amino acid sequences. In this well-accepted
assumption, there are two contrary scenarios. In the first scenario,
similar structures, and therefore functions, can be defined by genes
with very low similarity in sequences. Thus, epoxide hydrolase (EH)
from Frankia sp., Rhodococus erythropolis, Burkholderia cenocepacia,
and B. pseudomallei, which have sequence similarity as low as
0.6%–36%, were classified in the same class (unpublished result).
This is not surprising, especially since the local structure of the
active site for the catalytic activity may be similar. In the second
scenario, the enzyme’s functions can be largely altered by single-
nucleotide polymorphisms (SNPs) [14, 15]. Numerous examples
show that the substitution of one amino acid in the sequence
of protein may drastically change the substrate specificity of the
enzyme. For instance, the enzyme activity of EH in humans shows
high interindividual variation (e.g., 500-fold in the liver). A variety
of polymorphisms were found, six of which result in amino acid
substitutions in the human population, and at least four of these
may influence human soluble epoxide hydrolase (hsEH)-mediated
metabolism of exogenous and endogenous epoxide substrates in
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

390 Decoding Life Secrets in Sequences by Chemicals

vivo [16]. A genome-wide association study (GWAS) conducted


by Sanger’s group showed that variants in the genes for three
enzymes—vitamin K epoxide reductase complex subunit 1 gene, or
VKORC1; cytochrome P450 2C9 or CYP2C9; and CYP4F2—have been
linked to a 20-fold variation of warfarin dose response [17].
In the first scenario, the large sequence space of the enzymes
converges upon a specific 3D architecture to perform the enzyme’s
specific function, while the remainder 74%–99% of the difference in
the sequence is not reflected in the function. In the second scenario,
on the contrary, the slight difference in sequences may be amplified
greatly in difference of functions or even be diverged into unrelated
functions, that is, promiscuous activity of enzymes.
While the convergence (reduction of differences) on one side
and the divergence (amplification) on the other potentially reflect
the complexity and flexibility of the strategies that living systems
have adapted to face the perturbations of the chemical world, what
is the common ground at the molecular level of the two obviously
controversial phenomena?
From the chemical point of view, however, by careful analysis
and rationalization, it is not hard to see that the convergence in
the first scenario might not be related to the same focal point,
same substrate, and/or same kinetic parameters. The criteria used
for assessment are not exactly identical, and many features such
as the enzyme’s substrate specificity, specific kinetics, and other
features are not taken into account. In other words, our knowledge
is still vague at this point. The intricacies of the difference in gene
sequences are overshadowed by the ambiguous term “similarity” in
structures of proteins.
In the second scenario, divergence may not exclude the involve-
ment of other factors. For instance, recently, Kimchi-Sarfaty et al.
reported that a “silent” SNP that does not change the amino acid
sequence may dramatically alter the substrate specificity of the
enzyme [14, 15], demonstrating that naturally occurring variations
in synonymous codons in a defined gene can give rise to a protein
product with the same amino acid sequence but different structural
or functional features. In addition, altered translation kinetics of
mRNA might affect the final protein conformation; this hypothesis
is supported by the artificial site-directed silent mutagenesis of

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Linking an Enzyme’s Activity to Its Sequence 391

synonymous codons (changing from infrequent to frequent) in


certain genes [18, 19].
On top of these hypotheses, the cellular processes, including
alternative RNA splicing and posttranslational protein modification,
may create more than one protein product from a given sequence in
the genome, yet its regulation is poorly understood [20].
Thus, our current understanding provides a more concrete
picture as follows:
One gene, one protein => several conformations => several
functions
One gene (SNP) => one different protein or different conforma-
tion => different function
One gene (siSNP), one protein => different conformation =>
different function
One gene, several proteins => different conformations =>
multiple functions
In any case, it can be seen that one gene may create more than
one protein products or functions.
While these hypotheses are technically very difficult to prove
in the absence of in vivo enzyme structural information because
numerous quality-control mechanisms exist to abolish incorrectly
or abnormally folded and misfolded proteins in the cellular
environment, they point to the same blind spot whether the
subtleness of genetic sequence change may alter the structure and
hence the function of a protein. It is now clear that the controversy
will continue to puzzle us without more descriptive information on
the collective interaction between the enzyme of a specific sequence
and structural diversity of its substrates. That might be the central
issue that the scientific communities in all of the life sciences are
facing in the 21st century. It is here that genochemistry is positioned.
Viewing the nature of the problems to be faced and volume of the
data to be dealt with, it obviates a new strategy of integration, and it
necessitates the construction of databases on the basis of specifically
designed experimental data and use of computational methods
and bioinformatics tools. Fortunately, the tools, methodologies, and
technologies are largely available now. On the basis of those, a
framework of genochemical approach is proposed, as shown in
Fig. 11.1.
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

392 Decoding Life Secrets in Sequences by Chemicals

sequencing sequence
data mining profile C
Enzyme N
D

Enzyme ...

sequence P
profile A Enzyme C
substrome C

D
Enzyme B
P activity tests
data mining

Enzyme A P: Promiscuous
substrome A
D activity
digital models D: digitization

Figure 11.1 Illustration of genochemistry framework: digital model of any


living system is constituted of digital modules of enzymes built on the basis
of experimental data and sequences acquired from sequencing technology.

As mentioned previously, the number of enzyme types required


to maintain the basic functions in living systems is estimated in the
range of 2000–3000, which is a relatively small number compared
to genetic sequencing data. A living system can be virtualized as a
machine comprising these enzyme functions from a viewing point
of a small molecule traveling in it.
Similarly, a genochemistry model consists of a series of enzyme
modules, for example, enzyme A: cytochrome P450; enzyme B:
EH, etc. Each module is represented as a digital model made up
of the enzyme’s essential sequences and functions. The model
of enzyme A is built on the basis of the sequence profile A
and substrome A through digitization of experimental data. The
promiscuous activity of enzymes is taken into account. To make a
genochemistry study successful and useful, the experiments must be
sufficiently fast enough and provide sufficiently detailed information
and data sharable with other -omics approaches. Consequently,
it requires high-throughput methods for screening; standardized
data collections of enzyme activity, allowing digitization; and

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Linking an Enzyme’s Activity to Its Sequence 393

Enzyome 1 Product 1

Substrate Enzyome 2 Product 2

Enzyome 3 Product 3

Figure 11.2 Illustration of genochemical pathways: a substrate can be


transformed to product 1 via enzyome1 or product 2 via enzyome 2, and
so on.

molecular modeling linking the enzyme’s sequence information to


its substrome.
In accordance with the genomic approach, here we coin two
terms for more clarity in description in some cases: (1) substrome,
which stands for collectively all of the substrates of an enzyme,
in addition to terms such as “substrate specificity,” “substrate
scope,” “substrate profile,” etc., and reciprocally (2) enzyome
(corresponding to proteome), which stands for a group of enzymes
that is collectively involved in catalyzing a cascade reaction of
a specific substrate. The same substrate can be transformed to
product 1 by enzyome 1, product 2 by enzyome 2, and so on
(Fig. 11.2).
In such a way, genochemistry differs from other -omics app-
roaches by its strategy. It divides the living system into constituent
modules. Each module comprises digital models of an enzyme (or
a family of enzymes), depending on the enzyme’s sequences and
functions. The modules can be studied and built independently and,
however, can function individually or in synergy-forming cascade
reaction pipelines and networks (enzyome) when a product of one
module serves as the substrate of others.
Unlike other -omics investigations, genochemistry cannot be
accomplished through any single technological platform. As an
integrated strategy aimed at revealing the complex interactive
relationship between biodiversity and chemical structural diversity,
the complexity of the problems and the volume of the information to
be dealt with are considerable.
When these modules are made available, it could ultimately be
expected that new substrate structure can be discriminated and
predicted from a given sequence of an enzyme (or a gene), and
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

394 Decoding Life Secrets in Sequences by Chemicals

enzymes diverse
of diverse substrate enzyme substrates
sequences (Substrome)
Convergence Divergence

Figure 11.3 Three aspects to be envisioned in building genochemical


modules: convergence of sequence to a specific substrate, divergence of the
enzyme’s activity toward a diverse substrate, and integration of both.

vice versa. It is similar to the biochemical metabolism pathway


that biochemists have built throughout the 20th century. Such
relationships can be easily indexed with other phenotypes and
biomarkers, such as diseases, toxicity, metabolites, etc., from other
-omics approaches and traditional studies, making wider usefulness
in all the life sciences.
To build models, three major aspects are to be envisioned:

• The convergence of large sequence space against a specific


substrate
• The divergence of an enzyme (or a family of enzymes)
against substrates of diverse structures
• The integration and digitization of both (Fig. 11.3)

In addition, a prerequisite is that the models must be progressive


and inclusive so that the newly added sequences and substrates
information will constantly improve the precision and accuracy.
Fortunately, current scientific advances have made available numer-
ous scientific tools, and they are routinely used in biocatalysis and
related investigations. A vast volume of sequence data is available
and can be mined freely. Meanwhile, the sequencing technology
is improving constantly and getting much faster and more cost
effective than it was just a few years ago. Currently, sequencing of
the whole human genome takes only a few hours instead of several
years as it used to [21]. Digitization of protein sequences from a
variety of species to build digital models of enzymes is a routinely
used strategy in bioinformatics. Molecular modeling has been well
developed and widely used in investigation of molecular recognition,
such as the virtual screening in pharmaceutical research. Both con-
vergence and divergence strategies are routinely used in biocatalysis
in which a number of sophisticated tools and criterion to measure

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Refining the Sequence Space to a Specific Function by Directed Evolution 395

Table 11.1 Terms and definitions in this chapter

Terms Definition
Genochemistry The study of chemistry mediated by enzymes relating their sequence
information and structural diversity of substromes.
Enzyme A protein with specific sequence and specific catalytic function, that is,
lipase EC 3.1.1.3 from Chromobacterium viscosum.
Enzyme variants Enzyme mutants with sequence variations similar to the native enzyme.
Enzyme family The same type of enzymes from various origins.
Enzyme model A digital model built with various computational methods based on
either the protein structural structures or the sequences.
Enzyme module The practically functional model that represents the enzyme properties
and functions in an organelle or a cell in a living system. An enzyme
module is made of constituent digital models or collectively all digital
models of the enzyme.
Enzyome A group of enzymes involved in the cascade transformations of a
chemical. The product of one enzyme is the substrate of the next.
Substrome The collection of substrates of a specific enzyme that have diversified
structures. A subtrome provides not only the information on substrates
but also the shape and space of the active site of the enzymes.

and evaluate the enzyme’s properties have been developed. These


tools allow not only the proper characterization of enzymes but also
the optimization of enzymes through mutagenesis of their genes and
the elucidation of the mechanism of catalytic reactions mediated
by enzymes. By integrating these tools, genochemistry is expected
to supply more detailed and descriptive information in order
to understand the relationship of enzymes and their substrome
and annotate their genomic sequences. For clarification, a list of
terms and their definitions is given in Table 11.1. The candidate
technologies are briefly described through explicit examples in the
following paragraphs.

11.3 Refining the Sequence Space to a Specific Function


by Directed Evolution

Since the midnineties of the last century, Arnolds and Reetz have
pioneered a powerful technology, namely directed evolution (DE).
Since the very beginning, DE technology has been playing the
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

396 Decoding Life Secrets in Sequences by Chemicals

central role in biocatalysis, owing to its groundbreaking power in


regulating and improving an enzyme’s property through sequence
optimization. DE mimics the natural evolution processes in test
tubes. Through recursive rounds of creation of genetic sequence
diversity and a selection/screening process, the enzyme’s activities
and properties are improved. Typically, genetic variants are created
from the gene(s) of an enzyme (or a family of enzymes) by various
methods of mutagenesis, randomly, or by design. These variants are
incorporated into plasmids and expressed into a bacterium host,
for example, Escherichia coli. As the dilute culture of bacterium
containing the genetic variants is plated over the solid surface
of media, the single clones grow. Each clone contains an enzyme
encoded by a specific gene. The clones are then picked and incubated
separately to form a library of mutants; the mutant enzymes are
screened in whole cells or in crude preparation made of cell lyses,
or cell wall breaks. High-throughput technologies are mandatory
for each step of screening operation at this stage. The best clones
obtained from the selection/screening process are further mutated
for the next round of selection or the next generation of evolution.
The process iterates, and the enzyme evolves.
Directed evolution circumvents our profound ignorance about
how a protein’s sequence encodes its function by using iterative
rounds of random mutation and artificial selection to discover
new and useful proteins. Proteins can be tuned to adapt to new
functions or environments by simple adaptive walks involving small
numbers of mutations. DE studies have provided new insight into
the relationship between sequence and function. DE has also shown
how silent mutations that are functionally neutral can set the stage
for further adaptation [22].
In one investigation conducted by Arnold’s group, P450BM3
that mediates oxidation of the fatty acids was evolved to an
extended substrome, including nonpolar alkanes, gaseous alkanes
and alkenes, and aromatic compounds (Fig. 11.4). The reaction types
were also expanded from simple hydroxylation to epoxydation, and
the sequence/substrome relationship is well established [22–25].
Dexterous use of methods of mutagenesis and selection/
screening makes DE an efficient and powerful strategy to optimize
an enzyme’s properties through convergence of genetic sequences.

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Refining the Sequence Space to a Specific Function by Directed Evolution 397

P450BM3' preferred
substrates 4. synthetic intermediates
1. linear alkane
O COOR

OH
6-14
Phenylacetic acid ester

O
(OH) (OH) O
N
OH
2. gaseous alkane 6-14 2-cyclopentylbenzoxazole
(OH)

5. drug compounds

3. alkane and arene epoxidation O


O N
H
OH N N
N
O
Propranol Buspirone

Figure 11.4 P450BM3 hydroxylation and epoxidation activities achieved by


directed evolution.

Currently, DE technology based on test tube evolution experiments


has been applied to many biological systems, including enzymes,
metabolic pathways, genetic circuits, and ecosystems, and has
created a great deal of knowledge and breakthroughs, largely
moving scientific frontiers forward. A brief outline is given as
follows:

• DE generates novel and useful enzymes and organisms for


applications in medicine and in alternative energy.
• DE allows us to study structure–function relationships free
from constraints of natural selection.
• DE helps to elucidate on the principles of biological design.
• DE enables us to construct new biosynthetic pathways and
circuits for controlling gene expression and intercellular
signaling capabilities.

DE has been the most fruitful field in the biocatalysis study. For
perusal of more detailed information, readers are referred to other
related reviews, with attention of the pioneering works of Arnold’s
group and Reetz’s group.
DE technology comes from many disciplines, including chem-
istry, bioengineering, biochemistry, molecular biology, microbiology,
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

398 Decoding Life Secrets in Sequences by Chemicals

chemical engineering, applied physics, and, last but not least,


sequencing technology. And this powerful technology can play a
central role in genochemistry investigations.

11.4 Linking Chemistry to -Omics with High-Throughput


Screening Methods

High-throughput screening (HTS) methods are crucial for geno-


chemistry in order to acquire large datasets of quality for revealing
of the enzyme’s activity and for modeling. Enzymes are all asymmet-
ric by nature and possess active sites that are also asymmetric in
shape [25]. A quantitative description of such an asymmetric cavity
is still a challenge at present. Qualitative and empirical terms such
as large, medium, small, hydrophobic, hydrophilic, etc., are often
used in molecular modeling. Depending on the fitness in the pocket
of active site of an enzyme, two mirror-image enantiomers in a
racemic substrate usually behave differently with few exceptions
[26]. That is the origin of enantioselectivity and why enzymes can
differentiate the two enantiomers. For example, in the hydrolysis
reaction of acylated amino acid mediated by L-aminoacylase, the
enzyme preferably hydrolyzes the L-enantiomer in the racemic
mixture, leaving the D-enantiomer unreacted, transforming the
two enantiomers in different and separable forms. In this kinetic
resolution process, two enantiomers react at a different rate. An L-
enantiomer reacts faster than a D-enantiomer does. The composition
of the unreacted substrate and the product formed, as well as
their enantiomeric excesses (ees and eep ), varies as a function of
the progress of the reaction. Given enough time of reaction, both
enantiomers will be deacylated, and the eep will reach zero. Sih
and coworkers developed an equation for quantitative description
of this type of kinetic resolution process. And they figured out
the governing factor of this process, which is often referred to as
enantioselectivity, or simply the E value. The relationship between
the variables is expressed in the following equation [27, 28]
(Fig. 11.5).
E is a constant for a given enzyme and substrate under given
conditions. It has been the most important criterion along with KM

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Linking Chemistry to -Omics with High-Throughput Screening Methods 399

ln[(1 – c)(1– ees )] ln[(1 – c)(1– eep )]


E= or E = .
ln[(1 – c)(1– ees )] ln[(1– c)(1– eep )]

NHAc L-Aminoacylase NH 2 NHAc


+
R COOH H 2O R COOH R COOH

eep ees

Figure 11.5 Sih’s quantitative expression of enantioselectivity, the E value.

and kcat to characterize and assess enzymes and the reactions they
catalyze. From genochemical point of view, the E value is a very
interesting factor because it contains not only the kinetic parameter
but also the structural information (symmetry) of the substrate
and the interactive motion of the substrate and the enzyme. In
practice, the calculation of the E value, which depends on the
measure of ees (or eep ) at certain depth of conversion (c), is a
tedious task. It demands the physical separation of two enantiomers
for quantification, which relies often on the chromatographic
technologies with chiral columns. It is a serious challenge to
measure ee in a high-throughput manner.
To overcome this bottleneck, Reetz et al. devised an HTS method,
using a pseudoracemate mixed with equal molar amount of a
normal enantiomer and its isotopically labeled pseudo-enantiomer
(G: deuterium, Fig. 11.6). The pseudo-enantiomers in the substrate
are different in mass but same (or close to be same) in structure

OAcG 3 OH

CH 3 CH 3 + G3AcOH

OAc OH
+ AcOH
CH 3 CH 3

G: 1. D; 2. F

Figure 11.6 Kinetic resolution of pseudo-enantiomers. G is deuterium in


the screening method and fluorine in the selection method.
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

400 Decoding Life Secrets in Sequences by Chemicals

and size, minimizing the isotopic effects on the substrate specificity.


When reaction occurs, the products produced from two individual
pseudo-enantiomers differ from each other in mass, resulting in the
easy quantification by mass spectrometry [29, 30].
As mass spectrometry is a fast technology, this allows the
determination of the eep and conversion and hence the relative
reaction rate of the two pseudo-enantiomers. Subsequently the E
values can be obtained in a high-throughput manner. In such a way,
the authors were able to determine 1000 ees per day and screen
a large size of library of variants for evolution of enantioselective
enzymes. The method present great potential and is ideal for
establishing the relationship between the sequence of enzymes and
their substrates.
The authors also devised a selection method by further exploit-
ing the concept of pseudo-enantiomers. They labeled one enan-
tiomer with fluorine instead of deuterium (G: F, Fig. 11.6). With the
pseudo−enantiomers as substrates added in the growth media in
Petri dishes, the variants that favor the fluorine-labeled enantiomer
were killed because the product they produced, trifluoroacetic acid,
is a strong bacterium-killing agent. The variants that favor the
nonlabeled enantiomer, however, will live and grow as single clones,
benefiting from the acetic acid they released. Thus, the preference
among two enantiomers becomes a game of live or die, allowing
the easy selection of the variants with desired enantiopreference.
In this way, a selection method reduces tremendously the volume
of the measures of screening, sorting useful variants far more
efficiently [29–32]. Combining with the sequencing technologies,
these screening and selection methods and the inspiration thereof
will certainly be useful in linking chemistry to other -omics
approaches as required by genochemistry investigation.

11.5 Finding Large Sequence Space of a Specific Function


from Microbial Diversity

As estimated by Cowan et al., there are 106 –108 strains of


microorganisms in the nature, the majority of this rich resource
(>90%) has not been investigated [33]. In particular, those from ex-

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Finding Large Sequence Space of a Specific Function from Microbial Diversity 401

treme environmental conditions or evolutionarily unrelated species


present greater gene sequence variation. In order for the modules
built through the genochemical approach to be inclusive and
practically useful, a large sequence space is a requisite. Therefore,
screening of a large size of a collection of microorganisms and
data mining from Gene Databases (e.g., GenBanks) are instrumental.
This is also a process of discovery of new selectivity, new activity,
and new enzymes within the large sequence space. Biocatalysis
has benefited from the systematic use of nonnatural substrates of
diverse structures and various physical instruments that can serve
as a strategy in genochemical approaches to map the shape and
space of enzymes’ active sites [33, 34]. Such an approach may be
more successful if enhanced by HTS and computational methods
that can deal with large numbers of substrates tested.
As discussed previously, both convergence and divergence
strategies are routinely and alternatively used in biocatalysis.
Typically, for a specific substrate of interest, an assay method is
designed and screening of a collection of enzymes or cultures of
microorganisms is conducted. The separation technologies, such as
various chromatographic technologies, especially those equipped
with chiral columns, are commonly used to monitor the conversion
of substrate and the formation of the intermediate(s) and product(s)
and to guide the isolation of those. Various technologies can
be chosen for the determination of the structures and physical
properties of the intermediates and the products. They include
nuclear magnetic resonance (NMR), mass spectrometry, infrared
(IR), UV-Vis, X-ray crystallography, circular dichroism, etc. The
following examples show the technologies in a typical biocatalysis
study.
Lipase is the most studied enzyme in biocatalysis. They are
known to have a broad substrate scope and high enantioselectivity.
Zhang et al. resolved a bulky chemical, compound 1, (S)-(–)-
and (R)-(+)-2,3-dihydro-3-(4-hydroxyphenyl)-1,1,3-trimethyl-1H-
inden-5-ol, an industrial chemical, using enzymes [2]. They synthe-
sized the 1-diester as substrate and screened 21 hydrolases. The
enzymes, similar in structures, showed a remarkably large differ-
ence in activity towards the nonnatural structure. Most of them can
actively remove the ester groups from the structure with low or no
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

402 Decoding Life Secrets in Sequences by Chemicals

selectivity, yielding a mixture of 1-diesters, 1-monoesters, and two


enantiomers of 1 in various compositions. C. viscosum lipase (CVL)
catalyzes only the removal of the acyl on α position through path
a among all possible paths, with surprisingly high chemo-, regio-,
and enantioselecitivity, yielding (S)-1-4 -monoester and nonreacted
(R)-1-4 -diester as products (Fig. 11.7a).
In this work, enzyme activities were tested using automated
titration; E values were determined using high-performance
liquid chromatography (HPLC) equipped with chiral columns.
The structures were elucidated using proton nuclear magnetic
resonance (1 HNMR), 2D correlation spectroscopy (COSY), and gas
chromatography–mass spectrometry (GC-MS). The regioselectivity
and chemoselectivity were determined by comparing the chemical
shifts of corresponding protons on diol, monoester, and diester. The
chemical shifts of the orth-H of phenol are found in the range of δ
6.47–6.73 ppm and orth-H of ester in δ 6.77–7.00 ppm (Fig. 11.7b).
The absolute configuration (enantiopreference) of the resolved
product was assigned independently by circular dichroism and X-
ray crystallography. The observations of the features of CVL from the
experiments were confirmed by molecular modeling using Swiss-
Model and Discover (Biosym/MSI) (Fig.11.7c) [35].
It is worthy to note the following facts:
• Regioselectivity: CVL favors acyl on more hindered phenol.
• Chemoselectivity: Monoesters are not the substrate.
• Enantioselectivity: It is controlled by the remote acyl.
• The E value depends on the length of the side chains.
The maximum was found with butyl, and it decreases with
shorter or longer side chains [2, 35].
This chapter exemplifies the power of current biocatalysis technol-
ogy and also reveals the limitation of our understanding about the
enzymes and the frontier of current knowledge. Challenges exist
mainly in increasing the throughput of this process. Unlike in the
case of DE, screening of natural microbial diversity and nonnatural
chemicals suffer from heterogeneity of growth conditions and the
lack of high-throughput detecting methods.

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Finding Large Sequence Space of a Specific Function from Microbial Diversity 403

(a) OH

O
(S)-1-4'-mono-
C4 H9
ester
O
α
a c
C4 H9 C4 H9 OH
O O
O O O
C4 H9 b d
O OH OH

α β (S)-1-5-mono-
(S)-1- ester (S)-1
diester C4 H9 OH OH
O
O O O
a' C4 H9 c'
C4 H9
O O OH

β (R)-1-4'-monoester
C4 H9 (R)-1
(R)-1- O
diester
O
b' d'

OH
(R)-1-5-mono-
ester

(b)

δ 6.98 O δ 6.68 δ 6.73


C 4H 9
H H H
δ 7.17 O δ 6.97 OH δ 7.02
OH
H H H
δ 6.77 δ 6.47 δ 6.50
H H H
C 4H 9 C 4H 9
O OH O
O O
H δ 6.93 H δ 6.63 H δ 6.92

(c)

O large pocket

O
O
P
O direction of view

87 Ser
medium pocket

Figure 11.7 (a) An enantio-, regio-, and chemoselective kinetic resolution


of indane monomer by CV, (b) determination of the regioselectivity
by 1 HNMR, and (c) molecular modeling of the enantio-, regio-, and
chemoselective kinetic resolution of indane monomer mediated by CVL.
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

404 Decoding Life Secrets in Sequences by Chemicals

11.6 Linking Sequences to Substromes at the Molecular


Level

Biocatalysis has built a great wealth of knowledge, which has


benefited from the systematic use of nonnatural chemicals as
substrates and screening of enzymes from diverse origins, in
particular the microbial diversity, owing to the persisting efforts
made by all the scientists in the biocatalysis communities. At the
molecular level, the catalytic function of enzymes is governed by
their 3D structures and is intimately linked to the substrates the
enzymes transform. Substrate specificity, selectivity, and kinetics
are probably the best ways to distinguish the interspecies and
interindividual differences between enzymes with genetic variation.
This is particularly true when the enzyme structural information is
absent.
With the genochemical approach, the interactive relationship
of an enzyme and its substromes is studied and correlated at the
molecular level, amino acid sequence, and substrate structures, with
notes of the genetic sequence information as well as the origins. In
addition, the relationship it can be easily indexed to the biomarkers
as well as the physiological conditions of the hosts. Therefore, it
provides a common ground and a technological platform to create
knowledge sharable with many other disciplines and approaches,
forming a solid foundation for all the life sciences. We take the
example of the EH family to describe the principles, with emphasis
on various useful technologies used in the study.
Cytochrome P450 type of enzymes and EH are ubiquitous
in living systems. Together, they catalyze the oxygenations and
sequential hydrolysis of exogenous and endogeneous chemicals
in the process of degradation. In human beings, they are also
the key metabolizing enzymes responsible for drug metabolism
and/or activations. Their substromes include a broad range of
natural or nonnatural chemicals, such as steroids, aliphatic, aromatic
chemicals, etc. The reactions catalyzed by cytochrome P450s occur
often on the nonactivated positions with some degree of regio-
and stereoselectivity, which are seldom predictable. According to
the mechanisms that the molecular oxygen is introduced into
the substrate, they are also classified into monooxygenase and

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Linking Sequences to Substromes at the Molecular Level 405

dioxygenase. The former inserts one atom in the molecular oxygen


into the C–H bond and/or the C–C bond, leaving another into water;
the latter incorporates both the oxygen atoms onto two vicinal
atoms of the substrate. Monooxygenase and EH often perform their
catalytic function sequentially in cascade reactions, rendering the
products more hydrophilic and eventually degraded. Overall, these
enzymes have three main functions: detoxification, catabolism, and
regulation of signaling molecules. The degradation, however, is not
always beneficial to the health of the host. See the discussion below
on hsEH.

11.6.1 Biocatalytic Study of EHs


Although EH had been reported more than 30 years ago in
plant pathogenic fungus Fusarium solani pisi [36], the major
breakthroughs were brought by the finding from Aspergillus niger
[37, 38].
Zhang et al. incubated A. niger with nonnatural derivatives
of geraniol in the atmosphere of isotope oxygen (18 O2 ). The
biotransformation yielded (S)-diol (>95% ee) as product at pH 2
and (R)-diol (>95% ee) at pH 7. Both products have [M+ 2] peaks
on mass spectrometry, containing one isotopic oxygen atom. Peaks
of the daughter ions by fragmentation show that the isotopic oxygen
atom is on C6 in (S)-diol and on C7 in (R)-diol (Fig. 11.8).
The result evidenced the two-step cascade mechanism of the
dihydroxylation with no ambiguity, that is, first, one atom of
molecular 18 O2 is incorporated into the double bond by action of the
monooxygenase with high stereoselectivity, yielding 6(S)-epoxide
(>95% ee); second, the epoxide intermediate is then attacked on the
less substituted carbon by an EH at pH 7.0 (see also the discussion
of the mechanism of EH later) and undergoes a trans ring opening
with inversion of the absolute configuration. The product of the first
enzyme serves as the substrate of the second (enzyome consisted of
two enzymes). The work led to the discovery of EH in a strain of A.
niger and spurred enthusiasm for more research in the area [39, 40].
The substrome of the two enzyme systems includes N-
phenylcarbamate of citronelol and geraniol coumarin [41, 42].
Following this finding, EHs from diverse origins have been reported
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

406 Decoding Life Secrets in Sequences by Chemicals

(a)
OR A . niger OR

18 O 18
2 O

R: -CONHPh

pH 2.0 pH 7.0
acidic Epoxide
opening hydrolase

OR OR
18
OH OH
18
OH OH
6(S)-diol 6(R)-diol

(b)
OR OR
18 250 248
OH OH
18
59 OH 61 OH

M+2 = 309 M+2 = 309

Figure 11.8 (a) Stereoselective dihydroxylation of nonactivated double


bond of Geraniol by A. niger. (b) Determination of the position of isotopic
labeling by mass spectrometry.

and have become a focus of biocatalysis. The gene and cDNA for
EH from A. niger was isolated by use of inverse polymerase chain
reaction (PCR). Recombinant expressions of the isolated cDNA in
E. coli yielded a fully active EH with similar characteristics to
the fungal enzyme [43]. The substrate specificity and regio- and
enantioselectivity of this family of enzymes have been studied
against a variety of substrates, and rich information has been built
[39–51].
EH from A. niger is now commercially available.
The crystal structure of EH from A. niger was solved at 1.8 Å
resolution. And the structure provides the first structure of an EH
with strong relationships to the most important enzyme of human

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Linking Sequences to Substromes at the Molecular Level 407

epoxide metabolism, the microsomal EH. It has been used as a


homology model in bioinformatics for modeling of the EH family
(see later).

11.6.2 Pharmacological Study of EHs


In parallel, Hammock et al. have pioneered in the study of EHs in
insects, animals, and humans. The persistent and lasting study has
achieved that hsEH is proven to be a therapeutic target, which is
related to several diseases, such as inflammation, cardiovascular
diseases, and hypertension [44–46]. hsEH metabolizes a variety of
epoxides to the corresponding vicinal diols. It plays its physiological
role through small chemical molecules in a quite complex way:
cytochrome P450 enzymes transform their natural substrome
(endogenous substrates), including arachidonic and linoleic acids, to
various biologically active compounds, such as epoxyeicosatrienoic
(EETs) acids or hydroxyeicosatrienoic (HETEs) acids and epoxy-
octadecenoic (EpOMEs) acids, respectively. EETs and EpOMEs
are further metabolized by sEH to their corresponding diols,
dihydroxyeico-satrienoic acids (DHETs; also known as DiHETs)
and dihydroxyoctadecenoic (DiHOMEs) acids. Of the metabolome,
EETs are endothelium-derived hyperpolarizing factor candidates
that mediate vascular relaxation responses and possess antiinflam-
matory properties. Inhibition of sEH would increase cellular EETs
and decrease the inflammatory effects of acute endotoxin (i.e.,
lipopolysaccharide [LPS]) exposure (Fig. 11.9).
The EH family is classified in the α-/β-hydrolase fold family
of proteins [47, 48]. Most of the enzymes in this family are
characterized by a nucleophile–histidine–acid catalytic triad and
have a two-step mechanism involving the formation of a covalent
intermediate.

11.6.3 Mechanistic Study of EHs


Several mechanisms of epoxide ring opening mediated by EHs were
proposed, namely covalent, noncovalent, and one-step concerted
mechanism.
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

408 Decoding Life Secrets in Sequences by Chemicals

O HO OH

O O

OH OH

14,15-epoxyeicosatrienoic acids 14,15-dihydroxyeicosatrienoic acids

O OH
O OH

HO
O OH
epoxyoctadecenoic acids hydroxyoctadecenoic acids

Figure 11.9 Endogenous substrates of human-soluble epoxide hydrolase


and products the enzymes transform.

(a) Ring opening of styrene oxide by EHs from Agrobacterium


radiobacter undergoes a two-step process:
Step 1: Nucleophilic attack of Asp107, yielding a covalent
intermediate
Step 2: Hydrolysis of the covalent intermediate by water,
assisted by His275 (Fig. 11.10a)
(b) The mechanism of leukotriene A4 hydrolase is well known to
involve a noncovalent catalytic mechanism of leukotriene A4 EH.
Ring opening is facilitated by a zinc atom (Fig. 11.10b).

Besides the above two mechanisms, a protein crystallographic


study showed that limonene EH has a novel fold and acts through
a one-step mechanism involving the protonation of the epoxide ring
by aspartate 101 with concomitant attack of a water molecule on a
carbon atom of the epoxide ring [49].
Among the mechanisms proposed, the covalent mechanism is
best studied and is discussed later in more detail. The covalent
intermediate mechanism is supported by several studies [49–
51]. Lacourciere and Armstrong demonstrated the formation of a
covalent intermediate for the microsomal epoxide hydrolase (mEH)

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Linking Sequences to Substromes at the Molecular Level 409

Try218
(a) Try218 O
O
H H O Try152 Try218 Try152
H H O Try152 O O O
O O H H
O O
Asp107 O
Asp107 O Asp107 O OH
H HO
O
H Step 1 H Step 2
O
N
H
N N N
O H His275
N N
Asp248 O O H His275 O His275
H
Asp248 O Asp248 O

(b) O
O
H H His295
Glu271 O
His295 Glu271 OH O
Enz A O
H H Zn
2+ His
Zn2+ His299 H
O O 299
OH Glu318
Glu318
Enz A
O O

O O
HB Enz
HB Enz

Figure 11.10 Excerpt from Current Opinion of Biotechnology. Reaction


mechanisms of epoxide-converting enzymes.

with a single turnover experiment (excess of enzyme) in H2 18 O,


showing that the 18 O was not incorporated in the formed glycol
but rather in the protein. A second step was shown to incorporate
the 18 O in the product, even in H2 16 O. Further evidence was gained
through the isolation of the covalent intermediates for the sEH and
mEH [50, 51]. Chemical characterization of the enzyme–product
intermediate indicated a structure consistent with an α-hydroxyl
alkyl enzyme [47].
Sequence analysis showed that both the mammalian mEH and
sEH are similar to a bacterial haloalkane dehalogenase and other
related proteins, suggesting that both EHs have a similar mechanism
to the bacterial enzyme [52]. Interestingly, the origin of the hsEH
gene is thought to be the result of an early fusion of genes encoding
these two bacterial enzymes [53]. For more detailed information
about the mechanistic aspects of EH-mediated epoxide ring opening,
the readers are referred to Refs. [54, 55].
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

410 Decoding Life Secrets in Sequences by Chemicals

11.6.4 What We Have Learned from the Studies of EH


The convergent study of EHs from diverse origins has made
available valuable knowledge about this family of enzymes. As
a result, the search for inhibitors of hsEH now presents a real
opportunity, not only to enrich our understanding of the role of
this important enzyme in the metabolism system but also to find
the best therapeutic agent directly beneficial to human health. It is
noteworthy that the remarkable events have resulted from several
decades of efforts. With a global vision and systematic approach,
genochemistry is expected to reveal the relationships of enzymes
and their substromes on a molecular basis more efficiently, serving
as a powerful tool for the life sciences.

11.6.5 Technologies with Potentials in Genochemistry


Approach
Numerous technologies, such NMR, IR, circular dichroism, UV-Vis,
fluorescence, X-ray crystallography, mass spectrometry, and various
assay methods have been used in various stages of the studies in
biocatalysis, biochemistry, pharmacology, etc., from the discovery
of new types of enzymes to the determination of the catalytic
mechanisms. In particular, labeling techniques with stable isotopes
coupled with tracing by mass spectrometry have showed great
potential. They not only allow the qualitative determination of the
reaction mechanism [38, 56] but also serve as quantitative methods
for HTS.
The success of the genochemical approach will rely on the inte-
gration of these technologies that biocatalysis uses with dexterity
to create useful data and the further docking with computational
methods.

11.7 Correlating with Computational Methods

Our knowledge of biocatalysis is both deep and broad. However,


segmented biocatalysis data do not by themselves comprise an
understanding of living systems. To serve as a technology platform

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Correlating with Computational Methods 411

of genochemistry and tools to get insight into the role of enzymes


and the reactions they carry out in living systems, the enzymes’
activity toward their substromes acquired from biocatalysis study
must first be correlated and integrated not only to their structures
and the structural profiles of their substrome but also to the
primary sequence as well as to their genetic variations, including
SNPs in the context of genochemical networks (corresponding to
the biochemical pathway). Bearing in mind the vast biodiversity
and individual variations as well as the structural diversity of
substromes, technologies traditionally used in biocatalysis study
must be enhanced to create useful data of large volume and
high quality in accordance with all other -omics approaches.
And they must be empowered by modeling through digitization
and computational methodologies, in addition to building large
databases [57, 58].
To make sense of the models and to make them comprehensive
tools that are practical and useful, they must (1) be provable with
experimental data, (2) be predictive for new substrate and new
activity [59], and (3) be progressive and inclusive such that accuracy
and precision can be improved constantly with the addition of new
sequences, new protein crystal structures, and new substrates, while
taking into account the fast and growing volume of sequence data
from biology and a number of chemicals.
Nowadays, sequencing technology is constantly improving and
the amount of sequence data has increased exponentially. On the
other hand, it has been noted that the increase in the number of
protein structures resolved has been much slower and seems to
have reached a plateau. It has increased by about 6000 annually suc-
cessively since the year 2006 and has totaled 60,000 as of October
2009, compared to 150 billions of gene sequence data. Consequently,
the majority of genetic sequences and the functions of their protein
products must be studied in the absence of the proteins’ structural
information. Consistent with this trend, bioinformatics based on
the computational methodology has developed tremendously. An
exhaustive discussion of these approaches exceeds the range of
coverage of this review. The interested reader is referred to related
books and reviews for perusal [60].
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

412 Decoding Life Secrets in Sequences by Chemicals

Two major computational approaches are discussed briefly


through examples: first, molecular modeling based on protein
crystal structures [61] and, second, machine-learning approaches
based on statistics. The former has been used in mapping the
enzyme active site through docking with substrate transition-
state analogues. This approach has been routinely used for virtual
screening in pharmaceutical research and also for prediction of
enzyme substrate specificity and selectivity [62]. The latter has
been used in annotation of genetic sequences in bioinformatics.
Being trained by positive and negative datasets, the computer can
acquire a capacity of discriminating and recognizing new enzymes
from unknown sequences. The accuracy of the models depends on
the extraction of vectors and the number of datasets. This feature
can be exploited to correlate an enzyme’s function and structures
in genochemistry studies. Numerous methods are available. They
include multisequence alignment, neural networks, hidden Markov
models, support vector machine, position-specific iterated BLAST
(PSI-BLAST), etc.
Pleiss and coworkers developed a high-throughput method for
in silico biochemical profiling of enzyme families. It was based
on covalent docking of potential substrates into the binding sites
of target enzymes. The method has been tested by systematically
docking transition-state analogous intermediates of 12 substrates
into the binding sites of 20 α-/β-hydrolases from 15 homologous
families. One hundred thirty-seven crystal structures were included
in the analysis. The modeling results in general reproduced
experimental data on substrate specificity and stereoselectivity.
It provides a time-efficient and cost-saving protocol for virtual
screening to identify the potential substrates of the members of a
large enzyme family from a library of molecules [60].
Feenstra et al. developed and applied new methods of modeling
to predict the substrate specificity and regioselectivity of large
protein families (α-/β-hydrolases, P450 monooxygenases) on the
basis of the biochemical properties of enzymes, like stability, activity,
stereoselectivity, and substrate specificity [63].
Pleiss and coworkers developed a method based on the analysis
of multisequence alignments of homologous protein families to
predict rare codons with a potential impact on protein expression.

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

Problems That Genochemistry Can Potentially Tackle 413

The method was able to predict the functionally relevant codons


in proteins that bind fatty acids and chloramphenicol acetyltrans-
ferase, which had been experimentally determined. The analysis of
16 homologous protein families belonging to the α-/β-hydrolase
fold showed that functionally rare codons share no common location
in respect to their tertiary and secondary structures [64]. These
methods present great promises owing to their versatility to tackle
a broad array of challenging problems.
Computational methods have made tremendous progress re-
cently in the prediction of reactivity of enzymes regarding their
structural features and have drawn the attribution of Nobel Prize
in Chemistry 2013. These methods rely either on the physical
properties of local atoms or on layer-by-layer grouping using the
protein’s structural information and evolutionary and functional
information. For instance, the structural classification of proteins
(SCOP) method uses sequence, structural, and functional phylogeny
to classify the evolutionary family [65]. The majority of the trees
drawn from these methods show a very high degree of consistency;
however, this is not always consistent with SCOP groupings at the
family level, especially toward functions [66].

11.8 Problems That Genochemistry Can Potentially


Tackle

Recently, questions were raised on the effects of silent SNPs on


altering the substrate specificity [14–19]. How did rare codons that
did not alter the amino acids sequences still exerted effects on the
enzyme activity? Is it through kinetics of posttranslation or protein
foldings?
On top of these questions, the cellular processes, including
alternative RNA splicing and posttranslational protein modification,
may create more than one protein products from a given sequence
in the genome, yet their regulation is poorly understood [20].
These questions are difficult to investigate by any other technolo-
gies due to the absence of the in vivo structural and conformational
information of enzyme. However, the enzyme’s features are linked to
its activity, which can be revealed by their substrates, products, and
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

414 Decoding Life Secrets in Sequences by Chemicals

intermediates. Thus, the genochemical approach built on the state-


of-the-art biocatalysis processes and empowered by computational
methodologies described in this chapter might supply a unique
tactic to these challenging problems [22].

11.9 Conclusion

Entering the 21st century of the genomic era, the availability of


vast volumes of sequence data has paved new avenues for research
in the life sciences. Many new disciplines have emerged in the -
omics cascade echoing this new trend. However, there still exist
gaps in the study of the interaction of big biomolecules and small
chemical molecules, in particular enzymes and their substrates as
well as inhibitors, which has been the central focus for the last
century and will continue to be so in the 21st century. We herewith
refer to such a study as genochemistry. The subject of this study
is the chemistry mediated by enzymes, correlating the sequence of
enzymes and their substromes. Genochemistry is designed on the
basis of the knowledge and state-of-the-art technologies employed
in biocatalysis, biochemistry, and bioinformatics. It can serve as a
comprehensive strategy to tackle the challenges that the scientific
communities at the junction of the life sciences are facing today.
In this review, we describe such a perspective from distillation
of the state-of-the-art sciences, with emphasis on potential tech-
nologies. However, it is not an exhaustive survey of technologies
because such a survey itself deserves more volumes of reviews.
Skillful readers may easily find opportunities to add their expertise
to contribute to this promising area, strengthening the foundation of
genochemistry.

Acknowledgments

The authors express their acknowledgment for helpful discussions


with Dr. Allan Svendsen. This work was supported in part by IUPAC
Project 2009-021-3-300 (http://www.iupac.org/web/ins/2009-
021-3-300).

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

References 415

References

1. Reetz, M. T. (2004). Controlling the enantioselectivity of enzymes by


directed evolution: practical and theoretical ramifications, Proc. Natl.
Acad. Sci. U S A, 101, pp. 5716–5722.
2. Zhang, M., and Kazlauskas, R. (1999). First preparation of enan-
tiopure indane monomer, (S)-(-)- and(R)-(+)-2,3-dihydro-3-(4 -hydro-
xyphenyl)-1,1,3-trimethyl-1H-inden-5-ol, via a unique enantio- and
regioselective enzymatic kinetic resolution, J. Org. Chem., 64, pp. 7498–
7503.
3. Pasteur, M. L. (1858). Memoire sur la fermentation de l’acide tartarique,
C. R. Hebd. Seances Acad. Sci., 46, pp. 615–618.
4. Federsei, H.-J. (1994). The impact of chirality on drug manufacturing on
the industrial scale, Endeavour, 18, pp. 163–172.
5. Pollard, D. J., and Woodley, J. M. (2006). Biocatalysis for pharmaceutical
intermediates: the future is now, Trends Biotechnol., 25, pp. 66–74.
6. Aleu, J., Bustillo, A. J., Hernandez-Galan, R., Collado, I. G. (2006).
Biocatalysis applied to the synthesis of agrochemicals, Curr. Org. Chem.,
10, pp. 2037–2054.
7. Kim, J., Jia, H., and Wang, P. (2006). Challenges in biocatalysis for
enzyme-based biofuel cells, Biotechnol. Adv., 24, pp. 296–308.
8. Guidance for Industry Pharmacogenomic Data Submissions (2005). U.S.
Food and Drug Administration. http://www.fda.gov/cder/guidance/
6400fnl.pdf
9. German, J. B., Hammock, B. D., and Watkins, S. M. (2005). Metabolomics:
building on a century of biochemistry to guide human health,
Metabolomics, 1, pp. 3–9.
10. Kubinyi, H., Müller, G., Mannhold, R., and Folkers, G., eds. (2004).
Chemogenomics in Drug Discovery: A Medicinal Chemistry Perspective
(Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim).
11. MacBeath, G. (2001). Chemical genomics: what will it take and who gets
to play? Genome Biol., 2, pp. 1–6.
12. Snoep, J. L., Westerhoff, H. V., Alberghina, L., and Westerhoff, H. V., eds.
(2005). From isolation to integration, a systems biology approach for
building the Silicon Cell, Systems Biology: Definitions and Perspectives
(Springer-Verlag, Berlin), p. 7.
13. Anfinsen, C. B. (1973). Principles that govern the folding of protein
chains, Science, 181, pp. 223–230.
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

416 Decoding Life Secrets in Sequences by Chemicals

14. Kimchi-Sarfaty, C., Oh, J. M., Kim, I.-W., Sauna, Z. E., Calcagno, A.
M., Ambudkar, S. V., and Gottesman, M. M. A. (2007). A “silent”
polymorphism in the MDR1 gene changes substrate specificity, Science,
315, pp. 525–528.
15. Sauna, Z. E., Kimchi-Sarfaty, C., Ambudkar, S. V., and Gottesman, M.
M. (2007). Silent polymorphisms speak: how they affect pharmacoge-
nomicsand the treatment of cancer, Cancer Res., 67, pp. 9609–9612.
16. Przybyla-Zawislak, B. D., Srivastava, P. K., Vázquez-Matı́as, J., Mohren-
weiser, H. W., Maxwell, J. E., Hammock, B. D., Bradbury, J. A., Enayetallah,
A. E., Zeldin, D. C., and Grant, D. F. (2003). Polymorphisms in human
soluble epoxide hydrolase, Mol. Pharmacol., 64, pp. 482–490.
17. Takeuchi, F., McGinnis, R., Bourgeois, S., Barnes, C., Eriksson, N., Soranzo,
N., Whittaker, P., Ranganath, V., Kumanduri, V., McLaren, W., Holm, L.,
Lindh, J., Rane, A., Wadelius, M., and Deloukas, P. (2009). A genome-wide
association study confirms VKORC1, CYP2C9, and CYP4F2 as principal
genetic determinants of warfarin dose, PLOS Genet., 5, p. e1000433.
18. Komar, A. A., Lesnik, T., and Reiss, C. (1999). Synonymous codon
substitutions affect ribosome traffic and protein folding during in vitro
translation, FEBS Lett., 462, pp. 387–391.
19. Komar, A. A. (2007). Genetics. SNPs, silent but not invisible, Science, 315,
pp. 466–467.
20. Luco, R. F., Pan, Q., Tominaga, K., Blencowe, B. J., Pereira-Smith, O. M.,
and Misteli, T. (2010). Regulation of alternative splicing by histone
modifications, Science, 327, pp. 996–1000.
21. Bonetta, L. (2006). Genome sequencing in the fast lane, Nat. Methods, 3,
pp. 141–147.
22. Romero, P. A., and Arnold, F. H. (2009). Exploring protein fitness
landscapes by directed evolution, Nat. Rev. Mol. Cell Biol., 10, pp. 866–
876.
23. Lewis, J. C., and Arnold, F. H. (2009). Oxidations by laboratory-evolved
cytochrome P450 BM3, Chimia, 63, pp. 309–312.
24. Kubo, T., Peters, M. W., Meinhold, P., and Arnold, F. H. (2006).
Enantioselective epoxidation of terminal alkenes to (R)- and (S)-
epoxides by engineered cytochromes P450 BM-3, Chemistry, 12, pp.
1216–1220.
25. Meinhold, P., Peters, M. W., Chen, M. M. Y., Takahashi, K., and Arnold,
F. H. (2005). direct conversion of ethane to ethanol by engineered
cytochrome P450 BM3, ChemBioChem, 6, pp. 1765–1768.

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

References 417

26. Dunbar, J. C., Holmquist, B., and Johansen, J. T. (1984). Asymmetric


active site structures in yeast copper, zinc superoxide dismutase.
Reconstruction of apo-superoxide dismutase, Biochemistry, 23, pp.
4324–4329.
27. Mentel, M., Blankenfeldt, W., and Breinbauer, R. (2009). The active
site of an enzyme can host both enantiomers of a racemic ligand
simultaneously, Angew. Chem., Int. Ed., 48, pp. 9084–9087.
28. Chen, C. S., Fujimoto, Y., Girdaukas, G., and Sih, C. J. (1982). Quantitative
analyses of biochemical kinetic resolutions of enantiomers, J. Am. Chem.
Soc., 104, pp. 7294–7299.
29. Chen, C. S., Wu. S. H., Girdaukas, G., and Sih, C. J. (1987). Quantitative
analyses of biochemical kinetic resolution of enantiomers. 2. Enzyme-
catalyzed esterifications in water-organic solvent biphasic systems, J.
Am. Chem. Soc., 109, pp. 2812–2817.
30. Reetz, M. T., Becker, M. H., Klein, H.-W., and Stöckigt, D. (1999). A
method for high-throughput screening of enantioselective catalysts,
Angew. Chem., Int. Ed., 38, pp. 1758–1761.
31. Reetz, M. T. (2002). New methods for the high-throughput screening of
enantioselective catalysts and biocatalysts, Angew. Chem., Int. Ed., 41, pp.
1335–1338.
32. Becker, S., Höbenreich, H., Vogel, A., Knorr, J., Wilhelm, S., Rosenau,
F., Jaeger, K.-E., Reetz, M. T., and Kolmar, H. (2008). single-cell high-
throughput screening to identify enantioselective hydrolytic enzymes,
Angew. Chem., Int. Ed., 47, pp. 5085–5088.
33. (a) Cowan, D. A. (2000). Microbial genomes: the untapped resource,
Trends Biotechnol., 18, pp. 14–16. (b) Cowan, D., Meyer, Q., Stafford,
W., Muyanga, S., Cameron, R., and Wittwer, P. (2005) Metagenomic gene
discovery: past, present and future, Trends Biotechnol., 23, pp. 321–
329.
34. Yanga, Y., Zhu, D., Piegata, T. J., and Hua, L. (2007). Enzymatic ketone
reduction: mapping the substrate profile of a short-chain alcohol de-
hydrogenase (YMR226c) from Saccharomyces cerevisiae, Tetrahedron:
Asymmetry, 18, pp. 1799–1803.
35. Lemke, K., Lemke, M., and Theil, F. (1997). A three-dimensional
predictive active site model for lipase from Pseudomonas cepacia, J. Org.
Chem., 62, pp. 6268–6273.
36. Gascoyne, D. G., Finkbeiner, H. L., Chan, K. P., Gordon, J. L., Stewart,
K. R., and Kazlauskas, R. J. (2001). Molecular basis for enantioselec-
tivity of lipase from Chromobacterium viscosum toward the diesters
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

418 Decoding Life Secrets in Sequences by Chemicals

of 2,3-dihydro-3-(4 -hydroxyphenyl)-1,1,3-trimethyl-1H-inden-5-ol, J.
Org. Chem., 66, pp. 3041–3048.
37. Kolattukudy, P. E., and Brown, L. (1975). Fate of naturally occurring
epoxyacids: a soluble epoxide hydrase, which catalyzes cis-hydration,
from Fusarium solani pisi, Arch. Biochem. Biophys., 166, pp. 599–
607.
38. Archelas, A., and Furstoss, R. (2001). Synthetic applications of epoxide
hydrolases, Curr. Opin. Chem. Biol., 5, pp. 112–119.
39. Zhang, X. M., Archelas, A., and Furstoss, R. (1991). Microbiological
transformation. 19. Asymmetric dihydroxylation of the remote double
bonf of geraniol: a unique stereochemical controle allowing easy access
to both enantiomers of geraniol-6,7-diol, J. Org. Chem., 56, pp. 3814–
3817.
40. Arand, M, Grant, D. F., Beetham, J. K., Friedberg, T., Oesch, F., Hammock,
B. D. (1994). Sequence similarity of mammalian epoxide hydrolases
to the bacterial haloalkane dehalogenaseand other related proteins:
implication for the potential catalytic mechanism of enzymatic epoxide
hydrolysis, FEBS Lett., 338, pp. 251–256.
41. Zou, J., Hallberg, B., Bergfors, T., Oesch, F., Arand, M., Mowbray, S., and
Jones, T. (2000). Structure of Aspergillus niger epoxide hydrolase at
1.8 Å resolution: implications for the structure and function of the
mammalian microsomal class of epoxide hydrolases, Structure, 8, pp.
111–122.
42. Zhang, X. M., Archelas, A., and Furstoss, R. (1991). Microbiological
transformation. 21. An expedient route to both enantiomers of
marmins and epoxyayauraptens via microbial dihydroxylation of 7-
geranyloxycoumarin, Tetrahedron: Asymmetry, 2, p. 247.
43. Zhang, X. M., Archelas, A., and Furstoss, R. (1992). Microbiological trans-
formation. 24. Synthesis of chiral building blocks via stereoselective
dihydroxylation of citronellol enantiomers, Tetrahedron: Asymmetry, 3,
p. 1373.
44. Arand, M., Hemmer, H., Dürk, H., Baratti, J., Archelas, A., Furstoss, R., and
Oesch, F. (1999). Cloning and molecular characterization of a soluble
epoxide hydrolase from that is related to mammalian microsomal
epoxide hydrolase, Biochem. J., 344, pp. 273–280.
45. Schmelzer, R., Kubala, L., Newman, J. W., Kim, I.-H., Eiserich, J. P., and
Hammock, B. D. (2005). Soluble epoxide hydrolase is a therapeutic
target for acute inflammation, Proc. Natl. Acad. Sci. U S A, 102, pp. 9772–
9777.

www.ebook3000.com
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

References 419

46. Gross, G. J., and Nithipatikom, K. (2009). Soluble epoxide hydrolase: a


new target for cardioprotection, Curr. Opin. Investig. Drugs, 10, pp. 253–
258.
47. Imig, J. D., and Hammock, B. D. (2009). Soluble epoxide hydrolase as a
therapeutic target for cardiovascular diseases, Nat. Rev. Drug Discovery,
8, pp. 794–805.
48. Ollis, D. L., Cheah, E., Cygler, M., Dijkstra, B., Frolow, F., et al. (1992). The
α/β hydrolase fold, Protein Eng., 5, pp. 197–211.
49. Holmquist, M. (2000). Alpha/beta-hydrolase fold enzymes: structures,
functions and mechanisms, Curr. Protein Pep. Sci., 1, pp. 209–235.
50. Lacourciere G. M., and Armstrong, R. N. (1993). The catalytic mechanism
of microsomal epoxide hydrolase involves an ester intermediate, J. Am.
Chem. Soc.,115, pp. 10466–10467.
51. Hammock, B. D., Pinot, F., Beetham, J. K., Grant, D. F., Arand, M. E., et al.
(1994). Isolation of the putative hydroxyacyl enzyme intermediate of an
epoxide hydrolase, Biochem. Biophys. Res. Commun., 198, pp. 850–856.
52. Muller, F., Arand, M., Frank, H., Seidel, A., Hinz, W., et al. (1997).
Visualization of a covalent intermediate between microsomal epoxide
hydrolase, but not cholesterol epoxide hydrolase, and their substrates,
Eur. J. Biochem., 245, pp. 490–496.
53. Verschueren, K. H. G., Seijée, F., Rozeboom, H. J., Kalk, K. H., and Dijkstra,
B. W. (1993). Crystal lographic analysis of the catalytic mechanism of
haloalkane dehalogenase, Nature, 363, pp. 693–698.
54. Argiriadi, M. A., Morisseau, C., Hammock, B. D., and Christianson, D.
W. (1999). Detoxification of environmental mutagens and carcinogens:
structure, mechanism, and evolution of liver epoxide hydrolase, Proc.
Natl. Acad. Sci. U S A, 96, pp. 10637–10642.
55. de Vries E. J., and Janssen D. B. (2003). Biocatalytic conversion of
epoxides, Curr. Opin. Biotechnol., 14, pp. 414–420.
56. Morisseau, C., and Hammock, B. D. (2005). Epoxide hydrolases: mech-
anisms, inhibitor designs, and biological roles, Annu. Rev. Pharmacol.
Toxicol., 45, pp. 311–333.
57. Kurtz, K. A., Rishavy, M. A., Cleland, W. W., and Fitzpatrick, P. F. (2000).
Nitrogen isotope effects as probes of the mechanism of D-amino acid
oxidase, J. Am. Chem. Soc., 122, pp. 12896–12897.
58. Knoll, M., Hamm, T. M., Wagner, F., Martinez, V., and Pleiss, J. (2009). The
PHA depolymerase engineering database: a systematic analysis tool for
the diverse family of polyhydroxyalkanoate (PHA) depolymerases, BMC
Bioinformatics, 10, p. 89.
March 23, 2016 12:53 PSP Book - 9in x 6in 11-Allan-Svendsen-c11

420 Decoding Life Secrets in Sequences by Chemicals

59. Sirim, D., Wagner, F., Lisitsa, A., and Pleiss, J. (2009). The cytochrome
P450 engineering database: integration of biochemical properties, BMC
Biochem., 10, p. 27.
60. Tyagi, S., and Pleiss, J. (2006). Biochemical profiling in silico: predicting
substrate specificities of large enzyme families, J. Biotechnol., 124, pp.
108–116.
61. Baldi, P., and Brunak, S. (2001). Bioinformatics: The Machine Learning
Approach (Massachusetts Institute of Technology Press, Cambridge).
62. Seifert, A., Tatzel, S., Schmid, R. D., and Pleiss, J. (2006). Multiple
molecular dynamics simulations of human P450 monooxygenase
CYP2C9: the molecular basis of substrate binding and regioselectivity
toward warfarin, Proteins, 64, pp. 147–155.
63. Bös, F., and Pleiss, J. (2009). Multiple molecular dynamics simulations
of TEM beta-lactamase: dynamics and water binding of the Omega-loop,
Biophys. J., 97, pp. 2550–2558.
64. Feenstra, K. A., Starikov, E. B., Urlacher, V. B., Commandeur, J. N., and
Vermeulen, N. P., Combining substrate dynamics,binding statistics, and
energy barriers to rationalize regioselective hydroxylation of octane and
lauric acid by CYP102A1 and mutants, Protein Sci., 16, pp. 420–431.
65. Pethica, R. B., Levitt, M., and Gough, J. (2012). Evolutionarily consistent
families in SCOP: sequence, structure and function, BMC Struct. Biol., 12,
p. 27.
66. Widmann, M., Clairo, M., Dippon, J., and Pleiss, J. (2008). Analysis of the
distribution of functionally relevant rare codons, BMC Genomics, 9, p.
207.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Chapter 12

Role of Tunnels and Gates in Enzymatic


Catalysis

Sérgio M. Marques, Jan Brezovsky, and Jiri Damborsky


Loschmidt Laboratories, Department of Experimental Biology,
Research Centre for Toxic Compounds in the Environment RECETOX, Faculty of Science,
Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic
smarques@mail.muni.cz, brezovsky@mail.muni.cz, jiri@chemi.muni.cz

12.1 Introduction

Enzymes, as natural catalysts, have evolved over millions of years


to perform specific reactions within living organisms. Because of
their large complexity and variability, the structural basis for their
efficiency and specificity is not fully understood. At the same
time, there is an increasing demand to engineer enzymes for the
reactions needed for production of chemicals, pharmaceuticals,
food, agricultural additives, and fuels [1–3].
Many of known enzymes have their active sites buried inside
their protein core, rather than exposed to the bulk solvent at the
protein surface [4–6]. This may be due to several reasons, such as the

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

422 Role of Tunnels and Gates in Enzymatic Catalysis

need for solvent absence to carry out a specific chemical reaction, a


means of controlling substrate specificity, or a means of regulating
the release of products to the surrounding solvent. These buried
active sites are connected to the bulk solvent through tunnels, which
act as exchange pathways for substances between a bulk solvent and
the active site. Hence, taking into account the very complex mixture
of proteins colocalized within a living cell, the tunnels can be very
important systems for accomplishing the enzyme functions. Thus,
in addition to the simplistic Fischer’s lock-and-key model [7] or
the more realistic Koshland’s induced-fit model [8], the enzymes
bearing tunnels can be described by a lock-keyhole-key model [4].
This model takes into account that the key (substrate) needs to pass
through a keyhole (tunnel) in order to reach the lock (active site).
Considering this model it becomes very intuitive that access tunnels
represent important structural features for regulating enzymatic
functions.
Enzymes frequently possess structural elements for controlling
the transport of substances through tunnels and channels, called
gates [9]. Protein gates are dynamic systems that can reversibly
switch between open and close states through conformational
changes and by this way control the passage of molecules into
and out of the protein. The gates provide a privileged mode for
selecting the molecules that are allowed to enter the structure, as
well as the frequency with which they can pass through. Protein
gates have been described and studied before [9–13], but the
knowledge has been very dispersed until recently Gora et al. [14]
surveyed the literature and systematized the information using a
newly established classification system.
Due to their primary importance for enzyme structure and
function, tunnels, channels, and gates have revealed good potential
for engineering of enzyme properties. There are many examples
showing how mutations in the key residues defining the tunnel
geometry or gate mobility have contributed to change activity,
specificity, and stability of enzymes. The grand challenge in this
context is to understand the structural basis and underlying
mechanisms that will allow rational engineering of fully functional
access pathways in the future.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Tunnels 423

12.2 Protein Tunnels

12.2.1 Structural Basis and Function


Many of the known enzymes possess buried active sites, and one
of the possible reasons is to regulate substrate specificity or to
create a suitable environment for their chemical reaction. Because
the terminology in the literature is diverse, herein we define protein
tunnels as transport pathways between the surface and the active
sites that are buried inside the protein structures or connect
different active sites within the proteins or protein complexes;
we define channels as conduits connecting different parts of the
protein surface through which the molecules may pass without
transformation [4–6, 15].
Structurally, a protein tunnel often contains a bottleneck, which
is its narrowest part and is a determinant of tunnel selectivity.
Bottlenecks are often controlled by gates that open and close
the narrowest part of the tunnels with certain frequencies. The
existence of tunnels and channels is not restricted to a small group
of enzymes, but it is rather widespread and can be found in all
the six enzyme classes. There are proteins containing (1) channels
passing throughout the structure connecting two different parts of
the protein surface, (2) one single tunnel connecting the surface with
the buried active site cavity, (3) more than one tunnel connecting
the surface with the buried active site, and (4) more than one active
site all connected with each another and with the surface by several
tunnels (Fig. 12.1).
In the first class (1), the channels serve as a pathway for the
substances to cross the protein structures, for which there is usually
a well-regulated mechanism. We may find them, for example, in ion
channels [16, 17], which allow the crossing of specific ions through
membrane proteins (K+ channel, Ca2+ channel, etc.); in ion
pumps (Na+ /K+ -adenosine triphosphatase [18], neurotransmitter
transporter [19], etc.); or in the porins [20].
In the second class (2), a single tunnel connecting a deeply
buried active site with the surface has the role of exchange of
the substrates, products, and solvent molecules throughout the
catalytic cycle. Many enzymes possess one permanent tunnel as
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

424 Role of Tunnels and Gates in Enzymatic Catalysis

Figure 12.1 Channels and tunnels in proteins. Examples of proteins


containing (1) a channel, (2) a single tunnel, (3) multiple tunnels connecting
the active site cavity with the bulk solvent, and (4) multiple tunnels
connecting several active sites. The channels and tunnels are shown in
orange, while the active sites are shown in purple.

well as several transient tunnels, which can be revealed only by


studying protein dynamics (Fig. 12.2). Transient tunnels occur upon
dynamic conformational changes or protein gating mechanisms, and
their emergence may be stochastic or induced by the binding of a
substrate or the presence of a ligand molecule to be transported [6].
Examples of enzymes with only one tunnel are oxidoreductases
(e.g., cytosolic sheep liver aldehyde dehydrogenase [21], pyruvate
oxidase [22], amine oxidase [23], 4-hydroxybenzoate hydroxylase
[24]), transferases (glutathione S-transferase [25], lipoate-protein
ligase A [26]), hydrolases (Candida antarctica lipase A [27],
Candida rugosa lipase [28], Agrobacterium radiobacter epoxide
hydrolase [29, 30], neurolysin [31]), lyases (β-hydroxydecanoyl
thiol ester dehydrase [32]), and isomerases (glutamate racemase
[33, 34]).

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Tunnels 425

Figure 12.2 Permanent and transient tunnels in proteins. Example of


a protein containing permanent and transient tunnels, shown by the
dynamics of their bottleneck radii: tunnel 1 is permanently open with a
bottleneck radius >1.4 Å most of the time; tunnel 2 has closed and open
periods, representing a gated tunnel; and tunnel 3 is permanently closed
with a bottleneck radius <1.4 Å most of the time. Only tunnels wider than
1 Å radius are displayed.

In the third class (3), several tunnels are connecting the buried
active site with the surface, and they may or may not serve
an equivalent purpose in the catalytic process. In some cases
their roles are distinct. One such example is cytochrome P450,
which has a main 22 Å long hydrophobic tunnel with a role in
substrate access and product egress, while 12 other secondary
tunnels allow the exchange of oxygen and solvent molecules and
also provide alternative pathways for the product release. Similarly,
the haloalkane dehalogenases possess a main tunnel, used for the
halogenated substrate, alcohol, and halide product exchange, and
several secondary tunnels are used for the alcohol release and water
solvent exchange [35]. Multiple tunnels have been identified in the
structures of oxidoreductases (cytochromes [36], catalase [37],
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

426 Role of Tunnels and Gates in Enzymatic Catalysis

Ni-Fe hydrogenase [38], lipoxygenase 12/15 [24], L-amino acid


oxidase [39]), hydrolases (haloalkane dehalogenases DhaA [35,
40, 41] and LinB [40, 42, 43], acetylcholinesterase [44]), and
isomerases (5-3-ketosteroid isomerase [45]).
Finally, in the fourth class (4), we find multifunctional enzymes
and multienzyme complexes that contain separate active sites
interconnected by the tunnels. These enzymes are able to carry
out sequential reactions, in which an internal pathway conducts
the intermediate products from one catalytic site to another. This
mechanism may be necessary to increase the enzyme’s efficiency
to (i) prevent potentially toxic intermediates to be released into
the medium, (ii) avoid labile intermediates to be released into the
medium and undergo side reactions, or (iii) reduce the transfer
time between different catalytic sites. Such a transfer process is
always tightly regulated, often through molecular gating mechanism.
In this type of enzymes we find oxidoreductases (glutamate
synthase [6, 46]), transferases (glucosamine 6-phosphate synthase
[6, 47], glutamine phosphoribosylpyrophosphate amidotranferase
[6, 48], and acetyl-CoA synthase [49]), imidazole glycerol phosphate
synthase [6, 50], lyases (tryptophan synthase [6, 51, 52]), and
ligases (carbamoyl phosphate synthetase [6, 53–55], asparagine
synthetase [6, 56]).
The single-most important function of protein tunnels is to
control the ligands’ entry to the active site. The selection of the
ligands that may pass through the tunnels prevents the formation
of nonproductive complexes in the binding site, which would reduce
the enzyme efficiency. It may also avoid the poisoning of the
active center by certain compounds and by this way completely
inactivating the catalyst, such as transition-metal-ion-dependent
metalloenzymes. Tunnels connecting multiple active sites may also
prevent the release of toxic intermediate products or metabolites
into the cell. The same class of tunnels provides an excellent way
of synchronizing reactions that require the contact of multiple
substrates or cofactors, control the order of multistep catalytic
reactions, and provide an environment for carrying out reactions
that require the absence of water. These functions are particularly
important, considering the thousands of proteins and ligands that
are simultaneously colocalized within a living cell.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Tunnels 427

12.2.2 Identification Methods


Two primary experimental methods that allow direct identification
of the tunnels and channels within protein structures to atomic
resolution are X-ray crystallography and nuclear magnetic
resonance (NMR) spectroscopy. These experimental techniques are
often followed by theoretical analyses using molecular dynamics
(MD) simulations.
In the last few decades, advances in biological chemistry were
boosted by the determination of high-resolution 3D structures
of proteins by X-ray diffraction, which allowed a deeper under-
standing of the underlying catalytic mechanism of enzymes at
the atomic level [6]. Likewise, also the tunnels of some enzymes
started being described into greater detail and their functions
being better understood. The higher is the number of crystal
structures solved, the deeper is the knowledge attained about
that enzyme by sampling different conformational states. However,
crystallography only supplies static structures and cannot show
metastable conformations, and hence the insight given about the
transient tunnels can be limited.
NMR spectroscopy can also provide the 3D structures of
proteins, either from solution or from solid-state studies. But
rather than a single structure, NMR analysis supplies results in
the form of ensembles. Modern protein NMR spectroscopy has
largely developed in the last 10–15 years, and currently it is
possible to observe the dynamics of proteins at the atomic level
by using specific methods, in timescales ranging from picoseconds
to seconds [57–59]. Therefore, NMR spectroscopy has the potential
to supply relevant information regarding not only the permanent
tunnels but also the transient ones. In some cases, the transient
states of tunnels and channels have indirectly been investigated by
using particular methods of NMR. For example, water magnetic
relaxation dispersion (MRD) has been used, in combination with
MD simulations, to track the internal water molecules buried inside
myoglobin [60] and the bovine pancreatic trypsin inhibitor [61];
solid-state NMR spectroscopy (ssNMR) has been used to track
the buried water molecules within a K+ channel in different gating
modes [62].
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

428 Role of Tunnels and Gates in Enzymatic Catalysis

MD simulation is the method par excellence for detecting


and studying the transient tunnels in proteins. MD simulates
the behavior of a molecule under certain conditions of pressure
and temperature, preferably in the presence of explicit solvent
molecules. AMBER [63], CHARMM [64], GROMACS [65], and NAMD
[66] are among the most used software packages to perform such
calculations. However, this important and widespread method still
has its limitations. One of them is the timescale that is possible to
survey. Long timescales are very demanding in terms of calculation
time and computational resources. Unless one has access to very
expensive resources, such as the ANTON supercomputer [67], it is
currently possible to reach only the hundreds of nanoseconds’ or
few microseconds’ timescales by using graphics processing units
[68]. Several techniques have recently emerged to overcome such
limitations and sample a broader conformation space: accelerated
MD [69, 70], conformational flooding [71, 72], hyperdynamics
[73, 74], and metadynamics [75, 76], which allow sampling of a
conformational space comparable to the millisecond timescale.
Tunnel dynamics can also effectively be studied by simulating
protein–ligand complexes. Ligands may induce conformational
changes in the protein and by this way affect the shape of the
tunnel and/or the frequency of gating. This can be studied by MD
simulations that make use of extra forces to let the ligand move
through the tunnels, either randomly in random acceleration MD
(RAMD) [77, 78] or with directed forces in steered MD (SMD) [79–
81]. These procedures can also elucidate the energy profile along the
ligand pathway, the preference of a certain ligand for one or another
tunnel, or the tunnel specificity toward particular ligands.
Because of the high complexity of most systems, visual inspection
is unlikely to be sufficient for identifying the voids in protein
structures, such as clefts, pockets, pores, tunnels, and channels.
Several specialized software tools are currently available to accu-
rately calculate those voids. The most commonly used programs
for calculation of tunnels and channels in proteins are CAVER [15],
MOLAXIS [82], and MOLE [83]. These programs mainly differ in the
model used to describe the protein and the boundary between the
surface and the bulk solvent, the algorithms used to calculate the
tunnels and the cost of individual tunnels, and the way of treating

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Tunnels 429

multiple tunnels and the ability to analyze multiple structures. All of


them can identify the static tunnels in single structures, but only the
first two can handle ensembles of structures to calculate dynamic
tunnels. Detailed description and comparison of these tools are
provided in the recent review by Brezovsky et al. [84].

12.2.3 Molecular Engineering


It has been observed in many cases that the mutation of residues
far from the active site have led to important enhancements in
enzymatic properties such as activity, specificity, enantioselectivity,
stability, etc. Whereas for most enzymes with solvent-exposed
catalytic sites the mutations in or near the substrate-binding
residues have been more successful [85], for enzymes with buried
active sites the situation can be different. It may be difficult to
find residues susceptible to mutagenesis without disrupting the
active site architecture. On the other hand, mutations targeting
residues far away from the active site are more likely to be accepted
without loss of function. Considering the important roles of access
pathways, their modification appears to be an attractive possibility
for generating functional variants with rationally tuned properties.
For enzymes with buried active sites and rate limitation at
the substrate entry or product release, the catalytic activity
can be effectively engineered by pathway modification. The most
promising residues to perform positive mutations are those
forming bottlenecks. There are a number of reports on activity
improvements by modification of tunnel residues. Among these
we may find oxidoreductases (cholesterol oxidase [86, 87],
pyruvate dehydrogenase [88], ferredoxin glutamate synthase [89],
carbon monoxide dehydrogenase [90], catalase [91–93], toluene-
o-xylene monooxygenase [94, 95], toluene 4-monooxygenase [96–
98], 4-hydroxybenzoate hydroxylase [24], cytochrome P450 [99–
104]), transferases (glucosamine-6-phosphate synthase [105],
β-ketoacyl-acyl-carrier-protein synthase [106], undecaprenyl py-
rophosphate synthase [107], RNA-dependent RNA polymerase
[108]), hydrolases (lipase [109, 110], acetylcholinesterase [111,
112], epoxide hydrolase [113], haloalkane dehalogenases [114,
115]), lyases (tryptophan synthase [116, 117], 3-hydroxydecanoyl-
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

430 Role of Tunnels and Gates in Enzymatic Catalysis

acyl carrier protein dehydratase [118], halohydrin dehalogenase


[119]), and isomerases (squalene-hopene cyclase [120]), as-
paragine synthetase [121], carbamoyl phosphate synthetase [54,
122]). In our laboratory, Pavlova et al. [115] performed saturated
mutagenesis in the tunnel residues of the haloalkane dehalogenase
DhaA, aiming at increasing its ability to degrade the toxic anthro-
pogenic compound 1,2,3-trichloropropane. These efforts resulted
in the discovery of a variant DhaA31 containing five mutations,
four of them located in the access tunnels. DhaA31 showed 32-fold
enhancement in the overall catalytic activity due to an increase in
rate of the carbon–halogen bond cleavage rate and a shift of the
rate-limiting step to the product release. Modeling studies revealed
that the origin of the enhanced activity is the lower number of
water molecules in the active site, which would otherwise hinder the
formation of the activated complex [115].
Mutations in the tunnel residues may also modulate the
enzyme specificity or the enantioselectivity. This is rationalized,
considering that the access tunnels are the first sieves prior to
the molecules access the active site. Hence, by changing their
physicochemical properties or stereochemistry, one may tune the
type of substrates or stereoisomers that are able to pass through
and enter the active site. Examples can be found among oxidoreduc-
tases (aminoaldehyde dehydrogenase [123], amine oxidase [124],
toluene-4-monooxygenase [98], 4-hydroxybenzoate hydroxylase
[24], cytochrome P450 [99], alkane hydroxylase [125]), trans-
ferases (chalcone synthase [126, 127], polyketide synthases [128],
cellobiose phosphorylase [129], octaprenyl pyrophosphate synthase
[130], undecaprenyl pyrophosphate synthase [107]), hydrolases
(arylesterase [131], lipase [110, 132, 133], epoxide hydrolase [113,
134], haloalkane dehalogenase [114, 135]), lyases (hydroxynitrile
lyase [136, 137]), and isomerases (squalene-hopene cyclase [120]).
Chaloupkova et al. [114] from our lab designed and constructed a
complete set of single-point mutants of the haloalkane dehalogenase
LinB at position L177, which is the residue located near the entrance
to the main tunnel. Fifteen active variants showed activities and
specificities toward the halogenated substrates very different from
the wild type and the activities correlated with the size and the
hydrophobicity of the amino acid introduced.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Gates 431

Improvements in protein stability may also be achieved through


tunnel engineering. It may occur in case the permeability of the
tunnels to the water or organic solvent is changed in such a way
that the hydrophobic packing of the protein is enhanced. In this
case the protein becomes less affected by the presence of an organic
cosolvent or temperature rise, thus increasing its general stability.
This has been the case of the haloalkane dehalogenase variant
DhaA85, which carried four mutations in the tunnel-lining residues
[138]. Compared to the wild-type enzyme, this variant revealed an
increase in its melting temperature by 19◦ C in aqueous buffer and a
half-life rise from minutes to weeks in 40% dimethyl sulfoxide.
One last example of how significantly enzyme catalytic properties
can be affected by modifications in the access tunnel is the change
of the mechanism of the catalytic cycle. Biedermannova et al.
[139] have observed a change in the kinetics mechanism for
the conversion of 1,2-dibromoethane by the LinB L177W variant,
associated with a dramatic change of substrate specificity. The
substitution of the tunnel-lining leucine at position 177 for a bulkier
tryptophan changed the bromide ion binding kinetics from a one-
step to a two-step mechanism and resulted in a significant drop in
the bromide release rate (from >500 to 0.8 s−1 ).

12.3 Protein Gates

12.3.1 Structural Basis and Function


Many enzymes possessing tunnels or channels also contain some
type of gate, since the traffic of ligands, ions, and solvent in those
pathways is susceptible to regulation. However, the gates are not
limited to the enzymes containing tunnels. Molecular gates can be
found in a wide variety of biological systems, such as enzymes,
ion channels, protein–protein complexes, and protein–nucleic acid
complexes. The gates present in enzymes may have three major
roles: (1) control the access of the substrate to the active site,
(2) control the access of the solvent to the active site, and (3)
synchronize molecular events occurring at different locations of the
protein.
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

432 Role of Tunnels and Gates in Enzymatic Catalysis

Considering substrate access (role 1), the gates may account


for the substrate specificity of the enzyme. On the basis of physic-
ochemical (polarity, lipophilicity, charge, polarizability, etc.) and
geometric (bulkiness, length, stereochemistry, etc.) properties, gates
can act as filters that control which compounds can pass through
and which cannot. Examples of such case is NiFe hydrogenase,
which blocks the access of oxygen over carbon monoxide [140,
141], the catalase that is more permeable to the entry of hydrogen
peroxide than water [142, 143], cytochromes P450 [36, 144],
epoxide hydrolases [113], undecaprenyl-pyrophosphate synthases
[107], and cellobiohydrolase I [145].
Concerning solvent access (role 2), in some cases the catalytic
reaction requires a reduced number or the absence of water
molecules in the active site. In those cases it is fundamental to
control the access of solvent to the catalytic site by a gate. The
mechanisms regulating solvent accessibility can permit entry of
the solute alone, allow entry of only a limited number of water
molecules, or even restrict the access of water molecules to some
parts of the active site cavity, for example, cytochrome P450
[36], carbamoyl phosphate synthetase [122], imidazole glycerol
phosphate synthase [146], and glutamine amidotransferase [105,
147]. In other cases, water molecules are not allowed to enter the
active site unless the substrate or cofactor is present, such as in
rabbit 20α-hydroxysteroid dehydrogenase [148].
Synchronization of reactions (role 3) may occur in the enzymes
with more than one active site interconnected by the tunnels. In
this case only the proper intermediate from the first reaction is
allowed to cross the gate and access the second site. This may be
necessary in cases of instability or toxicity of intermediate or to
avoid their unfavorable hydration. Examples of enzymes with gates
involved in synchronization of reactions are carbamoyl phosphate
synthetase [122], asparagine synthetase [121], glucosamine 6-
phosphate synthase [105], and glutamate synthase [149], all of
which possess tunnels for ammonia transportation, and the first
one also for carbamate, tryptophan synthase for indole [117], and
carbon monoxide dehydrogenase/acetyl coenzyme A synthase for
carbon monoxide transportation [150].

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Gates 433

Molecular gates can operate on the basis of a very diverse


structural basis, involving side-chain conformational changes of
one or more residues, movement of the backbones of a few
residues, of longer peptide chains, loops, or other secondary
elements, or even the motions of entire domains. Gates have been
classified according to their structural basis: (1) wings (single-
residue motion), (2) swinging doors (two-residue motion), (3)
apertures (backbone motion of several residues), (4) drawbridges,
(5) double drawbridges (motions of loops and secondary elements),
and (6) shell (motion of a domain) (Fig. 12.3) [14].
Wing (1) corresponds to the side-chain rotation of one single
residue. It represents the simplest of all types of gating mechanisms
and is also the most common [14]. The movement of wing gates
cannot be large in amplitude and have quite small activation
barriers. They are typically located at the bottlenecks of tunnels or
channels. Interactions with certain residues, termed as anchoring
residues, allow stabilization of each state. Such interactions can be
H-bonds, salt bridges, π –π contacts, etc. The most common amino
acids involved in this type of gating are W, F, and Y [14]. Examples
of the enzymes containing wing gates are the imidazole glycerol
phosphate synthase [146], cytidine triphosphate synthetase [151],
methane monooxygenase hydroxylase [152], FabZ β-hydroxyacyl-
acyl carrier protein dehydratase [118], cytochrome P450 [144, 153],
and cellobiohydrolase I [145].
Swinging door (2) corresponds to two amino acids’ side chains
moving in a synchronized manner and represents the second most
frequent type of gating. In this case, the closed state involves a close
interaction between the gating residues, operated either through
π–π stacking (F–F, F–Y, and W–F pairs), ionic interactions (R–E
and R–D pairs), aliphatic hydrophobic contacts (F–I, F–V, and F–
L pairs), aliphatic interactions (L–I, L–V, and R–L pairs), or H-
bonds (R–S pair). However, the most common interacting pair in the
swinging-door type of gate is F–F [14]. Examples of enzymes bearing
such type of gates are the acetylcholinesterase [154], toluene-4-
monooxygenase [155], and the cytochromes P4503A4 [102, 156],
P450cam , P450BM3 , and P450eryF [144, 153].
Aperture (3) corresponds to another type of residue
movements, this time involving the backbone atoms of two to
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

434 Role of Tunnels and Gates in Enzymatic Catalysis

four residues without the need for side-chain movements. In


this case it is common to observe several of the bottleneck
residues of a tunnel performing a synchronized motion toward
each other. Enzymes containing this type of gating are, for
example, the carbamoyl phosphate synthetase [122], choline ox-
idase [157], glutamate synthases [158], extradiol dioxygenases-
homoprotocatechuate 2,3-dioxygenase [159], cytochrome P450eryF
[144], and acetylcholinesterase [160].
Single drawbridge (4) and double drawbridge (5) function by
the motions of loops or secondary structure elements, involving one
or two elements, respectively. These types of gates are privileged

Figure 12.3 Contd.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Gates 435

Figure 12.3 Classification of molecular gates. Examples of protein gates and


their schematic representation: (1) wing, (2) swinging door, (3) aperture,
(4) single drawbridge, (5) double drawbridge, and (6) shell. The gating
elements are represented in red color and the access tunnels in orange.

mechanisms to control the access of large ligands, which cannot


be accomplished by the previously described types of gates. The
loops may even be involved in the formation of the binding cavity
for the substrate or cofactor. In some cases, the gating can also be
part of complex machinery that controls the opening and closing of
different tunnels, merges several tunnels, or even forms smaller and
more selective gates. The cytochrome P450 family is a good example
of such a large complex gating system [36, 102, 161, 162].
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

436 Role of Tunnels and Gates in Enzymatic Catalysis

Shell (6) is represented by large motions of entire protein


domains. It can be found in ion channels [12] but also in
enzymes that catalyze reactions with very large substrates, such as
RNA polymerase [163]. Such type of domain–displacement gating
may also serve the purpose of a stricter control of the tunnel
networks in order to prevent the substrate leakage, for example,
dehydrogenase/acetyl coenzyme A synthase [90], epoxide hydrolase
from Mycobacterium tuberculosis [164], phospholipase A2 [165],
and prolyl oligopeptidase [166].
Concerning the location, the enzyme gates can be found at (1)
the entrance to or even at the active site itself, (2) the mouth or the
bottleneck of the access tunnel, and (3) the interface between the
active site and the cofactor binding site (Fig. 12.2).
The entrance to the active site (1) is a very suitable location
for a gate that directly controls the access of the substrate to the
active site. It can either prevent the entry of the substrate before
the catalytic residues are properly oriented or act as a substrate
sieve to control selectivity. In particular cases, the gating may
even be operated by the residues which are part of the active site
[167]. Examples of enzymes bearing gates at the catalytic site or
their entrance are acetylcholinesterase [154], imidazole glycerol
phosphate synthase [146], glutamate synthase [149], toluene-
o-xylene monooxygenase [167], monooxygenase [168], choline
oxidase [157], NiFe hydrogenases [141], carbonic anhydrases [169],
formiminotransferase-cyclodeaminase [170], type III polyketide
synthases [171], and FabZ β-hydroxyacyl-acyl carrier protein
dehydratase [118].
The mouth or bottleneck of tunnels (2) is the most common
location of a gate in enzymes [14]. Tunnels connecting buried
active sites with a protein surface are privileged structures to
control the access of ligands and solvents. Therefore, they are
naturally privileged locations for the gates. The mouth of the
tunnel is the first barrier that a molecule faces before entering
into a tunnel. On the other hand, the bottleneck is the narrowest
part of a tunnel, and it may be regarded as one of the easiest
hotspots for controlling the molecules being transferred. Figure 12.2
presents different dynamics of the bottleneck radius in a tunnel

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Gates 437

with and without a gate. Examples of enzymes containing gates


located in tunnels are the cholesterol oxidase type I [172], toluene-
4-monooxygenase [155], undecaprenyl-pyrophosphate synthase
[107], homoprotocatechuate 2,3-dioxygenase [159], 4-hydroxy-2-
ketovalerate aldolase/acylating acetaldehyde dehydrogenase [173],
epoxide hydrolase from Aspergillus niger M200 [113], FabZ β-
hydroxyacyl-acyl carrier protein dehydratase [118], glucosamine 6
phosphate synthase [105], imidazole glycerol phosphate synthase
[146], cytidine triphosphate synthetase [151], carbamoyl phosphate
synthetase [122], and glutamate synthases [149].
The entrance to the cofactor cavity (3) is also a suitable
location for a gate, since the interface between the cofactor cavity
and the active site often plays a critical role in the catalytic
process. The gates in these locations may control the binding
rate of the cofactor to the enzyme. Examples are the NADH
oxidase [174], 3-hydroxybenzoate hydroxylase [175], 4-hydroxy-2-
ketovalerate aldolase/acylating acetaldehyde dehydrogenase [173],
and cholesterol oxidase type I [172] and type II [176]. In more
specific cases, the cofactor may perform the gating function
by opening the access tunnels for the substrates, such as in
digeranylgeranylglycerophospholipid reductase [177].

12.3.2 Identification Methods


Identification and description of a gating process is not a simple
task because of its complexity, and hence experimental and
modeling techniques are usually combined with each other. X-ray
crystallography can provide important insights into the possible
presence of gating mechanisms in some proteins. The existence of
different crystal structures with amino acid residues, or even larger
elements, in different conformations may be a good indication of a
gating process occurring in that system. Through X-ray ensembles,
for instance, it is possible to infer about protein flexibility, dynamics,
and function [178]. However, it is necessary that both open
and closed states be captured with significant representations.
Examples of enzymes with gates identified by this technique are
tryptophan synthase [179], haloalkane dehalogenase LinB [43],
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

438 Role of Tunnels and Gates in Enzymatic Catalysis

L-amino acid oxidase [180], toluene-o-xylene monooxygenase [181],


acetylcholinesterase [160], and phospholipase A2 [165].
Similarly to the identification of the protein tunnels, NMR spec-
troscopy may supply important evidences regarding the presence
of gates in proteins. Advanced solution or solid-state NMR methods
may be used to study the dynamics of protein systems and give
insight into the conformational changes. These methods may survey
very different timescales, ranging from picoseconds to seconds [57–
59], thus allowing one to study different types of gating mechanisms.
It is even possible to study the less populated conformations and
the exchange rate between different conformations [62, 182, 183].
NMR spectroscopy has been used, for instance, to investigate the
conformational changes in the gating of triosephophate isomerase
[184–186], HIV-1 protease [187], and dihydrofolatereductase
[188].
Fluorescence-based methods have become very popular in
investigating biomolecular systems, namely in detecting and char-
acterizing conformational changes in proteins. Hence, although not
often used for that purpose, they have great potential to investigate
protein gates. Fluorescence emission by fluorophore groups is
dependent on the immediate surrounding environment, namely the
polarity of the neighboring molecules or residues. Hence, it can
be used to detect changes in the microenvironment of fluorescent
residues within the proteins, thus allowing tracking of eventual
conformational changes. Intrinsic tryptophan fluorescence emis-
sion (ITFE), for instance, has been used to study the closed and
open conformations of a dimeric phospholipase A2 homolog [165]
and cytochrome c oxidase [189]. Other methods employ unnatural
fluorescent probes to track the dynamics of proteins. These can
be covalently bonded [190] or inserted as part of the protein by
mutagenesis with unnatural fluorescent amino acids [191]. Time-
resolved fluorescence spectroscopy can be applied with different
methods and assess the dynamics of events occurring in timescales
ranging from femtoseconds to nanoseconds [192]. This technique
has been used, for example, to study the hydration and protein
dynamics at the tunnel mouth of haloalkane dehalogenases [190,
193]. Förster resonance energy transfer (FRET) is a method
that makes use two fluorophores, a donor and an acceptor, which

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Gates 439

perform nonradiative energy transfer with each other and are


bound to the protein at a certain distance. The efficiency of this
energy transfer is proportional to the sixth power of the distance
between the two fluorophores and is correlated to the changes
of their fluorescence spectra. Since the conformational changes in
the protein will affect the distance between the two fluorophores,
this method can report on the dynamics of the specific parts of
protein. FRET is a versatile technique that can be used not only
intramolecularly but also intermolecularly to study the protein
functions and interactions with other proteins or even in living cells
[191, 194–196]. It can be applied on average ensembles or single-
molecule studies, giving great insight into the dynamics and kinetics
of conformational changes occurring in timescales ranging from
nanoseconds to seconds or even minutes [197–199].
MD simulation is a very important theoretical technique for
identifying and characterizing protein gates. This method has been
described in Section 12.2.2, and it can be used to sample the different
conformational states of a protein gate, their respective energies,
and interconversion frequencies. It may be difficult to survey the
timescales of certain gates involving larger movements by using
classical MD, namely apertures (nanoseconds to microseconds),
drawbridges and double drawbridges (nanoseconds to microsec-
onds), or shell gates (milliseconds to seconds) [14]. In these
cases, the enhanced-sampling techniques, that is, accelerated MD,
conformational flooding, hyperdynamics, and metadynamics
must be used. In addition to these, Brownian dynamics simulations
have also been used to investigate gates of enzymes [157, 200],
while several other methods may be used to study the dynamics
of ligand binding to proteins and give important insight into
gating processes [201]. For the proteins bearing gates at their
tunnels, the study of tunnel dynamics is essential. For that, the
use of specialized programs for performing tunnel analysis in the
MD trajectory is necessary to determine bottleneck residues and
potential key residues involved in the gates. CAVER 3.0 [15, 84] is
currently the only software tool that can handle analysis of large
trajectories and supply information about various time-dependent
tunnel properties.
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

440 Role of Tunnels and Gates in Enzymatic Catalysis

12.3.3 Molecular Engineering


Three main approaches of gate engineering can be followed:

(1) Gates can be modified by mutating the residues of existing gates,


the hinge, or the anchoring residues. In this way it may be
possible to rationally modify the gate amplitude, frequency, or
the affinity toward certain substrates or solvent.
(2) Gates can be removed by mutating the gating residues in order
to leave the pathway permanently open. This could not only lead
to an increase in the ligand exchange rate but also change the
access of the solvent.
(3) Gates can be introduced into enzyme pathways that were
originally open, thus providing control over the transport
of substances. It can be achieved by mutating tunnel-lining
residues, preferably in the bottleneck or in the entrance.

Modification of catalytic activity by gate engineering can be


easily rationalized by considering exchange of substrates and
products at gate-controlled rates, which limit the overall catalytic
cycle. Gate can also control access of water molecules to the
active site, making the chemical reaction more or less favored.
There are many examples wherein the enzyme activity has been
changed by mutation of gate residues. Oxidation of p-nitrophenol
in toluene-o-xylene monooxygenase was improved 15-fold due to
E214G mutation [94]. Also in the lipase from Burkholderia cepacia
an overall 15-fold increase in specific activity toward (R,S)-2-chloro
ethyl 2-bromophenylacetate by the double mutation L17S+L287I
was observed [110]. Several mutations on V74 and V74+L122
residues in NiFe hydrogenase attained reduced transport rates of
CO and O2 molecules through the tunnel, thereby increasing the
resistance of that enzyme to the inhibition by these molecules
[141]. The gate removal in tryptophan synthase with the F280C or
F280S mutations led to an increase in the rate of indole binding
[116]. In imidazole glycerol phosphate synthase, gate removal by
T78A mutation increased the ammonia transfer rate and also the
overall enzyme activity [146]. On the other hand, a gate disruption
by R5A led to an increased access of water to the active site,
which impaired the enzyme activity. The same mutation also caused
ammonia to leak through the interdomain tunnel to the bulk solvent,

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Protein Gates 441

resulting in 103 -fold decrease of the cyclase reaction rate [146].


A similar ammonia leakage was observed with G359F and G359Y
mutations in carbamoyl phosphate synthetase [202, 203]. The gate
removal in FabZ β-hydroxyacyl-acyl carrier protein dehydratase by
mutation Y100A leaves the active site exposed to the bulk solvent
and results in a much stronger binding of the product to the active
site, reducing the enzyme activity by 50% [118]. The DhaA31 mutant
with enhanced activity for 1,2,3-trichloropropane, developed in
our laboratory by Pavlova et al. [35, 115], contains substitutions
at the main tunnel residues C176Y and V245F. Unpublished MD
simulations demonstrated the existence of a gating mechanism
involving these residues, which controls the access of ligands and
solvent to the tunnel. This is a case of gate insertion that resulted in
improvements of the enzyme activity.
Substrate specificity of enzymes has also been modified by gate
engineering. It can be rationalized by the fact that gates control the
nature and geometry of the substrates accessing the active site and
that change in the gate may result in shift of substrate specificity.
The specificity increase of toluene-o-xylene monooxygenase toward
the oxidation of p-nitrophenol was observed upon mutation of
gating E214G [94]. A double mutant (L17S+L287I) of B. cepacia
lipase resulted in 178-fold improvement of the E -value toward
(R,S)-2-chloro ethyl 2-bromophenylacetate compared to the wild
type [110]. Q230P mutation at the hinge region of rabbit 20α-
hydroxysteroid dehydrogenase decreased the flexibility of loop B,
which led to narrowing its specificity compared to the wide range of
substrates [148]. The gate removal from toluene-4-monooxygenase,
by the D285I and D285Q mutations at the tunnel entrance, increased
its ability to hydroxylate bulkier substrates as 2-phenylethanol
and methyl- p-tolyl sulphide by 8- and 11-fold, respectively [96].
Mutations at the access tunnel of C. rugosa lipase changed its
substrate specificity in terms of the fatty acids’ chain lengths
accepted. Introduction of bulkier aromatic residues at the entrance
or inside the tunnel changed the substrate specificity profile. We
speculate that these mutations are likely to have introduced some
type of gating common for F and W residues, for example, wing or
swinging door gates [132].
Modulation of product specificity by mutation of gate residues
has been reported in a few cases. Escherichia coli undecaprenyl-
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

442 Role of Tunnels and Gates in Enzymatic Catalysis

pyrophosphate synthase condenses the isopentenyl pyrophosphate


with allylic pyrophosphate units to generate linear isoprenyl
polymers. It was found that a gate formed by the flexible loop
controls the extent of the reaction and the product release and thus
the length of the polymer formed. The L137A mutation, located at
the bottom of the tunnel, led to formation of the C70 polymer rather
than the smaller C55 in the wild type. On the other hand, the A69L
mutant produces the smaller C30 polymer instead [107].
To the best of our knowledge, there have been no reports regard-
ing protein stability enhancements achieved by gate modification
to date. However, considering the structural basis for improving
enzyme stability [204–206], it should be in principle possible to
construct stable mutants by engineering enzyme gates.

12.4 Conclusions

Molecular tunnels, channels, and gates are structural features widely


represented in the protein world. Protein tunnels can be found in
any enzyme containing a buried active site, in which they serve as a
pathway connecting the active site with bulk solvent or connecting
multiple active sites. We note that proteins containing tunnels of
some type can be found in all six classes of enzymes, and the same
is true also for enzyme gates. The gates can be very diverse in
terms of localization, involved structural elements, amplitudes, and
frequencies of motion.
Tunnels, channels, and gates play important roles and in
some cases are essential for enzymatic catalysis. They control the
transport of small ligands and solvent molecules to and from the
active site and in this way modulate enzyme activity and substrate
specificity. Furthermore, they enable synchronization of molecular
events taking place in a distinct part of the protein structure and
control properties of the active site environment during individual
phases of a catalytic cycle.
We have surveyed existing methods suitable for the study
of tunnels, channels, and gates in the protein structures. X-ray
crystallography is one of the most important experimental tools for
identification of permanent tunnels and channels, while it is less

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

Conclusions 443

suitable for visualization of transient structures and gates. NMR


spectroscopy may supply missing information on dynamical protein
structures. Fluorescence spectroscopic methods may provide addi-
tional evidences for conformational changes in proteins and reveal
details about the gating processes. Classical and enhanced variants
of MD are currently one of the most useful tools for identifying
and characterizing transient tunnels and gates. Combined with
specialized void detection tools, MD allows exploration of a large
conformational space and study of the mechanisms involved in
dynamical processes.
Due to their structural and functional importance, molecular
tunnels and gates appear to be very attractive targets for protein
engineering. The provided examples demonstrate that, by changing
a specific tunnel or gate residue, it is possible to modify enzyme
properties such as activity, specificity, enantioselectivity, and stabil-
ity. It is already possible to tune properties of a target enzyme by
performing such modifications in a rational manner. With further
expansion of our knowledge on enzyme tunnels and gates, it will be
possible to design these fascinating structural features de novo in
the future.

Acknowledgments

The authors would like to thank the Grant Agency of the Czech
Republic (P503/12/0572) and the Czech Ministry of Education
of the Czech Republic (LO1214, LH14027, and LQ1605) for
financial support. S.M. is supported by the SoMoPro II Programme
(project BIOGATE, Nr. 4SGA8519), which is co-financed by the
People Programme (Marie Curie action) of the Seventh Framework
Programme of the European Union according to the REA Grant
Agreement No. 291782, and the South Moravian Region. J.B. is
supported by the project REGPOT financed by the European Union
(316345). Computational resources were provided by CESNET
LM2015042 and the CERIT Scientific Cloud LM2015085, under the
program “Projects of Large Research, Development, and Innovations
Infrastructures.” What was exposed here reflects only the authors
view and the European Union is not liable for any use that may be
made of the information contained herein.
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

444 Role of Tunnels and Gates in Enzymatic Catalysis

References

1. Dalby, P. A. (2007). Engineering enzymes for biocatalysis, Recent Pat.


Biotechnol., 1, pp. 1–9.
2. Kaul, P., and Asano, Y. (2012). Strategies for discovery and improve-
ment of enzyme function: state of the art and opportunities, Microb.
Biotechnol., 5, pp. 18–33.
3. Davids, T., Schmidt, M., Böttcher, D., and Bornscheuer, U. T. (2013).
Strategies for the discovery and engineering of enzymes for biocataly-
sis, Curr. Opin. Chem. Biol., 17, pp. 215–220.
4. Prokop, Z., Gora, A., Brezovsky, J., Chaloupkova, R., Stepankova, V., and
Damborsky, J. (2012). Engineering of protein tunnels: keyhole-lock-
key model for catalysis by the enzymes with buried active sites. In
Protein Engineering Handbook, Lutz, S., and Bornscheuer, U. T., eds.
(Wiley-VCH, Weinheim), 3, pp. 421–464.
5. Huang, X., Holden, H. M., and Raushel, F. M. (2001). Channeling of
substrates and intermediates in enzyme-catalyzed reactions, Annu.
Rev. Biochem., 70, pp. 149–180.
6. Raushel, F. M., Thoden, J. B., and Holden, H. M. (2003). Enzymes with
molecular tunnels, Acc. Chem. Res., 36, pp. 539–548.
7. Fischer, E. (1894). Einfluss der configuration auf die wirkung der
enzyme, Berichte Dtsch. Chem. Ges., 27, pp. 2985–2993.
8. Koshland, D. E. (1958). Application of a theory of enzyme specificity to
protein synthesis*, Proc. Natl. Acad. Sci. U S A, 44, pp. 98–104.
9. McCammon, J. A., and Northrup, S. H. (1981). Gated binding of ligands
to proteins, Nature, 293, pp. 316–317.
10. Zhou, H.-X., Wlodek, S. T., and McCammon, J. A. (1998). Conformation
gating as a mechanism for enzyme specificity, Proc. Natl. Acad. Sci. U S
A, 95, pp. 9280–9283.
11. Sheu, S.-Y. (2006). Selectivity principle of the ligand escape process
from a two-gate tunnel in myoglobin: molecular dynamics simulation,
J. Chem. Phys., 124, pp. 154711–154711–9.
12. Zhou, H.-X., and McCammon, J. A. (2010). The gates of ion channels and
enzymes, Trends Biochem. Sci., 35, pp. 179–185.
13. McCammon, J. A. (2011). Gated diffusion-controlled reactions, BMC
Biophys., 4, p. 4.
14. Gora, A., Brezovsky, J., and Damborsky, J. (2013). Gates of enzymes,
Chem. Rev., 113, pp. 5871–5923.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 445

15. Chovancová, E., Pavelka, A., Benes, P., Strnad, O., Brezovsky, J.,
Kozlikova, B., Gora, A., Sustr, V., Klvana, M., Medek, P., et al. (2012).
CAVER 3.0: a tool for the analysis of transport pathways in dynamic
protein structures, PLOS Comput. Biol., 8, p. e1002708.
16. Littleton, J. T., Ganetzky, B. (2000). Ion channels and synaptic
organization: analysis of the drosophila genome, Neuron, 26, pp. 35–
43.
17. MacKinnon, R. (2003). Potassium channels, FEBS Lett., 555, pp. 62–65.
18. Morth, J. P., Pedersen, B. P., Buch-Pedersen, M. J., Andersen, J. P., Vilsen,
B., Palmgren, M. G., and Nissen, P. (2011). A structural overview of the
plasma membrane Na+,K+-ATPase and H+-ATPase ion pumps, Nat.
Rev. Mol. Cell Biol., 12, pp. 60–70.
19. Iversen, L. (2000). Neurotransmitter transporters: fruitful targets for
CNS drug discovery, Mol. Psychiatry, 5, pp. 357–362.
20. Klebba, P. E., and Newton, S. M. (1998). Mechanisms of solute transport
through outer membrane porins: burning down the house, Curr. Opin.
Microbiol., 1, pp. 238–247.
21. Moore, S. A., Baker, H. M., Blythe, T. J., Kitson, K. E., Kitson, T. M., and
Baker, E. N. (1998). Sheep liver cytosolic aldehyde dehydrogenase: the
structure reveals the basis for the retinal specificity of class 1 aldehyde
dehydrogenases, Structure, 6, pp. 1541–1551.
22. Juan, E. C. M., Hoque, M. M., Hossain, M. T., Yamamoto, T., Imamura,
S., Suzuki, K., Sekiguchi, T., and Takénaka, A. (2007). The structures of
pyruvate oxidase from Aerococcus viridans with cofactors and with a
reaction intermediate reveal the flexibility of the active-site tunnel for
catalysis, Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun., 63, pp.
900–907.
23. Wilce, M. C., Dooley, D. M., Freeman, H. C., Guss, J. M., Matsunami,
H., McIntire, W. S., Ruggiero, C. E., Tanizawa, K., and Yamaguchi, H.
(1997). Crystal Structures of the copper-containing amine oxidase
from Arthrobacter globiformis in the holo and apo forms: implications
for the biogenesis of topaquinone, Biochemistry (Moscow), 36, pp.
16116–16133.
24. Wang, J., Ortiz-Maldonado, M., Entsch, B., Massey, V., Ballou, D., and
Gatti, D. L. (2002). Protein and ligand dynamics in 4-hydroxybenzoate
hydroxylase, Proc. Natl. Acad. Sci. U S A, 99, pp. 608–613.
25. Rossjohn, J., McKinstry, W. J., Oakley, A. J., Verger, D., Flanagan, J.,
Chelvanayagam, G., Tan, K. L., Board, P. G., and Parker, M. W. (1998).
Human theta class glutathione transferase: the crystal structure
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

446 Role of Tunnels and Gates in Enzymatic Catalysis

reveals a sulfate-binding pocket within a buried active site, Structure,


6, pp. 309–322.
26. Kim, D. J., Kim, K. H., Lee, H. H., Lee, S. J., Ha, J. Y., Yoon, H. J., and Suh, S.
W. (2005). Crystal structure of lipoate-protein ligase A bound with the
activated intermediate: insights into interaction with lipoyl domains, J.
Biol. Chem., 280, pp. 38081–38089.
27. Ericsson, D. J., Kasrayan, A., Johansson, P., Bergfors, T., Sandström, A.
G., Bäckvall, J.-E., and Mowbray, S. L. (2008). X-ray structure of Candida
antarctica lipase A shows a novel lid structure and a likely mode of
interfacial activation, J. Mol. Biol., 376, pp. 109–119.
28. Grochulski, P., Bouthillier, F., Kazlauskas, R. J., Serreqi, A. N., Schrag, J.
D., Ziomek, E., and Cygler, M. (1994). Analogs of reaction intermediates
identify a unique substrate binding site in Candida rugosa lipase,
Biochemistry (Moscow), 33, pp. 3494–3500.
29. Nardini, M., Ridder, I. S., Rozeboom, H. J., Kalk, K. H., Rink, R., Janssen, D.
B., and Dijkstra, B. W. (1999). The X-ray structure of epoxide hydrolase
from Agrobacterium radiobacter AD1. An enzyme to detoxify harmful
epoxides, J. Biol. Chem., 274, pp. 14579–14586.
30. Nardini, M., Rink, R., Janssen, D. B., and Dijkstra, B. W.(2001). Struc-
ture and mechanism of the epoxide hydrolase from Agrobacterium
radiobacter AD1, J. Mol. Catal. B Enzym., 11, pp. 1035–1042.
31. Brown, C. K., Madauss, K., Lian, W., Beck, M. R., Tolbert, W. D., and
Rodgers, D. W. (2001). Structure of neurolysin reveals a deep channel
that limits substrate access, Proc. Natl. Acad. Sci. U S A, 98, pp. 3127–
3132.
32. Leesong, M., Henderson, B. S., Gillig, J. R., Schwab, J. M., and Smith,
J. L. (1996). Structure of a dehydratase-isomerase from the bacterial
pathway for biosynthesis of unsaturated fatty acids: two catalytic
activities in one active site, Structure, 4, pp. 253–264.
33. May, M., Mehboob, S., Mulhearn, D. C., Wang, Z., Yu, H., Thatcher, G. R. J.,
Santarsiero, B. D., Johnson, M. E., and Mesecar, A. D. (2007). Structural
and functional analysis of two glutamate racemase isozymes from
Bacillus anthracis and implications for inhibitor design, J. Mol. Biol.,
371, pp. 1219–1237.
34. Mehboob, S., Guo, L., Fu, W., Mittal, A., Yau, T., Truong, K., Johlfs, M.,
Long, F., Fung, L. W.-M., and Johnson, M. E. (2009). Glutamate racemase
dimerization inhibits dynamic conformational flexibility and reduces
catalytic rates, Biochemistry (Moscow), 48, pp. 7045–7055.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 447

35. Klvana, M., Pavlova, M., Koudelakova, T., Chaloupkova, R., Dvorak, P.,
Prokop, Z., Stsiapanava, A., Kuty, M., Kuta-Smatanova, I., Dohnalek, J.,
et al. (2009). Pathways and mechanisms for product release in the
engineered haloalkane dehalogenases explored using classical and
random acceleration molecular dynamics simulations, J. Mol. Biol., 392,
pp. 1339–1356.
36. Cojocaru, V., Winn, P. J., and Wade, R. C. (2007). The ins and outs of
cytochrome P450s, Biochim. Biophys. Acta, 1770, pp. 390–401.
37. Chelikani, P., Carpena, X., Fita, I., and Loewen, P. C. (2003). An electrical
potential in the access channel of catalases enhances catalysis, J. Biol.
Chem., 278, pp. 31290–31296.
38. Fontecilla-Camps, J. C., Volbeda, A., Cavazza, C., and Nicolet, Y. (2007).
Structure/function relationships of [NiFe]- and [FeFe]-hydrogenases,
Chem. Rev., 107, pp. 4273–4303.
39. Moustafa, I. M., Foster, S., Lyubimov, A. Y., and Vrielink, A. (2006).
Crystal structure of LAAO from Calloselasma rhodostoma with an L-
phenylalanine substrate: insights into structure and mechanism, J. Mol.
Biol., 364, pp. 991–1002.
40. Petrek, M., Otyepka, M., Banas, P., Kosinová, P., Koca, J., and Damborsky,
J. (2006). CAVER: a new tool to explore routes from protein clefts,
pockets and cavities, BMC Bioinformatics, 7, p. 316.
41. Newman, J., Peat, T. S., Richard, R., Kan, L., Swanson, P. E., Affholter,
J. A., Holmes, I. H., Schindler, J. F., Unkefer, C. J., and Terwilliger, T.
C. (1993). Haloalkane dehalogenases: structure of a Rhodococcus
enzyme, Biochemistry (Moscow), 38, pp. 16105–16114.
42. Marek, J., Vévodová, J., Smatanová, I. K., Nagata, Y., Svensson, L. A.,
Newman, J., Takagi, M., and Damborský, J. (2000). Crystal structure of
the haloalkane dehalogenase from Sphingomonas paucimobilis UT26,
Biochemistry (Moscow), 39, pp. 14082–14086.
43. Oakley, A. J., Klvana, M., Otyepka, M., Nagata, Y., Wilce, M. C. J., and
Damborský, J. (2004). Crystal structure of haloalkane dehalogenase
LinB from Sphingomonas paucimobilis UT26 at 0.95 A resolution:
dynamics of catalytic residues, Biochemistry (Moscow), 43, pp. 870–
878.
44. Sanson, B., Colletier, J.-P., Xu, Y., Lang, P. T., Jiang, H., Silman, I.,
Sussman, J. L., and Weik, M. (2011). Backdoor opening mechanism
in Acetylcholinesterase based on X-ray crystallography and molecular
dynamics simulations, Protein Sci. Publ. Protein Soc., 20, pp. 1114–
1118.
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

448 Role of Tunnels and Gates in Enzymatic Catalysis

45. Westbrook, E. M., Piro, O. E., and Sigler, P. B. (1984). The 6-A crystal
structure of delta 5-3-ketosteroid isomerase. Architecture and location
of the active center, J. Biol. Chem., 259, pp. 9096–9103.
46. Binda, C., Bossi, R. T., Wakatsuki, S., Arzt, S., Coda, A., Curti, B., Vanoni,
M. A., and Mattevi, A. (2000). Cross-talk and ammonia channeling
between active centers in the unexpected domain arrangement of
glutamate synthase, Structure, 8, pp. 1299–1308.
47. Teplyakov, A., Obmolova, G., Badet, B., and Badet-Denisot, M. A. (2001).
Channeling of ammonia in glucosamine-6-phosphate synthase, J. Mol.
Biol., 313, pp. 1093–1102.
48. Krahn, J. M., Kim, J. H., Burns, M. R., Parry, R. J., Zalkin, H.,
and Smith, J. L. (1997). Coupled formation of an amidotransferase
interdomain ammonia channel and a phosphoribosyltransferase active
site, Biochemistry (Moscow), 36, pp. 11061–11068.
49. Maynard, E. L., and Lindahl, P. A. (2001). Catalytic coupling of the active
sites in acetyl-CoA synthase, a bifunctional CO-channeling enzyme,
Biochemistry (Moscow), 40, pp. 13262–13267.
50. Hyde, C. C., Ahmed, S. A., Padlan, E. A., Miles, E. W., and Davies, D. R.
(1998). Three-dimensional structure of the tryptophan synthase alpha
2 beta 2 multienzyme complex from Salmonella typhimurium, J. Biol.
Chem., 263, pp. 17857–17871.
51. Miles, E. W. (2001). Tryptophan synthase: a multienzyme complex with
an intramolecular tunnel, Chem. Rec., 1, pp. 140–151.
52. Chaudhuri, B. N., Lange, S. C., Myers, R. S., Chittur, S. V., Davisson,
V. J., and Smith, J. L. (2001). Crystal structure of imidazole glycerol
phosphate synthase: a tunnel through a (beta/alpha)8 barrel joins two
active sites, Structure, 9, pp. 987–997.
53. Thoden, J. B., Holden, H. M., Wesenberg, G., Raushel, F. M., and Rayment,
I. (1997). Structure of carbamoyl phosphate synthetase: a journey of
96 A from substrate to product, Biochemistry (Moscow), 36, pp. 6305–
6316.
54. Kim, J., and Raushel, F. M. (2004). Access to the carbamate tunnel of
carbamoyl phosphate synthetase, Arch. Biochem. Biophys., 425, pp. 33–
41.
55. Kim, J., and Raushel, F. M. (2004). Perforation of the tunnel wall
in carbamoyl phosphate synthetase derails the passage of ammonia
between sequential active sites, Biochemistry (Moscow), 43, pp. 5334–
5340.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 449

56. Larsen, T. M., Boehlein, S. K., Schuster, S. M., Richards, N. G., Thoden, J.
B., Holden, H. M., and Rayment, I. (1999). Three-dimensional structure
of Escherichia Coli asparagine synthetase B: a short journey from
substrate to product, Biochemistry (Moscow), 38, pp. 16146–16157.
57. Mittermaier, A. K., and Kay, L. E. (2009). Observing biological dynamics
at atomic resolution using NMR, Trends Biochem. Sci., 34, pp. 601–611.
58. Kleckner, I. R., and Foster, M. P. (2011). An introduction to NMR-based
approaches for measuring protein dynamics, Biochim. Biophys. Acta,
1814, pp. 942–968.
59. Haller, J. D., and Schanda, P. (2013). Amplitudes and time scales of
picosecond-to-microsecond motion in proteins studied by solid-state
NMR: a critical evaluation of experimental approaches and application
to crystalline ubiquitin, J. Biomol. NMR, 57, pp. 263–280.
60. Kaieda, S., and Halle, B. (2013). Internal water and microsecond
dynamics in myoglobin, J. Phys. Chem. B, 117, pp. 14676–14687.
61. Persson, F., and Halle, B. (2013). Transient access to the protein
interior: simulation versus NMR, J. Am. Chem. Soc., 135, pp. 8735–
8748.
62. Weingarth, M., van der Cruijsen, E. A. W., Ostmeyer, J., Lievestro, S.,
Roux, B., and Baldus, M. (2014). Quantitative analysis of the water
occupancy around the selectivity filter of a K+ channel in different
gating modes, J. Am. Chem. Soc., 136, pp. 2000–2007.
63. Case, D. A., Darden, T. A., Cheatham, III, T. E., Simmerling, C. L., Wang, J.,
Duke, R. E., Luo, R., Walker, R. C., Zhang, W., Merz, K. M., et al. (2012).
AMBER 12 (University of California, San Francisco).
64. Brooks, B. R., Brooks, C. L., 3rd, Mackerell, A. D., Jr, Nilsson, L., Petrella,
R. J., Roux, B., Won, Y., Archontis, G., Bartels, C., Boresch, S., et al. (2009).
CHARMM: the biomolecular simulation program, J. Comput. Chem., 30,
pp. 1545–1614.
65. Hess, B., Kutzner, C., van der Spoel, D., and Lindahl, E. (2008).
GROMACS 4: algorithms for highly efficient, load-balanced, and
scalable molecular simulation, J. Chem. Theory Comput., 4, pp. 435–
447.
66. Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa,
E., Chipot, C., Skeel, R. D., Kalé, L., and Schulten, K. (2005). Scalable
molecular dynamics with NAMD, J. Comput. Chem., 26, pp. 1781–1802.
67. Shaw, D. E., Chao, J. C., Eastwood, M. P., Gagliardo, J., Grossman, J. P., Ho,
C. R., Lerardi, D. J., Kolossváry, I., Klepeis, J. L., Layman, T., et al. (2008).
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

450 Role of Tunnels and Gates in Enzymatic Catalysis

Anton, a special-purpose machine for molecular dynamics simulation,


Commun. ACM, 51, p. 91.
68. Friedrichs, M. S., Eastman, P., Vaidyanathan, V., Houston, M., Legrand,
S., Beberg, A. L., Ensign, D. L., Bruns, C. M., and Pande, V. S. (2009).
Accelerating molecular dynamic simulation on graphics processing
units, J. Comput. Chem., 30, pp. 864–872.
69. Hamelberg, D., Mongan, J., and McCammon, J. A. (2004). Accelerated
molecular dynamics: a promising and efficient simulation method for
biomolecules, J. Chem. Phys., 120, pp. 11919–11929.
70. Pierce, L. C. T., Salomon-Ferrer, R., Augusto F de Oliveira, C., McCam-
mon, J. A., and Walker, R. C. (2012). Routine access to millisecond
time scale events with accelerated molecular dynamics, J. Chem. Theory
Comput., 8, pp. 2997–3002.
71. Grubmüller, H. (1995). Predicting slow structural transitions in
macromolecular systems: conformational flooding, Phys. Rev. E, 52, pp.
2893–2906.
72. Lange, O. F., Schäfer, L. V., and Grubmüller, H. (2006). Flooding in
GROMACS: accelerated barrier crossings in molecular dynamics, J.
Comput. Chem., 27, pp. 1693–1702.
73. Voter, A. F. (1997). Hyperdynamics: accelerated molecular dynamics of
infrequent events, Phys. Rev. Lett., 78, pp. 3908–3911.
74. Voter, A. F. (1997). A method for accelerating the molecular dynamics
simulation of infrequent events, J. Chem. Phys., 106, pp. 4665–
4677.
75. Bussi, G., Laio, A., and Parrinello, M. (2006). Equilibrium free energies
from nonequilibrium metadynamics, Phys. Rev. Lett., 96, p. 090601.
76. Darve, E., and Pohorille, A. (2001). Calculating Free energies using
average force, J. Chem. Phys., 115, pp. 9169–9183.
77. Lüdemann, S. K., Lounnas, V., and Wade, R. C. (2000). How do
substrates enter and products exit the buried active site of cytochrome
P450cam? 1. Random expulsion molecular dynamics investigation of
ligand access channels and mechanisms, J. Mol. Biol., 303, pp. 797–
811.
78. Schleinkofer, K., Sudarko, Winn, P. J., Ludemann, S. K., and Wade, R. C.
(2005). Do mammalian cytochrome P450s show multiple ligand access
pathways and ligand channelling? EMBO Rep., 6, pp. 584–589.
79. Izrailev, S., Stepaniants, S., Balsera, M., Oono, Y., and Schulten, K. (1997).
Molecular dynamics study of unbinding of the avidin-biotin complex,
Biophys. J., 72, pp. 1568–1581.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 451

80. Park, S., and Schulten, K. (2004). Calculating potentials of mean force
from steered molecular dynamics simulations, J. Chem. Phys., 120, pp.
5946–5961.
81. Lüdemann, S. K., Lounnas, V., and Wade, R. C. (2000). How do
substrates enter and products exit the buried active site of cytochrome
P450cam? 2. Steered molecular dynamics and adiabatic mapping of
substrate pathways, J. Mol. Biol., 303, pp. 813–830.
82. Yaffe, E., Fishelovitch, D., Wolfson, H. J., Halperin, D., and Nussinov, R.
(2008). MolAxis: efficient and accurate identification of channels in
macromolecules, Proteins, 73, pp. 72–86.
83. Petrek, M., Kosinová, P., Koca, J., and Otyepka, M. (2007). MOLE: a
voronoi diagram-based explorer of molecular channels, pores, and
tunnels, Structure, 15, pp. 1357–1363.
84. Brezovsky, J., Chovancova, E., Gora, A., Pavelka, A., Biedermannova,
L., and Damborsky, J. (2013). Software tools for identification,
visualization and analysis of protein tunnels and channels, Biotechnol.
Adv., 31, pp. 38–49.
85. Morley, K. L., and Kazlauskas, R. J. (2005). Improving enzyme
properties: when are closer mutations better? Trends Biotechnol., 23,
pp. 231–237.
86. Chen, L., Lyubimov, A. Y., Brammer, L., Vrielink, A., and Sampson, N. S.
(2008). The binding and release of oxygen and hydrogen peroxide are
directed by a hydrophobic tunnel in cholesterol oxidase, Biochemistry
(Moscow), 47, pp. 5368–5377.
87. Piubelli, L., Pedotti, M., Molla, G., Feindler-Boeckh, S., Ghisla, S.,
Pilone, M. S., and Pollegioni, L. (2008). On the oxygen reactivity of
flavoprotein oxidases: an oxygen access tunnel and gate in Brevibac-
terium sterolicum cholesterol oxidase, J. Biol. Chem., 283, pp. 24738–
24747.
88. Frank, R. A. W., Titman, C. M., Pratap, J. V., Luisi, B. F., and Perham, R.
N. (2004). A molecular switch and proton wire synchronize the active
sites in thiamine enzymes, Science, 306, pp. 872–876.
89. Dossena, L., Curti, B., and Vanoni, M. A. (2007). Activation and coupling
of the glutaminase and synthase reaction of glutamate synthase is
mediated by E1013 of the ferredoxin-dependent enzyme, belonging to
loop 4 of the synthase domain, Biochemistry (Moscow), 46, pp. 4473–
4485.
90. Tan, X., Loke, H.-K., Fitch, S., and Lindahl, P. A. (2005). The tunnel
of acetyl-coenzyme a synthase/carbon monoxide dehydrogenase
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

452 Role of Tunnels and Gates in Enzymatic Catalysis

regulates delivery of CO to the active site, J. Am. Chem. Soc., 127, pp.
5833–5839.
91. Zamocky, M., Herzog, C., Nykyri, L. M., and Koller, F. (1995). Site-
directed mutagenesis of the lower parts of the major substrate channel
of yeast catalase A leads to highly increased peroxidatic activity, FEBS
Lett., 367, pp. 241–245.
92. Sevinc, M. S., Maté, M. J., Switala, J., Fita, I., and Loewen, P. C. (1999).
Role of the lateral channel in catalase HPII of Escherichia coli, Protein
Sci. Publ. Protein Soc., 8, pp. 490–498.
93. Melik-Adamyan, W., Bravo, J., Carpena, X., Switala, J., Maté, M. J., Fita, I.,
and Loewen, P. C. (2001). Substrate flow in catalases deduced from the
crystal structures of active site variants of HPII from Escherichia coli,
Proteins, 44, pp. 270–281.
94. Vardar, G., and Wood, T. K. (2005). Alpha-subunit positions methionine
180 and glutamate 214 of pseudomonas stutzeri OX1 toluene-o-
xylene monooxygenase influence catalysis, J. Bacteriol., 187, pp. 1511–
1514.
95. Notomista, E., Cafaro, V., Bozza, G., and Di Donato, A. (2009).
Molecular determinants of the regioselectivity of toluene/o-xylene
monooxygenase from Pseudomonas sp. strain OX1, Appl. Environ.
Microbiol., 75, pp. 823–836.
96. Brouk, M., Derry, N.-L., Shainsky, J., Zelas, Z. B.-B., Boyko, Y., Dabush, K.,
and Fishman, A. (2010). The influence of key residues in the tunnel
entrance and the active site on activity and selectivity of toluene-4-
monooxygenase, J. Mol. Catal. B Enzym., 66, pp. 72–80.
97. Fishman, A., Tao, Y., Bentley, W. E., and Wood, T. K. (2004). Protein
engineering of toluene 4-monooxygenase of Pseudomonas mendocina
KR1 for synthesizing 4-nitrocatechol from nitrobenzene, Biotechnol.
Bioeng., 87, pp. 779–790.
98. Feingersch, R., Shainsky, J., Wood, T. K., and Fishman, A. (2008).
Protein engineering of toluene monooxygenases for synthesis of chiral
sulfoxides, Appl. Environ. Microbiol., 74, pp. 1555–1566.
99. Smith, G., Modi, S., Pillai, I., Lian, L. Y., Sutcliffe, M. J., Pritchard, M. P.,
Friedberg, T., Roberts, G. C., and Wolf, C. R. (1998). Determinants of
the substrate specificity of human cytochrome P-450 CYP2D6: design
and construction of a mutant with testosterone hydroxylase activity,
Biochem. J., 331( Pt 3), pp. 783–792.
100. Nguyen, T.-A., Tychopoulos, M., Bichat, F., Zimmermann, C., Flinois, J.-P.,
Diry, M., Ahlberg, E., Delaforge, M., Corcos, L., Beaune, P., et al. (2008).

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 453

Improvement of cyclophosphamide activation by CYP2B6 mutants:


from in silico to ex vivo, Mol. Pharmacol., 73, pp. 1122–1133.
101. Khan, K. K., He, Y. Q., Domanski, T. L., and Halpert, J. R. (2002). Mida-
zolam oxidation by cytochrome P450 3A4 and active-site mutants: an
evaluation of multiple binding sites and of the metabolic pathway that
leads to enzyme inactivation, Mol. Pharmacol., 61, pp. 495–506.
102. Fishelovitch, D., Shaik, S., Wolfson, H. J., and Nussinov, R. (2009).
Theoretical characterization of substrate access/exit channels in the
human cytochrome P450 3A4 enzyme: involvement of phenylalanine
residues in the gating mechanism, J. Phys. Chem. B, 113, pp. 13018–
13025.
103. Carmichael, A. B., and Wong, L. L. (2001). Protein engineering of
Bacillus megaterium CYP102. The oxidation of polycyclic aromatic
hydrocarbons, Eur. J. Biochem. FEBS, 268, pp. 3117–3125.
104. Wen, Z., Baudry, J., Berenbaum, M. R., and Schuler, M. A. (2005).
Ile115Leu mutation in the SRS1 region of an insect cytochrome P450
(CYP6B1) compromises substrate turnover via changes in a predicted
product release channel, Protein Eng. Des. Sel., 18, pp. 191–199.
105. Floquet, N., Mouilleron, S., Daher, R., Maigret, B., Badet, B., and Badet-
Denisot, M.-A. (2007). Ammonia channeling in bacterial glucosamine-
6-phosphate synthase (Glms): molecular dynamics simulations and
kinetic studies of protein mutants, FEBS Lett., 581, pp. 2981–2987.
106. Xu, Z., Metsä-Ketelä, M., and Hertweck, C. (2009). Ketosynthase III as
a gateway to engineering the biosynthesis of antitumoral benastatin
derivatives, J. Biotechnol., 140, pp. 107–113.
107. Ko, T. P., Chen, Y. K., Robinson, H., Tsai, P. C., Gao, Y. G., Chen, A. P.,
Wang, A. H., and Liang, P. H. (2001). Mechanism of product chain
length determination and the role of a flexible loop in Escherichia coli
undecaprenyl-pyrophosphate synthase catalysis, J. Biol. Chem., 276,
pp. 47474–47482.
108. Labonté, P., Axelrod, V., Agarwal, A., Aulabaugh, A., Amin, A., and
Mak, P. (2002). Modulation of hepatitis C virus RNA-dependent RNA
polymerase activity by structure-based site-directed mutagenesis, J.
Biol. Chem., 277, pp. 38838–38846.
109. Qian, Z., Horton, J. R., Cheng, X., and Lutz, S. (2009). Structural redesign
of lipase B from Candida antarctica by circular permutation and
incremental truncation, J. Mol. Biol., 393, pp. 191–201.
110. Lafaquière, V., Barbe, S., Puech-Guenot, S., Guieysse, D., Cortés, J.,
Monsan, P., Siméon, T., André, I., and Remaud-Siméon, M. (2009).
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

454 Role of Tunnels and Gates in Enzymatic Catalysis

Control of lipase enantioselectivity by engineering the substrate


binding site and access channel, Chembiochem Eur. J. Chem. Biol., 10,
pp. 2760–2771.
111. Boublik, Y., Saint-Aguet, P., Lougarre, A., Arnaud, M., Villatte, F., Estrada-
Mondaca, S., and Fournier, D. (2002). Acetylcholinesterase engineering
for detection of insecticide residues, Protein Eng., 15, pp. 43–50.
112. Harel, M., Sussman, J. L., Krejci, E., Bon, S., Chanal, P., Massoulié,
J., and Silman, I. (1992). Conversion of acetylcholinesterase to
butyrylcholinesterase: modeling and mutagenesis, Proc. Natl. Acad. Sci.
U S A, 89, pp. 10827–10831.
113. Kotik, M., Stepánek, V., Kyslı́k, P., and Maresová, H. (2007). Cloning of an
epoxide hydrolase-encoding gene from Aspergillus niger M200, over-
expression in E. coli, and modification of activity and enantioselectivity
of the enzyme by protein engineering, J. Biotechnol., 132, pp. 8–15.
114. Chaloupkova, R., Sýkorová, J., Prokop, Z., Jesenská, A., Monincova,
M., Pavlová, M., Tsuda, M., Nagata, Y., and Damborsky, J. (2003).
Modification of activity and specificity of haloalkane dehalogenase
from Sphingomonas paucimobilis UT26 by engineering of its entrance
tunnel, J. Biol. Chem., 278, pp. 52622–52628.
115. Pavlova, M., Klvana, M., Prokop, Z., Chaloupkova, R., Banas, P., Otyepka,
M., Wade, R. C., Tsuda, M., Nagata, Y., and Damborsky, J. (2009).
Redesigning dehalogenase access tunnels as a strategy for degrading
an anthropogenic substrate, Nat. Chem. Biol., 5, pp. 727–733.
116. Ruvinov, S. B., Yang, X. J., Parris, K. D., Banik, U., Ahmed, S. A., Miles, E. W.,
and Sackett, D. L. (1995). Ligand-mediated changes in the tryptophan
synthase indole tunnel probed by nile red fluorescence with wild type,
mutant, and chemically modified enzymes, J. Biol. Chem., 270, pp.
6357–6369.
117. Brzović, P. S., Sawa, Y., Hyde, C. C., Miles, E. W., and Dunn, M. F.
(1992). Evidence that mutations in a loop region of the alpha-subunit
inhibit the transition from an open to a closed conformation in the
tryptophan synthase bienzyme complex, J. Biol. Chem., 267, pp. 13028–
13038.
118. Zhang, L., Liu, W., Hu, T., Du, L., Luo, C., Chen, K., Shen, X., and Jiang,
H. (2008). Structural basis for catalytic and inhibitory mechanisms
of beta-hydroxyacyl-acyl carrier protein dehydratase (FabZ), J. Biol.
Chem., 283, pp. 5370–5379.
119. Tang, L., Lutje Spelberg, J. H., Fraaije, M. W., and Janssen, D. B. (2003).
Kinetic mechanism and enantioselectivity of halohydrin dehalogenase

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 455

from agrobacterium radiobacter, Biochemistry (Moscow), 42, pp. 5378–


5386.
120. Dang, T., and Prestwich, G. D. (2000). Site-directed mutagenesis of
squalene-hopene cyclase: altered substrate specificity and product
distribution, Chem. Biol., 7, pp. 643–649.
121. Meyer, M. E., Gutierrez, J. A., Raushel, F. M., and Richards, N. G. J. (2010).
A conserved glutamate controls the commitment to acyl-adenylate
formation in Asparagine synthetase, Biochemistry (Moscow), 49, pp.
9391–9401.
122. Fan, Y., Lund, L., Shao, Q., Gao, Y.-Q., and Raushel, F. M. (2009). A
combined theoretical and experimental study of the ammonia tunnel
in carbamoyl phosphate synthetase, J. Am. Chem. Soc., 131, pp. 10211–
10219.
123. Kopeèný, D., Tylichová, M., Snegaroff, J., Popelková, H., and Šebela,
M. (2011). Carboxylate and aromatic active-site residues are de-
terminants of high-affinity binding of Ù-aminoaldehydes to plant
aminoaldehyde dehydrogenases, FEBS J., 278, pp. 3130–3139.
124. Matsuzaki, R., and Tanizawa, K. (1998). Exploring a channel to
the active site of copper/topaquinone-containing phenylethylamine
oxidase by chemical modification and site-specific mutagenesis,
Biochemistry (Moscow), 37, pp. 13947–13957.
125. Van Beilen, J. B., Smits, T. H. M., Roos, F. F., Brunner, T., Balada,
S. B., Röthlisberger, M., and Witholt, B. (2005). Identification of an
amino acid position that determines the substrate range of integral
membrane alkane hydroxylases, J. Bacteriol., 187, pp. 85–91.
126. Jez, J. M., Bowman, M. E., and Noel, J. P. (2002). Expanding the
biosynthetic repertoire of plant type III polyketide synthases by
altering starter molecule specificity, Proc. Natl. Acad. Sci. U S A, 99, pp.
5319–5324.
127. Abe, I., Utsumi, Y., Oguro, S., Morita, H., Sano, Y., and Noguchi, H.
(2005). A plant type III polyketide synthase that produces pentaketide
chromone, J. Am. Chem. Soc., 127, pp. 1362–1363.
128. Sankaranarayanan, R., Saxena, P., Marathe, U. B., Gokhale, R. S.,
Shanmugam, V. M., and Rukmini, R. (2004). A novel tunnel in
mycobacterial type III polyketide synthase reveals the structural basis
for generating diverse metabolites, Nat. Struct. Mol. Biol., 11, pp. 894–
900.
129. De Groeve, M. R. M., Remmery, L., Van Hoorebeke, A., Stout, J., Desmet,
T., Savvides, S. N., and Soetaert, W. (2010). Construction of cellobiose
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

456 Role of Tunnels and Gates in Enzymatic Catalysis

phosphorylase variants with broadened acceptor specificity towards


anomerically substituted glucosides, Biotechnol. Bioeng., 107, pp. 413–
420.
130. Guo, R.-T., Kuo, C.-J., Ko, T.-P., Chou, C.-C., Liang, P.-H., and Wang, A. H.-J.
(2004). A molecular ruler for chain elongation catalyzed by octaprenyl
pyrophosphate synthase and its structure-based engineering to pro-
duce unprecedented long chain trans-prenyl products, Biochemistry
(Moscow), 43, pp. 7678–7686.
131. Hidalgo, A., Schliessmann, A., Molina, R., Hermoso, J., and Bornscheuer,
U. T. (2008). A one-pot, simple methodology for cassette randomisa-
tion and recombination for focused directed evolution, Protein Eng.
Des. Sel. PEDS, 21, pp. 567–576.
132. Schmitt, J., Brocca, S., Schmid, R. D., and Pleiss, J. (2002). Blocking the
tunnel: engineering of Candida rugosa lipase mutants with short chain
length specificity, Protein Eng., 15, pp. 595–601.
133. Marton, Z., Léonard-Nevers, V., Syrén, P.-O., Bauer, C., Lamare, S., Hult,
K., Tranc, V., and Graber, M. (2010). Mutations in the stereospecificity
pocket and at the entrance of the active site of Candida antarctica
lipase B enhancing enzyme enantioselectivity, J. Mol. Catal. B Enzym.,
65, pp. 11–17.
134. Reetz, M. T., Wang, L.-W., and Bocola, M. (2006). Directed evolution
of enantioselective enzymes: iterative cycles of CASTing for probing
protein-sequence space, Angew. Chem., Int. Ed., 45, pp. 1236–1241.
135. Prokop, Z., Sato, Y., Brezovsky, J., Mozga, T., Chaloupkova, R., Koude-
lakova, T., Jerabek, P., Stepankova, V., Natsume, R., van Leeuwen, J. G. E.,
et al. (2010). Enantioselectivity of haloalkane dehalogenases and its
modulation by surface loop engineering, Angew. Chem., Int. Ed. Engl.,
49, pp. 6111–6115.
136. Bühler, H., Effenberger, F., Förster, S., Roos, J., and Wajant, H. (2003).
Substrate specificity of mutants of the hydroxynitrile lyase from
manihot esculenta, Chembiochem Eur. J. Chem. Biol., 4, pp. 211–
216.
137. Lauble, H., Miehlich, B., Förster, S., Kobler, C., Wajant, H., and
Effenberger, F. (2002). Structure determinants of substrate specificity
of hydroxynitrile lyase from manihot esculenta, Protein Sci. Publ.
Protein Soc., 11, pp. 65–71.
138. Koudelakova, T., Chaloupkova, R., Brezovsky, J., Prokop, Z., Sebestova,
E., Hesseler, M., Khabiri, M., Plevaka, M., Kulik, D., Kuta Smatanova, I.,
et al. (2013). Engineering enzyme stability and resistance to an organic

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 457

cosolvent by modification of residues in the access tunnel, Angew.


Chem., Int. Ed. Engl., 52, pp. 1959–1963.
139. Biedermannová, L., Prokop, Z., Gora, A., Chovancová, E., Kovács, M.,
Damborsky, J., and Wade, R. C. (2012). A single mutation in a tunnel to
the active site changes the mechanism and kinetics of product release
in haloalkane dehalogenase LinB, J. Biol. Chem., 287, pp. 29062–29074.
140. Volbeda, A., Martin, L., Cavazza, C., Matho, M., Faber, B. W., Roseboom,
W., Albracht, S. P. J., Garcin, E., Rousset, M., and Fontecilla-Camps, J. C.
(2005). Structural differences between the ready and unready oxidized
states of [NiFe] hydrogenases, J. Biol. Inorg. Chem. JBIC Publ. Soc. Biol.
Inorg. Chem., 10, pp. 239–249.
141. Liebgott, P.-P., Leroux, F., Burlat, B., Dementin, S., Baffert, C., Lautier, T.,
Fourmond, V., Ceccaldi, P., Cavazza, C., Meynial-Salles, I., et al. (2010).
Relating diffusion along the substrate tunnel and oxygen sensitivity in
hydrogenase, Nat. Chem. Biol., 6, pp. 63–70.
142. Kalko, S. G., Gelpı́, J. L., Fita, I., and Orozco, M. (2001). Theoretical study
of the mechanisms of substrate recognition by catalase, J. Am. Chem.
Soc., 123, pp. 9665–9672.
143. Amara, P., Andreoletti, P., Jouve, H. M., and Field, M. J. (2001). Ligand
diffusion in the catalase from proteus mirabilis: a molecular dynamics
study, Protein Sci. Publ. Protein Soc., 10, pp. 1927–1935.
144. Winn, P. J., Lüdemann, S. K., Gauges, R., Lounnas, V., and Wade, R. C.
(2002). Comparison of the dynamics of substrate access channels in
three cytochrome p450s reveals different opening mechanisms and a
novel functional role for a buried arginine, Proc. Natl. Acad. Sci. U S A,
99, pp. 5361–5366.
145. Textor, L. C., Colussi, F., Silveira, R. L., Serpa, V., de Mello, B. L.,
Muniz, J. R. C., Squina, F. M., Pereira, N., Skaf, M. S., and Polikarpov,
I. (2013). Joint X-ray crystallographic and molecular dynamics study
of cellobiohydrolase I from Trichoderma harzianum: deciphering the
structural features of cellobiohydrolase catalytic activity, FEBS J., 280,
pp. 56–69.
146. Amaro, R. E., Myers, R. S., Davisson, V. J., and Luthey-Schulten, Z. A.
(2005). Structural elements in IGP synthase exclude water to optimize
ammonia transfer, Biophys. J., 89, pp. 475–487.
147. Badet, B., Vermoote, P., Haumont, P. Y., Lederer, F., and LeGoffic,
F. (1987). Glucosamine synthetase from Escherichia coli: purifica-
tion, properties, and glutamine-utilizing site location, Biochemistry
(Moscow), 26, pp. 1940–1948.
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

458 Role of Tunnels and Gates in Enzymatic Catalysis

148. Couture, J.-F., Legrand, P., Cantin, L., Labrie, F., Luu-The, V., and
Breton, R. (2004). Loop relaxation, a mechanism that explains the
reduced specificity of rabbit 20alpha-hydroxysteroid dehydrogenase,
a member of the aldo-keto reductase superfamily, J. Mol. Biol., 339, pp.
89–102.
149. Van den Heuvel, R. H. H., Svergun, D. I., Petoukhov, M. V., Coda, A.,
Curti, B., Ravasio, S., Vanoni, M. A., and Mattevi, A. (2003). The active
conformation of glutamate synthase and its binding to ferredoxin, J.
Mol. Biol., 330, pp. 113–128.
150. Darnault, C., Volbeda, A., Kim, E. J., Legrand, P., Vernède, X., Lindahl, P. A.,
and Fontecilla-Camps, J. C. (2003). Ni-Zn-[Fe4-S4] and Ni-Ni-[Fe4-S4]
clusters in closed and open á subunits of acetyl-CoA synthase/carbon
monoxide dehydrogenase, Nat. Struct. Mol. Biol., 10, pp. 271–279.
151. Endrizzi, J. A., Kim, H., Anderson, P. M., and Baldwin, E. P. (2005).
Mechanisms of product feedback regulation and drug resistance
in cytidine triphosphate synthetases from the structure of a CTP-
inhibited complex, Biochemistry (Moscow), 44, pp. 13491–13499.
152. Sazinsky, M. H., and Lippard, S. J. (2005). Product bound structures of
the soluble methane monooxygenase hydroxylase from Methylococcus
capsulatus (Bath): protein motion in the alpha-subunit, J. Am. Chem.
Soc., 127, pp. 5814–5825.
153. Oprea, T. I., Hummer, G., and Garcia, A. E. (1997). Identification of
a functional water channel in cytochrome P450 enzymes, Proc. Natl.
Acad. Sci. U S A, 94, pp. 2133–2138.
154. Niu, C., Xu, Y., Xu, Y., Luo, X., Duan, W., Silman, I., Sussman, J. L., Zhu, W.,
Chen, K., Shen, J., et al. (2005). Dynamic mechanism of E2020 binding
to acetylcholinesterase: a steered molecular dynamics simulation, J.
Phys. Chem. B, 109, pp. 23730–23738.
155. Lountos, G. T., Mitchell, K. H., Studts, J. M., Fox, B. G., and Orville,
A. M. (2005). Crystal structures and functional studies of T4moD,
the toluene 4-monooxygenase catalytic effector protein, Biochemistry
(Moscow), 44, pp. 7131–7142.
156. Zawaira, A., Coulson, L., Gallotta, M., Karimanzira, O., and Blackburn, J.
(2011). On the deduction and analysis of singlet and two-state gating-
models from the static structures of mammalian CYP450, J. Struct. Biol.,
173, pp. 282–293.
157. Xin, Y., Gadda, G., and Hamelberg, D. (2009). The cluster of hydrophobic
residues controls the entrance to the active site of choline oxidase,
Biochemistry (Moscow), 48, pp. 9599–9605.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 459

158. Vanoni, M. A., and Curti, B. (2005). Structure–function studies on the


iron-sulfur flavoenzyme glutamate synthase: an unexpectedly complex
self-regulated enzyme, Arch. Biochem. Biophys., 433, pp. 193–211.
159. Xu, L., Zhao, W., and Wang, X. (2010). Finding molecular dioxygen
tunnels in homoprotocatechuate 2,3-dioxygenase: implications for
different reactivity of identical subunits, Eur. Biophys. J. EBJ, 39, pp.
327–336.
160. Xu, Y., Colletier, J.-P., Weik, M., Jiang, H., Moult, J., Silman, I., and Sussman,
J. L. (2008). Flexibility of aromatic residues in the active-site gorge of
acetylcholinesterase: X-ray versus molecular dynamics, Biophys. J., 95,
pp. 2500–2511.
161. Wester, M. R., Johnson, E. F., Marques-Soares, C., Dansette, P. M.,
Mansuy, D., and Stout, C. D. (2003). Structure of a substrate complex
of mammalian cytochrome P450 2C5 at 2.3 A resolution: evidence
for multiple substrate binding modes, Biochemistry (Moscow), 42, pp.
6370–6379.
162. Fishelovitch, D., Shaik, S., Wolfson, H. J., and Nussinov, R. (2010). How
does the reductase help to regulate the catalytic cycle of cytochrome
P450 3A4 using the conserved water channel? J. Phys. Chem. B, 114,
pp. 5964–5970.
163. Sevostyanova, A., Belogurov, G. A., Mooney, R. A., Landick, R., and
Artsimovitch, I. (2011). The  subunit gate loop is required for RNA
polymerase modification by RfaH and NusG, Mol. Cell, 43, pp. 253–
262.
164. Biswal, B. K., Morisseau, C., Garen, G., Cherney, M. M., Garen, C., Niu, C.,
Hammock, B. D., and James, M. N. G. (2008). The molecular structure of
epoxide hydrolase B from Mycobacterium tuberculosis and its complex
with a urea-based inhibitor, J. Mol. Biol., 381, pp. 897–912.
165. Da Silva Giotto, M. T., Garratt, R. C., Oliva, G., Mascarenhas, Y. P., Giglio, J.
R., Cintra, A. C. O., de Azevedo, W. F., Arni, R. K., and Ward, R. J. (1998).
Crystallographic and spectroscopic characterization of a molecular
hinge: conformational changes in bothropstoxin I, a dimeric Lys49-
phospholipase A2 homologue, Proteins Struct. Funct. Bioinform., 30,
pp. 442–454.
166. Kaszuba, K., Róg, T., Danne, R., Canning, P., Fülöp, V., Juhász, T., Szeltner,
Z., St. Pierre, J.-F., Garcı́a-Horsman, A., Männistö, P. T., et al. (2012).
Molecular dynamics, crystallography and mutagenesis studies on the
substrate gating mechanism of prolyl oligopeptidase, Biochimie, 94, pp.
1398–1411.
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

460 Role of Tunnels and Gates in Enzymatic Catalysis

167. Murray, L. J., Garcı́a-Serres, R., McCormick, M. S., Davydov, R., Naik,
S. G., Kim, S.-H., Hoffman, B. M., Huynh, B. H., and Lippard, S. J.
(2007). Dioxygen activation at non-heme diiron centers: oxidation
of a proximal residue in the I100W variant of toluene/o-xylene
monooxygenase hydroxylase, Biochemistry (Moscow), 46, pp. 14795–
14809.
168. Sciara, G., Kendrew, S. G., Miele, A. E., Marsh, N. G., Federici, L.,
Malatesta, F., Schimperna, G., Savino, C., and Vallone, B. (2003). The
structure of ActVA-Orf6, a novel type of monooxygenase involved in
actinorhodin biosynthesis, EMBO J., 22, pp. 205–215.
169. Rowlett, R. S. (2010). Structure and catalytic mechanism of the beta-
carbonic anhydrases, Biochim. Biophys. Acta, 1804, pp. 362–373.
170. Kohls, D., Sulea, T., Purisima, E. O., MacKenzie, R. E., and Vrielink,
A. (2000). The crystal structure of the formiminotransferase domain
of formiminotransferase-cyclodeaminase: implications for substrate
channeling in a bifunctional enzyme, Structure, 8, pp. 35–46.
171. Abe, I., and Morita, H. (2010). Structure and function of the chalcone
synthase superfamily of plant type III polyketide synthases, Nat. Prod.
Rep., 27, pp. 809–838.
172. Lario, P. I., Sampson, N., and Vrielink, A. (2003). Sub-atomic resolution
crystal structure of cholesterol oxidase: what atomic resolution
crystallography reveals about enzyme mechanism and the role of the
FAD cofactor in redox activity, J. Mol. Biol., 326, pp. 1635–1650.
173. Manjasetty, B. A., Powlowski, J., and Vrielink, A. (2003). Crystal
structure of a bifunctional aldolase-dehydrogenase: sequestering a
reactive and volatile intermediate, Proc. Natl. Acad. Sci. U S A, 100, pp.
6992–6997.
174. Tóth, K., Sedlák, E., Sprinzl, M., and Zoldák, G. (2008). Flexibility and
enzyme activity of NADH oxidase from thermus thermophilus in the
presence of monovalent cations of hofmeister series, Biochim. Biophys.
Acta, 1784, pp. 789–795.
175. Hiromoto, T., Fujiwara, S., Hosokawa, K., and Yamaguchi, H. (2006).
Crystal structure of 3-hydroxybenzoate hydroxylase from comamonas
testosteroni has a large tunnel for substrate and oxygen access to the
active site, J. Mol. Biol., 364, pp. 878–896.
176. Vrielink, A., and Ghisla, S. (2009). Cholesterol oxidase: biochemistry
and structural features, FEBS J., 276, pp. 6826–6843.
177. Xu, Q., Eguchi, T., Mathews, I. I., Rife, C. L., Chiu, H.-J., Farr, C.
L., Feuerhelm, J., Jaroszewski, L., Klock, H. E., Knuth, M. W., et al.

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 461

(2010). Insights into substrate specificity of geranylgeranyl reductases


revealed by the structure of digeranylgeranylglycerophospholipid
reductase, an essential enzyme in the biosynthesis of archaeal
membrane lipids, J. Mol. Biol., 404, pp. 403–417.
178. Kohn, J. E., Afonine, P. V., Ruscio, J. Z., Adams, P. D., and Head-
Gordon, T. (2010). Evidence of functional protein dynamics from X-ray
crystallographic ensembles, PLOS Comput. Biol., 6, p. e1000911.
179. Rhee, S., Parris, K. D., Ahmed, S. A., Miles, E. W., and Davies, D. R.
(1996). Exchange of K+ or Cs+ for Na+ induces local and long-
range changes in the three-dimensional structure of the tryptophan
synthase alpha2beta2 complex, Biochemistry (Moscow), 35, pp. 4211–
4221.
180. Pawelek, P. D., Cheah, J., Coulombe, R., Macheroux, P., Ghisla, S., and
Vrielink, A. (2000). The structure of L-amino acid oxidase reveals the
substrate trajectory into an enantiomerically conserved active site,
EMBO J., 19, pp. 4204–4215.
181. Sazinsky, M. H., Bard, J., Di Donato, A., and Lippard, S. J. (2004).
Crystal structure of the toluene/o-xylene monooxygenase hydroxylase
from Pseudomonas stutzeri OX1. Insight into the substrate specificity,
substrate channeling, and active site tuning of multicomponent
monooxygenases, J. Biol. Chem., 279, pp. 30600–30610.
182. Boehr, D. D., Dyson, H. J., and Wright, P. E. (2006). An NMR perspective
on enzyme dynamics, Chem. Rev., 106, pp. 3055–3079.
183. Loria, J. P., Berlow, R. B., and Watt, E. D. (2008). Characterization of
enzyme motions by solution NMR relaxation dispersion, Acc. Chem.
Res., 41, pp. 214–221.
184. Rozovsky, S., Jogl, G., Tong, L., and McDermott, A. E. (2001). Solution-
state NMR investigations of triosephosphate isomerase active site loop
motion: ligand release in relation to active site loop dynamics, J. Mol.
Biol., 310, pp. 271–280.
185. Massi, F., Wang, C., and Palmer, A. G., 3rd. (2006). Solution NMR
and computer simulation studies of active site loop motion in
triosephosphate isomerase, Biochemistry (Moscow), 45, pp. 10787–
10794.
186. Kempf, J. G., Jung, J.-Y., Ragain, C., Sampson, N. S., and Loria, J. P. (2007).
Dynamic requirements for a functional protein hinge, J. Mol. Biol., 368,
pp. 131–149.
187. Katoh, E., Louis, J. M., Yamazaki, T., Gronenborn, A. M., Torchia, D. A., and
Ishima, R. A (2003). Solution NMR study of the binding kinetics and the
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

462 Role of Tunnels and Gates in Enzymatic Catalysis

internal dynamics of an HIV-1 protease-substrate complex, Protein Sci.


Publ. Protein Soc., 12, pp. 1376–1385.
188. Boroujerdi, A. F. B., and Young, J. K. (2009). NMR-derived folate-bound
structure of dihydrofolate reductase 1 from the halophile haloferax
volcanii, Biopolymers, 91, pp. 140–144.
189. Copeland, R. A., Smith, P. A., and Chan, S. I. (1987). Cytochrome c
oxidase exhibits a rapid conformational change upon reduction of CuA:
a tryptophan fluorescence study, Biochemistry (Moscow), 26, pp. 7311–
7316.
190. Jesenská, A., Sýkora, J., Olzyñska, A., Brezovský, J., Zdráhal, Z.,
Damborský, J., and Hof, M. (2009). Nanosecond time-dependent stokes
shift at the tunnel mouth of haloalkane dehalogenases, J. Am. Chem.
Soc., 131, pp. 494–501.
191. Speight, L. C., Muthusamy, A. K., Goldberg, J. M., Warner, J. B., Wissner,
R. F., Willi, T. S., Woodman, B. F., Mehl, R. A., and Petersson, E. J. (2013).
Efficient synthesis and in vivo incorporation of acridon-2-ylalanine,
a fluorescent amino acid for lifetime and förster resonance energy
transfer/luminescence resonance energy transfer studies, J. Am. Chem.
Soc., 135, pp. 18806–18814.
192. Stolow, A., Bragg, A. E., and Neumark, D. M. (2004). Femtosecond time-
resolved photoelectron spectroscopy, Chem. Rev., 104, pp. 1719–1758.
193. Amaro, M., Brezovsky, J., Kováèová, S., Maier, L., Chaloupkova, R.,
Sýkora, J., Paruch, K., Damborsky, J., and Hof, M. (2013). Are time-
dependent fluorescence shifts at the tunnel mouth of haloalkane
dehalogenase enzymes dependent on the choice of the chromophore?
J. Phys. Chem. B, 117, pp. 7898–7906.
194. Chamberlain, C., and Hahn, K. M. (2000). Watching proteins in the wild:
fluorescence methods to study protein dynamics in living cells, Traffic,
1, pp. 755–762.
195. Chen, S., Fahmi, N. E., Bhattacharya, C., Wang, L., Jin, Y., Benkovic,
S. J., and Hecht, S. M. (2013). Fluorescent biphenyl derivatives
of phenylalanine suitable for protein modification, Biochemistry
(Moscow), 52, pp. 8580–8589.
196. Cai, D., Hoppe, A. D., Swanson, J. A., and Verhey, K. J. (2007). Kinesin-1
structural organization and conformational changes revealed by FRET
stoichiometry in live cells, J. Cell Biol., 176, pp. 51–63.
197. Dietrich, A., Buschmann, V., Müller, C., and Sauer, M. (2002). Fluo-
rescence resonance energy transfer (FRET) and competing processes
in donor–acceptor substituted DNA strands: a comparative study of

www.ebook3000.com
March 23, 2016 12:54 PSP Book - 9in x 6in 12-Allan-Svendsen-c12

References 463

ensemble and single-molecule data, Rev. Mol. Biotechnol., 82, pp. 211–
231.
198. Haustein, E., and Schwille, P. (2004). Single-molecule spectroscopic
methods, Curr. Opin. Struct. Biol., 14, pp. 531–540.
199. Schuler, B. (2013). Single-molecule FRET of protein structure and
dynamics: a primer, J. Nanobiotechnol., 11, p. S2.
200. Chang, C.-E. A., Trylska, J., Tozzini, V., and Andrew McCammon, J.
(2007). Binding pathways of ligands to HIV-1 protease: coarse-grained
and atomistic simulations, Chem. Biol. Drug Des., 69, pp. 5–13.
201. Arroyo-Mañez, P., Bikiel, D. E., Boechi, L., Capece, L., Di Lella, S., Estrin,
D. A., Martı́, M. A., Moreno, D. M., Nadra, A. D., and Petruk, A. A.
(2011). Protein dynamics and ligand migration interplay as studied by
computer simulation, Biochim. Biophys. Acta, 1814, pp. 1054–1064.
202. Huang, X., and Raushel, F. M. (2000). An engineered blockage within
the ammonia tunnel of carbamoyl phosphate synthetase prevents
the use of glutamine as a substrate but not ammonia, Biochemistry
(Moscow), 39, pp. 3240–3247.
203. Thoden, J. B., Huang, X., Raushel, F. M., and Holden, H. M. (2002).
Carbamoyl-phosphate synthetase. Creation of an escape route for
ammonia, J. Biol. Chem., 277, pp. 39722–39727.
204. Eijsink, V. G. H., Bjørk, A., Gåseidnes, S., Sirevåg, R., Synstad, B., Burg,
B. van den, and Vriend, G. (2004). Rational engineering of enzyme
stability, J. Biotechnol., 113, pp. 105–120.
205. Bommarius, A. S., and Paye, M. F. (2013). Stabilizing biocatalysts, Chem.
Soc. Rev., 42, pp. 6534–6565.
206. Woodley, J. M. (2013). Protein engineering of enzymes for process
applications, Curr. Opin. Chem. Biol., 17, pp. 310–316.
This page intentionally left blank

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Chapter 13

Molecular Descriptors for the Structural


Analysis of Enzyme Active Sites

Valerio Ferrario,a Lydia Siragusa,b Cynthia Ebert,a


Gabriele Cruciani,b and Lucia Gardossiaa
a Department of Scienze Chimiche e Farmaceutiche, Università degli Studi di Trieste,

P.le Europa 1, Trieste (TS), 34127, Italy


b Laboratory for Chemometrics and Molecular Modeling, Dipartimento di Chimica,

Biologia e Biotecnologie, Università degli Studi di Perugia, Via Elce di Sotto 10,
06123 Perugia (PG), Italy
gardossi@units.it

13.1 Introduction: Molecular Descriptors for


Investigation of Enzyme Catalysis

Enzymes are increasingly used to perform a range of chemical


reactions. These catalysts from nature are sustainable, selective, and
efficient and offer a variety of benefits such as environmentally
friendly manufacturing processes, reduced use of solvents, lower
energy requirement, high atom efficiency, and reduced cost. How-
ever, natural biocatalysts are often not optimally suited for industrial
applications. To boost the use of enzymes in industrial processes,
it is important to expand the range of reactions catalyzed by

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

466 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

enzymes and to improve their properties for industrial applications.


Traditionally, in the past, new enzymes for desired reactions were
obtained by tedious and time-consuming screening of microbial
cultures, often following enrichment and isolation of new cultures.
Due to the genomics revolution, massive sequencing combined with
appropriate use of databases and efficient predictive bioinformatics
tools have the potential to replace the current laborious screening
approaches. The technological advances in the field offer an array
of tools, which nowadays still have to express their full applicative
potential. Actually, time-consuming, expensive, and investment-
intensive screening in the laboratory is expected to be replaced by
in silico screening using computer programs, ranking, design, and
automated DNA synthesis, thus allowing a much shorter time from
process idea to feasibility judgment with considerable savings on
research costs.
To fully exploit the enormous developments in life sciences,
technologies and information must be used according to more
effective and integrated strategies so that designing, developing,
and applying new and better enzymes for industrial processes
become a faster and more effective practice. The achievement
of this goal is of crucial importance for the technological and
economic competitiveness of industrial biotechnological processes.
During the last 40 years, rigorous quantum mechanics (QM)-based
computational methodologies have been developed and applied for
the investigation of the physical-chemical features, thermodynamic
parameters, and electrostatic contributions of enzyme active site
in order to fully understand the source of the catalytic power of
enzymes. QM simulations result to be very expensive in terms of
computational power required because of the system definition with
its high level of theory. Therefore, the enzyme system is usually QM
defined just in its catalytic machinery or in a limited portion of the
enzyme corresponding to the active site, while the remaining part
of the system is defined with the molecular mechanics (MM) theory
level [1]. While the oversimplification of the former methods makes
quantitative predictions unfeasible, the latter are definitely too
much time consuming to be attractive as predicting tools, and above
all, they often still provide unsatisfactory quantitative accuracy
[2].

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Molecular Descriptors Based on Molecular Interaction Fields 467

Recent advances in computer sciences have led to sophisticated


and refined molecular descriptors able to describe quantitatively the
features of target molecules and macromolecules [3]. A molecular
descriptor is the final result of a logic and mathematics procedure
that transforms chemical information encoded within a symbolic
representation of a molecule into useful numbers. That makes
possible to analyze and compare, without any bias, a series of
different objects or molecules of a data set for the investigation.
When active sites of enzymes are investigated, it must be taken
into account that each amino acid residue can establish multiple
interactions but it also can determine multiple short- and long-
range perturbations. Conversely, informative molecular descriptors
are requested to extract the necessary evidence to compose a
comprehensive analysis of active site properties and possibly
correlate them to enzyme functions. Different kinds of molecular
descriptors are currently available and they are classified according
to the type of information provided by the descriptors [4]. However,
the following sections intend to offer selected examples illustrating
only the application of descriptors based on the computation of
molecular interaction fields (MIFs) and, more specifically, GRID-
derived molecular descriptors [5]. They have been originally
developed for drug design applications, where the final goal is to
increase the energies of interactions between the target, which is
usually a protein receptor and a candidate drug. In the study of
enzyme catalysis such descriptors can be exploited not only for in-
vestigating enzyme specificity but also for identifying and evaluating
quantitatively the interactions that can produce a stabilization of the
relevant transition state (TS) of a reaction of interest.

13.2 Molecular Descriptors Based on Molecular


Interaction Fields

MIFs consist of the interaction energies computed considering a


molecule and a small chemical probe (e.g., a chemical functional
group). The GRID computational method [5] allows one to calculate
interaction energies between any type of target molecules and a
broad number of probes, as described in Table 13.1.
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

468 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Table 13.1 Some examples of probes


employed for GRID analysis and MIF
computation

PROBE Description

Hydrophobic
C3 Aliphatic carbon
C1+ Aromatic carbon
H bond donor
OH Phenol
O1 Alkyl OH
S1 Tiol
N1 Amidic nitrogen
H bond acceptor
O Carbonyl oxygen
O:: Carboxyl oxygen
NO Nitro oxygen
Halogens
BR Bromine
CL Chlorine
F Fluorine

Energies are calculated in all the nodes of a 3D grid, which spans


the structure of the target considered (Fig. 13.1).
The computed interaction energies form an MIF, which can be
visualized as an isopotential surface.
It must be underlined that 3D MIFs in general contain a large
amount of data, some of which are redundant or not relevant for
a given problem. Therefore, specific algorithms and statistical tools
are required to extract relevant descriptors from extensive data
matrixes of 3D MIFs.
Various probes are available in order to compute the energies of
different types of interactions [5]: Among these, the most used take
into account hydrophobicity (DRY probe), hydratability (WATER
probe), H bond donor (O probe representing carbonyl oxygen), and
H bond acceptor (N probe, representing a amide nitrogen). Finally,
also the global shape of the target molecule can be computed by
means of the H probe.
The use of GRID descriptors rely upon an alignment step
that is often time consuming and can introduce arbitrary choices,

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Molecular Descriptors Based on Molecular Interaction Fields 469

Figure 13.1 A schematic illustration of the GRID computational method


and interaction energies between a chemical group (PROBE) and each
GRID node that spams the structure (or a selected part of that) of a target
molecule. The matrix of the interaction energies forms the so-called MIF. In
this example the target is represented by ampicillin and the figure illustrates
the MIF calculated using the WATER probe. The MIF indicates the areas of
the molecules that are prone to be hydrated.

the resultant model being dependent upon and sensitive to the


alignment procedure used. GRID molecular descriptors are at the
basis of a series of GRID-based molecular descriptors that were
developed in order to extend their efficiency in extracting and
describing the relevant information from different target molecules
and for overcoming the problems related to the alignment of
molecules.
One advance on that respect is represented by GRid-INdependent
descriptors (GRIND), which are alignment-independent descrip-
tors computed by the software Almond [6]. The GRIND method
transforms the information included in the MIF into alignment-
independent descriptors (correlograms) able to describe the dif-
ferent chemical groups in the molecule and their relative spatial
position. The calculation of GRIND is a two-step procedure. Firstly
the hundreds of thousands variables, which constitute the original
MIF, are filtered to select the most relevant groups of nodes
and to discard redundant variables. The chosen nodes must fulfil
the requirements of having low energy values (corresponding to
favorable interactions of the target with a given probe) and being
as distant as possible from each other (Table 13.2).
The second step is the so-called maximum auto- and cross-
covariance (MACC) transformation. It is an autocorrelation proce-
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

470 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Table 13.2 Examples of interactions that can


be studied using GRIND

PROBE 1 PROBE 2 Type of interaction


DRY DRY Hydrophobic
O O H bond acceptor
N1 N1 H bond donor
TIP TIP Molecular shape
DRY O Hydrophobic H bond acceptor
DRY N1 Hydrophobic H bond donor
DRY TIP Hydrophobic molecular shape
O N1 H bond acceptor–H bond donor
O TIP H bond acceptor shape
N1 TIP H bond donor shape

dure [7] in which the nodes, selected in the first step, are screened by
identifying couples of nodes that are localized at a defined distance.
The algorithm calculates energy products represented by vectors
connecting couples of MIF nodes. When more than one couple of
nodes fulfils the distance requirement, only the vector representing
the maximum energy product is conserved.
The correlation can be performed between nodes belonging to
the same MIF (generated by the same probe) or to different MIFs
(generated by two different probes), resulting in auto- or cross-
correlation, respectively (Fig. 13.2).
Therefore, at the end of this procedure each molecule of the
data set is described by (i) a number of vectors, which link couples
of original MIF nodes, and (ii) their energy products. A graphical
example can be found in Fig. 13.3. The descriptors can be plotted
in correlogram profiles where the energy product appears on the
y axis (Fig. 13.3). By selecting a given point of the correlogram, it
is possible to analyze the properties of the corresponding vector
(energy and geometry) and the original MIF nodes.
Each correlogram constitutes a sort of fingerprint of the molecule
and represents the molecule independently from its position in the
space. Ultimately, the correlograms form the input matrix for the
multivariate analysis or the construction of the regression models.
These graphical diagrams [6] (Fig. 13.3) allow for an intuitive
interpretation of descriptors. Globally, the GRIND method, besides

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Molecular Descriptors Based on Molecular Interaction Fields 471

Figure 13.2 Grid-independent descriptors calculated by ALMOND soft-


ware. They correspond to vectors that in the present case determine
autocorrelograms connecting points of the MIFs calculated with the DRY
probe on top and the TIP probe below (see Table 13.2). The crosses
represent the points of the MIFs selected for the generation of the
descriptors.

being an alignment-independent procedure, has the advantage


of requiring a modest computational power for analyzing and
comparing structural properties of large sets of molecules [6]. While
the GRIND method is suitable for studying geometric and steric
features of molecules, the VolSurf method was originally developed
for the prediction of pharmacokinetic properties of drugs and it
has demonstrated impressive performances in the prediction of
drug solubility, membrane permeation, and intestinal adsorption
[8]. Being designed to describe interactions between molecules and
biological membranes, VolSurf analysis is actually able to provide
information on the contribution of physical-chemical phenomena as,
for instance, solvation and desolvation of molecules.
The method firstly calculates the MIFs and secondly converts the
massive information into simple quantitative molecular descriptors.
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

472 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Figure 13.3 Auto- and cross-correlograms of the probes DRY and TIP. Each
point corresponds to a vector.

However, the input for the calculation of the VolSurf descriptors


can be of different nature, as, for instance, various MM calculations
or semi-empirical data (e.g., electrostatic potential). The resulting
VolSurf molecular descriptors refer to properties such as molecular
size, shape, extension of both hydrophilic and hydrophobic regions,
and amphiphilic moments and reports a schematic description of
some of the most relevant ones (Fig. 13.4).
Exhaustive information on the molecular descriptors used by the
VolSurf methods can be found in the original work of Cruciani et al.
[6].

13.3 Multivariate Statistical Analysis for Processing and


Interpretation of Molecular Descriptors

GRID-based descriptors are able to generate a huge number of vari-


ables from the analysis of each target molecule. When a comparison

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Multivariate Statistical Analysis for Processing and Interpretation of Molecular Descriptors 473

Figure 13.4 The VolSurf descriptors are computed starting from the MIFs
and they account for different physical-chemical and structural properties.
The figure reports some representative VolSurf descriptors.

of different target molecules is the objective, adequate statistical


tools are necessary for the interpretation and simplification of the
information.
Multivariate statistical methods share the ability to simplify the
analysis of complex systems by extracting the relevant information
contained in extensive matrixes of variables [3]. The information
is represented in a space of reduced dimensionality, thus making
possible its interpretation.
Multivariate statistical methods are founded on different mathe-
matical basis. Principal component analysis (PCA) extracts relevant
information from data tables: variables that are correlated with
one another are combined into a principal component (PC), or
latent variable, so that objects are projected on a space of reduced
dimensionality (i.e., a reduced number of independent variables
corresponding to the new components) [9] (Fig. 13.5).
When a certain response (Y variable) of the system must be
modeled and optimized, the new variables or components are
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

474 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Figure 13.5 Applicability of MIF-derived descriptors to the study of enzyme


properties. The molecular descriptors can be computed by considering
either the substrates or the enzymes as targets. Multivariate statistical
analysis provides the tools for processing the massive amount of data and
information coming from the GRID analysis, providing mathematical models
usable for a simple interpretation and representation of enzyme properties.

extracted to give the best fit of both Y and X variable matrices.


This is accomplished by applying the partial least squares (PLS)
analysis, where the Y latent variables are correlated to X latent
variables [10]. PLS analysis applies conveniently when the aim of the
investigation is represented by the correlation of structural features
with enzyme function or experimentally determined properties.
In particular, quantitative structure–activity relationships (QSARs)
use the structural properties of molecules (generally acquired by
molecular simulation methods) as input for the PLS analysis, which
eventually finds correlations between the molecular structure and
experimentally measurable responses. From the inherent nature of
the regression models, it derives that they are able to predict the
effect of a variable (e.g., a specific structural feature of a molecule)
as long as such a variable is somehow represented in the training
set. As a consequence, models that are built up on the basis of a
training set of structurally homogeneous molecules are expected to
have excellent predictivity but only with respect to molecules that
fall within the structural features accounted by the training set.

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Grind Descriptors for the Study of Substrate Specificity 475

The comparison of the different object of a data set and the


extraction of relevant information is made possible by analyzing the
score and loading plots, which reveals how the target molecules are
described by each PC and provide information on how the variables
contribute to each PC [10].

13.4 Grind Descriptors for the Study of Substrate


Specificity

One example of application of molecular descriptors combined with


multivariate statistics comes from the study of substrate specificity
of an alkanesulfonate monooxygenase (SsuD) from Escherichia coli
[11]. SsuD catalyzes the oxidation of a broad range of structurally
different aliphatic sulfonates having the general structure R-CH2 -
SO3 H, and the study aimed at understanding the structural basis
of enzyme–substrate recognition and possibly at developing some
predictive model of substrate specificity. For this purpose, a set of
substrates accepted by SsuD was described by means of the GRIND
method. The procedure involved docking of each substrate inside
the enzyme active site and molecular dynamics (MD) simulations
for system equilibration and for selecting those conformers having
the lowest potential energy in the simulated (by MD) experimental
conditions. Interestingly, the GRID method was also exploited for
guiding the docking of the substrates into the active site of the
SsuD. Considering the nature of the SsuD natural substrates, the
GRID sulfone/sulfoxide probe (OS probe) was applied to identify
the regions of the active site able to establish interactions with the
sulfone/sulfoxide probe. Then, visual analysis allowed the selection
of those conformations of the docked substrate where the sulfonic
group was correctly placed in the areas corresponding to the OS MIF
(Fig. 13.6).
The corresponding substrate specificity value of each substrate
(experimentally determined kcat /KM ) was used as input for the PLS
analysis that correlated the structural features of the substrates with
the kcat /KM values [11]. A 3D-QSAR model was finally constructed
where two PCs were sufficient to describe both steric and electronic
factors and, more importantly, their interactions. The mathematical
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

476 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Figure 13.6 GRID analysis of the active site of SsuD with the sul-
fone/sulfoxide probe (OS probe). The gray surfaces correspond to the MIFs
calculated using the OS probe and then indicate areas where the interactions
with the sulfonic group of substrates will be favored.

model explained 62% of the whole variance and had a predictive


coefficient q 2 (calculated by means of the leave-one-out [LOO]
cross-validation procedure) of 0.86 on the second PC. A q 2 of 0.4
generally indicates that the predictivity of the model is adequate
[10] (Table 13.3).
Interactions of factors appeared to affect significantly the ability
of SsuD of transforming efficiently a substrate. It must be underlined

Table 13.3 Comparison of the measured experimental kcat /KM values of 10


substrates of SSuD and the kcat /KM predicted by the 3D-QSAR model

Compound Experimental kcat /K M (min−1 μM−1 ) Predicted kcat /K M by LOO (min−1 μM−1 )

1 6.7 5.7
2 6.1 6.1
3 6.0 4.4
4 4.6 3.6
5 3.2 3.4
6 2.7 3.4
7 1.8 2.1
8 1.1 1.1
9 0.6 1.0
10 0.4 1.8

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

VolSurf Descriptors for the Modeling of Substrate Specificity 477

that the information coming from the PLS analysis goes beyond the
simple ability of the enzyme to recognize the substrate but includes
the factors that affect the capacity of the enzyme to reduce the
activation energy of the rate determining step of the reaction.

13.5 VolSurf Descriptors for the Modeling of Substrate


Specificity

As previously mentioned, GRIND derives from vectors that provide


also geometric and steric information of each target molecule.
Conversely, conformational analysis of the substrate inside the active
site is generally carried out for selecting the most representative
conformers for the structure of each object of the GRIND analysis.
However, a study of substrate specificity of penicillin G amidase
(PGA) revealed how, in some cases, physical-chemical properties
of substrates (e.g., hydrophilic-hydrophobic balance, solvation and
desolvation phenomena) are the driving factors in determining
substrate specificity.
A set of 10 substrates of PGA was analyzed by two different
molecular descriptors, namely GRIND and VolSurf [8]. The exper-
imentally measured kcat /KM values concerning the hydrolysis of
amides and esters were used as training set for the construction
of PLS models using either GRIND or VolSurf descriptors but
computed according different strategies. In the first case, an accurate
conformational analysis of each substrate within the active site of
the enzyme was performed. The molecules were docked into the
active site of PGA and then each enzyme–substrate complex was
equilibrated by performing MD simulations. The second strategy
was developed with the aim of verifying whether the predictivity
of the models relies strictly on the knowledge of the correct
conformation of each substrate upon its binding into the active
site. Therefore, conformations of the molecules were randomly
generated. Successively, the procedure for the calculation of the 3D-
QSAR models was the same for both the approaches.
Unexpectedly, the predictivity of models calculated on the basis
of random conformations was comparable to predictivity of models
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

478 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Table 13.4 Quality of the 3D-QSAR models


predicting substrate specificity of penicillin
G amidase (PGA) enzyme and calculated re-
spectively using the VolSurf descriptors and
GRIND. Data indicate that the predictivity of
the models is equally high when the mathe-
matical model is constructed without taking
into account the conformation acquired by the
substrates inside the enzyme active site. There-
fore, in the case of the PGA enzyme the data
suggest that substrate specificity is strongly
driven by physical-chemical properties rather
than steric factors

VolSurf GRIND
r2 q2 r2 q2
Conformational analysis 0.96 0.93 0.98 0.76
Random conformation 0.94 0.82 0.96 0.78

based on conformational analysis (Table 13.4). The models were


also validated by using substrates external to the training set.
The good results obtained with the VolSurf models are somehow
surprising. VolSurf descriptors are appropriate for a detailed de-
scription of physical-chemical properties, whereas the information
accounting for the steric and geometric features of the molecules
is very limited. Nevertheless, the predictivity is high both for the
random and for the conformation-based models. The unexpected
result can be interpreted in terms of the original ability of the
VolSurf descriptors to account for interactions between molecules
and biological membranes and physical-chemical phenomena more
in general [12]. On this basis, VolSurf analysis should be able to
provide information on the contribution to kcat /KM of physical-
chemical phenomena occurring in the biocatalyzed system such
as, for instance, solvation and desolvation of the substrates upon
leaving the bulk medium and entering the active site. Actually, it is
widely recognized that the solvent can change enzyme selectivity
as a result of variations that are largely ascribable to differences of
solvation/desolvation energies of the substrates [13].

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Differential MIFS Descriptors for the Study of Enantioselectivity 479

This observation suggests that, at least in the case of PGA,


it is feasible to create predictive models starting from a set of
substrate structures and the corresponding kcat /KM measurements,
thus overcoming the necessity of enzyme structure. This would
translate into an enormous reduction of the operational time (from a
few days to a few hours) and make the methods competitive respect
to experimental screening procedures. Of course, while this works
well in the case of PGA, the extension of this finding to different
biocatalytic systems should be verified.
It must be underlined that, as expected, the VolSurf approach
failed in predicting PGA enantioselectivity, since accurate con-
formation analysis and geometric description of enantiomers is
requested [8]. Computational methods and descriptors applicable to
the prediction of enzyme enantiodiscrimination are described in the
following section.

13.6 Differential MIFS Descriptors for the Study of


Enantioselectivity

Methods able to quantitatively predict enantioselectivity would


have a great practical and theoretical impact in the pharmaceutical
and fine-chemistry industry. Enzymes enantiodiscrimination can
be defined as the ratio of the specificity constant of the two
enantiomers (Eq. 13.1).  
kcat/
KM R
Enantioselectivity =   (13.1)
kcat/
KM S
The two enantiomers will react with a different reaction rate
according to their specificity constant kcat /KM .
Enantioselectivity can be evaluated quantitatively by measuring
experimentally the enantiomeric ratio E , which is given by Eq. 13.2.
ln [(1 − c) (1 − ee (L))] ln [(1 − c) (1 + ee (P ))]
E = =
ln [(1 − c) (1 + ee (L))] ln [(1 − c) (1 − ee (P ))]
(13.2)
where L is the unreacted substrate and P is the product.
From the computational point of view, quantitative prediction of
E requires the calculation of the G‡ (activation free energy) for
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

480 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

the two enantiomers. This objective remains a formidable challenge


for computational methods, which cannot process such complex
molecular systems without using severe approximation or extensive
computational power. Moreover, the outcome of these calculations
presents a significant margin of error [14].
To overcome the problem of G‡ calculation, strategies based
on 3D-QSAR models have also been developed, where mathematical
models correlate the structure of enantiomers with experimentally
measured data [2].
In principle, a 3D-QSAR approach similar to the study of
substrate specificity, as described before, can be applied. However,
the E value represents an intrinsic property of a couple of molecules,
whereas the MIF calculated for each enantiomer separately misses
the correspondence to the above-mentioned property. Therefore,
new GRID-based descriptors were developed for analyzing the
structural information related to couples of enantiomers, namely the
differential molecular interaction fields (dMIFs). These descriptors
merge information contained in the MIFs of both enantiomers and
were used for the construction of 3D-QSAR models able to predict
quantitatively the enantioselectivity of Candida antarctica lipase B
(CaLB) [2].
The construction of the 3D-QSAR model involved a protocol that
is divided into four principal stages: (a) definition of the training
set and refinement of the structures of both the enzyme and the
substrates, (b) docking and calculation of the active conformers of
the substrates inside the active site by MD simulations, (c) genera-
tion of the molecular descriptors for the couples of enantiomers by
means of GRID analysis and dMIF calculation, and (d) multivariate
statistical analysis of the data and generation of the mathematical
predictive model. Stages (a) and (b) were performed by means
of MM simulations, whereas for stages (c) and (d), GRID analysis
was employed in combination with chemometric tools. In particular,
the PLS method (projection to latent structures) was used for the
calculation of the regression model.
The calculation of dMIFs was performed in a matrix differential
procedure where each variable of the MIF of the slow-reacting
enantiomer was mathematically subtracted from the correspond-
ing variable of the MIF of the fast-reacting enantiomer (i.e.,

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Hybrid MIFS Descriptors for the Computation of Entropic Contribution 481

dMIF = MIFR –MIFS ). This procedure led to the quantitative evalua-


tion of the differences in interactions between the two enantiomers
and the enzyme active site. To take into account the most important
noncovalent interactions the WATER and DRY probes were used for
computing the MIFs. The WATER probe describes and quantifies the
dipolar interactions and hydrogen bond formation, whereas the DRY
probe considers all hydrophobic interactions so that they provide an
adequate description of the most relevant interactions taking place
between the substrates and the active site. The model was tested
against a validation set of couples of enantiomers demonstrating a
predictivity coefficient q 2 of 0.78.
Concerning the timescale of the whole computational procedure,
a whole PLS model including a data set of about 20 compounds
can be developed in about one week using standard low-end
computational facilities; once the model is available, screening of
substrates requires approximately one hour per molecule. However,
times can be heavily reduced by increasing the computational
power, since the conformational analysis represents the most time-
consuming step of the protocol.

13.7 Hybrid MIFS Descriptors for the Computation of


Entropic Contribution to Enantiodiscrimination

The identification of enzyme structural features that affect enan-


tiodiscrimination is of fundamental importance for designing
mutagenesis strategies aiming at improving the enantioselectivity of
biocatalysts.
As discussed above, quantitative prediction of E requires the
calculation of the G‡ for the two enantiomers, since enantios-
electivity depends on the difference in free energy of activation
 S−R G‡ of the reactions of the two enantiomers according to the
Eq. 13.3:

 S−R G‡ =  S−R H ‡ − T  S−R S ‡ (13.3)

where H ‡ is the enthalpy of activation, T represents the


temperature, and S ‡ is the entropy of activation [15]. The
studies of Karl Hult demonstrated that the entropic contribution
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

482 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

to enantiodiscrimination is quite relevant in different reactions


catalyzed by CaLB [16, 17]. However, evaluation of entropic
contribution to enzymatic catalysis appears quite a complex task,
although some studies support the idea that the binding of a
substrate in the enzyme active site freezes the motion of the
reacting fragments and minimizes their entropic contributions to
the energy of activation. However, Villa et al. [18] demonstrated
that the reacting moieties of substrates maintain some mobility
even after the formation of the enzyme–substrate complex. Although
the inclusion of entropy in protein modeling remains a challenge,
some very rigorous approaches have been published [18, 19], which
are based on comparison of the enzymatic reaction with the same
reaction in solution without any catalyst.
To develop quantitative computational models able to account
for entropic contribution to enzyme enantioselectivity new GRID-
based molecular descriptors have been conceived and validated
using experimental data published by the group of Karl Hult [20].
The study addressed the enantioselecivity of the W104A mutant of
CaLB, which is endowed with an enlarged stereoselectivity pocket.
The new molecular descriptors were named differential hybrid
molecular interaction fields (dH-MIFs) and are based on the concept
that entropy is correlated with the conformational freedom of the
tetrahedral intermediate (TI) of the reaction catalyzed by the lipase
(TI closest to the TS was used for approximation). Because of the
increased conformational freedom of substrates, the entropic con-
tribution to enantiodiscrimination is particularly relevant in kinetic
resolution of alcohols catalyzed by this enzyme. By combining MD
simulations and GRID analysis the new dH-MIF descriptors proved
to be able to extract both enthalpic and entropic information from
models of the tetrahedral intermediates of enantiomers.
The protocol for the computation of the dH-MIF can be
schematized as follows:

(a) Construction of the TI for each enantiomer


(b) Generation of an array of conformers for the TI
(c) Selection of the 10 most representative conformers
(d) Calculation of the MIFs for each conformer by means of the GRID
method

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Hybrid MIFS Descriptors for the Computation of Entropic Contribution 483

(e) Generation of the H-MIF descriptor for each enantiomer by


calculating the average of the MIFs of the 10 selected conformers
(f) Computation of the dH-MIFs for each couple of enantiomers

The H-MIF can be considered as an average MIF of each


enantiomer, and it can be calculated according to Eq. 13.4:
n
(MIFi )x yz
(H-MIF)x yz = i =1 (13.4)
n
where n is the number of conformers for each enantiomer (10 in this
case) and i is the MIF computed for the n enantiomer conformers in
3D space.
As already pointed out (Section 13.6), each H-MIF contains the
thermodynamic contributions for only one single enantiomer and
this information cannot be correlated to the E value measured
experimentally. Conversely, the concept of differential MIFs must
be applied by subtracting each value of the H-MIF matrix for the
R enantiomer to the corresponding value for the S enantiomer and
obtaining, finally, the dH-MIFs [2].
The dH-MIFs were computed by applying the DRY probe
(accounting for hydrophobic interactions) and WATER probe (that
evaluates hydrophilic interactions, H-bond properties, and hydrata-
bility).
PLS analysis was used to correlate the descriptors to the
experimental data available from the literature, and the good fitting
and predictivity of the mathematical models demonstrate that the
new molecular descriptors can be used effectively for correlating the
structure of substrates and CaLB enantioselectivity [21, 22].
The contributions of differential activation enthalpy and entropy
to enantioselectivity can be evaluated by calculating the dependence
of E from the temperature, namely by carrying out each resolution
reaction at different temperatures. Therefore, the differences in
terms of activation enthalpy and activation entropy between each
couple of enantiomer were calculated.
The resulting QSAR models (one for each temperature applied for
the determination of E ) allow for the identification of the variables
responsible for the differential free energy of activation, which can
be visualized into the CaLB mutant active site (Fig. 13.7).
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

484 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Figure 13.7 Schematic representation of the active site of CaLB. (A)


The pink circle delimits the stereospecificity pocket; the yellow shadow
outlines the oxyanion hole. These are the regions where the selected
variables were concentrated. The catalytic His is highlighted in blue at
the bottom of the active site. The tetrahedral intermediate of the reaction
is depicted in purple. (B) Details of the variables selected by the PLS
analysis that correlated dH-MIFs and the entropy values. Variables located
next to the oxyanion hole and at the entrance and at the bottom of
the stereospecificity pocket indicate those interactions that might limit
the conformational freedom of the substrate. Globally, these interactions,
by causing steric hindrance, can be translated in differences in entropy
contributions between the two enantiomers.

It is interesting to underline that for all the models the selected


variables are concentrated in those areas of the enzyme active
site that are more important for enantiodiscrimination, namely
the catalytic His, the oxyanion hole, and the stereospecificity
pocket (Fig. 13.7). Indeed, in the case of CaLB, the amino acids
able to determine a variation in the entropic contribution to
enantiodiscrimination of the enzyme are mainly localized in the
tunnel that leads to the stereospecificity pocket [23].
Therefore, the localization of the variables allows identifying
mutation hotspots that will affect stereoselectivity by changing the
entropic contribution and ultimately E.

13.8 Analysis of Enzyme Active Sites for Rational Enzyme


Engineering

All the examples reported here above are related to the analysis of
the properties of a single enzyme toward different substrates. In

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Analysis of Enzyme Active Sites for Rational Enzyme Engineering 485

Figure 13.8 The GRID analysis can be performed on the enzyme structure,
for instance, obtaining MIFs that describe the physical-chemical nature of
the active sites and the nature of interactions that can be established inside
the active site.

principle, similar procedures can be employed to compare different


enzyme structures, considering the active sites of enzymes as the
targets of the GRID analysis (Fig. 13.8).
This strategy has been applied for the rational design of mutants
of CaLB endowed with improved amidase activity [24].
In principle, the rational redesign of enzyme catalytic properties
should be driven by fundamental knowledge of the whole array
of structural, electronic, and functional factors that determine
the stabilization of the TS of the reaction of interest. Current
computational studies of enzyme activity, as measured by the
activation free energy, generally restrict their focus to the wild-type
(WT) enzyme and a limited number of mutants, which have been
described with a comparatively high [25, 26] or moderately high
[27] level of theory. However, the computational demand of these
methods makes it difficult to apply them to the actual design of new
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

486 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

enzymatic catalysts where the activity of hundreds of mutants has to


be evaluated.
As an alternative approach, GRID-based descriptors, molecular
modeling, experimental studies, and multivariate statistical analysis
can be merged to develop 3D-QSAR models able to screen virtual
mutants in silico but also providing information on hotspots for
mutagenesis.
This low computational demanding strategy can be schematized
in the following four main steps:
1. Generation and selection of mutants constituting the “training-
set”
2. Modeling the mutants
3. GRID analysis of each mutant and extraction of molecular
descriptors
4. Statistical analysis and generation of the 3D-QSAR model using
PLS analysis
More importantly, it has been demonstrated that all computa-
tional steps can be integrate within a workflow, thus making the
whole procedure automatic and fast. The validated 3D-QSAR model
was integrated as well inside the workflow, as scoring function
enabling the screening of new generations of virtual mutants.
The main concept at the basis of this hybrid strategy is that
catalytic efficiency of enzymes depends on multiple contributions
of factors that are not independent but rather interact among them,
thus resulting in a complex behavior. Consequently, it is mandatory
to identify and analyze not only the effect of each single variable
but also their interactions. To limit the computational cost, the
method does not aim at computing each contribution but rather at
finding an empiric equation suitable to correlate enzyme structure
with its catalytic property evaluated experimentally (e.g., the rate of
hydrolysis of a model amide).
As a general rule, 3D-QSAR models should be constructed by
using training sets of mutants of modest size (e.g., 15–30) but
expressing a wide variability in terms of enzymatic activity and
structural properties. Models trained on structurally homogeneous
objects are expected to have excellent predictivity but only within a
restricted range of structural properties because regression models

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Analysis of Enzyme Active Sites for Rational Enzyme Engineering 487

are able to predict the effect of a variable X (i.e., a specific structural


feature of an enzyme/mutant) on amidase activity (Y variable) as
long as such a variable is somehow represented in the training set.
As an example, in the case of the study of the engineering of amidase
activity into the CaLB scaffold, 30 mutants and their experimentally
determined activity were used as the training set. Mutants involved
mutations on 17 residues, determining 19 monovariants and 11
multivariants, which resulted from their combinations [24].
When considering the ability of enzymes to catalyze a given
reaction, it is crucial to analyze not only geometric and electronic
features of the residues located in the close proximity of the
catalytic residues but also properties such the ability of establishing
H bonds or hydrophobic interactions. Therefore, the f probes to
be used in the mapping of the active sites included not only
WATER and DRY probes but also the O:: probe (which represents
a H bond acceptor carbonylic oxygen and therefore accounts
for interactions with H bond donors) and the N1 probe (which
represents a H bond donor group and describes interactions with
H bond acceptors). The extensive matrixes of variables calculated
for each object (i.e., mutant) were filtered by applying suitable
statistical procedures (e.g., the D-Optimal preselection and FFD
variable selection algorithm) [28] before applying PLS analysis. The
predictivity (q 2 of 0.79) and the validation of the model using
mutants external to the training set confirmed that the model is able
to discriminate between poor and good mutants, thus allowing an in
silico preselection of virtual candidates structures (Table 13.5).
However, the information coming from the PLS model goes
beyond the structure selection, because the selected variables can
be projected on the 3D structure of CaLB to visualize promising
hotspots (Fig. 13.9).
In Fig. 13.9 it appears evident that amidase activity is the
complex result of various variables and their interactions, this is
mathematically accounted by the 3D-QSAR model, whereas any
interpretation of the effect of the selected variables cannot rely
on a simple visual inspection of the enzyme structure. Indeed, the
PLS analysis fully exploits fundamental knowledge, while avoiding
conceptual biases and select variables that correlate with the
desired enzymatic activity.
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

488 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Table 13.5 Experimentally determined activities of a set of


mutants external to the training set and the corresponding
activities predicted by the 3D-QSAR model. The amidase activities
are expressed as an improvement factor (IF) referred to CaLB
wild-type activity: IF = amidase activity of mutant/amidase
activity of CaLB wild

Mutation Predicted activity Experimental activity (improvement factor)

I189N 0.82 0.95


D223N 2.08 1.30
T42S 1.47 1.21
G39S 0.71 0.55
A225L 0.95 1.19
A225F 0.99 0.75
A225G 1.13 1.26
CaLB WT - 1

Figure 13.9 An example of relevant variables selected by the statistical


analysis and projected on the 3D structure of the lipase wild type. O::
variables (on the left) and N1 variables (on the right) are colored as
a function of statistical loadings: green for negative loadings and red
for positive loadings. Loadings provide information on how the variables
contribute to each principal component. Positive loading indicates that the
variable is directly correlated with kcat /KM . Hotspot residues are highlighted
in stick mode.

Because the credibility of any in silico engineering strategy at


the industrial scale depends strongly on the timescale of its ap-
plication, ideally, computational methods should request moderate
computational power and work within a high-throughput frame. In
that respect, the automation of different procedures or software can
be implemented by using multi-objective optimization software [29]
that creates work-flows that integrate the following steps:

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

BioGPS Descriptors for in silico Rational Design and Screening of Enzymes 489

(1) Selection of mutations and generation of the virtual mutants


(2) Equilibration of each mutant by minimization and an MD
simulation procedure
(3) Superimposition of each mutant to the WT enzyme structure
(4) Automatic calculation of the molecular descriptors
(5) Scoring of each virtually generated mutant by applying the 3D-
QSAR model to the matrix of the descriptors

In conclusion, the multi-objective optimization software com-


putes randomly a first generation of mutants (the number of
mutants for each generation can be modified by the user), which
can be scored. The scoring results of the first generation are then
exploited for the calculation of the next 20 mutants using any
suitable algorithm [30]. The automatic workflow generates and
scores each virtual mutant in 2 h on a normal workstation, and
in principle, the multi-objective optimization software can compute
generations of mutants until the established convergence criteria
are met.
The following table (Table 13.5) provides an idea of the
computational resources required and the timescale of the overall
computational process for the optimization of mutant libraries by
combining modeling and statistical analysis.
Overall, it must be underlined once again, that the regression
model can predict a property as long as it has been trained with
the information relevant for describing such property. Accordingly,
highly active mutants can be predicted and selected only by starting
from a training set of mutants with a broad range of activity
values. Therefore, the methodology appears particularly suited for
optimizing and tuning engineering strategies once an appropriate
library of mutants is made available.

13.9 BioGPS Descriptors for in silico Rational Design and


Screening of Enzymes

The rational redesign of the active site of an enzyme requires


effective computational strategies able to correlate the structure of
enzymes with their ability to stabilize the TS of a given reaction.
March 23, 2016
12:57

Table 13.6 Integration of computational procedures and steps within an automatic workflow for in silico
evolution and screening of virtual mutants: computational resources required for each step (example referred
to about 30–40 mutants)

Computational step Timescale Software Open source?


Generation of the mutant 1 hour on a 4 core processor PyMOL or SCAP Yes
structure
Mutant structure equilibration 1 week on a 4 core processor GROMACS Yes
by MD simulations
Structure superimposition 1 hour on a 4 core processor PyMOL Yes
Molecular descriptor 5 hours on a 4 core processor GRID GRID is a commercial software.
calculation by GRID mapping
Construct 3D-QSAR model 10 days on a 4 core processor GOLPE GOLPE (commercial software);
correlating mutant structure open source alternative: R
PSP Book - 9in x 6in

and their activity and variable package


analysis (http://www.r-project.org/)

www.ebook3000.com
Automatization of the 10 days on a 4 core processor modeFRONTIER modeFRONTIER (commercial
procedures and integration of software); open source
the different software alternative: KNIME
490 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

(http://www.knime.org/)
In silico design and screening From 6 to 8 mutants can be modeFRONTIER and all the As above
evaluated within a day on a 4 core integrated software
processor.
13-Allan-Svendsen-c13
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

BioGPS Descriptors for in silico Rational Design and Screening of Enzymes 491

Computational approaches are expected not only to explore the


structural complexity of enzymes but also to disclose factors
that, by exerting their effect jointly, produce an optimized, pre-
organized reaction environment [31]. As shown in the previous
paragraphs, molecular descriptors based on MIFs allow for the
analysis of keys structural and physical-chemical properties that
are at the basis of the catalytic power of an enzyme. A novel
type of GRID-based molecular descriptor named BioGPS (global
positioning system in biological space) has been recently applied
for developing the first example of bioinformatic methodology that
relies entirely on the 3D structure analysis of enzymes rather
than on simple sequence or structure alignment [32]. The BioGPS
tool brings two major advances—a function-based classification
of enzymes or mutants as well as an effective rational guide
for enzyme engineering. By combining BioGPS descriptors with
multivariate statistical analysis, the method allows for focused
studies of structure–function correlations and identifies physical-
chemical and electrostatic factors that affect catalytic activity of
enzymes.
From a methodological point of view, the BioGPS algorithm
[33, 34] firstly identifies the active site of enzyme structures and
then analyzes the active site properties through a GRID mapping
procedure (see Section 13.2). The output will depend on the nature
of the selected GRID probes and consists of electron density–like
fields centered on each active site atom (pseudo-MIF). Subsequently,
the algorithm reduces the complexity of the pseudo-MIFs, selecting a
number of representative points using a weighted energy-based and
space coverage function. These selected points are used for gener-
ating all possible geometrical combinations, termed “quadruplets,”
which are mathematically associated to bitstrings or biofingerprints
[32]. Conversely, information, contained in the quadruplets can
be compared within a common reference framework by searching
for similar quadruplets with an “all against all” approach. The
procedure involves the alignment of the corresponding quadruplets
and avoids the problem of arbitrary superimposition of the protein
structures. The output is represented by a series of similarities
scores (one for each original GRID probe) together with a global
score.
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

492 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

The efficiency of this new bioinformatic strategy was demon-


strated by considering the same engineering objective described
also in Section 13.8, namely the introduction of amidase activity
into the scaffold of CaLB. The problem was approached initially
by comparing the properties of a series of 42 serine hydrolases.
As shown in Table 13.6, they are classified into four different
classes (lipases, amidases, proteases, esterases) on the basis of
their ability to catalyze the hydrolysis of ester or amide bonds in
different natural substrates. In principle, revealing factors that make
such structurally similar enzymes able to catalyze the hydrolysis of
different functional groups is of major importance for guiding the
rational design of a promiscuous active site endowed with amidase
activity.
For that purpose, the active site of each enzyme of all four classes
was scanned using the BioGPS descriptors and then compared using
the computed biofingerprint [32]. Finally, the algorithm generated
the similarity scores that reflect differences related to each original
GRID probe and the global score that provides the comprehensive
picture of differences/similarities in terms of all the properties
addressed in the GRID analysis.
The BioGPS scores were used as input for performing unsu-
pervised pattern cognition analysis (UPCA; a statistical procedure
similar to PCA described in Section 13.3) that allowed for the
unbiased identification and quantification of differences among
hydrolases enzymes and, consequently, for their grouping inside
clusters in 2D space, corresponding to the first two PCs shown in
Fig. 13.10.
Figure 13.10 shows the clustering of the 42 enzymes on the basis
of the structural and physical-chemical similarities, as computed
by the BioGPS descriptors. More specifically, four different GRID
probes were employed for mapping the active sites: an H probe
taking into account the shape, an O probe that evaluates H
bond donor properties, an N1 probe that considers the H bond
acceptor capabilities, and a DRY probe accounting for hydrophobic
interactions. The magnitude of the interaction of the N1 and
O probes includes, implicitly, also information about the charge
contribution, since these probes have already a partial positive
and negative charge, respectively. As a result of UPCA, the four

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

BioGPS Descriptors for in silico Rational Design and Screening of Enzymes 493

Table 13.7 Serine hydrolases analyzed by means of BioGPS descriptors:


PDB code of crystal structures, enzyme classes, sources, and natural
substrates

Enzyme class PDB code Source Substrate

1CRL Candida rugosa Triacylglycerol


1DTE Humicola lanuginosa Triacylglycerol
1ETH Sus scrofa Triacylglycerol
1EX9 Pseudomonas aeruginosa Triacylglycerol
1GPL Cavia porcellus Triacylglycerol
1K8Q Canis lupus familiaris Triacylglycerol
Lipases
1LPB Homo sapiens Triacylglycerol
1TCA Candida antarctica Triacylglycerol
2FX5 Pseudomonas mendocina Triacylglycerol
2NW6 Burkholderia cepacia Triacylglycerol
2W22 Geobacillus thermocatenulatus Triacylglycerol
1AUO Pseudomonas fluorescens Broad specificity
1BS9 Penicillium purpurogenum Xylanes acetates

1C7J Bacillus subtilis p-Nitrobenzyl esters
1CLE Candida cylindracea Cholesterol esters
1JU3 Rhodococcus sp. Cocaine

1QOZ Tricoderma reesei Xylanes acetates
Esterases
1USW Aspergillus niger Feroloyl-polysaccharide
2ACE Torpedo californica Acetylcoline

2H7C Homo sapiens CoA, palmitate, and taurocholate
2WFL Rauvolfia serpentine Polyneuridine aldehyde

3KVN Pseudomonas aeruginosa Rhamnolipids

1GVK Sus scrofa Ala-|-Xaa

1NPM Mus musclus Lys/Arg-|-Xaa

1PPB Homo sapiens Arg-|-Gly fibrinogen
1QFM Sus scrofa Pro-|-Xaa (∼30aa)

1TAW Bos Taurus Lys/Arg-|-Xaa
Proteases ∗
1TM1 Bacillus amyloliquefaciens Uncharged P1

1YU6 Bacillus licheniformis Uncharged P1
2XE4 Leshmania major Olygopeptides

3F7O Peacelomyces lilacinus Peptides
1AZW Xantomonas campestris NH-Pro-|-Xaa
1GM9 Escherichia coli Penicillin
1HL7 Microbacterium sp. γ -lactam

1M21 Stenotrophomonas maltophilia C terminal amide

1MPL Streptomyces sp. L-Lys-D-Ala-|-D-Ala
1MU0 Thermoplasma acidophilum NH-Pro-|-Xaa
Amidases
1QTR Serratia marcescens NH-Pro-|-Xaa

3A2P Arthrobacter sp. 6-Amino hexanoate dimer
3K3W Alcaligens faecalis Penicillin

3K84 Rattus norvegicus Fatty acid amide
3NWO Mycobacterium smegmatis NH-Pro-|-Xaa
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

494 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

Figure 13.10 Unsupervised pattern cognition analysis of 42 serine hydro-


lases on the basis of BioGPS descriptors (global score). The enzymes are
labeled according to their PDB code. Lipases are indicated in blue, esterases
in green, amidases in red, and proteases in cyan. Improved mutants are
reported as black triangles (amidase activity 3–11-fold higher than wild-
type CaLB), whereas poor mutants are in pink. WT CaLB (1TCA) is indicated
by the blue arrow.

classes of serine hydrolases are clearly grouped into clusters,


hence confirming that the method is able to extract from the
chemical structures the factors responsible for the different catalytic
properties of hydrolase active sites.
Respect to the objective of designing amidase activity inside
the scaffold of CaLB, the engineering operation can be rationally
guided by disclosing the details of the information contained in the
BioGPS descriptors computed for each GRID probe. For instance, Fig.
13.11a illustrates the superimposition of pseudo-MIFs calculated by
using the O probe driven by the alignment of the corresponding
quadruplets. The extension and position of the pseudo-MIFs provide
a visual description of differences between the two enzymes in
terms of ability to donate H bonds. Of course this information can
be exploited as a roadmap for defining hotspots and the in silico
generation of virtual mutants with increased amidase activity.

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

Conclusions 495

Figure 13.11 (a) Superimposition of pseudo-MIFs calculated using the


O probe on 1GM9 (amidae, cyan surfaces) and 2W22 (lipase, in blue).
Protein structures are represented as green and magenta cartoon mode,
respectively. (b) Comparison of pseudo-MIFs calculated using the DRY
(hydrophobic) probe: 1GM9 pseudo-MIFs are represented as green surfaces,
whereas 2W22 pseudo-MIFs are in orange. Protein structures are repre-
sented in cartoon mode: green for 1GM9 and magenta for 2W22.

Finally, the computational tool allows for in silico screening of


virtual mutants: By analyzing each mutant by means of the BioGPS-
UCPA method, it is possible to observe which CaLB mutants have
moved from the lipase group toward the amidase cluster and then
generate experimentally only such promising variants [32]. Indeed,
Fig. 13.10 shows the projection of eight CaLB mutants, whose
amidase activity had been previously determined experimentally
[32]. Interestingly, only variants displaying an increased amidase
activity (3–11 folds as compared to the WT CaLB) fall inside the
amidase cluster (black triangles in Fig. 13.10). On the contrary,
mutants with amidase activity comparable or lower to the WT CaLB
remain inside the lipase area.
Therefore, this first application of BioGPS-UPCA to bioinformatic
analysis opens new perspectives toward 3D phylogenetic analysis
and toward the in silico design and screening of virtual mutants
endowed with activities of interest.

13.10 Conclusions

The catalytic activity and selectivity of each enzyme are the result of
an ensemble of structural, steric, electronic, and physical-chemical
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

496 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

factors that constitute a complex and pre-organized reaction


environment able to stabilize the TS of a given reaction.
The unbiased analysis of all these parameters and their inter-
actions requires informative molecular descriptors. In particular,
descriptors based on MIFs can be applied for analyzing structural,
electronic, and physical-chemical properties of macromolecules and,
conversely, for rational enzyme engineering or even structure–
function classification of enzymes. On that respect, BioGPS-UPCA
descriptors allow for the implementation of the first example of
bioinformatic methodology that relies entirely on the 3D structure
analysis of enzymes and does not stem from simple sequence or
structure alignment.
Adequate multivariate statistical tools are needed for processing
the huge amount of information contained in the descriptors, as
well for correlating structures and functions of enzymes [3]. The
versatility, modest computational cost, and operational time make
these hybrid approaches broadly accessible and applicable to the
study of multiple enzymatic properties.
More importantly, it has been proved that methodologies
exploiting descriptors based on MIFs and involving multivariate
statistical analyses are suitable to be integrated within automatic
computational work-flows. On that respect, the first example of
automated work-flow for in silico design and screening of virtual
mutants opens new perspectives toward the rational engineering of
enzymes within a high-throughput scheme.

References

1. Warshel, A., and Levitt, M. (1976). Theoretical study of enzymatic


reaction: dielectric, electrostatic and steric stabilization of the carbon
ion in the reaction of lysozyme, J. Mol. Biol., 103, pp. 227–249.
2. Braiuca, P., Knapic, L., Ferrario, V., Ebert, C., and Gardossi, L. (2009).
A three-dimensional structure activity-relationship (3D-QSAR) model
for predicting the enantioselectivity of Candida antarctica lipase B, Adv.
Synth. Catal., 351, pp. 1293–1302.
3. Braiuca, P., Ebert, C., Basso, A., Linda, P., and Gardossi, L. (2006).
Computational methods to rationalize experimental strategies in
biocatalysis, Trends Biotechnol., 24, pp. 419–425.

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

References 497

4. Todeschini, R., and Consonni, V. (2009). Molecular Descriptors for


Chemometrics (Wiley- VCH, Weinheim).
5. Goodford, P. J. A. (1985). A computational procedure for determining
energetically favorable binding sites on biologically important macro-
molecules, J. Med. Chem., 28, pp. 849–857.
6. Pastor, M., Cruciani, G., McLay, I., Pickett, S., and Clementi, S. (2000).
GRid-INdependent descriptors (GRIND): a novel class of alignment-
independent three-dimensional molecular descriptors, J. Med. Chem.,
43, pp. 3233–3243.
7. Clementi, S., Cruciani, G., Riganelli, D., Valigi, R., Costantino, G., Baroni,
M., and Wold, S. (1993). Autocorrelation as a tool for a congruent
description of molecules in 3D QSAR studies, Pharm. Pharmacol. Lett.,
3, pp. 5–8.
8. Braiuca, P., Boscarol, L., Ebert, C., Linda, P., and Gardossi, L. (2006).
3D-QSAR applied to the quantitative prediction of penicillin G amidase
selectivity, Adv. Synth. Catal., 348, pp. 773–780.
9. Wold, S. (1987). Principal component analysis, Chemom. Intell. Lab. Syst.,
2, pp. 37–52.
10. Wold, S. (2001). PLS-regression: a basic tool of chemometrics, Chemom.
Intell. Lab. Syst., 58, 109–130.
11. Ferrario, V., Braiuca, P., Tessaro, P., Knapic, L., Gruber, C., Pleiss, J., Ebert,
C., Eichhorn, E., and Gardossi, L. (2012). Elucidating the structural
and conformational factors responsible for the activity and substrate
specificity of alkanesulfonate monooxygenase, J. Biomol. Struct. Dyn., 30,
pp. 74–88.
12. Cruciani, G., Crivori, P., Carrupt, P. A., and Testa, B. (2000). Molecular
fields in quantitative structure–permeation relationships: the VolSurf
approach, J. Mol. Struct., 503, pp. 17–30.
13. Klibanov, A. M. (2001). Improving enzymes by using them in organic
solvents, Nature, 409, pp. 241–246.
14. Felluga, F., Pitacco, G., Valentin, E., Coslanich, A., Fermeglia, M., Ferrone,
M., and Pricl, S. (2003). Studying enzyme enantioselectivity using
combined ab initio and free energy calculations: alfa-chymotrypsin
and methyl cis- and trans-5-oxo-2-pentylpirrolidine-3-carboxylates,
Tetrahedron Asymmetry, 14, pp. 3385–3399.
15. Philips, R. S. (1992). Temperature effect on stereochemistry of enzyme
reactions, Enzyme Microb. Technol., 14, pp. 417–419.
16. Ottosson, J., Rotticci-Mulder, J. C., Rotticci, D., and Hult, K. (2001).
Rational design of enantioselective enzymes requires consideration of
entropy, Protein Sci., 10, pp. 1769–1774.
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

498 Molecular Descriptors for the Structural Analysis of Enzyme Active Sites

17. Ottosson, J., Fransson, L., and Hult, K. (2002). Substrate entropy in
enzyme enantioselectivity: an experimental and molecular modeling
study of a lipase, Protein Sci., 11, pp. 1462–1471.
18. Villa, J., Strajbl, M., Glennon, T. M., Sham, Y. Y., Chu, Z. T., and Warshel, A.
(2000). How important are entropic contributions to enzyme catalysis?
Proc. Natl. Acad. Sci. U S A, 97, pp. 11899–11904.
19. Strajbl, M., Sham, Y. Y., Villa, J., Chu, Z. T., and Warshel A. (2000).
Calculations of activation entropies of chemical reactions in solution, J.
Phys. Chem. B, 104, pp. 4578–4584.
20. Vallin, M., Syren, P. O., and Hult, K. (2010). Mutant lipase-catalyzed
kinetic resolution of bulky phenyl alkyl sec-alcohols: a thermodynamic
analysis of enantioselectivity, ChemBioChem, 11, pp. 411–416.
21. Ferrario, V., Foscato, M., Ebert, C., and Gardossi, L. (2013). Thermody-
namic analysis of enzyme enantioselectivity: a statistical approach by
means of new differential HybridMIF descriptors, Biocatal. Biotrans-
form., 31, pp. 272–280.
22. Straathof, A. J. J., and Jongejan, J. A. (1997). The enantiomeric ratio:
origin, determination and prediction, Enzyme Microb. Technol., 21, pp.
559–571.
23. Uppenberg, J., Ohrner, N., Norin, M., Hult, K., Kleywegt, G. J., Patkar, S.,
Waagen, V., Anthonsen, T., and Jones, T. A. (1995). Crystallographic and
molecular-modeling studies of lipase B from Candida antarctica reveals
a stereospecificity pocket for secondary alcohols, Biochemistry, 34, pp.
16838–16851.
24. Ferrario, V., Ebert, C., Svendsen, A., Besenmatter, W., and Gardossi, L.
(2014). An integrated platform for automatic design and screeningof
virtual mutants based on 3D-QSAR analysis, J. Mol. Catal. B, 101, pp. 7–
15.
25. Claeyssens, F., Harvey, J., Manby, F., Mata, R., Mulholland, A., Ranaghan, K.
E., Schutz, M., Thiel, S., Thiel, W., and Werner, H. J. (2006). High-accuracy
computation of reaction barriers in enzymes, Angew. Chem., Int. Ed.,
118, pp. 7010–7013.
26. Parks, J., Hu, H., Rudolph, J., and Yang, W. (2009). Mechanism of
Cdc25B phosphatase with the small molecule substrate p-nitrophenyl
phosphate from QM/MM-MFEP calculations, J. Phys. Chem. B, 113, pp.
5217–5224.
27. Noodleman, L., Lovell, T., Han, W. G., Li, J., and Himo, F. (2004). Quantum
chemical studies of intermediates and reaction pathways in selected
enzymes and catalytic synthetic systems, Chem. Rev., 104, pp. 459–508.

www.ebook3000.com
March 23, 2016 12:57 PSP Book - 9in x 6in 13-Allan-Svendsen-c13

References 499

28. Cruciani, G., Clementi, S., and Baroni, M. (1993). 3D QSAR in Drug Design:
Theory, Methods and Applications (ESCOM, Leiden), pp. 567–582.
29. modeFRONTIER, ESTECO s.p.a., www.esteco.com.
30. Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. (2002). A fast and elitist
multiobjective genetic algorithm: NSGA-II, IEEE Trans. Evol. Comput., 6,
pp. 182–197.
31. Warshel, A., Sharma, P. K., Kato, M., Xiang, Y., Liu, H., and Olsson, M. H.
(2006). Electrostatic basis for enzyme catalysis, Chem. Rev., 106, pp.
3210–3235.
32. Ferrario, V., Siragusa, L., Ebert, C., Baroni, M., Foscato, M., Cruciani, G.,
and Gardossi, L. (2014). BioGPS descriptors for rational engineering of
enzyme promiscuity and structure based bioinformatic analysis, PLOS
ONE, 9, p. e109354.
33. Baroni, M., Cruciani, G., Sciabola, S., Perruccio, F., and Mason, J. S. (2007).
A common reference framework for analyzing/comparing proteins and
ligands. Fingerprints for Ligands and Proteins (FLAP): theory and
application, J. Chem. Inf. Model., 47, pp. 279–294.
34. Sciabola, S., Santon, R. V., Mills, J. E., Flocco, M. M., Baroni, M., et al.
(2010). High-throughput virtual screening of proteins using GRID
molecular interaction fields, J. Chem. Inf. Model., 50, pp. 150–169.
35. Brincat, J. P., Carosati, E., Sabatini, S., Manfroni, G., Fravolini, A., et al.
(2011). Discovery of novel inhibitors of the NorA multidrug transporter
of Staphylococcus aureus, J. Med. Chem., 54, pp. 354–365.
This page intentionally left blank

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Chapter 14

Hydration Effects on Enzyme Properties


in Nonaqueous Media Analyzed by MD
Simulations

Diana Lousa, António M. Baptista, and Cláudio M. Soares


Instituto de Tecnologia Quı́mica e Biológica, Universidade Nova de Lisboa,
Av. da República, 2780-157 Oeiras, Portugal
dlousa@itqb.unl.pt, claudio@itqb.unl.pt

The discovery that enzymes can work in nonaqueous solvents


opened the way for the development of a vast number of enzymatic
processes with tremendous technological potential. The advantages
of nonaqueous biocatalysis include the possibility of controlling en-
zyme selectivity, using hydrophobic substrates, avoiding unwanted
side reactions, and preventing microbial contamination [1].
Even when the reactions take place in nonaqueous media, they
rarely occur in a completely anhydrous environment. In most cases
a certain amount of water is required for the enzyme to function,
and there is an optimum amount of water that maximizes its
activity [2]. Therefore, hydration plays a crucial role in nonaqueous
biocatalysis.
Molecular dynamics (MD) simulations have provided major
contributions to the elucidation of the molecular determinants
of hydration effects on enzyme properties in nonaqueous media,

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

502 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

including organic solvents, ionic liquids, and supercritical fluids.


This chapter provides an overview of the major findings of these
studies.

14.1 Enzyme Reactions in Nonaqueous Solvents

Water is the most abundant solvent on our planet and the major
constituent of living cells. Therefore, most enzymes have evolved in
an aqueous environment and the common belief was that they could
not catalyze reactions in nonaqueous media. However, this notion
was proven wrong when it was shown that enzymes could function
in a biphasic system comprising water and a water-immiscible
organic solvent [3]. This finding paved the way for nonaqueous
enzymology, which emerged as a promising field.
Since the 1980s, several studies have shown that many enzymes
could not only function in organic solvents but also display
interesting novel properties [1, 4, 5]. These properties include
altered substrate specificity and enantioselectivity, suppression
of unwanted hydrolysis side reactions, increased stability, and
molecular memory. Additionally, solvents that are less polar than
water can solvate hydrophobic substrates and/or products. Organic
solvents are also considerably less prone to microbial contamination
than water. Due to all these advantages, nonaqueous enzymology
has a large technological potential and has been applied in several
industrial processes (see Ref. [6] for further information).
In addition to its biotechnological potential, nonaqueous en-
zymology is also interesting from a fundamental point of view.
When analyzing the behavior of proteins, scientists often pay
little attention to the role played by the solvent (in most cases
water). Nevertheless, enzyme properties (e.g., native fold, stability,
flexibility, activity, protonation and redox state, and interaction
with counterions) are strongly influenced by the environment that
surrounds them. Studying enzymes in nonaqueous solvents enables
the comparison of their properties in different media and the
analysis of how these properties are dependent of the characteristics
of the solvent (polarity, hydrophilicity, density, viscosity, etc.).
Additionally, in this type of medium, the amount of water present

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Classes of Nonaqueous Solvents 503

can be controlled, and by varying this property, one can determine


the role of water in enzyme structure and function [2].

14.2 Classes of Nonaqueous Solvents

In a broad sense, the term “nonaqueous” can be applied to any


medium containing a considerable percentage of solvent(s) other
than water and having properties distinct from those of pure
water. A large number of nonaqueous solvents, with very different
characteristics, have been used in enzyme catalysis, and they can be
divided into three main classes: organic solvents, ionic liquids (ILs),
and supercritical fluids.
Organic solvents are carbon-containing compounds that exist
in liquid form at room temperature and are usually volatile.
They can be further subdivided into polar and apolar according
to their dielectric constant. Polar organic solvents can be either
protic or aprotic. Protic solvents have a labile hydrogen atom
attached to a strongly electronegative atom (O or N), which
can be donated in a hydrogen bond. Additionally, these O–H or
N–H bonds can serve as a source of protons, although many protic
solvents (e.g., ethanol) are very weak acids. Aprotic solvents, on the
other hand, lack O–H or N–H bonds and, therefore, do not act as
hydrogen donors in hydrogen bonds or behave as acids. Another
property that is used to classify organic solvents is hydrophilicity,
which refers to the tendency of a molecule to be solvated by water.
ILs are salts that are liquid at, or close to, room temperature.
These salts are usually composed of an organic cation and an
inorganic anion. Through the combination of different cations and
anions the properties of the liquid can be tuned. This makes ILs very
powerful solvents. Additionally, they have negligible vapor pressure
and are nonflammable, which can make them more environmentally
friendly than common organic solvents.
Supercritical fluids are substances that are above their critical
temperature and pressure values. Under these conditions, they
combine liquid and gas properties. These fluids are very good
solvents because they are able to diffuse like gases and dissolve
solutes like liquids. Supercritical fluids like supercritical CO2
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

504 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

(scCO2 ), which are nontoxic and nonflammable, are emerging as


promising new solvents and are being used in several industrial
applications, including enzymatic processes [7].
Given the diversity of nonaqueous solvents and their divergent
properties, enzymes display different behaviors in different types
of nonaqueous solvents. The biotechnological potential of these
catalysts can be increased by taking advantage of this fact, that
is, one can choose the most appropriate solvent (and control
other reaction conditions) to obtain the desired enzyme properties
(medium engineering).

14.3 The Role of Water in Nonaqueous Biocatalysis

One of the most important factors controlling enzyme function in


nonaqueous solvents is water, or, more precisely, the amount of
water that is available to lubricate the protein [2]. It has long been
recognized that enzyme properties, such as activity, selectivity, and
stability, vary with the water content of the media [2]. Therefore, this
parameter can be adjusted to obtain the desired enzyme properties.
Several experimental studies have shown that there is an optimal
water amount that maximizes the activity of a given enzyme in a
given solvent [8–12]. The optimum value varies with the type of
solvent that is used, with polar solvents requiring higher water
amounts than nonpolar ones. This is due to the fact that polar
solvents can “strip” water molecules from the protein surface.
Interestingly, when enzyme activity is measured as a function of
water activity, the optimum value is similar for different solvents
[11]. Thus, water activity seems to be more relevant than water
concentration [2, 13, 14].

14.4 Effect of Water Content on Enzyme Structure and


Dynamics

To find a molecular explanation for the observed dependence


of enzyme activity on the hydration conditions of the protein,
researchers have turned to MD simulations. This technique has

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Effect of Water Content on Enzyme Structure and Dynamics 505

proved to be particularly useful because it enables the analysis of the


structural and dynamic properties of molecules with atomic details.
In one of the pioneering studies addressing this question, MD
simulations of cutinase and ubiquitin in hexane were performed
using different hydration conditions in order to analyze the influ-
ence of water content on the enzyme structure and dynamics [15].
This analysis revealed that there is an optimum water concentration
(∼10% w/w) at which the enzyme structural properties in hexane
are more similar to the ones found in water. The comparison of
protein fluctuations at different hydration levels showed that the
protein becomes more flexible as the water concentration increases
[15]. These results were able to shed light on the experimental
findings, showing that there is a bell-shaped dependence of
enzymatic activity on hydration [8, 11, 12]. When the water amount
present in the media is below the optimum, the enzyme is too rigid
and, thus, cannot efficiently catalyze the reaction. When the water
amount is above the optimum, the enzyme starts to deviate from its
native structure, which compromises its function.
MD simulations have also provided molecular insights into the
factors that explain the dependence of enzyme properties on the
water content of the media. One factor that seems to be particularly
important is the effect that the hydration level has on the stability
of intramolecular hydrogen bonds. Several MD simulation studies
have shown that the number of persistent intraprotein hydrogen
bonds decreases when the water content of the medium increases
(Fig. 14.1) [15–18]. As the water amount increases, there are more
water molecules available to form hydrogen bonds with the protein,
replacing intraprotein hydrogen bonds. This is one of the reasons
why proteins are very rigid in the absence of water and become
more flexible in the presence of higher amounts of water. In the
absence of water the protein is strongly constrained by the large
number of intramolecular hydrogen bonds. As the water increases,
the intramolecular interactions become weaker and the protein
becomes less restricted.
Another issue that was analyzed concerns the distribution
of water molecules between the protein surface and the bulk
solution. As expected, MD simulations of enzymes in nonpolar
solvents revealed that water molecules tend to stay bound to the
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

506 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

Figure 14.1 Effect of water percentage on the number of hydrogen bonds


observed in MD simulations of cutinase in hexane. Reprinted from Ref. [15],
Copyright (2003), with permission from Elsevier.

protein surface, interacting with polar and charged residues. At


low hydration levels, water molecules form many isolated clusters
around the polar regions of the protein. As the water percentage
increases, these clusters grow until the whole hydrophilic surface of
the protein is covered. In some cases, when the amount of water is
very high some clusters detach from the protein surface and migrate
to solution [15–18].
Other enzyme properties that are affected by the water content
include the radius of gyration and the solvent-accessible surface.
Both of these properties increase with the addition of water. This
is due to the fact that in nonpolar solvents, polar residues tend to
form interactions with each other or turn to the protein interior in
order to avoid being exposed to the solvent, whereas in water they
turn to solution, leading to the expansion of the protein. Moreover,
as expected, the hydrophobic/hydrophilic ratio decreases as the
hydration level increases [15, 18].

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Effect of Water Content on Enzyme Selectivity 507

The interaction between the protein and counterions is also


influenced by the amount of water present in the media. In
simulations performed in pure nonpolar solvents, counterions tend
to remain associated with the protein, in the positions where they
are placed in the beginning of the simulations [15, 16]. When the
water percentage increases, ions start to form ion pairs and the
protein tends to follow these ions, which can be one of the reason
underlying the destruction observed at high hydration levels [15,
16].

14.5 Effect of Water Content on Enzyme Selectivity

One of the major goals of nonaqueous biocatalysis is to increase


enzyme enantioselectivity, since there is a great technological
interest in obtaining enantiopure products. It has been shown that
enzyme enantioselectivity changes in different solvents, and for the
same solvent, it is also dependent on the water content of the media
[1, 19].
Several theoretical works have focused on the analysis of enzyme
enantioselectivity in nonaqueous media [20–23]. However, to our
knowledge, the role of hydration has only been extensively analyzed
in one MD simulation study, where cutinase was used as a model
enzyme and hexane as a model solvent [22]. The transesterification
reactions between vinyl butyrate and two aromatic alcohols (1-
phenylethanol and 2-phenyl-1-propanol) were studied.
The rate-limiting step of transesterification reactions catalyzed
by serine proteases is the formation of the second tetrahedral
intermediate [24]. It is generally assumed that the structure of
the transition state that leads to the tetrahedral intermediate
is similar to the tetrahedral intermediate itself [24]. Thus, the
enantioselectivity can be measured by calculating the free-energy
difference between the R and the S enantiomers of the tetrahedral
intermediate. In this study, the free-energy difference between the
two enantiomers was obtained by the thermodynamic integration
method [22]. The free-energy calculations indicated that the R
enantiomer is preferred in the two reactions analyzed, which is
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

508 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

in agreement with the experimentally measured enantioselectivity


[22].
The free-energy difference between the two enantiomers was
affected by the water percentage of the media, displaying higher
values at 5%–10% (w/w), which means that the enantioselectivity
of the enzyme is higher in this hydration range [22]. According
to the simulations, the interaction between the catalytic histidine
and the R enantiomer of the tetrahedral intermediate is stabilized
at this hydration level, whereas the same is not observed for
the S enantiomer [22]. This increases the stability of the R
enantiomer relative to the S enantiomer, which explains why the
enantioselectivity is higher in these conditions.

14.6 Hydration Mechanisms of Enzymes in Polar and


Nonpolar Solvents

As has been mentioned above, one of the properties that is used


to classify organic solvents is polarity. This property is especially
relevant in the field of nonaqueous biocatalysis, given that proteins
display rather different behaviors in polar and nonpolar solvents.
Several experimental studies have shown that the optimal
amount of water for a given enzyme to function in a nonaqueous
solvent is strongly dependent on solvent polarity [11, 25–27], and
this issue has also been analyzed using MD simulations [16, 28, 29].
In accordance with experimental findings, the simulations showed
that polar solvents often replace water molecules at the protein
surface [16, 28, 29]. In these media, protein–water interactions are
weak and transient and water molecules frequently migrate to the
bulk solution [16, 28, 29]. This is in contrast with what happens in
nonpolar solvents, where water molecules remain firmly attached to
the protein throughout the simulations [15, 16, 18, 28, 29].
MD simulations of cutinase in different solvents have also shed
light on the effect of different solvents on the localization of water
molecules [16]. The probability density maps obtained from the
simulations revealed that, independently of the solvent that is used,
it is possible to identify regions on the enzyme surface where there
is an accumulation of water molecules (Fig. 14.2) [16]. Interestingly,

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Hydration Mechanisms of Enzymes in Polar and Nonpolar Solvents 509

Figure 14.2 Spatial distribution probability density of water in MD


simulations of cutinase in different solvents. From Ref. [16]. Copyright 
c
2007. Reproduced with permission of John Wiley & Sons, Inc.

the most populated regions are the same in all the solvents tested,
independently of their polarity. As expected, the most populated
areas are composed of polar and charged residues. The main
difference observed among the different solvents is that water
covers a larger fraction of the protein surface in nonpolar than in
polar solvents. However, even in the former case, some regions of
the protein rarely interact with water [16]. These results explain the
experimental observations, indicating that the optimum hydration
level required for catalysis is shifted toward higher values in the case
of polar/hydrophilic solvents. Given that these solvents are able to
strip water molecules from the protein surface, one needs to add
large amounts of water to be able to cover the protein surface and
obtain functional enzymes.
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

510 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

The effect of water concentration on the structural properties of


enzymes is also influenced by the nature of the solvent. In nonpolar
solvents, there is an optimum amount of water (which depends on
the solvent) at which the enzyme structure is more similar to the one
observed in water [16]. In polar solvents such as acetonitrile and
ethanol this trend is not observed, probably because only a small
amount of water is found at the enzyme surface [16].
The dependence of enzyme flexibility on the hydration condi-
tions also varies with the type of solvent [16, 28, 29]. Trodler
and Pleiss found that the model enzyme Candida antarctica lipase
B (CaLB) displays a larger conformational mobility in simulations
performed in polar when compared with the situation observed in
nonpolar media [28]. Their results indicate that this is due to the
formation of a network of water molecules with long residence times
in nonpolar solvents, which make the protein more rigid [28].

14.7 Enzyme Behavior as a Function of Water Activity

As discussed in the previous section, the amount of water that


binds to the protein surface depends on the polarity of the organic
solvent used. At the same hydration percentage, the number of
water molecules that interact with the protein is larger in polar
than in nonpolar solvents. It is, thus, more useful to describe the
hydration conditions of a protein using water activity (aw ), because
this parameter is a measure of the available water in the system.
The use of this parameter has proven to be quite effective, and
several studies have shown that the value of aw , which maximizes
the activity of a given enzyme, does not depend on the nature of the
solvent [2, 13, 30]. Thus, it has become common practice to control
aw in nonaqueous enzymology experiments.
The value of aw is not easy to determine in MD simulations, and
therefore, in the pioneering simulations of enzymes in nonaqueous
media, the hydration conditions were described in terms of water
percentage. Nevertheless, strategies to calculate water activities in
MD simulations have been developed in recent years.
One of these strategies is on the basis of MD simulations
performed in the gas phase at different water activities, which are

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Enzyme Behavior as a Function of Water Activity 511

used to obtain water adsorption/desorption isotherms. Branco et


al. used this methodology to analyze hydration mechanism and
the effect of water activity (aw ) on the structure and dynamics of
the model enzyme CaLB [31]. The water adsorption/desorption
isotherms obtained in the simulations were in good agreement
with the experimental counterparts measured by Branco et al. They
observed that the water adsorption increases linearly with aw for
water activity values below 0.4. At larger values of aw , the number of
water molecules bound to the protein shows a sharp increase.
The profile of the isotherms could be explained by the analysis
of the simulations. When the water activity is low, water molecules
accumulate around the exposed polar groups of the protein. The
number of water molecules bound to the protein increases linearly
until the whole hydrophilic surface of the protein is covered. After
the saturation of the first hydration shell, the water starts forming
new layers around the protein, which explains the sharp increase in
the adsorption isotherms [31].
In this study the authors did not find a correlation between
the enzyme structural properties and the aw . The conformational
flexibility of CaLB was affected by the water activity of the medium,
with some regions of the protein becoming more flexible and others
more rigid upon an increase in aw [31].
Other researchers have used a completely different approach to
calculate and control the water activity in MD simulations [32]. In
this approach the water activity in a given solvent is calculated using
the relationship aw = γw (xw )xw , where xw is the fraction of bulk
water in the system and γw (xw ) is the water activity coefficient in
water/organic solvent mixture.
Wedberg et al. simulated CaLB in five different solvents (water,
methanol, tert-butyl alcohol, methyl tert-butyl ether [MTBE], and
hexane) [32]. For the nonpolar solvents hexane and MTBE, the value
of γw (xw ) could be approximated by the water activity at infinite
dilution (γw∞ ), because the fraction of bulk water was smaller than
0.02 in all their tests. The value of γw∞ can be obtained by using
the free energy of solvation of water, calculated through free-energy
perturbation methods. For the polar solvents, the authors had to
combine free-energy calculations with Kirkwood–Buff theory to
obtain the values of γw (xw ).
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

512 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

After calculating the water activity, they analyzed the effect of this
parameter on the hydration level of the protein. Interestingly, their
analysis revealed that at the same water activity, the hydration level
(defined as the number of water molecules whose oxygen atom is
within 3.5 of any nonhydrogen atom of the protein) is similar for all
the solvents tested. These findings are in excellent agreement with
previous experimental observations [30] and corroborate the notion
that water activity is a good parameter to evaluate the hydration
conditions of a protein.
In the same study, the authors also observed that the structural
and dynamical properties of CaLB (measured using the root-mean-
square deviation [RMSD], surface-accessible areas, and protein
fluctuations) are influenced by the water activity of the system [32].
In accordance with other studies [15, 16], in nonpolar solvents,
the variation of the RMSD with aw displayed a U-shaped profile,
indicating that there is an optimum water activity at which the
enzyme is more native like. The enzyme flexibility increased linearly
with aw for all the solvents [32].

14.8 Hydration Effects on Enzyme Reactions in Ionic


Liquids

In the last decade, ILs have emerged as promising alternative


media for biocatalytic processes. They offer many advantages over
conventional organic solvents, such as the ability to enhance enzyme
activity and stability, the capacity to dissolve a wide range of solutes,
and a low vapor pressure (which can make them less toxic than
organic solvents). The most interesting characteristic of ILs is their
versatility—by combining different cations and anions, one can
obtain ILs with different physical-chemical properties and this is
extremely valuable for their application in enzyme catalysis.
Exploring the potential of ILs as reaction media for biocatalysis
requires a detailed knowledge of the interaction between these
solvents and enzymes. Molecular simulations have been able to shed
light on this matter [33–35]. The general conclusion that emerged
from these studies was that protein–anion interactions play a crucial
role in enzyme stability in ILs. Smaller anions, with higher charge

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Hydration Effects on Enzyme Reactions in Ionic Liquids 513

densities, tend to have a deleterious effect on enzyme structure,


because they compete with intramolecular hydrogen bonds [33–35].
Understanding the role of hydration in these media is also crucial
and this subject has also been analyzed by MD simulations. In the
first study of this kind, the serine protease cutinase was simulated
in two ILs ([BMIM][PF6 ] and [BMIM][NO3 ]), which contain the same
cation paired with different anions, at two different temperatures
(298 and 343 K) [35]. Given that there were no parameters
available for these solvents they had to be parameterized in a
preliminary work [36]. The role of hydration was also analyzed
by performing simulations at 11 different water percentages [35].
These simulations showed that, given that ILs are quite hydrophilic,
most water molecules tend to go into the bulk solution, similar to
what happens in polar organic solvents. The water molecules tend to
interact preferentially with the anion species of the IL via hydrogen
bonds. The hydration profile of the enzyme is quite different in
the two ILs. The amount of water that is retained at the enzyme
surface at a given water percentage in [BMIM][PF6 ] is considerably
larger than in [BMIM][NO3 ], indicating that the latter IL has a
higher tendency to strip water molecules from the protein surface
(Fig. 14.3) [35].
The results also revealed that the structural properties of the
enzyme (measured by RMSD from the X-ray structure) are strongly
dependent on the amount of water present in the media. In the
case of [BMIM][PF6 ] at room temperature, the plot of enzyme RMSD
versus water percentage has an U-shaped form, with the enzyme
being more native like when the percentage of water is 5%–10%
(w/w) [35]. This U-shaped profile is similar to the one observed
in other nonaqueous solvents [15, 16]. The deviation from the
X-ray structure found when there is no water in solution can be
due to the reduced flexibility of the enzyme, which prevents it from
reaching a native-like structure. It can also be a consequence of the
lack of essential water molecules, which are necessary to maintain
the structure. The addition of only a small amount of water (5%–
10% w/w) is sufficient to make the enzyme more native like. When
the water percentage is above this level some regions of the protein
undergo large conformational changes and the protein structure
becomes less native like [35].
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

514 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

Figure 14.3 Average number of water molecules at a distance inferior to


0.25 nm from the enzyme surface of cutinase in simulations performed in
[BMIM][PF6] and [BMIM][NO3] at 298 and 343 K. The curve representing
simulations in acetonitrile at 298 K is shown as a reference [16]. Reprinted
with permission from Ref. [35], Copyright 2008 American Chemical Society.

Curiously, in the case of [BMIM][NO3 ], the plot of RMSD versus


water percentage displays an inverted-U profile, with the enzyme
being less native like at 35% of water. This trend is quite uncommon
for enzymes in nonaqueous solvents [35]. One possible explanation
for this fact is that at this hydration condition, the enzyme is flexible
enough to allow the penetration of NO3 − molecules, which disrupt
the protein structure, whereas at a lower water percentage, the
enzyme is less flexible and, therefore, less permeable to the anion
molecules. At higher hydration conditions, the number of water
molecules bound to the protein surface is larger, which can help to
prevent the access of NO3 − .

14.9 Hydration Effects on Enzyme Reactions in


Supercritical Fluids

The use of supercritical fluids in biocatalytic processes has also


become common in recent years, and a considerable number of

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Hydration Effects on Enzyme Reactions in Supercritical Fluids 515

experimental studies have helped to elucidate how enzymes behave


in these unconventional media. On the other hand, the number of
theoretical studies focusing on this subject is still scarce, although it
has increased in recent years [37–39].
In one of the few simulation studies of enzymes in sc fluids,
the authors used CaLB as a model to study enzyme stability
in scCO2 /liquid water biphasic mixtures. Particular attention was
given to the role of hydration, which was analyzed by performing
simulations using different amounts of water, varying between
0 and 3000 water molecules for 10,000 CO2 molecules [39]. To
simulate CO2 in the supercritical state, the temperature and pressure
were maintained at 328.15 K and 100 bar, respectively. In these
conditions, CO2 is in a supercritical state, whereas the TIP3P water
model used is well below its critical point, which means that water
is in the liquid state. Under these conditions, the two solvents do not
mix, forming interfacial biphasic systems consisting of liquid water
and scCO2 .
The simulations revealed that the RMSD of the protein increases
sharply in the absence of water, indicating that the protein is
very unstable in these conditions. When water is added to the
system, the protein becomes considerably more stable [39]. This is
consistent with previous experimental studies showing that CaLB
is inactive in anhydrous CO2 . Interestingly, only a small amount of
water is necessary to maintain the protein structure, and increasing
the water content does not have a significant effect. The authors
observed that water molecules bind to specific regions of the protein
and prevent CO2 molecules from penetrating into the core of the
protein, which is the main cause of the unfolding observed in
anhydrous conditions. This explains why only a small amount of
water is required to maintain enzyme stability [39].
On the one hand, the simulations indicate that the most flexible
regions of the protein are the same in pure water and in the water/
scCO2 biphasic mixture and are mainly located in the active site,
which is consistent with the observation that the enzyme maintains
its activity in these mixtures. On the other hand, in the absence of
water, some regions of the protein far from the active site undergo
large conformational changes. Moreover, the results indicate that the
interaction between the catalytic residues is severely compromised
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

516 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

in anhydrous scCO2 , which explains why the protein becomes


inactive in these conditions [39].
Another aspect that was analyzed in this study concerns the
distribution of CO2 and water molecules on the protein surface. The
two solvents are heterogeneously distributed around the protein,
with water interacting with hydrophilic regions and CO2 accumu-
lating around hydrophobic areas. The access to the active site is
done through a nonpolar tunnel, which is mainly surrounded by
CO2 molecules. Given that lipases, such as CaLB, catalyze reactions
involving hydrophobic substrates, the layer of CO2 surrounding the
entrance to the active site may facilitate the access of the substrates,
increasing the enzymatic activity [39].

14.10 Conclusions

Hydration plays a major role in nonaqueous enzymology, which has


been extensively investigated in molecular simulation studies. These
studies showed that enzyme properties are strongly dependent on
the water content of the media. There is an optimum amount of
water at which the enzyme structural and dynamical properties in
a given nonaqueous solvent are more similar to the ones found in
pure water. At very low hydration levels the protein is very rigid
and, therefore, less efficient. As the amount of water increases, the
protein becomes more flexible and more active. At a certain point,
it becomes so flexible that organic molecules are able to penetrate
its core, leading to unfolding. Enzyme selectivity displays a similar
dependence on the water content of the media.
The optimum water percentage for enzyme activity depends
on the nature of the solvent, with polar solvents requiring higher
water amounts. This is due to the fact that in polar media water
molecules tend to be in the bulk solution, which means that higher
amounts of water have to be added to the system to lubricate
the protein. Both experimental and theoretical studies have shown
that if enzyme properties are measured as a function of water
activity instead of water percentage, the optimum value is similar
for different solvents, although there are some exceptions. Water

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

Conclusions 517

activity is, therefore, an appropriate parameter to describe the


hydration conditions of an enzyme.
In recent years, two new classes of organic solvents have
emerged as promising alternatives to organic solvents, ILs and
supercritical fluids. ILs are salts that are liquid at (or close to)
room temperature. The main advantage of these solvents is that
they can be tuned by combining different cations and anions.
Simulation studies have shown that protein–anion interactions have
a strong effect on enzyme properties, with smaller anions causing
protein unfolding. These studies also revealed that the amount of
water dissolved in an IL influences the behavior of the enzyme. In
accordance with what is observed in organic solvents, there is an
optimum water amount for enzyme activity in ILs. As these media
are very polar, they tend to strip water molecules from the enzyme
surface.
Supercritical fluids are substances that are above their critical
point. Under these conditions, these fluids have both gas and
liquid properties. They diffuse fast, like gases, and they are able to
efficiently dissolve solutes, like liquids, which is very useful for their
application in biocatalysis. Recently, MD simulation studies have
been used to elucidate the molecular details of enzyme behavior
in these media. One of the most relevant findings of these studies
was that adding small amounts of water is necessary and sufficient
to enable enzymes to function in sc fluids. Water molecules bind
to specific regions of the protein, preventing CO2 molecules from
penetrating into the protein core and destroying its structure. These
studies also revealed that water and CO2 form a heterogeneous layer
around the protein, which may help to increase the activity of the
enzyme.
The findings obtained in simulation studies of enzymes in
nonaqueous solvents have played and will certainly continue to play
an important part in the development of nonaqueous biocatalysis.
The knowledge that has been obtained on the role of hydration
will be particularly useful, since this parameter is a determinant
factor for enzyme activity and stability in nonaqueous solvents.
This knowledge can be used to choose the appropriate reaction
conditions (solvent and hydration level) for a given process—media
engineering. Some studies have also provided valuable insights into
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

518 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

the mechanisms of enzyme hydration in different solvents, which


can be used to find mutations that increase enzyme activity and/or
stability in these media—protein engineering.
Due to the ongoing developments in computer hardware and MD
simulation software, we are confident that new findings will arise
from future simulation studies of enzymes in nonaqueous solvents.
In particular, new studies of enzymes in ILs and supercritical fluids
will be particular useful, since the number of simulations in these
media, which are the solvents of the future, is still scarce.

References

1. Klibanov, A. M. (2001). Improving enzymes by using them in organic


solvents, Nature, 409, pp. 241–245.
2. Halling, P. J. (2004). What can we learn by studying enzymes in non-
aqueous media?, Philos. Trans. R. Soc. Lond. B: Biol. Sci., 359, pp. 1287–
1296.
3. Klibanov, A. M., Samokhin, G. P., Martinek, K., and Berezin, I. V. (1977). A
new approach to preparative enzymatic synthesis, Biotechnol. Bioeng.,
19, pp. 1351–1361.
4. Zaks, A., and Klibanov, A. M. (1985). Enzyme-catalyzed processes in
organic solvents, Proc. Natl. Acad. Sci. U S A, 82, pp. 3192–3196.
5. Klibanov, A. M. (1989). Enzymatic catalysis in anhydrous organic
solvents, Trends Biochem. Sci., 14, pp. 141–144.
6. Krishna, S. H. (2002). Developments and trends in enzyme catalysis in
nonconventional media, Biotechnol. Adv., 20, pp. 239–267.
7. Orlova, E. V., Dube, P., Beckmann, E., Zemlin, F., Lurz, R., Trautner, T. A.,
Tavares, P., and van Heel, M. (1999). Structure of the 13-fold symmetric
portal protein of bacteriophage SPP1, Nat. Struct. Biol., 6, pp. 842–846.
8. Bell, G., Halling, P. J., Moore, B. D., Partridge, J., and Rees, D. G. (1995).
Biocatalyst behaviour in low-water systems, Trends Biotechnol., 13, pp.
468–473.
9. Fontes, N., Almeida, M. C., Peres, C., Garcia, S., Grave, J., Aires-Barros, M.
R., Soares, C. M., Cabral, J. M. S., Maycock, C. D., and Barreiros, S. (1998).
Cutinase activity and enantioselectivity in supercritical fluids, Ind. Eng.
Chem. Res., 37, pp. 3189–3194.

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

References 519

10. Zaks, A., and Klibanov, A. M. (1988). The effect of water on enzyme action
in organic media, J. Biol. Inorg. Chem., 263, pp. 8017–8021.
11. Valivety, R. H., Halling, P. J., and Macrae, A. R. (1992). Reaction rate with
suspended lipase catalyst shows similar dependence on water activity
in different organic solvents, Biochim. Biophys. Acta, 1118, pp. 218–222.
12. Affleck, R., Xu, Z. F., Suzawa, V., Focht, K., Clark, D. S., and Dordick, J. S.
(1992). Enzymatic catalysis and dynamics in low-water environments,
J. Am. Chem. Soc., 89, pp. 1100–1104.
13. Halling, P. J. (2000). Biocatalysis in low-water media: understanding
effects of reaction conditions, Curr. Opin. Chem. Biol., 4, pp. 74–80.
14. Halling, P. J. (1990). High-affinity binding of water by proteins is similar
in air and in organic solvents, Biochim. Biophys. Acta, 1040, pp. 225–228.
15. Soares, C. M., Teixeira, V. H., and Baptista, A. M. (2003). Protein structure
and dynamics in nonaqueous solvents: insights from molecular dynam-
ics simulation studies, Biophys. J., 84, pp. 1628–1641.
16. Micaelo, N. M., and Soares, C. M. (2007). Modeling hydration mecha-
nisms of enzymes in nonpolar and polar organic solvents, FEBS J., 274,
pp. 2424–2436.
17. Trodler, P., Schmid, R. D., and Pleiss, J. (2009). Modeling of solvent-
dependent conformational transitions in Burkholderia cepacia lipase,
BMC Struct. Biol., 9, pp. 38–50.
18. Diaz-Vergara, N., and Pineiro, A. (2008). Molecular dynamics study of
triosephosphate isomerase from Trypanosoma cruzi in water/decane
mixtures, J. Phys. Chem. B, 112, pp. 3529–3539.
19. Ducret, A., Trani, M., and Lortie, R. (1998). Lipase-catalyzed enantiose-
lective esterification of ibuprofen in organic solvents under controlled
water activity, Enzyme Microb. Technol., 22, pp. 212–216.
20. Colombo, G., and Carrea, G. (2002). Modeling enzyme reactivity in
organic solvents and water through computer simulations, J. Biotechnol.,
96, pp. 23–35.
21. Colombo, G., Toba, S., and Merz, K. M. (1999). Rationalization of the
enantioselectivity of subtilisin in DMF, J. Am. Chem. Soc., 121, pp. 3486–
3493.
22. Micaelo, N. M., Teixeira, V. H., Baptista, A. M., and Soares, C. M. (2005).
Water dependent properties of cutinase in nonaqueous solvents: a
computational study of enantioselectivity, Biophys. J., 89, pp. 999–1008.
23. Schulz, T., Pleiss, J., and Schmid, R. D. (2000). Stereoselectivity of
Pseudomonas cepacia lipase toward secondary alcohols: a quantitative
model, Prot. Sci., 9, pp. 1053–1062.
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

520 Hydration Effects on Enzyme Properties in Nonaqueous Media Analyzed

24. Warshel, A., Naray-Szabo, G., Sussman, F., and Hwang, J. K. (1989). How
do serine proteases really work?, Biochemistry, 28, pp. 3629–3637.
25. Bell, G., Janssen, A. E. M., and Halling, P. J. (1997). Water activity fails
to predict critical hydration level for enzyme activity in polar organic
solvents: interconversion of water concentrations and activities, Enzyme
Microb. Technol., 20, pp. 471–477.
26. Broos, J., Visser, A. J. W. G., Engbersen, J. F. J., Verboom, W., van Hoek,
A., and Reinhoudt, D. N. (1995). Flexibility of enzymes suspended
in organic solvents probed by time-resolved fluorescence anisotropy.
Evidence that enzyme activity and enantioselectivity are directly related
to enzyme flexibility, J. Am. Chem. Soc., 117, pp. 12657–12663.
27. Klibanov, A. M. (1990). Asymmetric transformations catalyzed by
enzymes in organic-solvents, Acc. Chem. Res., 23, pp. 114–120.
28. Trodler, P., and Pleiss, J. (2008). Modeling structure and flexibility
of Candida antarctica lipase B in organic solvents, BMC Struct. Biol.,
8, pp.
29. Yang, L., Dordick, J. S., and Garde, S. (2004). Hydration of enzyme in
nonaqueous media is consistent with solvent dependence of its activity,
Biophys. J., 87, pp. 812–821.
30. Valivety, R. H., Halling, P. J., Peilow, A. D., and Macrae, A. R. (1992). Lipases
from different sources vary widely in dependence of catalytic activity on
water activity, Biochim. Biophys. Acta, 1122, pp. 143–146.
31. Branco, R. J. F., Graber, M., Denis, V., and Pleiss, J. (2009). Molecular
mechanism of the hydration of Candida antarctica lipase B in the
gas phase: water adsorption isotherms and molecular dynamics
simulations, ChemBioChem, 10, pp. 2913–2919.
32. Wedberg, R., Abildskov, J., and Peters, G. H. (2012). Protein dynamics in
organic media at varying water activity studied by molecular dynamics
simulation, J. Phys. Chem. B, 116, pp. 2575–2585.
33. Klahn, M., Lim, G. S., Seduraman, A., and Wu, P. (2011). On the different
roles of anions and cations in the solvation of enzymes in ionic liquids,
Phys. Chem. Chem. Phys., 13, pp. 1649–1662.
34. Klahn, M., Lim, G. S., and Wu, P. (2011). How ion properties determine
the stability of a lipase enzyme in ionic liquids: a molecular dynamics
study, Phys. Chem. Chem. Phys., 13, pp. 18647–18660.
35. Micaelo, N. M., and Soares, C. M. (2008). Protein structure and dynamics
in ionic liquids. Insights from molecular dynamics simulation studies, J.
Phys. Chem. B, 112, pp. 2566–2572.

www.ebook3000.com
February 2, 2016 16:49 PSP Book - 9in x 6in 14-Allan-Svendsen-c14

References 521

36. Micaelo, N. M., Baptista, A. M., and Soares, C. M. (2006). Parametrization


of 1-butyl-3-methylimidazolium hexafluorophosphate/nitrate ionic liq-
uid for the GROMOS force field, J. Phys. Chem. B, 110, pp. 14444–14451.
37. Housaindokht, M. R., Bozorgmehr, M. R., and Monhemi, H. (2012).
Structural behavior of Candida antarctica lipase B in water and
supercritical carbon dioxide: a molecular dynamic simulation study, J.
Supercrit. Fluids, 63, pp. 180–186.
38. Monhemi, H., Housaindokht, M. R., Bozorgmehr, M. R., and Googheri, M.
S. S. (2012). Enzyme is stabilized by a protection layer of ionic liquids
in supercritical CO2: insights from molecular dynamic simulation, J.
Supercrit. Fluids, 69, pp. 1–7.
39. Silveira, R. L., Martinez, J., Skaf, M. S., and Martinez, L. (2012). Enzyme
microheterogeneous hydration and stabilization in supercritical carbon
dioxide, J. Phys. Chem. B, 116, pp. 5671–5678.
This page intentionally left blank

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Chapter 15

Understanding Esterase and Amidase


Reaction Specificities by Molecular
Modeling

Per-Olof Syrén
School of Biotechnology, Science for Life Laboratory, KTH Royal Institute of Technology,
171 21 Solna, Sweden
per-olof.syren@biotech.kth.se

15.1 Introduction

In silico computational chemistry constitutes an appealing method


to enhance our understanding of how enzymes work [1–5].
These tools provide researchers with a unique opportunity to
reveal details of catalysis at the molecular level that would be
difficult, or even impossible, to access by experimental methods
solely. Computational chemistry, in combination with laboratory
work, has become a cornerstone in today’s efforts to improve
promiscuous activities displayed by enzymes and to design novel
enzymes catalyzing hitherto unknown reactions [6–9]. The aim of
this chapter is to demonstrate the potential of using molecular
modeling to shed light on fundamental aspects of catalysis displayed

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

524 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

by amidases/proteases and esterases/lipases.a The potential of


using computer simulations to clarify the effects of introduced
mutations on enzyme catalysis was realized early on [10]. In silico
computational strategies for an elevated atomistic understanding of
enzymatic reaction mechanisms can be founded on [10–12]:

(i) Quantum mechanics (QM)


(ii) Molecular dynamics (MD) and force-field methods
(iii) Hybrid methods (QM/MM [molecular mechanics])
(iv) Empirical valence bond (EVB) methods

Pioneering work on the basis of EVB calculations [13] and


MD simulations [14, 15] has paved the way for the development
of in silico methodologies into powerful and widespread tools
amenable for the study of enzyme catalysis. High-level first-principle
methods provide researchers with tools that can approach chemical
accuracy. However, even with the computational power available
today, QM calculations using a high level of theory are limited to
systems containing a couple of hundred atoms. Such small models
of enzymes have been termed “theozymes” [16]. Representing
enzymes by carving out carefully chosen active site models that are
used for quantum mechanical calculations has also been referred
to as the cluster approach [17]. In contrast to QM, MD simulations
are fast but chemical bonds are represented by springs. This
oversimplification of reality makes force-field-dependent methods
unable to represent bond-breaking and bond-forming processes.
In the hybrid QM/MM approach, the reacting atoms and a small
portion of the active site are treated quantum mechanically [18–
20] and capping atoms allow for interactions between the QM and
MD parts. The different theories listed above have their pros and
cons in achieving a trade-off between accuracy and speed in the
quest to increase our general understanding of how enzymes work.
Accessible time scales range from femtosecond for QM (i.e., bond
vibrations) to millisecond for MD (i.e., conformational changes) [21].
After a short summary of fundamental aspects associated with
enzyme-catalyzed ester and amide bond hydrolysis, in silico efforts

a The term “amidase” will from now on refer to both amidases and proteases.
Likewise, in the term “esterase,” lipases are included as well.

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Fundamental Catalytic Concepts 525

within the last 10 years centered on shedding light on these reaction


mechanisms will be discussed. The exploitation of the acquired
knowledge from such mechanistic studies in enzyme engineering
and design will be highlighted, as biocatalytic amide bond synthesis
is of high relevance in the chemical and pharmaceutical industrial
fields [22, 23].

15.2 Fundamental Catalytic Concepts

15.2.1 Fundamental Chemistry of Amides and Esters


Peptide bonds are inherently stable and display half-times of 500
years at neutral pH and ambient temperature in aqueous solution
[24]. Base-catalyzed hydrolysis of alkyl esters proceeds at a rate
3 orders of magnitude faster than that of corresponding amide
substrates [25]. These facts highlight the stability of the amide bond
in comparison to an ester, an effect caused by delocalization of the
nitrogen lone pair. The consequence of this resonance, facilitated
by the smaller electronegativity of the nitrogen atom as compared
to oxygen, is that the carbonyl carbon–nitrogen bond displays
significant sp2 character.

15.2.2 Esterases and Amidases and Their Metabolic


Significance
Esterases and lipases are hydrolytic enzymes of high relevance in
energy storage. These biocatalysts typically display a Ser/His/Asp
catalytic triad confined within an α-/β-hydrolase fold [26]. They are
associated with a range of diseases, including obesity and diabetes
[27]. Amidases and proteases display a variety of arrangements of
catalytic dyads and triads as well as different folds [28]. They have
essential functions in the metabolism, homeostasis, protein and viral
maturation, analgesia, immune response, apoptosis, regeneration,
and signal transduction [29–35]. These enzymes furthermore catab-
olize neuroactive peptides and hormones involved in nociception,
aging, memory, and learning [36, 37]. Substitutions in protease-
encoding genes that reduce or abolish enzymatic activity [38–40]
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

526 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

are closely associated with a plethora of genetic diseases [41–


45]. Cascades of proteases are key players in inflammation, cancer
and tumor progression [46, 47], Alzheimer’s disease [48–50], and
cardiovascular diseases [36]. Therefore it is of highest significance
to shed light on the reaction mechanisms displayed by these
biocatalysts.

15.2.3 Fundamental Chemical Aspects of Amidase and


Esterase Catalysis
Ester bonds are readily hydrolyzed by both esterases and amidases
[25, 51]. To achieve efficient hydrolysis of the highly stable amide
bond [24], amidases capitalize on proximity effects, display general
acid/base catalysis, and provide stabilization of the formed oxyanion
(see Fig. 15.1a) [52]. These catalytic strategies are illuminated
by the well-known Ser/His/Asp catalytic triad of the serine
proteases for which the oxyanion hole is formed by backbone
amide groups (Fig. 15.1a) [52]. Other amidase families have more
exotic catalytic motifs, including N-terminal nucleophilic amino
acids (i.e., monads), the Asp dyads of the aspartic proteases, and the
Ser/cis-Ser/Lys triads of the AS-signature enzymes [28]. The latter
are powerful catalysts that display similar amidase and esterase
reaction specificitiesa [57].
Esterases have also evolved to harness the same catalytic triad as
serine proteases and an oxyanion hole. Whereas amidases display
rate enhancements in the range of 1013 -fold [24], esterases and
lipases are very poor catalysts in the hydrolysis and synthesis
of amide bonds. This sharp contrast cannot be explained by a
difference in the spatial arrangement, or in the order in the primary
sequence, of the amino acids constituting the catalytic machineries.
Even more intriguingly is the fact that some proteases, such as
prolyl oligopeptidase, share the same α-/β-hydrolase fold typically
displayed by esterases [26].

a Reaction specificity is herein referred to as the relative capability of the enzyme


catalyst to hydrolyze an ester and an amide substrate that share the same molecular
architecture. It is given by kcat /KM (ester)/[kcat /KM (amide)].

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Fundamental Catalytic Concepts 527

Figure 15.1 (Contd.)


March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

528 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

Figure 15.1 Serine-hydrolase-catalyzed hydrolysis. (a) Reaction mecha-


nism for the acylation step of amide bond hydrolysis. Nucleophilic attack
on the substrate carbonyl in the first transition state (TS1 ) generates a
tetrahedral intermediate (TI1 ). Due to stereoelectronic effects [53, 54], the
lone pair of the reacting nitrogen atom will be situated antiperiplanar
to the bond formed between the Oγ of the catalytic Ser and the former
carbonyl carbon of the substrate (indicated by the arrows). This spatial
arrangement corresponds to an unproductive conformation, since the lone
pair is pointing away from the catalytic histidine, which prohibits catalysis
from proceeding. Thus nitrogen inversion via TSinv (shown), or possibly
rotation (not shown), is required to produce a catalytically competent
second tetrahedral intermediate (TI2 ) [55, 56]. Proton transfer to the
reacting nitrogen atom and the generation of the acyl enzyme proceeds via
the collapse of the second tetrahedral intermediate (via TS2 ). During the
deacylation step, the N-terminal part of the original peptide substrate is
released by the nucleophilic attack of a water molecule. (b) For an ester
substrate, the additional oxygen lone pair can accept a proton from the
catalytic base with only small rearrangements of the corresponding first
tetrahedral intermediate formed during acylation (TI1 ). (c) Facilitation of
nitrogen inversion through hydrogen bond formation in TSinv (indicated
by the arrow). (d) Proton shuttle mechanism for the acylation step of
hydrolysis of functionalized amides makes nitrogen inversion obsolete. The
mechanism is shown starting from the first tetrahedral intermediate of an
amide of (R)-alaninol as example. Two protons are in flight via participation
of the additional hydroxyl group of the substrate in the transition state
(TSproton shuttle ). The Ser/His/Asp catalytic triad is labeled. Bonds that are
formed/broken are shown with dashed lines, and hydrogen bonds are
shown with dotted lines.

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 529

15.2.4 Impact of Stereoelectronic Effects on the Enzymatic


Reaction Mechanism
Regardless of the composition of the catalytic machinery, or the
protein fold, stereoelectronic effects are of fundamental mechanistic
importance for enzyme-catalyzed amide bond hydrolysis and
synthesis. It is well-recognized that the single lone pair, sitting on
the reacting nitrogen atom of the amide substrate, will be situated
antiperiplanar to the bond
formed/broken between the catalytic nucleophile and the former
carbonyl carbon of the substrate [53–55] (see TI1 in Fig. 15.1a). This
spatial arrangement allows for a stabilizing n–σ * orbital overlap,
worth several kilocalories per mole, in the transition state (TS)
involving nucleophilic attack on the substrate carbonyl [53]. As
a consequence, the nitrogen lone pair will point away from the
catalytic base in the first tetrahedral intermediate formed during the
acylation step of enzyme-catalyzed amide bond hydrolysis (see TI1
in Fig. 15.1). Hence, rotation around the C–N bond and/or nitrogen
inversion is necessary [56] to rearrange the nitrogen lone pair into a
catalytic competent conformation for proton abstraction (i.e., TI2 in
Fig. 15.1).
Stereoelectronic effects have less dramatic consequences for
enzyme-catalyzed ester bond hydrolysis since the ester oxygen has
two lone pairs. Hence proton abstraction can occur with only subtle
rearrangements of the corresponding first tetrahedral intermediate
formed during enzyme-catalyzed ester bond hydrolysis (Fig. 15.1b).

15.3 Molecular Modeling of Fundamental Catalytic


Concepts

15.3.1 QM Calculations on Amidases and Esterases


Quantum mechanical studies of reference reactions for serine-
protease-catalyzed hydrolysis established that acylation is rate
limiting for amide bond hydrolysis [58]. Protonation of the lone
pair of the reacting nitrogen atom was found to occur prior to
the breaking of the bond between the former carbonyl carbon
and the nitrogen [58]. On the basis of density functional theory
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

530 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

(DFT) calculations on a small active site model of chymotrypsin that


accounted for the catalytic triad and oxyanion hole, it was suggested
that the acylation step is rate limiting for serine-hydrolase-catalyzed
ester hydrolysis [59]. One important finding from this early study
was that DFT-optimized TS geometries could be superimposed on
the catalytic triad of a crystal structure of chymotrypsin [59]. In
another study, DFT calculations on serine-hydrolase-catalyzed ester
bond hydrolysis demonstrated that multistep catalysis is associated
with minimal conformational rearrangement [60]. These facts [59–
60] corroborate the importance of enzyme preorganization in
achieving optimal TS stabilization [10].
Although quantum mechanical calculations on enzyme active
site models in the gas phase represent a simplification of reality,
they have a formidable potential in applied chemical catalysis.
Calculations at a lower level of theory can be used for in silico
screening of enzyme variants [61]. QM and theozyme calculations
to obtain geometries of TS structures, and to elucidate reaction
mechanisms, have been explored in enzyme design [62, 63].
Researchers were able to design a papain-like dyad and an oxyanion
hole with one hydrogen bonding capability into catalytically inactive
protein scaffolds [64]. The best de novo designed variant displayed
second-order rate constants of 400 M−1 ·s−1 for p-nitrophenyl ester
hydrolysis [64]. Crystal structure analysis of the designed biocat-
alysts revealed that the introduced dyad displayed a suboptimal
spatial arrangement [64]. A serine hydrolase catalytic triad and
oxyanion hole was subsequently designed in inert protein scaffolds
[65]. After directed evolution of the initial design, a variant with high
reactivity toward fluorophosphonates was obtained [65]. The most
active design comprised a hydrogen bonding interaction between
the Ser Oγ and the Nγ of the catalytic base [65]. This is in
contrast with serine hydrolases for which the histidine Nε2 activates
the catalytic nucleophile [52]. Peptidomimetic scaffolds, mimicking
the general acid/base catalysis of proteases and esterases, were
synthesized on the basis of quantum mechanical calculations
[66]. The obtained organocatalysts were called spiroligozymes
and displayed rate enhancements of 1000-fold compared to the
background reaction in the transacylation of esters [66].
Aminopeptidases contain two catalytic Zn2+ ions that bridge a
catalytic water molecule [67]. This is in contrast to carboxypep-

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 531

tidases that are mononuclear zinc proteases. DFT calculations


on an active site model of leucine aminopeptidase demonstrated
that the mechanism involving a bridged hydroxide ion had a
significantly lower barrier than the mechanism involving a bound
water molecule [68]. DFT calculations were used to shed light on
the reaction mechanism of amide bond hydrolysis displayed by
the aminopeptidase from Aeromonas proteolytica [69]. By using
the quantum chemical cluster approach, it was found that one
of the catalytic zinc atoms anchors the terminal amino group of
the substrate, whereas the second zinc provides stabilization of
the oxyanion [69]. The overall rate-limiting step at the B3LYP/6-
311+G(2d,2p) level of theory was found to be the protonation of the
reacting nitrogen atom of the substrate [69].
DFT calculations were used to elucidate the productive orien-
tation of the substrate lactone ring in the active site of N-acyl-
L-homoserine hydrolase [70]. In particular, the reactive substrate
conformation, found by DFT calculations, was flipped compared
to that found in a crystal structure of the enzyme with a bound
inhibitor. The substrate orientation, which was viable for catalysis
according to the QM calculations, afforded significant electrostatic
stabilization in the TS through interactions with the catalytic zinc
atoms [70]. Likewise, ab initio calculations were used to study
the productive substrate conformation during esterase-catalyzed
transacylation of acrylates [71]. Acylation of the catalytic serine was
found to involve a least-motion mechanism, for which the planar
ground-state conformation of the α, β-unsaturated ester substrate-
directed catalysis [71]. This spatial arrangement, for which the π-
electron orbital overlap in the substrate is maximized, was found to
impose severe steric constraints on the native enzyme catalysts [71].
DFT calculations were used to study the catalytic promiscuity
displayed by the aryl sulfatase from Pseudomonas aeruginosa [72].
It was found that the natural sulfate monoesterase activity involved
a different reaction mechanism compared to that for promiscuous
phosphate monoester substrates [72]. The difference in charge
between a sulfate and a phosphate monoester thus significantly
affected enzyme catalysis and changed the overall rate-limiting step
[72]. Clear physical differences that could be expected to influence
the enzymatic reaction specificity exist between amide and ester
substrates. QM calculations using the MP2/6-311+G(2d,2p) level
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

532 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

of theory were used to shed light on the importance of the spatial


arrangement of the single lone pair of the reacting nitrogen atom
during amide bond hydrolysis [73]. The QM model accounted for the
oxyanion hole and the catalytic triad of Candida antarctica lipase
B [73]. It was found that nitrogen inversion (TSinv shown in Fig.
15.1a) is rate limiting for the modeled serine-hydrolase-catalyzed
acylation of amides (represented by N-methyl acetamide) [73]. The
fact that nitrogen inversion becomes overall rate limiting can readily
be understood in terms of net forward flux rate constants [51, 73].
The reacting nitrogen atom was found to be of sp2 character in the
TS for inversion and the O–C–N–H dihedral angle was 155◦ , with
the reacting –NH group pointing away from the catalytic nucleophile
(Fig. 15.1a). Vibrational analysis demonstrated that, when using
the MP2 level of theory, nitrogen inversion is a discrete TS that
interconverts the first and the second tetrahedral intermediates
along the reaction coordinate [73] (TI1 and TI2 in Fig. 15.1a). This
study represents the first case for which the energetic implications
of nitrogen inversion were explicitly addressed [73]. As expected,
the calculated barrier for acylation of the ester methyl acetate was
found to be significantly lower than that for acylation of N-methyl
acetamide [73].
The TS for nitrogen inversion was found to reside 4.6 kcal/mol
higher in energy than the TS of the second-highest energy (TS1
in Fig. 15.1a, which was found to have an energy around 14
kcal/mol higher than the ES-complex) [73]. In relation to the
first tetrahedral intermediate (TI1 in Fig. 15.1), the calculated cost
of nitrogen inversion was around 4.5 kcal/mol [73]. This is in
excellent agreement with the well-known activation barriers of
inversion for simple alkylamines [74, 75]. This extra energetic
penalty of 4.5 kcal/mol, which is put on top of the high-energy
tetrahedral intermediate, corresponds to 3 orders of magnitude in
rate. Hence, it could be expected that amidases would facilitate the
nitrogen inversion process to reduce the associated extra energetic
burden during amide bond hydrolysis (or synthesis). An ideal
strategy would be to exploit the additional proton sitting on the
amide nitrogen, which is absent in ester substrates, to achieve TS
stabilization and enhanced reaction specificity (Fig. 15.1c).

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 533

The mechanistic importance of nitrogen inversion was corrob-


orated in a later study founded on using DFT calculations at the
B3LYP/6-311+G(2d,2p) level of theory [76]. A large model of the
active site of C. antarctica lipase B comprising 205 atoms was
employed to represent serine-hydrolase-catalyzed transacylation
[76]. Interestingly, when using DFT, nitrogen inversion was found
to be concerted with nucleophilic attack of the catalytic serine
on the carbonyl of the amide substrate during acylation [76]. As
would perhaps be expected, the nucleophilic attack (i.e., TS1 in Fig.
15.1a), concomitant with nitrogen inversion (TSinv in Fig. 15.1a), was
associated with a TS structure of high energy (the barrier was found
to be around 20 kcal/mol [76]). The performed vibrational analysis
and calculation of the intrinsic reaction coordinate established that
nitrogen inversion is associated with a movement of the catalytic
histidine [76]. This allows the Nε2 proton of the catalytic base to
switch the hydrogen bonding partner from the Oγ of the catalytic
serine to the developed lone pair of the reacting nitrogen atom of
the substrate [76].
A novel proton shuttle mechanism for esterase-catalyzed
transacylation, with difunctionalized amines as acyl acceptors, was
discovered on the basis of DFT calculations [76]. The mechanism
is founded on substrate-assisted catalysis for which the additional
substituent participates in proton shuttling (the mechanism is
shown in the hydrolytic direction in Fig. 15.1d). The hydrogen
shuttling delivers a proton to the substrate nitrogen lone pair in
the antiperiplanar spatial arrangement (Fig. 15.1d), which makes
nitrogen inversion obsolete. The barrier for the new proton shuttle
mechanism was indeed calculated to be 1.1 and 5 kcal/mol lower
than that of the traditional mechanism with nitrogen inversion
(for 1,2-aminoalcohols and diamines as acyl acceptors, respectively)
[76]. The calculations were in accordance with the experimental
observations that esterase-catalyzed amide bond formation, with
amino alcohols and diamines as nucleophiles, proceeds with
rates 10- to 40-fold faster than when using monofunctionalized
amines. The substrate-assisted proton shuttle mechanism was
further supported by linear free-energy relationship analysis and
experimental data [76]. Importantly, the proton shuttle mechanism
shown in Fig. 15.1d was found to be beneficial for amides only, since
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

534 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

the additional lone pair of an ester oxygen would lead to electrostatic


repulsion in the enzyme active site [76].
The importance of other concerted proton shuttle mechanisms
in biology has been recognized previously in the literature [77]. On
the basis of calculations at a low level of theory, it was suggested
that hydrolysis of penicillin substrates by class A β-lactamases
involves a concerted proton shuttle mechanism [77]. It was found
that the two catalytic serines of the Ser/Ser/Lys catalytic triad
were well positioned to participate in such proton shuttling [77].
The suggested mechanism allows for nucleophilic attack of the
catalytic serine on the carbonyl carbon of the substrate, concomitant
with the protonation of the reacting nitrogen atom of the β-lactam
ring [77]. Two protons are in flight in the TS according to this
concerted mechanism of acylation that does not involve formation
of a tetrahedral intermediate [77]. A later QM study suggested that
the penicillin carboxylate group could actively participate in proton
shuttling during the acylation step [78].
Other proton shuttle mechanisms in enzyme-catalyzed acyl
transfer reactions are founded on substrate-assisted catalysis [79].
A six-membered ring TS structure, involving the pro-S phosphate
oxygen of the histidyl-adenylate substrate as base, was found to
account for efficient catalysis displayed by histidyl-tRNA synthetase
[80]. The feasibility of substrate-assisted catalysis during ribosome-
catalyzed peptide bond formation has been investigated on the
basis of DFT calculations of the aminolysis of 1,2-diol monoesters
[79]. On the basis of the QM calculations it was found that the
additional hydroxyl group in the diol substrate, which would mimic
the 2’-OH of the A76 in the ribosome P-site, participated in proton
shuttling [79]. By calculations at the B3LYP/6-31G* level of theory
on a model consisting of 76 atoms of the ribosome peptidyl
transfer center, it was found that an eight- membered ring TS
structure was associated with the lowest barrier for amide bond
synthesis [81]. The A76 2’-hydroxyl group and two water molecules
allowed for a concerted mechanism for which three protons were
in flight in the TS for peptidyl transfer [81]. According to this
mechanism, protonation of the leaving oxygen group (i.e., the A76 3’-
OH) occurs concomitantly with nucleophilic attack by the α-amino
group of the aminoacyl-tRNA [81]. By using B3LYP/6-31+G(d,p)

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 535

calculations on models of the ribosome consisting of around 50


atoms, a four-membered ring proton shuttle has been suggested
[82]. In another study, protonation of the substrate carbonyl oxygen,
through shuttling involving the A76 2’-OH group, was suggested to
be the mechanism associated with the lowest energy barrier [83].
Subsequent calculations using the M06-2X/6-311+G(d,p) level of
theory demonstrated the importance of additional water molecules
for efficient proton shuttling and the catalytic importance of the 2’-
OH group of A2451 [84].

15.3.2 MD Simulations on Amidases and Esterases


MD was used to study the tetrahedral intermediate that could
be formed during peptide elongation catalyzed by the ribosome
[85]. From the simulations of the modeled zwitterionic interme-
diate, it was concluded that ribosomal groups capable of general
acid/base catalysis were lacking and that a substrate-assisted
proton shuttle mechanism was feasible [85]. Interactions between
serine penicillin-recognizing enzymes and different antibiotics were
studied by MD simulations [86]. Common mechanistic themes for
the acylation step of amide bond hydrolysis were analyzed for
penicillin-binding proteins and β-lactamases [86]. The simulations
revealed that the β-lactam carboxylate group resided in a spatial
arrangement that would allow it to participate in general acid/base
catalysis during acylation [86].
Insight into resistance-causing mutations in the human im-
munodeficiency virus (HIV)-protease gene was studied by MD
simulations of a wild-type and mutated enzyme complexed with
anti-HIV drugs [87]. On the basis of the simulations, it was concluded
that an altered binding pocket of the studied variant could explain
the decreased binding affinity toward the antiviral drug [87].
A transition from binding studies to analysis of the chemical
steps is hampered by the fact that MD simulations are unable to cap-
ture bond-breaking and bond-forming processes. However, TS struc-
tures can be represented by obtaining geometries, point charges,
and force-field parameters from quantum mechanical calculations.
Using this approach, researchers employed MD simulations to
study the reaction mechanism of amide bond hydrolysis displayed
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

536 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

by amidase signature enzymes [88]. Analysis of MD trajectories


elucidated the probability of the formation of the so-called near-
attack complex (NAC), for which the angle of nucleophilic attack on
the substrate carbonyl is within an optimal range and the reacting
atoms within the van der Waals distance. On the basis of the NAC
criteria [88], two possible protonation states of the catalytic Ser/cis-
Ser/Lys catalytic machinery were identified [88]. Interestingly, at
lower pH values, water molecules were found to participate in
catalysis by deprotonating the catalytic serine [88].
Another MD approach would be to model intermediates along
the reaction coordinate as representative TS structures. It is
well known that the tetrahedral intermediates formed during
serine hydrolase catalysis are high-energy structures that resemble
the corresponding TS (both in terms of geometry and energy)
[89, 90]. By modeling the tetrahedral intermediate, researchers
studied esterase-catalyzed transacylation of acyl donors bearing a
methoxy substituent [91]. The experimentally determined change
in reaction specificity was found to be up to 300-fold in favor of
aminolysis over alcoholysis using difunctionalized acyl donors [91].
A hydrogen bond, donated by the reacting –NH group of the 1-
phenylethylamine nucleophile and accepted by the methoxy group
of the acyl donor, could form in the modeled TS structure [91].
Since alcohol nucleophiles lack this hydrogen-bonding capability in
TS, the hydrogen bond could explain the experimentally observed
rate enhancement for esterase-catalyzed aminolysis [91]. In a later
study, MD simulations identified an analogous hydrogen bond in
the modeled tetrahedral intermediate for Bacillus subtilis esterase-
catalyzed amide bond hydrolysis [92]. In this case, the acceptor in
the TS was found to be a water molecule [92].
MD was used to study the stabilization of the TS for nitrogen
inversion in amidases [73]. The second tetrahedral intermediate
(TI2 in Fig. 15.1a) was used to represent the rate-limiting TS for
the protease-catalyzed amide bond hydrolysis [73]. The proton
sitting on the reacting nitrogen atom occupies an analogous spatial
arrangement in TI2 and in the TS for nitrogen inversion (i.e.,
the –NH group is pointing away from the bond formed between
the catalytic nucleophile and the former carbonyl carbon of the
substrate) [73]. Interestingly, a key hydrogen bond donated by the

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 537

Figure 15.2 A snapshot from an MD simulation of the second tetrahedral


intermediate reveals substrate-assisted catalysis displayed by plasmin. The
key hydrogen bond, involving a seven-membered ring TS, is shown by the
arrow. The residues forming the catalytic triad are labeled (H57, S195,
D102). The oxyanion hole is formed by the backbone of G193 and S195. The
scissile bond is located at Arg15 (bold sticks) confined within the so-called
plasminogen activation loop [93].

reacting –NH group was found in the modeled TS structure for


all 16 analyzed amidases. They represented 10 different reaction
mechanisms and 11 different folding families [73]. The investigated
amidases included classical proteases, such as trypsin, as well as
important enzymes in the metabolism of neuroactive peptides (e.g.,
fatty acid amide hydrolase) and in disease (e.g., HIV-protease). The
acceptor of the hydrogen bond was found to be a backbone carbonyl
in the protein (i.e., enzyme-assisted catalysis) or the P2 C=O of
the substrate itself (i.e., substrate-assisted catalysis) [73]. The latter
involves a characteristic seven-membered ring TS (see Fig. 15.2).
The only exception was carboxypeptidase A for which a side chain
(a tyrosine) functioned as the acceptor in the modeled TS structure
[73]. In all investigated amidases, the acceptor resided in a spatial
arrangement to provide optimal stabilization of the TS for nitrogen
inversion by hydrogen bond formation (Fig. 15.1c) [73]. Importantly,
it was found that an enzyme-assisted hydrogen bond acceptor for
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

538 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

inversion did not exist in esterases [73]. Hence this key interaction
is of high importance for the reaction specificity, which corroborates
the key role of a preorganized active site in achieving optimal TS
stabilization. Esterase-catalyzed transacylation, using amines and
alcohols as nucleophiles, was performed to analyze the impact of the
hydrogen bond experimentally. Acyl donors containing an additional
functional group, which would mimic the P2 C=O of peptide
substrates, resulted in significantly elevated rates of amide bond for-
mation [73]. Furthermore, the reaction specificity shifted 100-fold in
favor of aminolysis over alcoholysis using designed substrates [73].
The repertoire of investigated enzymes was expanded in a
later study to include important amidases, such as the proteasome
and bimetal amidohydrolases [94]. Remarkably, all investigated
amidases formed the key hydrogen bond with a spatial arrangement
to stabilize the TS for nitrogen inversion [94]. By MD analysis of
27 different proteases, it was found that substrate-assisted catalysis
was more common for larger peptide substrates, whereas enzyme-
assisted catalysis prevailed for smaller amides [94]. Obviously, for
fatty acid amides, substrate-assisted catalysis via a P2 C=O carbonyl
is not possible. The formation of the key hydrogen bond in the TS
donated by a terminal substrate amino group is mechanistically
interesting. This is because rotation around the C–N bond could be
possible for the small –NH2 group of fatty acid amides. Nevertheless,
evolution seems to have favored stabilization of nitrogen inversion
as a mechanistically viable path (Fig. 15.1c) [94].
The potential of using MD simulations to study reaction mech-
anisms, as well as to discover important discriminants between
esterase and amidase reaction specificities, is furthermore demon-
strated for plasmin in this chapter. Plasmin is a key player in fibri-
nolysis, inflammation, and cancer [93]. On the basis of the crystal
structure 1BUI [93], a model of the second tetrahedral intermediate
for the hydrolysis of plasminogen catalyzed by staphylokinase-
activated plasmin can be constructed (see Fig. 15.2). In this case,
the substrate is a zymogen (i.e., plasminogen) that becomes active
upon hydrolysis by plasmin, thus forming an activation cascade. MD
simulations, performed as previously described [73, 94], reveal that
substrate-assisted catalysis is key to catalysis for plasmin (Fig. 15.2),
a fact that was previously not known.

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 539

Researchers used MD simulations to study the promiscuous


amidase activity displayed by Pseudomonas fluorescens esterase I
toward Vince lactams [95]. In the wild-type enzyme, there is a water
molecule in close proximity of the reacting –NH group that could
potentially function as a hydrogen bond acceptor. Interestingly, the
L29P variant that displayed 100-fold higher amidase activity toward
the γ -lactam cis-amides [95] had a backbone carbonyl in close
proximity of the reacting –NH group of the substrate [95]. From the
generated simulation snapshots [95], it can be seen that the C=O
acceptor in the L29P variant resides in a spatial arrangement so as
to stabilize nitrogen inversion by hydrogen bond formation.
By using MD simulations, enhanced promiscuous amidase activ-
ity of a double variant (I270F, F314Y) of Bacillus subtilis esterase
was attributed to an introduced π-stacking [96]. Interestingly, from
the generated snapshots it can be seen that the F314Y substitution
brings a polar group in close proximity of the reacting –NH group in
the modeled tetrahedral intermediate. It could be possible that the
established π -network orients the introduced Tyr in a productive
conformation for hydrogen bond formation by its hydroxyl, thus
facilitating nitrogen inversion.
MD simulations were used to study the proton shuttle mech-
anism for esterase-catalyzed transacylation using difunctional-
ized amines as nucleophiles (Fig. 15.1d) [76]. The simulations
demonstrated that the frozen quantum mechanically computed TS
structure can indeed form when taking the dynamics of the whole
enzyme into account [76]. The modeled zwitterionic tetrahedral
intermediate that represented the TS for the proton shuttle in Fig.
15.1d had a higher abundance of productive conformations for
smaller amino alcohols as compared to larger substrates [76]. This
finding was in accordance with the experimental results that longer
amino alcohols displayed lower rate enhancements of amide bond
synthesis [76].

15.3.3 QM/MM Simulations on Amidases and Esterases


The geometry of the key hydrogen bond acceptor (Fig. 15.1c),
obtained from a snapshot of an MD simulation of protease-catalyzed
amide bond hydrolysis, was inserted in a theozyme model of an
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

540 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

esterase [73]. QM was subsequently used to investigate the impact


of the added amidase-like hydrogen bond acceptor on the reaction
mechanism for amide bond hydrolysis [73]. It was found that the key
hydrogen bond, donated by the reacting –NH group, contributed 4
kcal/mol in stabilizing the TS for nitrogen inversion in the gas phase
[73]. This demonstrates the importance of combining QM and MD
to decipher enzyme catalysis. When the hydrogen bond interaction
was added to the QM calculations on the esterase, a flat energy
surface was obtained, with the associated TS structures (TS1 , TSinv ,
and TS2 depicted in Fig. 15.1a) being close to isoenergetic [73]. The
importance of isoenergetic TS structures as a catalytic strategy for
efficient enzyme catalysis has been discussed in the literature [51].
In a later study, AM1, combined with the CHARMM force field,
was used to study amide bond hydrolysis catalyzed by the hepatitis
C virus NS3 serine protease [97], which is key for viral maturation.
Interestingly, a spatial rearrangement of the reacting nitrogen atom
in the first tetrahedral intermediate was suggested to produce a
second tetrahedral intermediate that could abstract a proton from
the catalytic histidine [97]. The TS for this interconversion (i.e.,
corresponding to the transformation of TI1 to TI2 in Fig. 15.1a) was
not reported [97]. The importance of a structural reconfiguration
at the reacting nitrogen atom during catalysis has likewise been
discussed for thermolysin on the basis of QM/MM calculations [98].
The hybrid QM/MM approach was used to study the acylation
step of serine-protease-catalyzed hydrolysis [99]. The quantum
mechanical region of the active site was treated using the Hartree–
Fock method with the 6-31G* basis set. The formation of the first
tetrahedral intermediate (corresponding to TI1 in Fig. 15.1a) was
found to be rate limiting [99]. The importance of a structural
reorganization to funnel the first intermediate into a second
tetrahedral intermediate (i.e., TI1 to TI2 ) was discussed [99].
Rotation around the scissile C–N bond was suggested to allow for
proton abstraction from the catalytic base. However no TS structure,
capable of interconverting TI1 to TI2 , was reported [99]. In another
study, the deacylation step for elastase was studied using QM/MM
at the HF/3-21G//CHARMM level of theory [100]. The nucleophilic
attack by the catalytic water molecule was found to be the rate-
limiting step for deacylation [100].

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 541

QM/MM simulations at the B3LYP/6-31G(d)//AM1 level of


theory were used to study the reaction mechanism of L-asparaginase
from Escherichia coli [101]. On the basis of the simulations it was
suggested that an unprotonated Lys residue activates a catalytic
water molecule, rather than an active site Thr that is part of a
Thr/Lys/Asp triad. According to this mechanism, catalysis proceeds
by water attack on the amide bond and does not involve the
formation of an acyl–enzyme complex [101].
QM/MM calculations were used to study the reaction mechanism
of inhibition of the proteasome by the drug epoxomicin [102]. The
proteasome is a multisubunit N-terminal hydrolase for which the
terminal Thr of the catalytic domains (referred to as β-subunits)
constitutes the nucleophile. It was found that the first step of the
acylation process consisted of the generation of a negatively charged
Thr 1 Oγ , for which the N-terminal amino group functioned as
the general base [102]. The rate-limiting step of inhibition was
found to be the ring opening of the epoxide, confined within
the inhibitor skeleton, by nucleophilic attack of the deprotonated
enzyme nucleophile on the β-carbon of the drug [102]. Interestingly,
it was found that active participation of water molecules in catalysis,
through proton shuttling, did not afford a catalytic advantage for
the acylation step [102]. The potential of using QM/MM calculations
in elucidating the reaction mechanism of protease inhibition by
drugs has furthermore been corroborated for fatty acid amide
hydrolase [103]. By studying the deacylation of the catalytic Ser, it
was concluded that the spatial arrangement of the carbamoyl moiety
of the inhibitor obstructed the stabilization that would otherwise
be provided by the oxyanion hole [103]. Another study focused on
unraveling the catalytic mechanism for fatty acid amide hydrolase-
catalyzed peptide bond hydrolysis [104]. Interestingly, both esterase
and amidase activities were modeled for the wild-type enzyme and
a variant, for which the catalytic Lys had been mutated to Ala.
Acylation was found to be rate limiting for the wild-type enzyme
and the catalytic Lys was found to participate in general acid/base
catalysis [104]. It was found that a new reaction mechanism was
introduced in the catalytically impaired dyad variant, for which
nucleophilic attack and proton transfer to the reacting nitrogen
group of the substrate were essentially concerted [104].
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

542 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

Sedolisins comprise an exotic Ser/Glu/Asp catalytic machinery


that give rise to maximal activity in the low-pH regime [28]. By
utilizing QM/MM calculations, it was found that the substrate
carbonyl oxygen gets protonated by an Asp (D164) through general
acid/base catalysis upon formation of the tetrahedral intermediate
[105]. Moreover, a histidine in the P1 position of the substrate was
suggested to change conformation. This would allow for substrate-
assisted catalysis by stabilizing the developing negative charge on
the catalytic Asp [106]. In a later study, the importance of such
substrate-assisted catalysis was found to be less pronounced during
deacylation catalyzed by Kumamolisin-As [107]. Researchers used
QM/MM methods to study amide bond hydrolysis catalyzed by the
cysteine protease cathepsin K [108]. It was concluded that acylation
was the rate-limiting step and that protonation of the reacting
nitrogen atom of the substrate occurs in concert with a nucleophilic
attack by the thiolate anion [108]. Interestingly, a weak hydrogen
bond donated by the reacting nitrogen atom and accepted by the
backbone of N161 was observed [108].
By employing QM/MM calculations, researchers studied acetyl-
choline esterase-catalyzed ester bond hydrolysis [109]. A low-
barrier hydrogen bond that resulted in a proton being shared
between the catalytic histidine and the catalytic glutamate was
proposed [109]. The low-barrier hydrogen bond aspect of the
reaction mechanism has previously been discussed in the literature
for serine proteases [13]. The rate-limiting step for ester bond
hydrolysis by the Ser/His/Glu triad of acetylcholine esterase was
found to be the deacylation step [109].
Histone-deacetylases are active toward acetyl-lysine scaffolds in
a biochemical process controlling the structure of the nucleosome.
QM/MM calculations were exploited to study the acylation step of
amide bond hydrolysis catalyzed by histone-deacetylase-like protein
[110]. On the basis of the simulations, for which the catalytic metal
was treated with a pseudopotential and the rest of the QM system
with DFT, nucleophilic attack by a water molecule on the carbonyl
carbon of the substrate was found to be rate limiting [110]. In
the proposed reaction mechanism, which involved two catalytic
histidines, the role of the metal was found to be the polarization of
the carbonyl carbon–oxygen bond [110].

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Molecular Modeling of Fundamental Catalytic Concepts 543

Binuclear metallohydrolases are key players in the metabolism


of small peptide substrates. Glutamate carboxypeptidase II contains
two catalytic zinc atoms that bridge a nucleophilic hydroxide. The
water nucleophile gets activated by a general base (typically a
glutamate), which is not directly bound to the catalytic metals.
By utilizing B3LYP/def2-TZVP//Amber calculations, it was found
that protonation of the reacting nitrogen atom and collapse of the
second tetrahedral intermediate were rate limiting [111]. This is in
accordance with previous quantum mechanical DFT calculations on
a small model of the active site of a binuclear zinc aminopeptidase
[69]. Interestingly, on the basis of the QM/MM model, a hydrogen
bond donated by the reacting –NH group and accepted by the
backbone of G518 was observed [111]. Researchers used QM/MM
simulations at the SCC-DFTB//CHARMM level to investigate the
ring-opening step of β−lactam antibiotics catalyzed by bimetal
dependent metallo-β-lactamases [112]. Catalysis through the forma-
tion of a negatively charged reacting nitrogen atom was proposed
[112]. The coordination of the reacting nitrogen atom of the
β−lactam ring to one of the catalytic zinc atoms was found to be
crucial for catalysis [112].
Other mechanisms for enzyme-catalyzed β−lactam hydrolysis
are founded on utilizing a serine as the catalytic nucleophile.
Researchers used QM/MM simulations to study the acylation step
of penicillin hydrolysis catalyzed by class A β−lactamases [113].
These enzymes contain a Ser/Ser/Lys triad and an additional Glu.
The acidic amino acid is central for the activation of a water molecule
during the deacylation step of penicillin hydrolysis. By performing
geometry optimizations at the AM1//CHARM level of theory, a
proton shuttle mechanism was suggested to allow for efficient
acylation [113]. More specifically, two protons were suggested to
be in flight in the first TS [113]. Key for this suggested mechanism
is the catalytic water molecule, essential for the deacylation step,
that functions as a shuttling residue (accepting a proton from
the nucleophilic Ser and synchronously delivering a proton to the
catalytic Glu) [113]. The Lys in the catalytic Ser/Ser/Lys triad
was protonated during the simulations [113], reflecting catalysis
relevant for physiological pH values. In a more recent study, the
atomistic details of carbapenemase activity, displayed by evolved
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

544 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

class A β-lactamases, was studied by QM/MM simulations [114]. On


the basis of these simulations on recently solved crystal structures,
it was found that bacterial resistance occurred through an evolved
protein–antibiotic hydrogen bond [114]. This interaction replaced a
hydrogen bond between the hydroxy ethyl moiety of the meropenem
antibiotic and the catalytic water molecule, which would impair the
deacylation step in meropenem-sensitive bacteria [114].
By using the EVB approach to study the reaction mechanism of
aspartic proteases, the importance of enzyme preorganization in
achieving electrostatic stabilization of TSs was demonstrated [115].
Protonation of the reacting nitrogen atom, prior to the collapse of
the tetrahedral intermediate, was simulated [115]. On the basis of
the calculations, a linear free-energy relationship was established
for the three different enzymes studied [115].

15.4 Outlook and Implications for Enzyme Design

Biochemical experiments in concert with computer simulations


allow researchers to meticulously describe enzyme catalysis at
the atomistic level. From the discussed examples herein it can
be concluded that QM calculations, MD simulations, and QM/MM
hybrid techniques all have important roles to play in enhancing our
understanding of how enzymes work. These techniques have beau-
tifully shed light on a number of enzymatic catalytic mechanisms for
ester and amide bond hydrolysis and synthesis. General acid/base
catalysis, the oxyanion hole, and electrostatic stabilization of TSs
are of paramount importance for high activities. Importantly, these
catalytic attributes are displayed by both amidases and esterases.
Therefore, a key fact to understand their reaction specificities is to
account for the spatial arrangement of the single lone pair, sitting on
the reacting nitrogen atom of the amide substrate, during catalysis
(Fig. 15.1a) [94]. Molecular modeling is very well suited to study
this biochemical aspect. The recently discovered key hydrogen bond
in amidases, donated by the reacting –NH group of the substrate,
has a geometry to stabilize the TS for nitrogen inversion [73].
In particular, an enzyme-assisted hydrogen bond, which would
facilitate the nitrogen inversion step, is not possible in the active

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

Outlook and Implications for Enzyme Design 545

site of esterase enzymes [73]. The discovery of this important and


general determinant of reaction specificity stresses the usefulness
of combining quantum mechanical calculations and MD simulations
[73, 94]. It should be noted that a hydrogen bond donated by
the reacting –NH group of the amide substrate has been observed
previously in a few cases [116–119]. A phosphonamide inhibitor,
which was found to form a hydrogen bond to the backbone of A113
in thermolysin, displayed a binding energy 4.1 kcal/mol higher than
that of the corresponding ester [120]. This value is very close to
the QM calculated stabilization of the TS for nitrogen inversion
upon introducing a hydrogen bond acceptor in a small model of an
esterase active site [73].
In addition to the key hydrogen bond donated by the substrate
amide group (Fig. 15.1c), there are until now two additional
known catalytic strategies for enhanced amidase activity. These are
centered on avoiding the nitrogen inversion step. The first additional
strategy consists of protonation of the substrate-reacting nitrogen
atom, prior to nucleophilic attack on the carbonyl carbon by the
enzyme nucleophile. This mechanism has been suggested to be
relevant for β-lactamases [121]. The other strategy is founded on a
substrate-assisted proton shuttle mechanism capable of delivering
a proton to the reacting nitrogen atom in the antiperiplanar
orientation (Fig. 15.1d).
Rational enzyme engineering displays a huge potential for
the future design of biocatalysts with improved properties [122].
MD simulations can be exploited to study the probability of the
formation of the “near-attack conformation,” a term coined by
Bruice [123]. Since the near-attack conformation structure displays
favorable distances and angles for catalysis, some atomistic details
of the relevant TS structure can be captured [123]. Importantly, the
near-attack conformation concept can be used to design enzyme
variants that display higher probabilities of TS formation by
classical MD. This approach was followed by researchers to predict
biocatalyst variants that could accommodate substituted acrylic
acid esters in the TS for esterase-catalyzed transacylation [71]. The
near-attack conformation was defined on the basis of quantum
mechanical calculations [71]. In silico–designed variants displayed
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

546 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

a threefold absolute increase in kcat for the transacylation of the


industrially relevant building block methyl methacrylate [124].
Novel biocatalysts capable of the green synthesis of amide
bonds have been evaluated to be particularly important in the
pharmaceutical industry [22]. The α-/β-hydrolase fold represents
a scaffold that could be suitable for such applications. Substrate-
assisted catalysis is possible for esterase-catalyzed amide bond hy-
drolysis/synthesis, which has been exploited in multiton industrial
processes [23]. However, designed enzymes that would accept a
range of acyl donors in synthesis would be highly desirable. We were
able to take a first step in this direction by designing in hydrogen
bond acceptors for facilitated nitrogen inversion in two esterases
[125]. MD simulations indicated that the inserted amidase-like
hydrogen bond was nonoptimal in TS, which was in accordance with
the fact that expressed variants displayed up to a tenfold absolute
increase in amidase activity [125]. The reaction specificity of the
designed esterase variants was shifted up to 50-fold in favor of
amidase over esterase activity [125]. A perfect hydrogen bond in the
TS could allow for designer enzymes with important applications in
the chemical and pharmaceutical industrial fields.
Water molecules can be key players in enzyme catalysis,
which is stressed in the examples given in this chapter. Thus an
understanding of water patterns and solvation of active sites would
be highly desirable. This could allow researchers to exploit water
molecules in novel and innovative design strategies, such as in
rationally engineered hydrogen bonding stabilization of TSs.

15.5 Additional Comments

Upon completion of this chapter, a work centered on computer


simulations on fatty acid amide hydrolase appeared [126]. The
calculations corroborated the importance of nitrogen inversion.

Acknowledgments

P.-O. Syrén is grateful for a young investigator grant from the


Swedish Research Council (VR).

www.ebook3000.com
March 28, 2016 10:40 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

References 547

References

1. Garcia-Viloca, M., Gao, J., Karplus, M., and Truhlar, D. G. (2004).


How enzymes work: analysis by modern rate theory and computer
simulations, Science, 303, pp. 186–195.
2. Ringe, D., and Petsko, G. A. (2008). Biochemistry. How enzymes work,
Science, 320, pp. 1428–1429.
3. Karplus, M. (2014). Development of multiscale models for complex
chemical systems: from H+H2 to biomolecules (nobel lecture), Angew.
Chem., Int. Ed., 53, pp. 9992–10005.
4. Levitt, M. (2014). Birth and future of multiscale modeling for
macromolecular systems (nobel lecture), Angew. Chem., Int. Ed., 53, pp.
10006–10018.
5. Warshel, A. (2014). Multiscale modeling of biological functions: from
enzymes to molecular machines (nobel lecture), Angew. Chem., Int. Ed.,
53, pp. 10020–10031.
6. Jiang, L., Althoff, E. A., Clemente, F. R., Doyle, L., Roethlisberger, D.,
Zanghellini, A., Gallaher, J. L., Betker, J. L., Tanaka, F., Barbas, C. F., III,
Hilvert, D., Houk, K. N., Stoddard, B. L., and Baker, D. (2008). De novo
computational design of retro-aldol enzymes, Science, 319, pp. 1387–
1391.
7. Siegel, J. B., Zanghellini, A., Lovick, H. M., Kiss, G., Lambert, A. R., St.,
C. J. L., Gallaher, J. L., Hilvert, D., Gelb, M. H., Stoddard, B. L., Houk, K. N.,
Michael, F. E., and Baker, D. (2010). computational design of an enzyme
catalyst for a stereoselective bimolecular diels-alder reaction, Science,
329, pp. 309–313.
8. Hoehne, M., and Bornscheuer, U. T. (2014). protein engineering from
“scratch” is maturing, Angew. Chem., Int. Ed., 53, pp. 1200–1202.
9. Blomberg, R., Kries, H., Pinkas, D. M., Mittl, P. R. E., Gruetter, M. G.,
Privett, H. K., Mayo, S. L., and Hilvert, D. (2013). Precision is essential
for efficient catalysis in an evolved Kemp eliminase, Nature, 503, pp.
418–421.
10. Warshel, A., Sussman, F., and Hwang, J. K. (1988). Evaluation of catalytic
free energies in genetically modified proteins, J. Mol. Biol., 201, pp.
139–159.
11. Noodleman, L., Lovell, T., Han, W.-G., Li, J., and Himo, F. (2004). quantum
chemical studies of intermediates and reaction pathways in selected
enzymes and catalytic synthetic systems, Chem. Rev., 104, pp. 459–
508.
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

548 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

12. Leach, A. (2001). Molecular Modelling: Principles and Applications, 2nd


ed. (Prentice Hall, New Jersey).
13. Warshel, A., and Russell, S. (1986). Theoretical correlation of structure
and energetics in the catalytic reaction of trypsin, J. Am. Chem. Soc.,
108, pp. 6569–6579.
14. Levitt, M. (1976). A simplified representation of protein conformations
for rapid simulation of protein folding, J. Mol. Biol., 104, pp. 59–107.
15. McCammon, J. A., and Karplus, M. (1979). Dynamics of activated
processes in globular proteins, Proc. Natl. Acad. Sci. U S A, 76, pp. 3585–
3589.
16. Tantillo, D. J., Chen, J., and Houk, K. N. (1998). Theozymes and
compuzymes: theoretical models for biological catalysis, Curr. Opin.
Chem. Biol., 2, pp. 743–750.
17. Siegbahn, P. E. M., and Himo, F. (2009). Recent developments of the
quantum chemical cluster approach for modeling enzyme reactions,
J. Biol. Inorg. Chem., 14, pp. 643–651.
18. Senn, H. M., and Thiel, W. (2009). QM/MM methods for biomolecular
systems, Angew. Chem., Int. Ed., 48, pp. 1198–1229.
19. Field, M. J., Bash, P. A., and Karplus, M. (1990). A combined quantum
mechanical and molecular mechanical potential for molecular dynam-
ics simulations, J. Comput. Chem., 11, pp. 700–733.
20. Warshel, A., and Levitt, M. (1976). Theoretical studies of enzymic
reactions: dielectric, electrostatic and steric stabilization of the
carbonium ion in the reaction of lysozyme, J. Mol. Biol., 103, pp. 227–
249.
21. Shaw, D. E., Maragakis, P., Lindorff-Larsen, K., Piana, S., Dror, R. O.,
Eastwood, M. P., Bank, J. A., Jumper, J. M., Salmon, J. K., Shan, Y., and
Wriggers, W. (2010). Atomic-level characterization of the structural
dynamics of proteins, Science, 330, pp. 341–346.
22. Constable, D. J. C., Dunn, P. J., Hayler, J. D., Humphrey, G. R., Leazer, J.
L., Jr., Linderman, R. J., Lorenz, K., Manley, J., Pearlman, B. A., Wells, A.,
Zaks, A., and Zhang, T. Y. (2007). Key green chemistry research areas-
a perspective from pharmaceutical manufacturers, Green Chem., 9, pp.
411–420.
23. Hanefeld, U. (2003). Reagents for (ir)reversible enzymatic acylations,
Org. Biomol. Chem., 1, pp. 2405–2415.
24. Radzicka, A., and Wolfenden, R. (1996). Rates of uncatalyzed peptide
bond hydrolysis in neutral solution and the transition state affinities
of proteases, J. Am. Chem. Soc., 118, pp. 6105–6109.

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

References 549

25. Zerner, B., Bond, R. P. M., and Bender, M. L. (1964). The mechanism
of action of proteolytic enzymes. XXVIII. Kinetic evidence for the
formation of acyl-enzyme intermediates in the α-chymotrypsin-
catalyzed hydrolyses of specific substrates, J. Am. Chem. Soc., 86, pp.
3674–3679.
26. Ollis, D. L., Cheah, E., Cygler, M., Dijkstra, B., Frolow, F., Franken, S. M.,
Harel, M., Remington, S. J., Silman, I. et al. (1992). The α/β hydrolase
fold, Protein Eng., 5, pp. 197–211.
27. Dolinsky, V. W., Gilham, D., Alam, M., Vance, D. E., and Lehner, R. (2004).
Triacylglycerol hydrolase: role in intracellular lipid metabolism, Cell.
Mol. Life Sci., 61, pp. 1633–1651.
28. Ekici, O. D., Paetzel, M., and Dalbey, R. E. (2008). Unconventional serine
proteases: variations on the catalytic Ser/His/Asp triad configuration,
Protein Sci., 17, pp. 2023–2037.
29. Werb, Z. (1997). ECM and cell surface proteolysis: regulating cellular
ecology, Cell, 91, pp. 439–442.
30. Debouck, C., Gorniak, J. G., Strickler, J. E., Meek, T. D., Metcalf, B. W.,
and Rosenberg, M. (1987). Human immunodeficiency virus protease
expressed in Escherichia coli exhibits autoprocessing and specific
maturation of the gag precursor, Proc. Natl. Acad. Sci. U S A, 84, pp.
8903–8906.
31. Neurath, H. (1984). Evolution of proteolytic enzymes, Science, 224, pp.
350–357.
32. Fitzgerald, P. M., and Springer, J. P. (1991). Structure and function of
retroviral proteases, Annu. Rev. Biophys. Biophys. Chem., 20, pp. 299–
320.
33. Neurath, H., and Walsh, K. A. (1976). Role of proteolytic enzymes in
biological regulation (a review), Proc. Natl. Acad. Sci. U S A, 73, pp.
3825–3832.
34. Vollmer, W., Joris, B., Charlier, P., and Foster, S. (2008). Bacterial
peptidoglycan (murein) hydrolases, FEMS Microbiol. Rev., 32, pp. 259–
286.
35. Wyckoff, T. J., Taylor, J. A., and Salama, N. R. (2012). Beyond growth:
novel functions for bacterial cell wall hydrolases, Trends Microbiol., 20,
pp. 540–547.
36. Bachovchin, D. A., and Cravatt, B. F. (2012). The pharmacological
landscape and therapeutic potential of serine hydrolases, Nat. Rev.
Drug Discovery, 11, pp. 52–68.
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

550 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

37. Roques, B. P., Fournie-Zaluski, M.-C., and Wurm, M. (2012). Inhibiting


the breakdown of endogenous opioids and cannabinoids to alleviate
pain, Nat. Rev. Drug Discovery, 11, pp. 292–310.
38. Bonten, E., van der Spoel, A., Fornerod, M., Grosveld, G., and d’Azzo, A.
(1996). Characterization of human lysosomal neuraminidase defines
the molecular basis of the metabolic storage disorder sialidosis, Genes
Dev., 10, pp. 3156–3169.
39. Kokame, K., Matsumoto, M., Soejima, K., Yagi, H., Ishizashi, H.,
Funato, M., Tamai, H., Konno, M., Kamide, K., Kawano, Y., Miyata,
T., and Fujimura, Y. (2002). Mutations and common polymorphisms
in ADAMTS13 gene responsible for von willebrand factor-cleaving
protease activity, Proc. Natl. Acad. Sci. U S A, 99, pp. 11902–11907.
40. Jones, J. M., Datta, P., Srinivasula, S. M., Ji, W., Gupta, S., Zhang, Z., Davies,
E., Hajnoczky, G., Saunders, T. L., Van Keuren, M. L., Fernandes-Alnemri,
T., Meisler, M. H., and Alnemri, E. S. (2003). Loss of Omi mitochondrial
protease activity causes the neuromuscular disorder of mnd2 mutant
mice, Nature, 425, pp. 721–727.
41. Chai, Y., Koppenhafer, S. L., Shoesmith, S. J., Perez, M. K., and
Paulson, H. L. (1999). Evidence for proteasome involvement in
polyglutamine disease: localization to nuclear inclusions in SCA3/MJD
and suppression of polyglutamine aggregation in vitro, Hum. Mol.
Genet., 8, pp. 673–682.
42. Stefanis, L., Larsen, K. E., Rideout, H. J., Sulzer, D., and Greene, L. A.
(2001). Expression of A53T mutant but not wild-type α-synuclein in
PC12 cells induces alterations of the ubiquitin-dependent degradation
system, loss of dopamine release, and autophagic cell death, J.
Neurosci., 21, pp. 9549–9560.
43. Bross, P., Corydon, T. J., Andresen, B. S., Jorgensen, M. M., Bolund, L., and
Gregersen, N. (1999). Protein misfolding and degradation in genetic
diseases, Hum. Mutat., 14, pp. 186–198.
44. Pennacchio, L. A., Bouley, D. M., Higgins, K. M., Scott, M. P., Noebels, J.
L., and Myers, R. M. (1998). Progressive ataxia, myoclonic epilepsy and
cerebellar apoptosis in cystatin B-deficient mice, Nat. Genet., 20, pp.
251–258.
45. Chavanas, S., Bodemer, C., Rochat, A., Hamel-Teillac, D., Ali, M., Irvine,
A. D., Bonafe, J.-L., Wilkinson, J., Taieb, A., Barrandon, Y., Harper, J. I.,
De, P. Y., and Hovnanian, A. (2000). Mutations in SPINK5, encoding a
serine protease inhibitor, cause Netherton syndrome, Nat. Genet., 25,
pp. 141–142.

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

References 551

46. Duffy, M. J., McGowan, P. M., and Gallaher, W. M. (2008). Cancer invasion
and metastasis: changing views, J. Pathol., 214, pp. 283–293.
47. Van Kempen, L. C. L., de Visser, K. E., and Coussens, L. M. (2006).
Inflammation, proteases and cancer, Eur. J. Cancer, 42, pp. 728–734.
48. Hafez, D., Huang, J. Y., Huynh, A. M., Valtierra, S., Rockenstein, E.,
Bruno, A. M., Lu, B., DesGroseillers, L., Masliah, E., and Marr, R. A.
(2011). Neprilysin-2 is an important β-amyloid degrading enzyme, Am.
J. Pathol., 178, pp. 306–312.
49. Leissring, M. A. (2008). The AβCs of Aβ-cleaving Proteases, J. Biol.
Chem., 283, pp. 29645–29649.
50. Malito, E., Hulse, R. E., and Tang, W. J. (2008). Amyloid β-degrading
cryptidases: insulin degrading enzyme, presequence peptidase, and
neprilysin, Cell. Mol. Life Sci., 65, pp. 2574–2585.
51. Fersht, A. (1999). Structure and Mechanism in Protein Science: A Guide
to Enzyme Catalysis and Protein Folding (W.H. Freeman, New York).
52. Hedstrom, L. (2002). Serine protease mechanism and specificity, Chem.
Rev., 102, pp. 4501–4524.
53. Deslongchamps, P. (1975). Stereoelectronic control in the cleavage
of tetrahedral intermediates in the hydrolysis of esters and amides,
Tetrahedron, 31, pp. 2463–2490.
54. Deslongchamps, G., and Deslongchamps, P. (2011). Bent bonds, the
antiperiplanar hypothesis and the theory of resonance. A simple model
to understand reactivity in organic chemistry, Org. Biomol. Chem., 9, pp.
5321–5333.
55. Bizzozero, S. A., and Dutler, H. (1981). Stereochemical aspects of
peptide hydrolysis catalyzed by serine proteases of the chymotrypsin
type, Bioorg. Chem., 10, pp. 46–62.
56. Liu, B., Schofield, C. J., and Wilmouth, R. C. (2006). Structural analyses
on intermediates in serine protease catalysis, J. Biol. Chem., 281, pp.
24024–24035.
57. Ahn, K., Johnson, D. S., Fitzgerald, L. R., Liimatta, M., Arendse, A.,
Stevenson, T., Lund, E. T., Nugent, R. A., Nomanbhoy, T. K., Alexander, J.
P., and Cravatt, B. F. (2007). Novel mechanistic class of fatty acid amide
hydrolase inhibitors with remarkable selectivity, Biochemistry, 46, pp.
13019–13030.
58. Strajbl, M., Florian, J., and Warshel, A. (2000). Ab initio evaluation
of the potential surface for general base-catalyzed methanolysis
of formamide: a reference solution reaction for studies of serine
proteases, J. Am. Chem. Soc., 122, pp. 5354–5366.
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

552 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

59. Hu, C.-H., Brinck, T., and Hult, K. (1998). Ab initio and density
functional theory studies of the catalytic mechanism for ester
hydrolysis in serine hydrolases, Int. J. Quantum Chem., 69, pp. 89–103.
60. Smith, A. J. T., Muller, R., Toscano, M. D., Kast, P., Hellinga, H. W.,
Hilvert, D., and Houk, K. N. (2008). Structural reorganization and
preorganization in enzyme active sites: comparisons of experimental
and theoretically ideal active site geometries in the multistep serine
esterase reaction cycle, J. Am. Chem. Soc., 130, pp. 15361–15373.
61. Hediger, M. R., De Vico, L., Svendsen, A., Besenmatter, W., and Jensen, J.
H. (2012). A computational methodology to screen activities of enzyme
variants, PLOS ONE, 7, p. e49849.
62. Hilvert, D. (2013). Design of protein catalysts, Annu. Rev. Biochem., 82,
pp. 447–470.
63. Kiss, G., Celebi-Oelcuem, N., Moretti, R., Baker, D., and Houk, K. N.
(2013). Computational enzyme design, Angew. Chem., Int. Ed., 52, pp.
5700–5725.
64. Richter, F., Blomberg, R., Khare, S. D., Kiss, G., Kuzin, A. P., Smith, A.
J. T., Gallaher, J., Pianowski, Z., Helgeson, R. C., Grjasnow, A., Xiao, R.,
Seetharaman, J., Su, M., Vorobiev, S., Lew, S., Forouhar, F., Kornhaber,
G. J., Hunt, J. F., Montelione, G. T., Tong, L., Houk, K. N., Hilvert, D.,
and Baker, D. (2012). Computational design of catalytic dyads and
oxyanion holes for ester hydrolysis, J. Am. Chem. Soc., 134, pp. 16197–
16206.
65. Rajagopalan, S., Wang, C., Yu, K., Kuzin, A. P., Richter, F., Lew, S., Miklos, A.
E., Matthews, M. L., Seetharaman, J., Su, M., Hunt, J. F., Cravatt, B. F., and
Baker, D. (2014). Design of activated serine-containing catalytic triads
with atomic-level accuracy, Nat. Chem. Biol., 10, pp. 386–391.
66. Kheirabadi, M., Celebi-Olcum, N., Parker, M. F. L., Zhao, Q., Kiss,
G., Houk, K. N., and Schafmeister, C. E. (2012). Spiroligozymes for
transesterifications: design and relationship of structure to activity, J.
Am. Chem. Soc., 134, pp. 18345–18353.
67. Wilcox, D. E. (1996). Binuclear metallohydrolases, Chem. Rev., 96, pp.
2435–2458.
68. Zhu, X., Barman, A., Ozbil, M., Zhang, T., Li, S., and Prabhakar, R.
(2012). Mechanism of peptide hydrolysis by co-catalytic metal centers
containing leucine aminopeptidase enzyme: a DFT approach, J. Biol.
Inorg. Chem., 17, pp. 209–222.
69. Chen, S.-L., Marino, T., Fang, W.-H., Russo, N., and Himo, F. (2008).
Peptide hydrolysis by the binuclear zinc enzyme aminopeptidase from

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

References 553

aeromonas proteolytica: a density functional theory study, J. Phys.


Chem. B, 112, pp. 2494–2500.
70. Liao, R.-Z., Yu, J.-G., and Himo, F. (2009). Reaction mechanism of
the dinuclear zinc enzyme N-acyl-L-homoserine lactone hydrolase: a
quantum chemical study, Inorg. Chem., 48, pp. 1442–1448.
71. Syren, P.-O., and Hult, K. (2010). Substrate conformations set the
rate of enzymatic acrylation by lipases, ChemBioChem, 11, pp. 802–
810.
72. Marino, T., Russo, N., and Toscano, M. (2013). Catalytic mechanism of
the arylsulfatase promiscuous enzyme from Pseudomonas aeruginosa,
Chem. Eur. J., 19, pp. 2185–2192.
73. Syren, P.-O., and Hult, K. (2011). Amidases have a hydrogen bond that
facilitates nitrogen inversion, but esterases have not, ChemCatChem, 3,
pp. 853–860.
74. Evans, J. C. (1960). Vibrational assignments and configurations of
aniline, aniline-NHD, and aniline-ND2, Spectrochim. Acta, 16, pp. 428–
442.
75. Belostotskii, A. M., Aped, P., and Hassner, A. (1997). MM3 force field
as a tool in mechanistic studies of nitrogen inversion processes
for alkylamines, J. Mol. Struct. THEOCHEM, 398–399, pp. 427–
434.
76. Syren, P.-O., Le Joubioux, F., Ben Henda, Y., Maugard, T., Hult, K., and
Graber, M. (2013). Proton shuttle mechanism in the transition state of
lipase-catalyzed N-acylation of amino alcohols, ChemCatChem, 5, pp.
1842–1853.
77. Dive, G., and Dehareng, D. (1999). Serine peptidase catalytic machin-
ery: cooperative one-step mechanism, Int. J. Quantum Chem., 73, pp.
161–174.
78. Diaz, N., Suarez, D., Sordo, T. L., and Merz, K. M., Jr. (2001). Acylation
of class A β-lactamases by penicillins: a theoretical examination of the
role of serine 130 and the β-lactam carboxylate group, J. Phys. Chem. B,
105, pp. 11302–11313.
79. Rangelov, M. A., Vayssilov, G. N., Yomtova, V. M., and Petkov, D. D.
(2006). The syn-oriented 2-OH provides a favorable proton transfer
geometry in 1,2-diol monoester aminolysis: implications for the
ribosome mechanism, J. Am. Chem. Soc., 128, pp. 4964–4965.
80. Liu, H., and Gauld, J. W. (2008). Substrate-assisted catalysis in the
aminoacyl transfer mechanism of histidyl-tRNA synthetase: a density
functional theory study, J. Phys. Chem. B, 112, pp. 16874–16882.
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

554 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

81. Wallin, G., and Aaqvist, J. (2010). The transition state for peptide bond
formation reveals the ribosome as a water trap, Proc. Natl. Acad. Sci. U
S A, 107, pp. 1888–1893.
82. Gindulyte, A., Bashan, A., Agmon, I., Massa, L., Yonath, A., and Karle, J.
(2006). The transition state for formation of the peptide bond in the
ribosome, Proc. Natl. Acad. Sci. U S A, 103, pp. 13327–13332.
83. Wang, Q., Gao, J., Liu, Y., and Liu, C. (2010). Validating a new
proton shuttle reaction pathway for formation of the peptide bond in
ribosomes: a theoretical investigation, Chem. Phys. Lett., 501, pp. 113–
117.
84. Acosta-Silva, C., Bertran, J., Branchadell, V., and Oliva, A. (2012).
Quantum-mechanical study on the mechanism of peptide bond
formation in the ribosome, J. Am. Chem. Soc., 134, pp. 5817–5831.
85. Trobro, S., and Aaqvist, J. (2005). Mechanism of peptide bond synthesis
on the ribosome, Proc. Natl. Acad. Sci. U S A, 102, pp. 12395–12400.
86. Oliva, M., Dideberg, O., and Field, M. J. (2003). Understanding
the acylation mechanisms of active-site serine penicillin-recognizing
proteins: a molecular dynamics simulation study, Proteins Struct.
Funct. Bioinf., 53, pp. 88–100.
87. Hou, T., and Yu, R. (2007). Molecular dynamics and free energy studies
on the wild-type and double mutant HIV-1 protease complexed with
amprenavir and two amprenavir-related inhibitors: mechanism for
binding and drug resistance, J. Med. Chem., 50, pp. 1177–1188.
88. Valina, A. L. B., Mazumder-Shivakumar, D., and Bruice, T. C. (2004).
Probing the Ser-Ser-Lys catalytic triad mechanism of peptide amidase:
computational studies of the ground state, transition state, and
intermediate, Biochemistry, 43, pp. 15657–15672.
89. Kazlauskas, R. J. (2000). Molecular modeling and biocatalysis: expla-
nations, predictions, limitations, and opportunities, Curr. Opin. Chem.
Biol. 4, pp. 81–88.
90. Guthrie, J. P. (1974). Hydration of carboxamides. Evaluation of the
free energy change for addition of water to acetamide and formamide
derivatives, J. Am. Chem. Soc., 96, pp. 3608–3615.
91. Cammenberg, M., Hult, K., and Park, S. (2006). Molecular basis for the
enhanced lipase-catalyzed N-acylation of 1-phenylethanamine with
methoxyacetate, ChemBioChem, 7, pp. 1745–1749.
92. Kourist, R., Bartsch, S., Fransson, L., Hult, K., and Bornscheuer, U. T.
(2008). Understanding promiscuous amidase activity of an esterase
from Bacillus subtilis, ChemBioChem, 9, pp. 67–69.

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

References 555

93. Parry, M. A. A., Fernandez-Catalan, C., Bergner, A., Huber, R., Hopfner,
K.-P., Schlott, B., Guhrs, K.-H., and Bode, W. (1998). The ternary
microplasmin-staphylokinase-microplasmin complex is a proteinase-
cofactor-substrate complex in action, Nat. Struct. Biol., 5, pp. 917–923.
94. Syren, P.-O. (2013). The solution of nitrogen inversion in amidases,
FEBS J., 280, pp. 3069–3083.
95. Torres, L. L., Schliessmann, A., Schmidt, M., Silva-Martin, N., Hermoso,
J. A., Berenguer, J., Bornscheuer, U. T., and Hidalgo, A. (2012). Promis-
cuous enantioselective (-)-γ -lactamase activity in the Pseudomonas
fluorescens esterase I, Org. Biomol. Chem., 10, pp. 3388–3392.
96. Hackenschmidt, S., Moldenhauer, E. J., Behrens, G. A., Gand, M., Pavlidis,
I. V., and Bornscheuer, U. T. (2014). Enhancement of promiscuous
amidase activity of a Bacillus subtilis esterase by formation of a π -π
network, ChemCatChem, 6, pp. 1015–1020.
97. Oliva, C., Rodriguez, A., Gonzalez, M., and Yang, W. (2007). A quantum
mechanics/molecular mechanics study of the reaction mechanism
of the hepatitis C virus NS3 protease with the NS5A/5B substrate,
Proteins, 66, pp. 444–455.
98. Blumberger, J., Lamoureux, G., and Klein, M. L. (2007). Peptide
hydrolysis in thermolysin: ab initio QM/MM investigation of the
Glu143-assisted water addition mechanism, J. Chem. Theory Comput.,
3, pp. 1837–1850.
99. Ishida, T., and Kato, S. (2003). Theoretical perspectives on the reaction
mechanism of serine proteases: the reaction free energy profiles of the
acylation process, J. Am. Chem. Soc., 125, pp. 12035–12048.
100. Topf, M., and Richards, W. G. (2004). Theoretical studies on the
deacylation step of serine protease catalysis in the gas phase, in
solution, and in elastase, J. Am. Chem. Soc., 126, pp. 14631–14641.
101. Gesto, D. S., Cerqueira, N. M. F. S. A., Fernandes, P. A., and Ramos, M. J.
(2013). Unraveling the enigmatic mechanism of L-asparaginase II with
QM/QM calculations, J. Am. Chem. Soc., 135, pp. 7146–7158.
102. Wei, D., Lei, B., Tang, M., and Zhan, C.-G. (2012). Fundamental reaction
pathway and free energy profile for inhibition of proteasome by
epoxomicin, J. Am. Chem. Soc., 134, pp. 10436–10450.
103. Lodola, A., Capoferri, L., Rivara, S., Tarzia, G., Piomelli, D., Mulholland,
A., and Mor, M. (2013). quantum mechanics/molecular mechanics
modeling of fatty acid amide hydrolase reactivation distinguishes
substrate from irreversible covalent inhibitors, J. Med. Chem., 56, pp.
2500–2512.
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

556 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

104. Tubert-Brohman, I., Acevedo, O., and Jorgensen, W. L. (2006). Elucida-


tion of hydrolysis mechanisms for fatty acid amide hydrolase and its
Lys142Ala variant via QM/MM simulations, J. Am. Chem. Soc., 128, pp.
16904–16913.
105. Guo, H., Wlodawer, A., and Guo, H. (2005). A General acid-base
mechanism for the stabilization of a tetrahedral adduct in a serine-
carboxyl peptidase: a computational study, J. Am. Chem. Soc., 127, pp.
15662–15663.
106. Xu, Q., Guo, H., Wlodawer, A., and Guo, H. (2006). The importance of
dynamics in substrate-assisted catalysis and specificity, J. Am. Chem.
Soc., 128, pp. 5994–5995.
107. Xu, Q., Li, L., and Guo, H. (2010). Understanding the mechanism
of deacylation reaction catalyzed by the serine carboxyl peptidase
kumamolisin-As: insights from QM/MM free energy simulations, J.
Phys. Chem. B, 114, pp. 10594–10600.
108. Ma, S., Devi-Kesavan, L. S., and Gao, J. (2007). Molecular dynamics
simulations of the catalytic pathway of a cysteine protease: a combined
QM/MM study of human cathepsin K, J. Am. Chem. Soc., 129, pp. 13633–
13645.
109. Zhou, Y., Wang, S., and Zhang, Y. (2010). Catalytic reaction mechanism
of acetylcholinesterase determined by Born-Oppenheimer ab initio
QM/MM molecular dynamics simulations, J. Phys. Chem. B, 114, pp.
8817–8825.
110. Corminboeuf, C., Hu, P., Tuckerman, M. E., and Zhang, Y. (2006). Un-
expected deacetylation mechanism suggested by a density functional
theory QM/MM study of histone-deacetylase-like protein, J. Am. Chem.
Soc., 128, pp. 4530–4531.
111. Klusak, V., Barinka, C., Plechanovova, A., Mlcochova, P., Konvalinka,
J., Rulisek, L., and Lubkowski, J. (2009). reaction mechanism of
glutamate carboxypeptidase II revealed by mutagenesis, X-ray crys-
tallography, and computational methods, Biochemistry, 48, pp. 4126–
4138.
112. Zheng, M., and Xu, D. (2013). New Delhi betallo-β-lactamase I:
substrate binding and catalytic mechanism, J. Phys. Chem. B, 117, pp.
11596–11607.
113. Hermann, J. C., Hensen, C., Ridder, L., Mulholland, A. J., and Hoeltje, H.-
D. (2005). Mechanisms of antibiotic resistance: QM/MM modeling of
the acylation reaction of a class A β-lactamase with benzylpenicillin, J.
Am. Chem. Soc., 127, pp. 4454–4465.

www.ebook3000.com
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

References 557

114. Fonseca, F., Chudyk, E. I., van, d. K. M. W., Correia, A., Mulholland, A. J.,
and Spencer, J. (2012). The basis for carbapenem hydrolysis by class
A β-lactamases: a combined investigation using crystallography and
simulations, J. Am. Chem. Soc., 134, pp. 18275–18285.
115. Bjelic, S., and Aaqvist, J. (2006). Catalysis and linear free energy
relationships in aspartic proteases, Biochemistry, 45, pp. 7709–
7723.
116. Suguna, K., Padlan, E. A., Smith, C. W., Carlson, W. D., and Davies,
D. R. (1987). Binding of a reduced peptide inhibitor to the aspartic
proteinase from Rhizopus chinensis: implications for a mechanism of
action, Proc. Natl. Acad. Sci. U S A, 84, pp. 7009–7013.
117. Fulop, V., Szeltner, Z., Renner, V., and Polgar, L. (2001). Structures of
prolyl oligopeptidase substrate/inhibitor complexes: use of inhibitor
binding for titration of the catalytic histidine residue, J. Biol. Chem.,
276, pp. 1262–1266.
118. Drenth, J., Kalk, K. H., and Swen, H. M. (1976). Binding of chloromethyl
ketone substrate analogs to crystalline papain, Biochemistry, 15, pp.
3731–3738.
119. Matthews, B. W. (1988). Structural basis of the action of thermolysin
and related zinc peptidases, Acc. Chem. Res., 21, pp. 333–340.
120. Tronrud, D. E., Holden, H. M., and Matthews, B. W. (1987). Structures of
two thermolysin-inhibitor complexes that differ by a single hydrogen
bond, Science, 235, pp. 571–574.
121. Atanasov, B. P., Mustafi, D., and Makinen, M. W. (2000). Protonation
of the β-lactam nitrogen is the trigger event in the catalytic action
of class A β-lactamases, Proc. Natl. Acad. Sci. U S A, 97, pp. 3160–
3165.
122. Wijma, H. J., and Janssen, D. B. (2013). Computational design gains
momentum in enzyme catalysis engineering, FEBS J., 280, pp. 2948–
2960.
123. Lightstone, F. C., and Bruice, T. C. (1994). Geminal dialkyl substitution,
intramolecular reactions, and enzyme efficiency, J. Am. Chem. Soc., 116,
pp. 10789–10790.
124. Syren, P.-O., Lindgren, E., Hoeffken, H. W., Branneby, C., Maurer,
S., Hauer, B., and Hult, K. (2010). Increased activity of enzymatic
transacylation of acrylates through rational design of lipases, J. Mol.
Catal. B: Enzym., 65, pp. 3–10.
125. Syren, P.-O., Hendil-Forssell, P., Aumailley, L., Besenmatter, W., Gounine,
F., Svendsen, A., Martinelle, M., and Hult, K. (2012). Esterases with an
March 21, 2016 13:47 PSP Book - 9in x 6in 15-Allan-Svendsen-c15

558 Understanding Esterase and Amidase Reaction Specificities by Molecular Modeling

introduced amidase-like hydrogen bond in the transition state have


increased amidase specificity, ChemBioChem, 13, pp. 645–648.
126. Palermo, G., Campomanes, P., Cavalli, A., Rothlisberger, U., and De Vivo,
M. (2014). Anandamide hydrolysis in FAAH reveals a dual strategy for
efficient enzyme-assisted amide bond cleavage via nitrogen inversion,
J. Phys. Chem. B., 119, pp. 789–801.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

PART III

ENZYME DIVERSITY

559
This page intentionally left blank

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Chapter 16

Toward New Nonnatural TIM-Barrel


Enzymes Using Computational Design
and Directed Evolution Approaches

Mirja Krausea and Rik K. Wierengab


a Faculty of Biotechnology, Department of Bioprocess Engineering,

Technical University Berlin, Ackerstrasse. 76, 13355 Berlin, Germany


b Faculty of Biochemistry and Molecular Medicine, Biocenter Oulu, University of Oulu,

90014 Oulu, Finland


rik.wierenga@oulu.fi

The fascinating biocatalytic properties of enzymes are briefly


introduced, taking triosephosphate isomerase (TIM) as an example.
Recent developments in computational design and directed evolu-
tion approaches for generating nonnatural enzymes are reviewed.
Many examples of successful protein engineering toward optimized
enzyme activities are discussed. Such studies done with TIM-barrel
enzymes are discussed in greater detail, including the development
of a highly proficient nonnatural Kemp eliminase. The common
active site features of Kemp eliminase and TIM, such as the catalytic
base and the oxyanion hole, are discussed. The advances in this field
suggest that other nonnatural enzymes with tailor-made catalytic
properties will become available in the near future. These enzymes

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

562 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

with optimal stability and substrate specificity will be of great value


in many applications.

16.1 Introduction

Enzymes are proteins with catalytic properties. These catalytic


properties have evolved by natural selection as needed to permit
survival of the host organism. The catalytic efficiencies of enzymes
allow for enormous rate enhancements with respect to the
uncatalyzed reactions, as has been documented, in particular by
Radzicka and Wolfenden [1]. They have shown that some enzymes
can enhance the reaction rate by 1017 fold as compared to the
uncatalyzed reaction. This makes them important for all living
organisms and also attractive for applications in industrial and
lab-scale production systems of different kinds. Enzymes have
the ability to catalyze a tremendous spectrum of reactions, from
simple hydrolysis to more complex reactions such as the synthesis
of larger complex compounds. Additionally, on the basis of their
usually high chemo-, regio-, and stereospecificities, without the
need of protecting groups, they have a significant advantage in
catalysis compared to conventional catalysts [2, 3]. The catalytic
properties of enzymes have fascinated researchers ever since they
were discovered [4] and also provide a strong incentive for using
enzymes in practical applications [5]. Nowadays many enzymes are
already applied in industrial processes for the production of phar-
maceuticals, renewable fuels, fine chemicals and others. They also
play an important role in bulk processes like brewing and waste and
water treatment and in the degradation of toxic substances [6, 7].
In recent years computational and experimental approaches
toward obtaining engineered nonnatural enzymes have made very
significant advances, as will be discussed in this chapter. Currently,
natural enzymes are used already in many large-scale industrial
processes. Also in lab-scale synthesis enzymes are frequently used
[8], for example, to make labile substrates [9] or to make substrates
that are not commercially available [10, 11]. It is expected that the
use of enzymes will increase also in the synthesis routes by which
complicated pharmaceuticals are made [12, 13].

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Introduction 563

Figure 16.1 Transition-state stabilization of the catalytic cycle of triose-


phosphate isomerase. The continuous line concerns the uncatalyzed
reaction. The dotted line concerns the enzyme-catalyzed reaction. In this
graph the variation of the free energy (G, Y axis) is plotted as a function of
the reaction coordinate (X axis). The binding event and the product release
step of the enzyme-catalyzed reaction are included in the free-energy
profile. With kind permission from Springer Science+Business Media: From
Ref. [15].

The basic mechanism of the biocatalytic properties of enzymes,


as originally proposed by Linus Pauling [14], concerns the stabiliza-
tion of the enzyme transition-state complex relative to the enzyme–
substrate complex. Through this mechanism the high free-energy
barriers of the reaction cycle are reduced significantly, and therefore,
the reaction rates are increased greatly. This is visualized in Fig. 16.1
for the relatively simple reaction catalyzed by the triosephosphate
isomerase (TIM). The most difficult step in this reaction (Fig. 16.2)
is the abstraction of a proton from a carbon. In this example
the free-energy barrier is about 26.0 kcal/mol for a simple base
catalysis, whereas for the reaction catalyzed by the catalytic base (a
glutamate), it is about 13.5 kcal/mol. The lowering of the transition-
state barrier by 12.5 kcal/mol makes the catalyzed reaction about
109 times faster than the uncatalyzed reaction [16].
March 23, 2016
13:4
PSP Book - 9in x 6in

www.ebook3000.com
Figure 16.2 Catalytic mechanism of TIM. Visualization of the chemistry of the reaction catalyzed by TIM, using DHAP as the
substrate. The catalytic glutamate is Glu167 in trypanosomal TIM (in other well-studied TIMs, like chicken and yeast TIM, it is
Glu165). In the TIM reaction the catalytic base abstracts the proton from the C1 atom of DHAP (generating the cis-enediolate
intermediate), and subsequently it delivers the proton to the adjacent C2 atom, generating the product D-GAP. The comparison
of the available atomic resolution structures of TIM [15] has shown that the glutamate side chain adopts slightly different
conformations in the different complexes, facilitating the proton-shuttling functionality. The negative charge that develops on
the O2 atom is stabilized by the side chains of Lys13 and His95, functioning as the oxyanion hole. With kind permission from
564 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

Springer Science+Business Media: From Ref. [15].


16-Allan-Svendsen-c16
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Introduction 565

The transition-state theory implies a much stronger binding of


the transition state to the enzyme than the substrate. Consequently,
it is predicted that the preorganized geometry of the active site
of enzymes has a very high affinity for molecules that capture
the key molecular and atomic features of these transition states
[17]. Such molecules are referred to as transition-state analogs.
The work by Schramm and coworkers [11] indeed has proven
that this hypothesis is correct, using as an example of an
enzyme of the nucleotide metabolism (5 -methylthioadenosine/S-
adenosylhomocysteine nucleosidase [MTAN]), for which femtomo-
lar inhibitors could be developed, using key structural data of the
transition state for the design of these compounds. These molecules
have 108 fold higher affinities for the enzyme than the substrate. The
implications of this insight into enzyme reaction mechanisms also
affect the possible strategies to develop new nonnatural enzymes
in silico and in vitro. At the molecular level the problem of enzyme
design can conceptually be simplified as a docking exercise such
that, on a suitable protein framework, a molecular pocket has to be
designed, which is optimally complementary to the transition-state
analogue. It is generally recognized that electrostatic properties
of active sites of enzymes are important in this respect [18]. The
required docking exercise is a major computational challenge in
itself. Even so, it should be noted that the incorporation of the
dynamical properties of the enzyme, which are also known to be
key features of the naturally evolved enzymes [19, 20], into this
approach is challenging. This naturally evolved flexibility explains
why mutations far away from the catalytic site are important for the
biocatalytic properties of enzymes, as has been described for several
systems. In particular, the dihydrofolate reductase (DHFR) has been
well studied in this respect [21].
Numerous specific active site tools contribute to the catalytic
power of enzymes in various ways [22]:

(i) Classic examples are oxyanion holes, very often formed by main
chain peptide –NH groups, for example, in proteases, stabilizing
the negatively charged tetrahedral intermediates.
(ii) Catalytic bases (like glutamates and aspartates) facilitate the
abstraction of a proton from a carbon, which is very often
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

566 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

the rate-limiting step in many reactions, as also for the model


enzyme TIM [16].
(iii) Nucleophiles, like the deprotonated cysteine or serine side
chains, are important when covalent intermediates occur in
the catalytic cycle, for example, in the cysteine and serine
proteases, respectively [23].

Studies related to obtaining efficient, nonnatural enzymes greatly


contribute to our full understanding of how the catalytic tools
in enzyme active sites generate the catalytic power of enzymes.
These enzyme engineering studies consist of various experimental
approaches, collectively referred to as protein engineering. In this
chapter we will first summarize the protein engineering approaches
that are currently being used for developing efficient nonnatural
enzymes. Subsequently, we discuss in some detail the various
enzyme engineering attempts that have been done using TIM-
barrel frameworks. In this section we will also briefly discuss the
importance and challenges of computational approaches that have
been used in these TIM-barrel studies.

16.2 General Aspects of Protein Engineering

Natural enzymes optimally function in aqueous environment, under


ambient conditions, and are perfectly adjusted to their physio-
logical roles. Many enzymes therefore optimally function under
atmospheric pressure and room temperature. Such mild conditions
are viewed as environmentally friendly and energy efficient [24].
Many of the processes in which enzymes are used as catalysts
employ unnatural conditions, which are not similar to the enzyme’s
natural habitat and optimal environment, reducing the efficiency of
the employed biocatalysts drastically. Therefore, the adaptation of
the natural enzymes to their biological context limits the efficient
practical application of enzymes. Enzymes usually exhibit low
stability (e.g., in terms of temperature, pH, solvent concentrations
etc.) and are easily inhibited by small amounts of the product or
other substances present in the process. Some enzymes also require
high amounts of a cofactor, which leads to high production costs.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 567

One way to address this problem is bioprospecting, which refers to


the discovery of enzymes with special properties in the genomes
of a wide range of organisms, and the evaluation of the possibility
how these enzymes could be exploited in industrial processes.
In particular, enzymes of extremophiles are of interest. These
organisms live and have evolved under extreme environmental
conditions such as high temperature and high pressure. However,
bioprospecting is limited by the diversity of enzymatic reactions
present in nature [25]. Molecular biology and synthetic biology
methodologies have developed greatly in the last 20 years and have
made engineering of proteins possible. Changing/mutating the DNA
sequence encoding for a specific enzyme and hence modifying the
amino acid composition of the protein sequence directly influence
the structure and function of a protein. These techniques were first
employed during the early 1980s and have been improved and de-
veloped for the creation of artificial enzymes in the laboratory [26].
Nowadays protein or enzyme engineering protocols consist of
a large toolbox of methodologies. Many attempts aim to improve
enzyme stability. This is achieved by an increased tolerance toward
substances like detergents and organic solvents and also toward
oxidation, extreme pressures, temperatures, or pH values. Other
approaches target the stereospecificity and catalytic turnover or aim
to broaden or change substrate specificity. However, not only the
features of already existing enzymes are targeted by protein engi-
neering, but attempts are also made to create entirely new catalytic
reactions not present in nature, which are interesting for various
large-scale and small-scale applications. Enzymes that can produce
certain modified molecules that cannot be easily obtained by chem-
ical synthesis are important for the production of fine chemicals.
Protein engineering is also employed to investigate fundamental
biological aspects, for example, reaction mechanisms and enzyme–
substrate or protein–protein interactions as well as the relation the
structure and function. Gathered knowledge will address one of
the biggest bottlenecks in protein engineering as it will facilitate
the design of nonnatural enzymes with reaction rates comparable
to those found in nature. Most protein engineering projects start
from a known scaffold. Yet, with an increasing number of fast and
reliable algorithms available some enzymes are created de novo
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

568 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

from scratch, not on the basis of any known structure [27]. These
approaches usually start in silico with substrate-based designs and
only in later stages are transferred to the laboratory. An important
aspect of this approach is that the stability of the new sequence also
needs to be considered, as also discussed in Section 16.3.2.
The very first step of protein engineering, the selection of
the target framework (on which to build the active site), is of
critical importance. Also the mutagenesis method requires careful
consideration as this influences the success of the creation of desired
new enzymes significantly. An important issue is also the assay
system that can be used to assess the properties of the new variants.
Unfortunately, the newly created or redesigned enzymes usually
do not display catalytic activity rates required in industry. As studies
show the success rate of a protein engineering approach is higher
if the targeted enzyme does already display the desired activity,
for example, as a promiscuous activity (usually at a low rate)
or at least catalyzes the desired reaction converting a chemically
and structurally similar substrate [24]. It has even been suggested
that for a successful approach this is absolutely essential, and as
a continuation of the first rule for directed evolution (you get
what you select for) there should be a second rule: “You should
select for what is already there” [28]. The experimental part of
protein engineering comprises always of (i) the gene library creation
through mutagenesis and (ii) the subsequent library selection.
These two steps are intrinsically linked to each other, being equally
important for the successful creation of functional nonnatural
enzymes. Usually, they are repeated in iterative cycles before the
desired change in trait is achieved. The methods for obtaining
optimal libraries (1.3.1 to 1.3.4) and for finding the better variants
of these libraries (1.3.5) are discussed below. The methods for
designing optimal libraries can be divided into three categories,
which can be seen in Fig. 16.3: (1) structure-based design (blue
box on the left), (2) directed evolution (green box on the right)
and (3) data-driven design (orange arrows). A combination of the
structure-based design followed by a directed evolution approach
is referred to as a data-driven design. From the obtained libraries
one can find the desired new variants in employing one of the two
protein engineering methods: (1) protein engineering by selection

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 569

Figure 16.3 Overview of the approaches used in protein engineering of


enzymes. The three approaches for a protein engineering project are shown.
The methodologies and procedures are displayed as follows: (left) the light-
blue box shows a structure-based design; (right) the green box shows
directed evolution, and the orange arrows symbolize the data-driven design.
A data-driven design is a combination of the first rational design followed by
directed evolution. The recent work on the Kemp eliminases, as discussed
in Section 16.3, suggests that the combined approach is the best possible
protocol, whenever suitable structural information is available. Reprinted
from Ref. [29], Copyright (2001), with permission from Elsevier.

or (2) protein engineering by screening (Table 16.1), as discussed in


Section 16.2.5.

16.2.1 Library Creation Methods


In a protein engineering approach the first experimental step is the
creation of a gene library by making mutations into the encoding
DNA sequence. Making mutations randomly over a certain sequence
can result in very large gene libraries. In the past it was thought that
March 23, 2016
13:4

Table 16.1 Overview of library selection and screening methodologies

Technology Description Diversities Amplification/synthesis Example references


9 10
In vivo selection Protein is expressed inside a cell, 10 to 10 In vivo/bacteria [30–33]
and the activity of this enzyme is
essential for cell survival
Cell surface display
(i) Bacterial display Target proteins are displayed on the 108 to 109 In vivo/bacteria [34, 35]
cell surface of Escherichia coli.
(ii) Yeast display Target proteins are fused to a yeast 109 In vivo/yeast [36, 37]
surface protein.
Ribosome/mRNA display The ribosome is stalled, and mRNA, 1015 In vitro/cell-free expression [38–44]
protein, and ribosome form a
ternary complex, after selection RT-
PSP Book - 9in x 6in

PCR allows amplification.


Phage display Target proteins are displayed with 106 to 1011 In vivo/bacteria In vitro/cell-free expression [45–48]

www.ebook3000.com
phage proteins on the phage surface.
In vitro compartmentalization Imitates the compartmentalization 108 to 1011 In vitro/cell-free expression [38, 49, 50, 52, 53]
of living cells, and genes labeled with
substrate are expressed. The protein
(still bound to the DNA) converts the
substrate to the product.
570 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed
16-Allan-Svendsen-c16
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 571

to create the highest variation it is best to create as large libraries


as possible. One bottleneck of such an approach, if in vivo selection
is applied, is the transformation efficiency of the selection strain.
Therefore, nowadays most approaches try to obtain smaller, smarter
libraries. However, to do that it is important to use prediction
tools to optimize and focus the area targeted by mutagenesis [54,
55]. Small libraries are crucial if activities that are challenging to
measure are needed. Nevertheless, large libraries are still thought
to be essential if selecting for high substrate selectivity [50]. One
has to consider not only the size of the gene library created with
a certain methodology but also the quality of the library. Most
mutagenesis methodologies exhibit a certain mutational bias. The
bias can be either an error bias (type of nucleotide mutations
present) [56], a codon bias (amino acid changes observed in the
protein) [57], or an amplification bias (distribution of distinct
sequences in the library) [58]. Over the last decade methods have
been developed that specifically addressed these problems. Some
of them have been reviewed in detail by Shivange et al. [59]
(e.g., error-prone polymerase chain reaction [epPCR] employing
8-hydroxy-dGTP, trinucleotide exchange (TriNEx), transversion-
enriched sequence saturation mutagenesis (SeSaM-Tv), and iterative
saturation mutagenesis (ISM)). A more recent review with a broader
perspective was published in 2013 by Tee and Wong [60].
Concluding, in a protein engineering approach all parts are linked
tightly with one another. First, the target protein influences the
choice of methodologies that can be applied for library creation
and selection procedure. Second, the decision on how to create
the gene library (the method will influence the library size) is
directly linked to the choice and availability of a selection/screening
system.
The number of studies employing protein engineering to achieve
different goals is large, and many examples can be found in the
literature. The different approaches can be divided into three
categories: structure-based design, directed evolution, and data-
driven design (Fig. 16.3). The latter is a combination or the
subsequent use of the first two. Recent reviews provide an overview
of the methods and techniques available as well as interesting target
proteins [60–64].
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

572 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

16.2.2 Structure-Based Library Design


Structure-based library design (sometimes referred to as rational
design) is used for many different purposes, for example, to
change substrate or cofactor specificity and catalytic mechanism
and also to improve enantioselectivity and enzyme stability [29]
The steps necessary for such an approach and to obtain mutant
variants can be seen in Fig. 16.3 (blue box, left). In contrast to
directed evolution, structure-based design is always based on the
available protein structural and biophysical data [65]. Diversity
is introduced at specific positions within the synthetic DNA. This
can be achieved by site-directed mutagenesis, the incorporation of
partially randomized synthetic DNA cassettes into genes via PCR or
direct cloning [57].
A convenient one-step PCR method for this kind of sitedirected
mutagenesis is the megaprimer PCR [66]. In addition to two flanking
primers it requires one mutational primer. This mutational primer
contains undefined bases, the so-called wobbles, instead of the
original bases in the DNA sequence targeted for mutation. If the
flanking primers have restriction site overhangs, the obtained DNA
fragment can be easily cloned into a vector backbone after the PCR,
creating a gene library.
If protein structure data are available, one can also apply the B-
factor iterative test (B-FIT) method, which considers the B-factors
of a structure. B-factors account for the thermal motion of atoms.
A high average B-factor of a certain amino acid residue signifies a
higher flexibility, as this residue has a low number of contacts to
other amino acid residues [67]. In a study by Gumulya and Reetz
[68] the B-factors of all residues were analyzed and residues with
the highest B-factors were considered and subsequently targeted
by ISM. This lead to increased thermal robustness of an epoxide
hydrolase of Aspergillus niger. The B-FIT method has also been
successfully applied to improve the tolerance of a lipase from
Bacillus subtilis to hostile organic solvents [69].
In some cases the mutation of a single amino acid residue is
enough to establish a new activity on an enzyme. To enable the
muconate lactonizing enzyme II from Pseudomonas sp. P51 (MLE
II) and the L-Ala-D/L-Glu epimerase from E . coli (AEE) to catalyze

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 573

the o-succinyl-benzonate synthase (OSBS) reaction a singleresidue


substitution sufficed. The residue to be mutated was found by
structure comparison of both TIM-barrel enzymes. Neither of the
wild-type enzymes is promiscuous for the OSBS reaction [70].
Structure-based design depends on the availability of methods
to choose the sites to be targeted for randomization. Furthermore,
these methods rely strongly on sequence, biophysical, and structural
data. They usually can yield much smaller and more focused
libraries, maximizing the proportion of functional variants present
[71]. Structure-guided targeted mutagenesis was recently used suc-
cessfully to evolve the stereoselectivity of hydrolases for the detoxifi-
cation of nerve agents. Their structure-based design approach made
the creation of smaller and more focused libraries possible [72].
One of the most apparent limitations of these structure-based
methods, though, is that they fail to consider remote mutations.
Mutations far away from the target area (usually a binding or an
active site) can influence the desired feature positively [29, 73]. This
was seen in particular for the DHFR, but many other examples can be
found in literature: for example, the directed evolution of the human
estrogen receptor where mutations far away from the ligandbinding
pocket greatly influenced the ligand affinity [36].
Over the last decade more complex computer algorithms have
been developed, which mimic natural evolution in its iterative
nature with an enormous throughput. This makes it possible to
access areas in sequence space that are impossible to reach with
purely experimental methodologies [74]. Noteworthy algorithms
are protein sequence–activity relationships (ProSAR), SCHEMA, and
ROSETTA. ProSAR sorts mutations according to their impact on
enzyme activity. The algorithm gives rise to constantly improved
variants by keeping favorable mutations while undergoing iterative
evolution cycles. SCHEMA predicts which polypeptide elements can
be exchanged on the basis of structural information. It estimates
the extent of disruption (breaking of residue–residue contacts)
that is caused upon DNA recombination and crossover events.
Finally, the screening is directed toward those chimeras with lower
structural disruption. Arnold and coworkers have successfully em-
ployed SCHEMA in several studies, confirming that their approach
efficiently generated a family of divers, folded proteins even with
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

574 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

low sequence identity between parental genes [75–77]. ROSETTA


can be employed to design a novel enzyme from a preexisting
template/scaffold. An active site is designed around a catalytic
reaction’s transition state [54]. This approach will be discussed
in more detail in the last section of this chapter. Structure-based
design is a powerful tool for protein engineering that will improve
with the advancement of the algorithms available.

16.2.3 Optimal Libraries for Directed Evolution Methods


Directed evolution mimics natural evolution with its iterative cycles
of mutagenesis followed by selection. The steps necessary to obtain
mutant enzymes via directed evolution can be seen in the schematic
overview in the green box of Fig. 16.3. In a directed evolution
approach this is repeated until the desired level of optimization is
reached [60, 78]. For this Darwinian approach structural knowledge
is not necessarily required [79, 80]. The methodologies for obtaining
the best possible libraries for directed evolution can be divided
into two categories: (1) recombination independent (introduction of
random mutations into the sequences, for example, epPCR, chemical
mutagens, or mutator strains) and (2) recombination dependent
(random recombination of a set of related sequences, for example,
DNA shuffling) [81, 82]. These methodologies are well developed
and widely applied [83]. The size of the created libraries is usually
large, 106 to 1010 variants [84].
Methods that are independent of recombination mainly encom-
pass techniques in which the copying and amplification of a DNA
sequence is deliberately disrupted (e.g., epPCR) [57].
A widely applied method is the epPCR, whose protocol is altered
to enhance the error rate of the employed DNA polymerase [85,
86]. The PCR was developed in the mid-80s [87, 88]. It was further
improved by the application of a thermostable DNA polymerase
[67]. The epPCR relies on the misincorporation of nucleotides by
the DNA polymerase. Therefore, polymerases, for example, the Taq-
polymerase from Thermus aquaticus, which lacks the proofreading
activity, are employed. They are prone to incorporate wrong
nucleotides at a frequency of 0.1–2 × 10−4 [89], leading to random
insertion of mutations, which, in turn, can result in the eventual

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 575

creation of improved protein variants [84]. However, it is important


to know that the error rate of a certain polymerase can be biased
toward certain sequence changes. For this reason, various protocols
have been developed aiming to optimize the procedure toward
an increased unbiased error rate. The optimizations include the
variation of nucleotide concentrations or triphosphate analogue,
the addition of MgCl2 and MnCl2 [90–92], and also a combination
of differently biased polymerases, for example, Mutazyme II DNA
polymerase (developed by Agilent Technologies, USA). To yield even
higher error rates that are unbiased these modifications also have
been applied in combination [84, 93, 94].
A very prominent example was published in Science in 2007 [77].
The successful evolution of a recombinase able to excise an HIV-1
provirus from the genome of infected cells was achieved via epPCR.
The same research group also successfully evolved a G-protein-
coupled receptor toward more stability and enhanced binding
selectivity [95] and later also increased detergent stability [94, 96].
To avoid mutational bias of the DNA polymerase PCR-based
mutagenesis methodologies have been developed, which are com-
pletely independent of it. Most of these methods are very labor
intensive [81]. One noteworthy method though is SeSaM, which was
developed as a fourstep procedure in 2004 by Wong et al. [97].
The mixture of standard nucleotides employed during the PCR is
supplemented with α-phosphothionate, a nucleotide analogue of
dATP. The incorporated analogue is hydrolyzed, and thus, random
length fragments are created. Subsequently these fragments are
elongated with a terminal deoxynucleotidyl transferase, which
incorporates universal bases. Eventually, a final PCR exchanges the
universal bases with standard nucleotides [81, 86, 97].
In contrast to the recombination-independent methods the
principle of recombination-dependent methods is the creation
of new variants by reshuffling existing diversity. DNA shuffling
published in 1994 by Stemmer was one of the first methods
employing this principle [98, 99]. DNA shuffling allows the
combination of beneficial mutations from multiple genes in one
step. This is achieved by the recombination of various homologous
paternal template genes previously digested with DNase I [81, 100].
Figure 16.4 shows a schematic overview of the methodology. DNA
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

576 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

Figure 16.4 DNA shuffling on the basis of the recombination of homologous


DNA fragments. An ancestral gene naturally evolves in a series of
homologous genes. The sequence similarity of these genes can be exploited
by DNA shuffling. The parental DNA is cleaved into small random fragments
by a DNAse I digest. The fragments are then mixed in a self-priming
reassembly PCR reaction to reconstruct the full-length genes. During the
PCR the dsDNA fragments are denaturated into ssDNA (those fragments
are symbolized here by the thin arrows) so that in the following annealing
step fragments can anneal to each other, either the complementary strands
(same color) or the highly homologous strands (different color). The ssDNA
fragments themselves function as primers (displayed as arrows). In the
first step the so-called heteroduplexes can form due to high homology. The
elongation step follows to create full-length dsDNA fragments. These steps
are repeated in cycles of PCR, resulting in hybrid genes that are now made
of parts of different length from other genes [98, 99].

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 577

shuffling was employed by Stemmer and coworkers to enhance the


fluorescence of green fluorescent protein (GFP) by 42-fold to be
easily detected by UV light [101]. In general directed evolution has
played a significant role in the improvement of fluorescent proteins.
Various examples can be found in the literature [73]. At the same
time also many publications can be found that employ an optimized
protocol to achieve a variety of goals. Interesting examples are
the development of transgenic glyphosate-tolerant plants [102], the
establishment of a new substrate specificity for triazine hydrolases
[103], and the increase in diversity of cytochrome P450 [104].
To address the limitations of DNA shuffling several other
methods, for example, random chimeragenesis on transient tem-
plate (RACHITT) and staggered extension process (StEP) were
developed. Due to the fact that gene fragments will only anneal
to another fragment if the sequence similarity is very high the
crossover rate of DNA shuffling can be very low. This was addressed
by RACHITT [105]. Specifically, RACHITT uses a single-stranded
parental template containing uracil. It is used as a scaffold for the
hybridization of second-strand homologous gene fragments [81,
105]. Another method, incremental truncation for the creation of
hybrid enzymes (ITCHY), dissociates the formation of hybrid genes
from sequence homology [106]. In ITCHY fragments are generated
by the truncation of two template sequences by the exonuclease
III, with each template being truncated from opposite ends. These
fragments are directly ligated. The method was further optimized
toward enhanced crossover, resulting in a method called SCRATCHY
[107]. The hybrids created by ITCHY are amplified with skewed
primers (the forward primer and the reverse primer anneal to
different parental sequences). Finally, the amplified fragments were
subjected to DNA shuffling and hybrids with multiple crossovers are
enriched. ITCHY and SCRATCHY were used, for example, to create
hybrid glutathion S-transferases (GSTs). GST enzymes play a crucial
role in cellular detoxification [108].
In contrast to DNA shuffling and RACHITT, StEP is a primer-
dependent method [109]. Different-sized DNA fragments are created
by a PCR, which employs a mixture of templates and only end
primers. Additionally, the PCR consists of very short extension
and annealing steps. The DNA fragments can finally anneal to one
another in the next annealing step.
March 28, 2016 15:58 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

578 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

To create libraries from more than two parental genes methods


like exon shuffling (template only for eukaryotic sources) [110] or
random multirecombinant PCR (RM-PCR, templates from prokary-
otic or eukaryotic sources) [111] can be applied.
Many more methods relying on recombination can be found
in the literature [81, 112]. A recent extensive review covers
many aspects of gene library creation [60]. Many current protein
engineering approaches however, try to apply a combination of
directed evolution and structure-based design. This socalled data-
driven design yields promising results and will be discussed further
in the following section.

16.2.4 Data-Driven Design (Semirational Design)


Data-driven design (sometimes also referred to as semirational
design) is the combination of structure-based design and directed
evolution and is considered as data-driven design, the third phase
of protein engineering [54]. As can be seen from Fig. 16.3 in
data-driven design usually a structure-based design step precedes
a directed evolution step. Data-driven design relies on available
structural and biochemical data and comprises smaller, smarter
libraries as directed evolution [82]. A reduced number of variants
also means that a high-throughput approach to evaluate the libraries
is not necessary [113]. The redesign of enzymes can comprise
approaches using all the available data or only focus on specific data,
for example, sequence-based design and structure-based design.
As was described earlier, there are several algorithms that
can be applied for the creation of smaller and smarter libraries.
For example, SCHEMA was used to create thermostable fungal
cellulases from a library of only 48 chimeras (out of 6,561 possible
combinations) [114].
A remarkable result was achieved by the combination of site-
directed mutagenesis, epPCR and DNA shuffling. Laundry conditions
easily inactivate peroxidases due to usually high pH (10.5), high
temperatures (50◦ C or more), and high peroxide concentration (5–
10 mM). Cherry et al. [115] evolved a fungal peroxidase used in
laundry detergent as a dye-transfer inhibitor. Another work by
Hoffmann et al. [116] showed that it is possible to change the

www.ebook3000.com
March 28, 2016 15:58 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 579

substrate specificity of cytochrome P450 toward diphenylmethane


by site-directed mutagenesis of nine residues, which were selected
on the basis of the protein crystal structure available. In a very recent
work Li and Liao [32] used site-directed saturated mutagenesis
combined with an in vivo auxotroph selection system followed
by structure-based mutagenesis on the basis of structural data,
available. Through this approach they developed a nicotinamide
adenine dinucleotide phosphate (NADPH)-dependent homopheny-
lalanine dehydrogenase on the basis of an NADPH-dependent
glutamate dehydrogenase (GDH) from E. coli.
Many more successful examples can be found in literature and
have been reviewed in detail [113, 117, 118]. In Section 16.3.2 recent
successful examples of this approach are discussed in more detail.

16.2.5 Protein Engineering by Selection and Screening


Methods
The experimental methodologies of finding the better variants of
the gene library are categorized in selection and screening, both
of which are subcategorized into in vivo and in vitro systems [24].
These approaches are compared in Table 16.2. For the development
of these selection and screening methods it has to be considered
that proteins perform very diverse reactions under a great variety
of conditions. Therefore, the methods applied are very often specific
for each protein The identification of variants exhibiting the desired
traits is in both approaches a challenging task [124, 125].
A key difference between selection and screening is that during
a selection all negative variants will not be visible while during
screening one sees positive and negative variants. In the in vivo
selection approach a suitable knockout strain needs to be prepared
[119, 120]. In the knockout strain the gene coding for the desired
activity has been deleted. Additionally, some other genes can
be deleted as well in order to allow for optimal properties of
the selection strain when grown under carefully chosen selective
growth conditions. The desired trait of the new enzyme activity
allows for the survival of the selection strain that harbors the
plasmid expressing a more active variant. During the selection,
evolutionary pressure is applied, and thus, the negative variants
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

580 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

die [31, 126]. Reetz et al. successfully used in vivo selection to find
enzyme variants with improved enantioselectivity [33]. In all in vivo
selection systems the genotype is directly linked to the phenotype.
The created libraries need to be transformed into cells, giving rise
to substantial bottlenecks: the transformation efficiency (yeast: 107
to 108 , E. coli: 109 to 1010 cells/μg DNA) [38]. Other bottlenecks
concern, for example, the lack of transport of the substrate into
the cell or cells circumventing the selection pressure by metabolic
adaptation [30].
Screening procedures are mainly performed in vitro, or partially
in vivo. A carefully chosen assay is used to record a certain signal, for
example, kinetic assay, colorimetry, fluorescence, or luminescence.
Prior to the detection the signal is produced by either a reaction
(enzymatic catalysis) or a display of a certain molecule. Many
different methodologies have been developed over the last 20 years:
compartmentalization and partially in vivo methodologies such as
yeast display, cell surface display, and phage display [24, 38].
For in vitro systems the transformation efficiency is not a limiting
factor [30], and much larger libraries can be handled (104 variants)
[38].
Additionally, since the methods are not dependent on viable cells,
an advantage is that it is possible to conduct in vitro selection or
screening under conditions, which can be much closer to the actual
industrial process (e.g., extreme pH or temperatures and organic
solvents) [30].
Several examples for selection and screening methodologies
from the literature are discussed in this chapter and are summarized
in Tables 16.1 and 16.2. Experiments have shown that to be
successful it is of crucial importance that the chosen methodology
is relevant, accurate, stringent, and tailored to given circumstances
(type of enzyme and activity) [65]. Therefore, the first law of
directed evolution is termed “You get what you screen/select for”
[127]. Examples can be found in literature where the selection or
screening method was not stringent enough, yielding variants with
undesired characteristics [29, 128].
A variety of display methodologies have been developed [38].
One of them, the cell surface display procedure employs either
bacterial or yeast cells. The latter offers the possibility for

www.ebook3000.com
March 23, 2016
13:4

Table 16.2 Comparison of selection and screening methods

Selection Methodology Material Visibility of result Bottleneck Library size Comments References

In vivo In vivo selection Plasmid Only positive hits Depends on Large libraries Knockout strains [119, 120,
(all negative transformation (>1010 ) need to be prepared 121]
variants die or are efficiency of host before the
washed away) cell experiments can
start.
Selection for Molecule Nonnatural
binding (protein, activities cannot be
peptide, DNA, optimized in this
etc.) approach.
In vitro Selection for Only positive hits No dependency on Large libraries Nonnatural [122]
binding (cell free) (all negative transformation (>1010 ) activities cannot be
variants are washed efficiency optimized in this
PSP Book - 9in x 6in

away) approach.
Screening
In vivo Cell surface display Plasmid + Negative and Depends on Small, targeted Robotics for a [34, 38, 48]
(yeast, bacteria, and displayed positive results transformation libraries high-throughput
phages) molecule efficiency of host approach is
cell essential.
Agar plate + MTW Plasmid +
plate screen protein

(Contd.)
General Aspects of Protein Engineering
581
16-Allan-Svendsen-c16
March 23, 2016
13:4

Table 16.2 (Contd.)

Screening Methodology Material Visibility of Result Bottleneck Library Size Comments References

In vitro Phage display (cell DNA + displayed Negative and No dependency on Small, Robotics for a [51]
free) molecule positive results transformation targeted high-throughput
efficiency libraries approach is
essential.
mRNA display mRNA + displayed [44]
molecule
PSP Book - 9in x 6in

Ribosome display mRNA + displayed [41–43]


molecule

www.ebook3000.com
In vitro compart- DNA or mRNA + [49, 52]
mentalization connected molecule
582 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed
16-Allan-Svendsen-c16
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

General Aspects of Protein Engineering 583

posttranslational modification (e.g., glycosylation) of the introduced


proteins [36, 37]. Signal sequences are used to direct the protein
to and anchoring it in the outer membrane for display. Klauser
et al. [35] exploits the ability of the C-terminal domain of the
Neisseria IgA protease precursor to transport covalently attached
proteins across the outer membrane of E. coli. Several studies
employing bacterial surface display and yeast surface display have
been reviewed by Daugherty [34], emphasizing the advantage to
couple these methods with cell-sorting instrumentation, such as,
for example, fluorescence-activated cell sorting (FACS). A display
method coupled with a FACS screen offers high-throughput sample
turnover and hence the possibility to screen large libraries. This
was done in a recent study where an advanced procedure was
used to evolve a bacterial transpeptidase sortase A [36]. Yeast cells
display was used and the desired protein was fused to the Aga2p
cell surface mating factor. Subsequently cells displaying the enzyme
were incubated with a substrate and stained with a fluorescent
molecule. All cells were screened by FACS.
Also phages can be used to display proteins on their surface.
George Smith [129] found that foreign DNA can be inserted into
phage genes encoding for the coat protein pIII. The resulting
fusion proteins are then displayed on the phage’s surface. This
so-called phage display was further optimized for the display of
antibody variable domains, making it important for drug design
and antibody discovery [47]. Phage display can be a selection
method (selection for binding) or a screening method (screening
for enzymatic activity). It is also possible to display functionally
active proteins, for example, the E. coli β-galactosidase [46]. A large
number of studies employing this method can be found in literature
[45, 48, 130].
Important methodologies are mRNA and ribosome display.
mRNA display was mainly developed by Roberts and Szostak in
the 1990s [44, 131]. Ribosome display is similar to mRNA display
and is nowadays the preferred methodology. It was first developed
by Mattheakis et al. [40] for the selection of peptides. Later on it
was further improved to select for folded proteins [39, 132]. The
DNA fragments of the library are fused to a noncoding sequence, a
tether. This enables protein folding while the tether remains in the
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

584 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

ribosome tunnel. The construct lacks a stop codon. It is transcribed


into mRNA and translated into protein in vitro. Since the stop codon
is missing the mRNA and the formed polypeptide are not released
from the ribosome, creating a ternary complex. Adjusted conditions
(low temperature and high magnesium concentrations) further
stabilize the complex. It can be directly used for selection binding to
an immobilized target. Finally, the mRNA is purified and transcribed
to DNA via reverse transcription polymerase chain reaction (RT-
PCR), yielding the genetic information of the positively selected
clones [38, 40]. This technique is often used for the selection and
evolution of antibodies [42, 43]. Since a transformation step into
cells is not necessary, very large libraries can be selected [41].
In vitro compartmentalization mimics the compartmentalization
of living cells and ensures the coupling of the genotype to the
phenotype, important for library screening. The transcription and
translation of genes take place within a water–oil emulsion [53]. The
substrate is attached to the gene that encodes for the protein, which
converts the substrate to product and enriches it in the droplet.
When the gene expresses an inactive protein variant, the product is
not enriched in the droplet. Finally, the droplets are screened by an
activity assay (e.g., formation of NADPH) or by FACS for an enriched
product (coupled to the gene) in the droplet [38, 53]. Bawazer et al.
[49] have recently applied this method to create biomineralizing
cells, which were screened by FACS identifying silicateins. These
biomineralizing enzymes were found in the silica skeleton of marine
sponges. New variants can produce crystalline silicates, which their
parents were not capable of synthesizing. A variety of studies
applying this method can be found in literature [50, 52, 133].

16.3 Directed Evolution Studies with TIM-Barrel Enzymes

As many as 10% of the enzymes in the Protein Data Bank (PDB)


[134] have the TIM-barrel fold, making them the most abundant
enzyme framework [134, 135]. Nature clearly favors the TIM-barrel
as a scaffold for enzymatic activity. The fold is also referred to
as (β/α)8 -barrel (Fig. 16.5). This very common and versatile fold
is present in about 20 different classes of enzymes, which very

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Directed Evolution Studies with TIM-Barrel Enzymes 585

Figure 16.5 The TIM-barrel fold of the trypanosomal TIM subunit (PDB
code: 5TIM). (a) End-on view: The β-strands are numbered and the catalytic
residues (loop 1: N11, K13; loop 4: H95; and loop 6: E167) are highlighted.
(b) Side view: The catalytic loops are part of the C-terminal end of the β-
strands (the catalytic end). The loops at the N-terminal end of the β-strands
are important for protein folding and stability (the stability end).

often catalyze reactions in the energy metabolism in the cell [136].


The TIM-barrel was first observed in the crystal structure of its
prototypical representative, TIM (EC 5.3.1.1) [137]. It concerned the
structure of chicken TIM. TIM is a dimeric enzyme. Now structures
of TIM of more than 20 species have been determined [15], including
many liganded and mutated structures.
The classical studies of Jeremy Knowles on the reaction mech-
anism of TIM [16] have emphasized that TIM is an example of a
perfect enzyme, since the rate of conversion of R-glyceraldehyde-3-
phosphate into dihydroxyacetone phosphate (DHAP) is limited only
by the diffusional controlled encounter of enzyme and substrate, in
which case the kcat /KM ratio is near 108 M−1 ·s–1 [138]. The free-
energy profile of the TIM reaction is shown in Fig. 16.1. The critical
step in this reaction is the deprotonation of a carbon atom for which
a C–H bond needs to be cleaved by a catalytic base, which is a
glutamate in TIM (Fig. 16.2). Other key features of the active site
of TIM are the properly positioned side chains of a lysine (K13)
and a histidine (H95), functioning as an oxyanion hole, as well as
three phosphate binding loops, which anchor the phosphate moiety
and consequently the substrate as a whole with respect to these
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

586 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

catalytic residues. This rigid anchoring of the substrate prevents


also an undesirable side reaction, which is known to occur in the
uncatalyzed reaction [123, 139, 140]. It concerns loop 6, loop 7,
and loop 8. Loop 6 is known to make a large loop movement
(7 Å) from open (unliganded) to close (liganded). Smaller but
equally important structural changes are the peptide flips of loop
7 concerted with the loop 6 closure. The active site is at the dimer
interface, but the catalytic residues of each of the two active sites are
provided by the same subunit. The dimerization generates a rigid
active site geometry.

16.3.1 Protein Engineering Studies of TIM-Barrel Proteins


An analysis of the folding and stability of TIM-barrels in terms
of mutability shows that a TIM-barrel is a very stable structure,
which can tolerate many modifications. This was studied extensively
by reshuffling experiments [141], rearrangement of the order of
secondary structure segments by circular permutation [136, 142],
and fragment complementation [143, 144]. The work of Silverman
et al. [145] on yeast TIM shows that certain areas in a TIM-barrel
are more sensitive to mutations than others. For example residues at
the end or the beginning of β-strands or at the core of the β-barrel
cannot be exchanged easily without a loss in folding and function.
On the other hand α-helices and loop sequences are highly mutable.
This raises the question if the TIM-barrel scaffold could be easily
modified by mutagenesis to improve enzymes to match the needs of
industrial processes or to even create enzymes with completely new
functions. The stability end of the TIM-barrel (Fig. 16.5) appears to
be at the N-terminal ends of the parallel β-strands, and the catalytic
end is at the C-terminal end of the β-strands. The N-terminal end
is also important for its efficient folding [146]. TIM-barrel enzymes
catalyze a huge number of diverse chemical reactions as the fold
is present in all enzyme classes. All these features make it an ideal
scaffold for protein engineering studies [147, 148].
The first directed evolution experiments on TIM-barrel enzymes
were conducted over 20 years ago. In 1989, Hermes et al. [149]
created gene libraries of chicken TIM using a spiked primer method
to demonstrate the superiority of this method compared to chemical

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Directed Evolution Studies with TIM-Barrel Enzymes 587

mutagenesis methods. The active site residue Glu165 was mutated


to an asparagine. On the basis of this variant, gene libraries were
prepared. These libraries were analyzed to see if variants could
be found where the residue at position 165 (corresponding to the
position of the catalytic glutamate in chicken TIM) was not back-
mutated to a glutamate. Interestingly, six of such variants could be
found. They showed point mutations in highly conserved sequence
areas and displayed an increased catalytic activity.
A study by the research group of Sampson disclosed the
relationship between the TIM activity and the C-terminal loop 6
hinge region [150, 151]. The three corresponding codons (of chicken
TIM) were replaced by a combinatorial library comprising all 8000
possible combinations. Several different combinations of amino
acids were found to lead to active TIM variants. The study was able
to explain the variety of solutions to yield hinge flexibility, which
directly influences catalytic activity.
One of the earliest success stories, with respect to directed evo-
lution using a monomeric TIM-barrel enzyme, was by Sterner and
coworkers [119]. Directed evolution of two (β/α)8 -barrel enzymes
catalyzing related reactions in two different metabolic pathways was
conducted. A challenge in protein engineering is especially to obtain
catalytic activity, which is comparable to those of natural enzymes.
In 2009 Sterner and coworkers were able to successfully create
enzyme variants (sugar isomerases) with turnover numbers and
substrate affinities similar to those of the wild-type enzymes [120].
Through random mutagenesis and subsequent in vivo selection they
were able to establish activity on a natural TIM-barrel as well as
an artificial chimeric TIM-barrel scaffold. Through their work the
group was further able to elucidate natural enzyme evolution of
TIM-barrels and even mimic it in the laboratory [101].
A work by Saab-Rincon et al., with focus on an enzyme with a
TIM-barrel scaffold, showed how two different engineering strate-
gies can lead to the same result [152]. A monomeric trypanosomal
TIM variant (monoTIM) was subjected to randomization of the
whole gene and also only the loop 2 region. The same variant
was found in both cases. The A43P T44A/S-mutant showed an
11-fold increased kcat and a 4-fold decreased KM value. From
the same research group another paper was published over 10
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

588 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

years later [121] demonstrating that it is possible to change the


catalytic activity of TIM. A combination of structure-based design
and random mutagenesis employing different mutagenesis methods
was used to create four different libraries. The catalytic activity
of a thiamine phosphate synthetase was successfully transplanted
onto monoTIM. The success is largely due to the use of data-driven
design and the stringent selection system applied. Although the new
enzymes exhibit low catalytic efficiency, the interesting part about
this work is that the substrates and the reaction mechanism of
both natural enzymes are not similar. In those studies, however, a
common intermediary was always present.
Other successful enzyme conversions on the basis of the TIM-
barrel scaffold have been demonstrated. One example is the di-
rected evolution of the E. coli 2-dehydro-3-deoxy-phosphogluconate
aldolase. (KDPG aldolase) [79], also a TIM-barrel enzyme. The
alteration of enzyme specificity, specifically enantioselectivity is
of industrial importance. In this work libraries were created by
random mutagenesis and DNA shuffling. They were subsequently
screened for variants that showed altered enantioselectivity toward
D- and L-sugars. The mutations found leading to the desired result
were surprisingly found far from the active site. In such cases
it is difficult to deduce structure–function relationships from the
obtained results. In another study in 2003 [153] the E. coli N-
acetylneuraminic acid aldolase, an enzyme with a (α/β)8 -barrel fold,
was altered to improve its catalytic activity toward enantiomeric
substrates in aldol reactions. A complete reversal and significant
improvement of enantioselectivity could be achieved. Also here all
mutations found were far from the active site. Both studies exemplify
that regions far away from the active site can also influence the
catalytic properties of enzymes significantly.
Industrially interesting is also the study of Lee et al. [154]. The
TIM-barrel enzyme xylose isomerase is of importance for ethanol
production, but the commonly employed strain Saccharomyces
cerevisiae lacks native ability to convert pentose sugars. By directed
evolution the catalytic activity of a xylose isomerase from Piromyces
sp. was improved 77% and subsequently brought into an engineered
yeast strain. The engineered enzyme significantly increased the
aerobic growth rate as well as xylose consumption rates and the
ethanol production.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Directed Evolution Studies with TIM-Barrel Enzymes 589

Not only is directed evolution used to target the catalytic


performance of TIM-barrel enzymes, but thermostability and cold
adaptation are also of interest. Probably due to rigidity at mesophilic
temperatures, very often hyperthermophilic enzymes exhibit low
catalytic performance under mild conditions [155]. It was possible
to increase the activity of a xylose isomerase from T. thermophilus at
lower temperature (60◦ C), however, only at the cost of a decrease in
protein stability [156]. The study illustrates once more that usually
the gain in one property is paid with the loss of another property.
The importance of the dimer interface of TIM for its catalytic
properties has been shown in studies of monomeric TIMs. The initial
monomeric variant (monoTIM) was obtained by deletion of the
dimer interface loop 3 of trypanosomal TIM. The generation of this
variant was guided by a careful modeling exercise [157]. MonoTIM is
catalytically active, but kcat is 1000fold less and KM is 10fold higher.
The kcat and KM values of monoTIM are, respectively, 1 s–1 and
5 mM for D-GAP as the substrate. The studies of the monomeric
TIMs have been the starting point for structure-based enzyme
engineering approaches aimed at changing the substrate specificity.
The aim has been to widen the substrate binding pocket, which was
achieved by shortening loop 8 by three residues [158]. The variant
obtained after an additional point mutation, V233A, A-TIM has been
characterized in great depth [159, 160]. It could be shown by various
crystallographic binding studies that the active site is competent. For
example the transition-state analogue 2-phosphoglycollate binds
to A-TIM identically as in the wild-type TIM. Also the mode of
binding of the suicide inhibitor bromohydroxyacetone phosphate
(BHAP) binds identically to A-TIM and wild-type TIM [15, 159]. Also
the open/closed conformational flexibility of loop 6 and loop 7 is
present in A-TIM. Nevertheless residual TIM activity of A-TIM was
not detectable, suggesting that the affinity of A-TIM for the TIM
substrate is too low. The studies by Salin et al. [160] have shown
that larger molecules such as 4-phospho-D-erythronohydroxamic
acid (4-PEH, a transition-state analogue of a C5 sugar phosphate)
indeed can bind in the active site of A-TIM. Subsequently, A-TIM has
been subjected to directed evolution approaches in order to evolve
various new substrate specificities (Mirja Krause and coworkers,
PEDS, 2014, in press).
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

590 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

Figure 16.6 The Kemp elimination reaction. (a) Schematic overview of


the Kemp eliminase reaction. The transition state of the reaction is shown
in square brackets. The importance of the base and the proton donor of
the oxyanion hole for stabilization of this transition state is visualized
[161]. (b) Transition-state analogue of the Kemp eliminase reaction.
6-Nitrobenzotriazole has been used as a ligand in the complexed structures
of HG3.17 [123].

16.3.2 The Kemp Eliminases


Monomeric TIM-barrel proteins have also been used in extensive
studies aimed at grafting Kemp eliminase activities on this frame-
work. Like in TIM the catalyzed reaction does not require cofactors
and is not dependent on catalytic waters and the framework
consists of approximately 250 residues. Each of these features
facilitates also the computational approaches. The reaction (Fig.
16.6) has emerged as a model system in various enzyme engineering
projects. Transition-state analogues are available (Fig. 16.6b). Like
in TIM the rate-limiting step is a proton transfer step from a
carbon atom, for which a catalytic base is essential. The difficulties
of the design of such a Kemp eliminase active site have been
discussed by Warshel and coworkers [162, 163], emphasizing that
optimal active sites are preorganized such as to selectively stabilize
the charge (re)distribution of the transition state as compared
to the substrate. This usually requires strain in the active site,
such as, for example, the construction of an effective oxyanion
hole. There are no natural enzymes that catalyze such a Kemp

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Directed Evolution Studies with TIM-Barrel Enzymes 591

eliminase reaction. Previously it has been attempted to graft this


activity on an antibody framework [164], using catalytic antibody
technology, as well as to make a computationally designed enzyme
on the framework of a calmoduline domain [165]. The latter
two approaches have generated enzymes with significant catalytic
properties; however, these new nonnatural enzymes are not as
proficient as is characteristic for naturally evolved enzymes.
The two examples of Kemp eliminases, described here in more
detail, were obtained using the data-driven design approach, as
outlined in Fig. 16.3. In each case the first step concerned the
calculation of the structure of the transition state using quantum
mechanics (QM) approaches [140, 166]. In the case of the Kemp
eliminase reaction the stabilization of the transition state requires
a base and an oxyanion hole (Fig. 16.6). With respect to the base
several choices have been considered, like an aspartate, a glutamate,
or a histidine-aspartate dyad [166]. Concerning the hydrogen bond
donor of the oxyanion hole, side chains of tyrosine, threonine,
serine, lysine, and arginines have been considered in the initial
designs [140, 166]. Subsequently a suitable protein framework
has to be found that can support the preferred geometry of the
transition state and the catalytic side chains. Röthlisberger et al.
[166] have carried out a systematic PDB search, which resulted
in a TIM-barrel framework enzyme as the best possible reference
structure. Subsequent extensive molecular dynamics (MD) runs
provided important suggestions for the optimization of the active
site pocket. Important for further optimizations were also the
crystal structures of the proposed variants. For example in the
design efforts toward HG3 (Table 16.3) it was noted from the
crystal structure of the first designed variant, the inactive HG1
[140], that its active site pocket was too solvent exposed and MD
simulation studies using this crystal structure showed that the
active site residues were too flexible and did not adopt the required
preorganized active site structure. This then led to a new model,
HG2, having weak catalytic properties, in which the active site
was more buried. Extensive HG2 MD-simulation runs suggested a
further point mutation, leading to HG3, which indeed showed more
catalytic activity than HG2. Building on these computational efforts,
eventually directed evolution approaches have been able to produce
March 23, 2016
13:4

Table 16.3 Catalytic constants of wild-type TIM and of the KE59 and HG3 variants are shown. The catalytic base of TIM and
KE59.13 is a glutamate and for HG3.17 it is an aspartate. The kinetic constants of the two designed variants (KE59 and HG3;
blue columns) are compared with the kinetic constants of the two evolved variants KE59.13 and HG3.17 (yellow columns)

Starting kcat /kM kcat (s−1 ) kM (mM) kcat /KM (s−1 ·M−1 ) kcat (s−1 ) kM (mM) kcat /kuncat Reference
−1 −1
PDB entry (s ·M )
Wild-type TIM 440,000 430 1.0 109 [167]
KE59/KE59.13 1A53 160 0.29 1.8 600,000 9.5 0.16 107 [168, 169]
(starting from
IGPS )
HG3/HG3.17 1GOR 430 0.68 1.6 230,000 700 3.0 109 [123, 170]
PSP Book - 9in x 6in

(starting from a
xylanase)

www.ebook3000.com
Properties of the variants Properties of the variants
after computational design after directed evolution by
screening approaches
592 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed
16-Allan-Svendsen-c16
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Directed Evolution Studies with TIM-Barrel Enzymes 593

monomeric TIM-barrel enzymes with very high Kemp eliminase


activities (Table 16.3), as described in the next section.
The first example started from the KE59 construct described
by Röthlisberger et al. [166]. This construct was the result of
extensive calculations resulting in a mutated variant of an indole-
3-glycerolphosphate synthase (IGPS, PDB entry 1A53), which is
an enzyme of which the active site is built at the C-terminal
end of the TIM-barrel [168]. The designed KE59 variant was an
unstable protein. Therefore the KE59 sequence was compared
with sequences of related IGPSs, identifying 23 possible stabilizing
locations, outside the active site. In the first two rounds of
directed evolution these positions were probed, resulting in the
incorporation of consensus, stabilizing mutations at 10 positions.
The catalytic properties of the improved version obtained for this
construct (KE59.13), after many subsequent rounds of directed
evolution, are summarized in Table 16.3. The best possible catalytic
activity was obtained after 13 rounds of directed evolution; 3
additional rounds did not improve the biocatalytic activity any
further. It is pointed out that during the laboratory evolution the
pK of the catalytic base, Glu230, increases from 6.1 to 6.7 (as
measured from the pH dependency of kcat ). The increase in pK can
be attributed to a more hydrophobic pocket in which the catalytic
base is more reactive [169]. A similar strategy has been found in
the naturally evolved TIM [171]. For the second example also a
TIM-barrel enzyme was used as the starting point [140], a xylanase
(1GOR) [170]. In an iterative design protocol a model (HG3) was
created that had some catalytic activity (Table 16.2). After 17 cycles
of directed evolution an efficient enzyme (HG3.17) was obtained,
having a similar catalytic proficiency as that observed for wild-
type enzymes (Table 16.3), although catalytic perfection was not
yet achieved. In any case the results of this study confirm that it is
possible to graft efficient active sites on a protein, which by itself
does not have such activity at all. During the laboratory evolution
the recombinant yield (10-fold) and the protein stability increased
significantly, Tm = +7◦ C (from approximately 48◦ C to 55◦ C). The
increase in activity of HG3.17 correlates with a much higher affinity
for the transition-state analogue, as the Ki of this compound for HG3
is 310 μM and for HG3.17 is 2 μM.
March 23, 2016
13:4
PSP Book - 9in x 6in

www.ebook3000.com
Figure 16.7 Comparison of active sites complexed with a transition state analog. (a) The active site of TIM (2VXN) is shown
complexed with phosphoglycolohydroxamic acid (PGH). The active site residues K13 (after β1) (oxyanion hole), H95 (after β4)
(oxyanion hole), and E167 (after β6) (catalytic base) are shown in blue. Also shown is E97, which anchors the K13 side chain.
(b) The active site of the nonnatural Kemp eliminase HG3.17 (4BS0) is shown complexed with 6-nitrobenzotriazole (6NT). The
active site residues Q50 (oxyanion hole) (after β2) and D127 (catalytic base) (after β3) are shown in light blue. In both pictures
594 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

important atom–atom interactions are highlighted by dashed lines. Contact distances between atoms are given in Armstrong (Å).
16-Allan-Svendsen-c16
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

Directed Evolution Studies with TIM-Barrel Enzymes 595

Like in TIM, the catalytic power of the designed and evolved


active site is achieved by carefully positioning side chain residues,
being an aspartate side chain and a glutamine side chain in HG3.17
(Fig. 16.7). The glutamine side chain NH2 functions as an oxyanion
hole. In KE59.13 a corresponding oxyanion hole side chain is absent.
Mutating the HG3.17 glutamine to an alanine reduces the catalytic
activity 50-fold. The side chain oxygen of this glutamine is anchored
to two main chain backbone peptide amides. In the TIM active site,
the side chains of a lysine and histidine make the corresponding
oxyanion hole (Fig. 16.7). Also these side chains are appropriately
anchored in the TIM active site. The lysine NZ is anchored to a
glutamate side chain and the protonation state, and the positioning
of the catalytic histidine is controlled by a hydrogen bond between
ND1 and a main chain backbone peptide NH group. In the Kemp
eliminase KE59.13 the base is a glutamate, which protrudes out
of loop 8. In the Kemp eliminase HG3.17 the base is an aspartate
(protruding out of a loop after β-strand 3), and in TIM, it is a
glutamate (protruding out of a loop after β-strand 6). Mutating
the HG3.17 aspartate to an alanine reduced the Kemp eliminase
activity 105 fold. To the best of our knowledge the HG3.17 Kemp
eliminase activity grafted on a TIM-barrel framework is currently
the best example of developing a nonnatural active site with catalytic
proficiency similar to naturally evolved perfect enzymes. It can be
noted that the catalytic proficiency of HG3.17 correlates with a very
precise fit of the transition-state analogue in its binding pocket.
From the comparison of the initial, intermediate, and final structures
of the HG3.17 studies it emerges that in the laboratory evolution the
geometry of the active site becomes better through small structural
changes, which are currently difficult to model but which appear to
be critical for catalytic proficiency.
In the above two examples expert modeling has provided
the starting model, whereas the experimental directed evolution
approaches have been essential for the optimization of the catalytic
activity. In each of these examples the catalytic tools of the active
site are provided by side chains, not main chains. It can be noted
that in many enzyme active sites the oxyanion holes are shaped
by two main chain peptide NH groups, such as in serine proteases
[172] and in the crotonase fold enzymes [173, 174]. The directed
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

596 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

evolution methodology allows for sometimes big changes in the


structures in the intermediate steps, such as in the directed
evolution optimization of a retroaldolase TIM-barrel protein, also
reported in 2013 [175]. In this example the designed catalytic
residue, a lysine in loop 7 of the TIM-barrel fold, during the
initial directed evolution steps was supplemented by another lysine,
protruding out of loop 2. In the further evolutionary process this
second lysine outcompeted the first lysine with respect to providing
better catalytic efficiency. In this project crystal structures have been
determined from the initial design, intermediate models, and the
most efficient final variant, showing large structural variability of
various TIM-barrel loops at the catalytic end of the (β/α)8 -barrel.

16.4 Concluding Remarks

This chapter highlights the importance of advanced expertise


in computational and directed evolution approaches for being
able to develop nonnatural active sites. Although the modeling
calculations provide a good starting point, it cannot yet predict the
detailed interactions at the active site correctly. In naturally evolved,
proficient active sites the catalytic residues are kept in place by
extensive hydrogen bonding networks, which in a subtle, not fully
understood way are essential [176]. Also the dynamical properties
of the active site are difficult to model. It is interesting to see that
in these two recent Kemp eliminase examples the computational
active sites have been developed on the TIM-barrel framework, as
also in nature this is a favorable framework for a wide range of active
sites. This seems a coincidence, because nature has also evolved very
proficient active sites on other protein folds. It should also be noted
that the nonnatural Kemp eliminases catalyze a relatively simple
reaction. In many active sites of enzymes the complete reaction
cycle has multiple high-energy barriers, such as also shown, for
example, for the simple TIM reaction (Fig. 16.1) in which case the
proton is abstracted from the C1 atom but is delivered to the C2
atom, whereas in the Kemp eliminase reaction, only the proton
abstraction step is important. Nevertheless, it seems realistic to
expect that more examples of proficient nonnatural enzymes will

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 597

be developed in the coming years or at least new enzymes will be


developed that are functional at nonnatural conditions. The insight
obtained through these efforts will greatly increase the possibility of
tailoring enzyme properties toward the needs of specific processes
in many applications, fulfilling also a great desire to develop more
sustainable chemical conversions in large-scale synthetic processes.

Acknowledgments

We thank Profs. Peter Neubauer and Kalervo Hiltunen for their


consistent support of our A-TIM-directed evolution studies.

References

1. Radzicka, A., and Wolfenden, R. (1995). A proficient enzyme, Science,


267, pp. 90–93.
2. Hibbert, E. G., and Dalby, P. A. (2005). Directed evolution strategies for
improved enzymatic performance, Microb. Cell Fact., 4, p. 29.
3. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., Horn, G.
T., Mullis, K. B., and Erlich, H. A. (1988). Primer-directed enzymatic
amplification of DNA with a thermostable DNA polymerase, Science,
239, pp. 487–491.
4. Gerlt, J. A., and Babbitt, P. C. (2009). Enzyme (re)design: lessons from
natural evolution and computation, Curr. Opin. Chem. Biol., 13, pp. 10–
18.
5. Schmid, A., Dordick, J., Hauer, B., Kiener, A., Wubbolts, M., and Witholt,
B. (2001). Industrial biocatalysis today and tomorrow, Nature, 409, pp.
258–268.
6. Turner, N. J. (2009). Directed evolution drives the next generation of
biocatalysts, Nat. Chem. Biol., 5, pp. 567–573.
7. Tarr, M. A. (2003). Chemical Degradation Methods for Wastes and
Pollutants: Environmental and Industrial Applications (CRC Press).
8. Koeller, K. M., and Wong, C.-H. (2001). Enzymes for chemical synthesis,
Nature, 409, pp. 232–240.
9. Rylott, E. L., Eastmond, P. J., Gilday, A. D., Slocombe, S. P., Larson,
T. R., Baker, A., and Graham, I. A. (2006). The Arabidopsis thaliana
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

598 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

multifunctional protein gene (MFP2) of peroxisomal â-oxidation is


essential for seedling establishment, Plant J., 45, pp. 930–941.
10. Burgos, E. S., Vetticatt, M. J., and Schramm, V. L. (2013). Recycling
nicotinamide. The transition-state structure of human nicotinamide
phosphoribosyltransferase, J. Am. Chem. Soc., 135, pp. 3485–3493.
11. Singh, V., Lee, J. E., Núñez, S., Howell, P. L., and Schramm, V.
L. (2005). Transition state structure of 5’-methylthioadenosine/S-
adenosylhomocysteine nucleosidase from Escherichia coli and its
similarity to transition state analogues, Biochemistry, 44, pp. 11647–
11659.
12. Huisman, G. W., Liang, J., and Krebber, A. (2010). Practical chiral alcohol
manufacture using ketoreductases, Curr. Opin. Chem. Biol., 14, pp. 122–
129.
13. Savile, C. K., Janey, J. M., Mundorff, E. C., Moore, J. C., Tam, S., Jarvis,
W. R., Colbeck, J. C., Krebber, A., Fleitz, F. J., and Brands, J. (2010).
Biocatalytic asymmetric synthesis of chiral amines from ketones
applied to sitagliptin manufacture, Science, 329, pp. 305–309.
14. Pauling, L. (1946). Molecular architecture and biological reactions,
Chem. Eng. News, 24, pp. 1375–1377.
15. Wierenga, R. K., Kapetaniou, E. G., and Venkatesan, R. (2010).
Triosephosphate isomerase: a highly evolved biocatalyst, Cell. Mol. Life
Sci., 67, pp. 3961–3982.
16. Knowles, J. R. (1991). Enzyme catalysis: not different, just better,
Nature, 350, pp. 121–124.
17. Wolfenden, R. (1969). Transition state analogues for enzyme catalysis,
Nature, 223, pp. 704–705.
18. Kamerlin, S. C., Chu, Z. T., and Warshel, A. (2010). On catalytic
preorganization in oxyanion holes: highlighting the problems with the
gas-phase modeling of oxyanion holes and illustrating the need for
complete enzyme models, J. Org. Chem., 75, pp. 6391–6401.
19. Francis, K., Stojkoviæ, V., and Kohen, A. (2013). Preservation of protein
dynamics in dihydrofolate reductase evolution, J. Biol. Chem., 288, pp.
35961–35968.
20. Hammes, G. G., Benkovic, S. J., and Hammes-Schiffer, S. (2011).
Flexibility, diversity, and cooperativity: pillars of enzyme catalysis,
Biochemistry, 50, pp. 10422–10430.
21. Wang, L., Goodey, N. M., Benkovic, S. J., and Kohen, A. (2006).
Coordinated effects of distal mutations on environmentally coupled
tunneling in dihydrofolate reductase, Proc. Natl. Acad. Sci., 103, pp.
15753–15758.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 599

22. Hegeman, A. D. (2007). Enzymatic Reaction Mechanisms (Oxford


University Press, USA).
23. Tooze, J., and Branden, C. (1999). Introduction to Protein Structure
(Garland Publishing Inc. New York, USA).
24. Leemhuis, H., Kelly, R. M., and Dijkhuizen, L. (2009). Directed evolution
of enzymes: library screening strategies, IUBMB Life, 61, pp. 222–228.
25. Liszka, M. J., Clark, M. E., Schneider, E., and Clark, D. S. (2012). Nature
versus nurture: developing enzymes that function under extreme
conditions, Annu. Rev. Chem. Biomol. Eng., 3, pp. 77–102.
26. Brannigan, J. A., and Wilkinson, A. J. (2002). Protein engineering 20
years on, Nat. Rev. Mol. Cell Biol., 3, pp. 964–970.
27. Li, X., Zhang, Z., and Song, J. (2012). Computational protein design
approaches with significant biological outcomes: progress and chal-
lenges, Comput. Struct. Biotechnol. J., 2, pp. 305–311.
28. Peisajovich, S. G., and Tawfik, D. S. (2007). Protein engineers turned
evolutionists, Nat. Methods, 4, pp. 991–994.
29. Bornscheuer, U. T., and Pohl, M. (2001). Improved biocatalysts by
directed evolution and rational protein design, Curr. Opin. Chem. Biol.,
5, pp. 137–143.
30. Boersma, Y. L., Droge, M. J., van der Sloot, A. M., Pijning, T., Cool, R.
H., Dijkstra, B. W., and Quax, W. J. (2008). A novel genetic selection
system for improved enantioselectivity of Bacillus subtilis lipase A,
ChemBioChem, 9, pp. 1110–1115.
31. Jestin, J. L., and Kaminski, P. A. (2004). Directed enzyme evolution and
selections for catalysis based on product formation, J. Biotechnol., 113,
pp. 85–103.
32. Li, H., and Liao, J. C. (2013). Development of an NADPH-dependent
homophenylalanine dehydrogenase by protein engineering, ACS Synth.
Biol., 3, pp. 13–20.
33. Reetz, M. T., Höbenreich, H., Soni, P., and Fernández, L. (2008). A genetic
selection system for evolving enantioselectivity of enzymes, Chem.
Commun. (Camb.), pp. 5502–5504.
34. Daugherty, P. S. (2007). Protein engineering with bacterial display,
Curr. Opin. Struct. Biol., 17, pp. 474–480.
35. Klauser, T., Pohlner, J., and Meyer, T. F. (1992). Selective extracellular
release of cholera toxin B subunit by Escherichia coli: dissection of
Neisseria Iga beta-mediated outer membrane transport, EMBO J., 11,
pp. 2327–2335.
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

600 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

36. Chen, I., Dorr, B. M., and Liu, D. R. (2011). A general strategy for the
evolution of bond-forming enzymes using yeast display, Proc. Natl.
Acad. Sci. U S A, 108, pp. 11399–11404.
37. Kondo, A., and Ueda, M. (2004). Yeast cell-surface display: applications
of molecular display, Appl. Microbiol. Biotechnol., 64, pp. 28–40.
38. Amstutz, P., Forrer, P., Zahnd, C., and Plückthun, A. (2001). In vitro
display technologies: novel developments and applications, Curr. Opin.
Biotechnol., 12, pp. 400–405.
39. He, M., and Taussig, M. J. (1997). Antibody-ribosome-mRNA (ARM)
complexes as efficient selection particles for in vitro display and
evolution of antibody combining sites, Nucleic Acids Res., 25, pp. 5132–
5134.
40. Mattheakis, L. C., Bhatt, R. R., and Dower, W. J. (1994). An in vitro
polysome display system for identifying ligands from very large
peptide libraries, Proc. Natl. Acad. Sci., 91, pp. 9022–9026.
41. Plückthun, A. (2012). Ribosome display: a perspective. In Ribosome
Display and Related Technologies (Springer), pp. 3–28.
42. Ravn, P. (2012). Selection of lead antibodies from naive ribosome dis-
play antibody libraries. In Ribosome Display and Related Technologies
(Springer), pp. 213–235.
43. Schaffitzel, C., Hanes, J., Jermutus, L., and Plückthun, A. (1999).
Ribosome display: an in vitro method for selection and evolution of
antibodies from libraries, J. Immunol. Methods, 231, pp. 119–135.
44. Keefe, A. D., and Szostak, J. W. (2001). Functional proteins from a
random-sequence library, Nature, 410, pp. 715–718.
45. Glökler, J., Schutze, T., and Konthur, Z. (2010). Automation in the
high-throughput selection of random combinatorial libraries: different
approaches for select applications, Molecules, 15, pp. 2478–2490.
46. Maruyama, I. N., Maruyama, H. I., and Brenner, S. (1994). Lambda foo: a
lambda phage vector for the expression of foreign proteins, Proc. Natl.
Acad. Sci. U S A, 91, pp. 8273–8277.
47. McCafferty, J., Griffiths, A. D., Winter, G., and Chiswell, D. J. (1990). Phage
antibodies: filamentous phage displaying antibody variable domains,
Nature, 348, pp. 552–554.
48. Veleva, A. N., Nepal, D. B., Frederick, C. B., Schwab, J., Lockyer, P., Yuan,
H., Lalush, D. S., and Patterson, C. (2011). Efficient in vivo selection
of a novel tumor-associated peptide from a phage display library,
Molecules, 16, pp. 900–914.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 601

49. Bawazer, L. A., Izumi, M., Kolodin, D., Neilson, J. R., Schwenzer, B.,
and Morse, D. E. (2012). Evolutionary selection of enzymatically
synthesized semiconductors from biomimetic mineralization vesicles,
Proc. Natl. Acad. Sci. U S A, 109, pp. E1705–1714.
50. Gupta, R. D., and Tawfik, D. S. (2008). Directed enzyme evolution via
small and effective neutral drift libraries, Nat. Methods, 5, pp. 939–
942.
51. Miyamoto-Sato, E., Ishizaka, M., Horisawa, K., Tateyama, S., Takashima,
H., Fuse, S., Sue, K., Hirai, N., Masuoka, K., and Yanagawa, H. (2005).
Cell-free cotranslation and selection using in vitro virus for high-
throughput analysis of protein-protein interactions and complexes,
Genome Res., 15, pp. 710–717.
52. Stapleton, J. A., and Swartz, J. R. (2010). Development of an in vitro
compartmentalization screen for high-throughput directed evolution
of [FeFe] hydrogenases, PLOS ONE, 5, p. e15275w.
53. Tawfik, D. S., and Griffiths, A. D. (1998). Man-made cell-like compart-
ments for molecular evolution, Nat. Biotechnol., 16, pp. 652–656.
54. Bommarius, A. S., Blum, J. K., and Abrahamson, M. J. (2011). Status
of protein engineering for biocatalysts: how to design an industrially
useful biocatalyst, Curr. Opin. Chem. Biol., 15, pp. 194–200.
55. Bornscheuer, U., Huisman, G., Kazlauskas, R., Lutz, S., Moore, J., and
Robins, K. (2012). Engineering the third wave of biocatalysis, Nature,
485, pp. 185–194.
56. Miyazaki, K., and Arnold, F. H. (1999). Exploring nonnatural evolu-
tionary pathways by saturation mutagenesis: rapid improvement of
protein function, J. Mol. Evol., 49, pp. 716–720.
57. Neylon, C. (2004). Chemical and biochemical strategies for the ran-
domization of protein encoding DNA sequences: library construction
methods for directed evolution, Nucleic Acids Res., 32, pp. 1448–1459.
58. Müller, A., MacCallum, R. M., and Sternberg, M. J. (2002). Structural
characterization of the human proteome, Genome Res., 12, pp. 1625–
1641.
59. Shivange, A. V., Marienhagen, J., Mundhada, H., Schenk, A., and
Schwaneberg, U. (2009). Advances in generating functional diversity
for directed protein evolution, Curr. Opin. Chem. Biol., 13, pp. 19–25.
60. Tee, K. L., and Wong, T. S. (2013). Polishing the craft of genetic diversity
creation in directed evolution, Biotechnol. Adv., 31, pp. 1707–1721.
61. Cobb, R. E., Si, T., and Zhao, H. (2012). Directed evolution: an evolving
and enabling synthetic biology tool, Curr. Opin. Chem. Biol., 16, pp. 285–
291.
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

602 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

62. Cobb, R. E., Sun, N., and Zhao, H. (2013). Directed evolution as a
powerful synthetic biology tool, Methods, 60, pp. 81–90.
63. Gumulya, Y., Sanchis, J., and Reetz, M. T. (2012). Many pathways in
laboratory evolution can lead to improved enzymes: how to escape
from local minima, ChemBioChem, 13, pp. 1060–1066.
64. Wang, M., Si, T., and Zhao, H. (2012). Biocatalyst development by
directed evolution, Bioresour. Technol., 115, pp. 117–125.
65. Goldsmith, M., and Tawfik, D. S. (2013). Enzyme engineering by
targeted libraries, Methods Enzymol., 523, pp. 257–283.
66. Wu, W., Jia, Z., Liu, P., Xie, Z., and Wei, Q. (2005). A novel PCR strategy
for high-efficiency, automated site-directed mutagenesis, Nucleic Acids
Res., 33, p. e110.
67. Reetz, M. T., Carballeira, J. D., and Vogel, A. (2006). Iterative saturation
mutagenesis on the basis of B factors as a strategy for increas-
ing protein thermostability, Angew. Chem., Int. Ed., 45, pp. 7745–
7751.
68. Gumulya, Y., and Reetz, M. T. (2011). Enhancing the thermal robustness
of an enzyme by directed evolution: least favorable starting points
and inferior mutants can map superior evolutionary pathways,
ChemBioChem, 12, pp. 2502–2510.
69. Reetz, M. T., Soni, P., Fernandez, L., Gumulya, Y., and Carballeira,
J. D. (2010). Increasing the stability of an enzyme toward hostile
organic solvents by directed evolution based on iterative saturation
mutagenesis using the B-FIT method, Chem. Commun. (Camb.), 46, pp.
8657–8658.
70. Schmidt, D. M., Mundorff, E. C., Dojka, M., Bermudez, E., Ness, J. E.,
Govindarajan, S., Babbitt, P. C., Minshull, J., and Gerlt, J. A. (2003).
Evolutionary potential of (beta/alpha)8-barrels: functional promis-
cuity produced by single substitutions in the enolase superfamily,
Biochemistry, 42, pp. 8387–8393.
71. Dalby, P. A. (2011). Strategy and success for the directed evolution of
enzymes, Curr. Opin. Struct. Biol., 21, pp. 473–480.
72. Goldsmith, M., Ashani, Y., Simo, Y., Ben-David, M., Leader, H., Silman,
I., Sussman, J. L., and Tawfik, D. S. (2012). Evolved stereoselective
hydrolases for broad-spectrum G-type nerve agent detoxification,
Chem. Biol., 19, pp. 456–466.
73. Romero, P. A., and Arnold, F. H. (2009). Exploring protein fitness
landscapes by directed evolution, Nat. Rev. Mol. Cell Biol., 10, pp. 866–
876.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 603

74. Jäckel, C., Kast, P., and Hilvert, D. (2008). Protein design by directed
evolution, Annu. Rev. Biophys., 37, pp. 153–173.
75. Landwehr, M., Carbone, M., Otey, C. R., Li, Y., and Arnold, F. H. (2007).
Diversification of catalytic function in a synthetic family of chimeric
cytochrome P450s, Chem. Biol., 14, pp. 269–278.
76. Meyer, M. M., Hochrein, L., and Arnold, F. H. (2006). Structure-guided
SCHEMA recombination of distantly related â-lactamases, Protein Eng.
Des. Sel., 19, pp. 563–570.
77. Meyer, M. M., Silberg, J. J., Voigt, C. A., Endelman, J. B., Mayo, S. L., Wang,
Z. G., and Arnold, F. H. (2003). Library analysis of SCHEMA-guided
protein recombination, Protein Sci., 12, pp. 1686–1693.
78. Bloom, J. D., Romero, P. A., Lu, Z., and Arnold, F. H. (2007). Neutral
genetic drift can alter promiscuous protein functions, potentially
aiding functional evolution. Biol Direct2, pp. 17
79. Reetz, M. T. (2004). Controlling the enantioselectivity of enzymes by
directed evolution: practical and theoretical ramifications, Proc. Natl.
Acad. Sci. U S A, 101, pp. 5716–5722.
80. Tobin, M. B., Gustafsson, C., and Huisman, G. W. (2000). Directed
evolution: the ‘rational’ basis for ‘irrational’ design, Curr. Opin. Struct.
Biol., 10, pp. 421–427.
81. McLachlan, M., Sullivan, R. P., and Zhao, H. (2009). Directed enzyme
evolution and high throughput screening. In Biocatalysis for the
Pharmaceutical Industry-Discovery, Development, and Manufacturing
(John Wiley and Sons, Singapore), pp. 45–64.
82. Steiner, K., and Schwab, H. (2012). Recent advances in rational
approaches for enzyme engineering, Comp. Struct. Biotechnol. J., 2, p.
e201209010.
83. Bershtein, S., and Tawfik, D. S. (2008). Advances in laboratory evolution
of enzymes, Curr. Opin. Chem. Biol., 12, pp. 151–158.
84. Jaeger, K.-E., and Reetz, M. T. (2000). Directed evolution of enantios-
elective enzymes for organic chemistry, Curr. Opin. Chem. Biol., 4, pp.
68–73.
85. Arnold, F. H., and Georgiou, G. (2003). Directed Evolution Library
Creation, Methods Mol. Biol., 231, p. 232.
86. Wang, T.-W., Zhu, H., Ma, X.-Y., Zhang, T., Ma, Y.-S., and Wei, D.-Z.
(2006). Mutant library construction in directed molecular evolution,
Mol. Biotechnol., 34, pp. 55–68.
87. Mullis, K. B., and Faloona, F. A. (1987). Specific synthesis of DNA in vitro
via a polymerase-catalyzed chain reaction, Methods Enzymol., 155, pp.
335–350.
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

604 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

88. Saiki, R. K., Scharf, S., Faloona, F., Mullis, K. B., Horn, G. T., Erlich, H.
A., and Arnheim, N. (1985). Enzymatic amplification of beta-globin
genomic sequences and restriction site analysis for diagnosis of sickle
cell anemia, Science, 230, pp. 1350–1354.
89. Eckert, K. A., and Kunkel, T. A. (1991). DNA polymerase fidelity and the
polymerase chain reaction, PCR Methods Appl., 1, pp. 17–24.
90. Cadwell, R. C., and Joyce, G. F. (1992). Randomization of genes by PCR
mutagenesis, PCR Methods Appl., 2, pp. 28–33.
91. Zaccolo, M., Williams, D. M., Brown, D. M., and Gherardi, E. (1996).
An approach to random mutagenesis of DNA using mixtures of
triphosphate derivatives of nucleoside analogues, J. Mol. Biol., 255, pp.
589–603.
92. Zhou, Y., Zhang, X., and Ebright, R. H. (1991). Random mutagenesis of
gene-sized DNA molecules by use of PCR with Taq DNA polymerase,
Nucleic Acids Res., 19, pp. 6052–6052.
93. McCullum, E. O., Williams, B. A., Zhang, J., and Chaput, J. C. (2010).
Random mutagenesis by error-prone PCR. In In Vitro Mutagenesis
Protocols (Springer) pp. 103–109.
94. Schlinkmann, K. M., and Plückthun, A. (2013). Directed evolution
of G-protein-coupled receptors for high functional expression and
detergent stability, Methods Enzymol., 520, pp. 67–97.
95. Sarkar, C. A., Dodevski, I., Kenig, M., Dudli, S., Mohr, A., Hermans, E.,
and Plückthun, A. (2008). Directed evolution of a G protein-coupled
receptor for expression, stability, and binding selectivity, Proc. Natl.
Acad. Sci., 105, pp. 14808–14813.
96. Dodevski, I., and Plückthun, A. (2011). Evolution of three human
GPCRs for higher expression and stability, J. Mol. Biol., 408, pp. 599–
615.
97. Wong, T. S., Tee, K. L., Hauer, B., and Schwaneberg, U. (2004).
Sequence saturation mutagenesis (SeSaM): a novel method for
directed evolution, Nucleic Acids Res., 32, p. e26.
98. Stemmer, W. P. (1994). DNA shuffling by random fragmentation and
reassembly: in vitro recombination for molecular evolution, Proc. Natl.
Acad. Sci. U S A, 91, pp. 10747–10751.
99. Stemmer, W. P. (1994). Rapid evolution of a protein in vitro by DNA
shuffling, Nature, 370, pp. 389–391.
100. Yuan, L., Kurek, I., English, J., and Keenan, R. (2005). Laboratory-
directed protein evolution, Microbiol. Mol. Biol. Rev., 69, pp. 373–
392.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 605

101. Crameri, A., Whitehorn, E., Tate, E., Stemmer, W., Crameri, A., Kitts, P.,
and Kitts, P. (1996). Improved green fluorescent protein by molecular
evolution using, Nat. Biotechnol, 14, pp. 315–319.
102. Tian, Y. S., Xu, J., Peng, R. H., Xiong, A. S., Xu, H., Zhao, W., Fu, X.
Y., Han, H. J., and Yao, Q. H. (2013). Mutation by DNA shuffling of
5-enolpyruvylshikimate-3-phosphate synthase from Malus domestica
for improved glyphosate resistance, Plant Biotechnol. J., 11, pp. 829–
838.
103. Raillard, S., Krebber, A., Chen, Y., Ness, J. E., Bermudez, E., Trinidad, R.,
Fullem, R., Davis, C., Welch, M., Seffernick, J., Wackett, L. P., Stemmer,
W. P., and Minshull, J. (2001). Novel enzyme activities and functional
plasticity revealed by recombining highly homologous enzymes, Chem.
Biol., 8, pp. 891–898.
104. Rosic, N. N., Huang, W., Johnston, W. A., DeVoss, J. J., and Gillam, E. M.
(2007). Extending the diversity of cytochrome P450 enzymes by DNA
family shuffling, Gene, 395, pp. 40–48.
105. Coco, W. M., Levinson, W. E., Crist, M. J., Hektor, H. J., Darzins, A., Pienkos,
P. T., Squires, C. H., and Monticello, D. J. (2001). DNA shuffling method
for generating highly recombined genes and evolved enzymes, Nat.
Biotechnol., 19, pp. 354–359.
106. Ostermeier, M., Shim, J. H., and Benkovic, S. J. (1999). A combinatorial
approach to hybrid enzymes independent of DNA homology, Nat.
Biotechnol., 17, pp. 1205–1209.
107. Kawarasaki, Y., Griswold, K. E., Stevenson, J. D., Selzer, T., Benkovic, S. J.,
Iverson, B. L., and Georgiou, G. (2003). Enhanced crossover SCRATCHY:
construction and high-throughput screening of a combinatorial library
containing multiple non-homologous crossovers, Nucleic Acids Res., 31,
p. e126.
108. Griswold, K. E., Kawarasaki, Y., Ghoneim, N., Benkovic, S. J., Iverson,
B. L., and Georgiou, G. (2005). Evolution of highly active enzymes by
homology-independent recombination, Proc. Natl. Acad. Sci. U S A, 102,
pp. 10082–10087.
109. Zhao, H., Giver, L., Shao, Z., Affholter, J. A., and Arnold, F. H. (1998).
Molecular evolution by staggered extension process (StEP) in vitro
recombination, Nat. Biotechnol., 16, pp. 258–261.
110. Kolkman, J. A., and Stemmer, W. P. (2001). Directed evolution of
proteins by exon shuffling, Nat. Biotechnol., 19, pp. 423–428.
111. Tsuji, T., Onimaru, M., and Yanagawa, H. (2001). Random multi-
recombinant PCR for the construction of combinatorial protein
libraries, Nucleic Acids Res., 29, p. E97.
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

606 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

112. Jaeger, K.-E., Eggert, T., Eipper, A., and Reetz, M. (2001). Directed
evolution and the creation of enantioselective biocatalysts, Appl.
Microbiol. Biotechnol., 55, pp. 519–530.
113. Lutz, S. (2010). Beyond directed evolution: semi-rational protein
engineering and design, Curr. Opin. Biotechnol., 21, pp. 734–743.
114. Heinzelman, P., Snow, C. D., Wu, I., Nguyen, C., Villalobos, A., Govindara-
jan, S., Minshull, J., and Arnold, F. H. (2009). A family of thermostable
fungal cellulases created by structure-guided recombination, Proc.
Natl. Acad. Sci. U S A, 106, pp. 5610–5615.
115. Cherry, J. R., Lamsa, M. H., Schneider, P., Vind, J., Svendsen, A., Jones, A.,
and Pedersen, A. H. (1999). Directed evolution of a fungal peroxidase,
Nat. Biotechnol., 17, pp. 379–384.
116. Hoffmann, G., Bonsch, K., Greiner-Stoffele, T., and Ballschmiter, M.
(2011). Changing the substrate specificity of P450cam towards
diphenylmethane by semi-rational enzyme engineering, Protein Eng.
Des. Sel., 24, pp. 439–446.
117. Williams, G., Nelson, A., and Berry, A. (2004). Directed evolution of
enzymes for biocatalysis and the life sciences, Cell. Mol. Life Sci., 61,
pp. 3034–3046.
118. Chaparro-Riggers, J. F., Polizzi, K. M., and Bommarius, A. S. (2007).
Better library design: data-driven protein engineering, Biotechnol. J.,
2, pp. 180–191.
119. Jürgens, C., Strom, A., Wegener, D., Hettwer, S., Wilmanns, M., and
Sterner, R. (2000). Directed evolution of a (βα) 8-barrel enzyme to
catalyze related reactions in two different metabolic pathways, Proc.
Natl. Acad. Sci., 97, pp. 9925–9930.
120. Claren, J., Malisi, C., Höcker, B., and Sterner, R. (2009). Establishing
wild-type levels of catalytic activity on natural and artificial (βα) 8-
barrel protein scaffolds, Proc. Natl. Acad. Sci., 106, pp. 3704–3709.
121. Saab-Rincon, G., Olvera, L., Olvera, M., Rudino-Pinera, E., Benites,
E., Soberon, X., and Morett, E. (2012). Evolutionary walk between
(beta/alpha)(8) barrels: catalytic migration from triosephosphate
isomerase to thiamin phosphate synthase, J. Mol. Biol., 416, pp. 255–
270.
122. Salema, V., Marı́n, E., Martı́nez-Arteaga, R., Ruano-Gallego, D., Fraile, S.,
Margolles, Y., Teira, X., Gutierrez, C., Bodelón, G., and Fernández, L. Á.
(2013). Selection of single domain antibodies from immune libraries
displayed on the surface of E. coli cells with two β-domains of opposite
topologies, PLOS ONE, 8, p. e75126.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 607

123. Blomberg, R., Kries, H., Pinkas, D. M., Mittl, P. R., Grütter, M. G., Privett, H.
K., Mayo, S. L., and Hilvert, D. (2013). Precision is essential for efficient
catalysis in an evolved Kemp eliminase, Nature, 503, pp. 418–421.
124. Reetz, M. T., Kahakeaw, D., and Lohmer, R. (2008). Addressing the
numbers problem in directed evolution, ChemBioChem, 9, pp. 1797–
1804.
125. Arnold, F. H., and Volkov, A. A. (1999). Directed evolution of
biocatalysts, Curr. Opin. Chem. Biol., 3, pp. 54–59.
126. Soumillion, P., and Fastrez, J. (2001). Novel concepts for selection of
catalytic activity, Curr. Opin. Biotechnol., 12, pp. 387–394.
127. Schmidt-Dannert, C., and Arnold, F. H. (1999). Directed evolution of
industrial enzymes, Trends Biotechnol., 17, pp. 135–136.
128. Laos, R., Shaw, R., Leal, N. A., Gaucher, E., and Benner, S. (2013).
Directed evolution of polymerases to accept nucleotides with nonstan-
dard hydrogen bond patterns, Biochemistry, 52, pp. 5288–5294.
129. Smith, G. P. (1985). Filamentous fusion phage: novel expression vectors
that display cloned antigens on the virion surface, Science, 228, pp.
1315–1317.
130. Rothe, C., Urlinger, S., Löhning, C., Prassler, J., Stark, Y., Jäger, U.,
Hubner, B., Bardroff, M., Pradel, I., and Boss, M. (2008). The human
combinatorial antibody library HuCAL GOLD combines diversification
of all six CDRs according to the natural immune system with a novel
display method for efficient selection of high-affinity antibodies, J. Mol.
Biol., 376, pp. 1182–1200.
131. Roberts, R. W., and Szostak, J. W. (1997). RNA-peptide fusions for the
in vitro selection of peptides and proteins, Proc. Natl. Acad. Sci., 94, pp.
12297–12302.
132. Hanes, J., and Plückthun, A. (1997). In vitro selection and evolution of
functional proteins by using ribosome display, Proc. Natl. Acad. Sci., 94,
pp. 4937–4942.
133. Gupta, R. D., Goldsmith, M., Ashani, Y., Simo, Y., Mullokandov, G., Bar,
H., Ben-David, M., Leader, H., Margalit, R., and Silman, I. (2011).
Directed evolution of hydrolases for prevention of G-type nerve agent
intoxication, Nat. Chem. Biol., 7, pp. 120–125.
134. Gerlt, J. A., and Babbitt, P. C. (1998). Mechanistically diverse enzyme
superfamilies: the importance of chemistry in the evolution of
catalysis, Curr. Opin. Chem. Biol., 2, pp. 607–612.
135. Copley, R. R., and Bork, P. (2000). Homology among (βα)8 barrels:
implications for the evolution of metabolic pathways, J. Mol. Biol., 303,
pp. 627–641.
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

608 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

136. Nagano, N., Orengo, C. A., and Thornton, J. M. (2002). One fold with
many functions: the evolutionary relationships between TIM barrel
families based on their sequences, structures and functions, J. Mol.
Biol., 321, pp. 741–765.
137. Banner, D., Bloomer, A., Petsko, G., Phillips, D., Pogson, C., Wilson, I., Cor-
ran, P., Furth, A., Milman, J., and Offord, R. (1975). Structure of chicken
muscle triose phosphate isomerase determined crystallographically at
2.5 Å resolution: using amino acid sequence data, Nature, 255, pp.
609–614.
138. Fersht, A. (1999). Structure and Mechanism in Protein Science: A Guide
to Enzyme Catalysis and Protein Folding (Macmillan, New York).
139. Richard, J. P. (1984). Acid–base catalysis of the elimination and
isomerization reactions of triose phosphates, J. Am. Chem. Soc., 106,
pp. 4926–4936.
140. Privett, H. K., Kiss, G., Lee, T. M., Blomberg, R., Chica, R. A., Thomas, L.
M., Hilvert, D., Houk, K. N., and Mayo, S. L. (2012). Iterative approach
to computational enzyme design, Proc. Natl. Acad. Sci., 109, pp. 3790–
3795.
141. Shukla, A., and Guptasarma, P. (2004). Folding of β/α-unit scrambled
forms of S. cerevisiae triosephosphate isomerase: evidence for
autonomy of substructure formation and plasticity of hydrophobic and
hydrogen bonding interactions in core of (β/α) 8-barrel, Proteins, 55,
pp. 548–557.
142. Luger, K., Hommel, U., Herold, M., Hofsteenge, J., and Kirschner, K.
(1989). Correct folding of circularly permuted variants of a beta alpha
barrel enzyme in vivo, Science, 243, pp. 206–210.
143. Eder, J., and Kirschner, K. (1992). Stable substructures of eightfold.
beta. alpha.-barrel proteins: fragment complementation of phospho-
ribosylanthranilate isomerase, Biochemistry, 31, pp. 3617–3625.
144. Tafelmeyer, P., Johnsson, N., and Johnsson, K. (2004). Transforming a
(β/α)8 -Barrel Enzyme into a Split-Protein Sensor through Directed
Evolution, Chem. Biol., 11, pp. 681–689.
145. Silverman, J., Balakrishnan, R., and Harbury, P. (2001). Reverse
engineering the (β/α) 8 barrel fold, Proc. Natl. Acad. Sci., 98, pp. 3092–
3097.
146. Kursula, I., Partanen, S., Lambeir, A. M., and Wierenga, R. K. (2002).
The importance of the conserved Arg191-Asp227 salt bridge of
triosephosphate isomerase for folding, stability, and catalysis, FEBS
Lett., 518, pp. 39–42.

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 609

147. Gerlt, J. A., and Raushel, F. M. (2003). Evolution of function in (β/α)8 -


barrel enzymes, Curr. Opin. Chem. Biol., 7, pp. 252–264.
148. Sterner, R., and Hocker, B. (2005). Catalytic versatility, stability, and
evolution of the (betaalpha)8-barrel enzyme fold, Chem. Rev., 105, pp.
4038–4055.
149. Hermes, J. D., Parekh, S. M., Blacklow, S. C., Koster, H., and Knowles, J. R.
(1989). A reliable method for random mutagenesis: the generation of
mutant libraries using spiked oligodeoxyribonucleotide primers, Gene,
84, pp. 143–151.
150. Sun, J., and Sampson, N. S. (1998). Determination of the amino
acid requirements for a protein hinge in triosephosphate isomerase,
Protein Sci., 7, pp. 1495–1505.
151. Kursula, I., Salin, M., Sun, J., Norledge, B. V., Haapalainen, A. M.,
Sampson, N. S., and Wierenga, R. K. (2004). Understanding protein
lids: structural analysis of active hinge mutants in triosephosphate
isomerase, Protein Eng. Des. Sel., 17, pp. 375–382.
152. Saab-Rincon, G., Juarez, V. R., Osuna, J., Sanchez, F., and Soberon,
X. (2001). Different strategies to recover the activity of monomeric
triosephosphate isomerase by directed evolution, Protein Eng., 14, pp.
149–155.
153. Wada, M., Hsu, C.-C., Franke, D., Mitchell, M., Heine, A., Wilson, I., and
Wong, C.-H. (2003). Directed evolution of N-acetylneuraminic acid
aldolase to catalyze enantiomeric aldol reactions, Bioorg. Med. Chem.,
11, pp. 2091–2098.
154. Lee, S. M., Jellison, T., and Alper, H. S. (2012). Directed evolution of
xylose isomerase for improved xylose catabolism and fermentation in
the yeast Saccharomyces cerevisiae, Appl. Environ. Microbiol., 78, pp.
5708–5716.
155. Jaenicke, R., and Böhm, G. (1998). The stability of proteins in extreme
environments, Curr. Opin. Struct. Biol., 8, pp. 738–748.
156. Lönn, A., Gardonyi, M., van Zyl, W., Hahn-Hägerdal, B., and Otero,
R. C. (2002). Cold adaptation of xylose isomerase from Thermus
thermophilus through random PCR mutagenesis Eur. J. Biochem., 269,
pp. 157–163.
157. Borchert, T. V., Abagyan, R., Kishan, K. V., Zeelen, J. P., and Wierenga,
R. K. (1993). The crystal structure of an engineered monomeric
triosephosphate isomerase, monoTIM: the correct modelling of an
eight-residue loop, Structure, 1, pp. 205–213.
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

610 Toward New Nonnatural TIM-Barrel Enzymes Using Computational Design and Directed

158. Norledge, B. V., Lambeir, A. M., Abagyan, R. A., Rottmann, A., Fernandez,
A. M., Filimonov, V. V., Peter, M. G., and Wierenga, R. K. (2001). Modeling,
mutagenesis, and structural studies on the fully conserved phosphate-
binding loop (loop 8) of triosephosphate isomerase: toward a new
substrate specificity, Proteins, 42, pp. 383–389.
159. Alahuhta, M., Salin, M., Casteleijn, M. G., Kemmer, C., El-Sayed, I.,
Augustyns, K., Neubauer, P., and Wierenga, R. K. (2008). Structure-
based protein engineering efforts with a monomeric TIM variant: the
importance of a single point mutation for generating an active site with
suitable binding properties, Protein Eng. Des. Sel., 21, pp. 257–266.
160. Salin, M., Kapetaniou, E. G., Vaismaa, M., Lajunen, M., Casteleijn, M. G.,
Neubauer, P., Salmon, L., and Wierenga, R. K. (2010). Crystallographic
binding studies with an engineered monomeric variant of triosephos-
phate isomerase, Acta Crystallogr. D Biol. Crystallogr., 66, pp. 934–
944.
161. Kemp, D., and Casey, M. L. (1973). Physical organic chemistry of
benzisoxazoles. II. Linearity of the Broensted free energy relation for
the base-catalyzed decomposition of benzisoxazoles, J. Am. Chem. Soc.,
95, pp. 6670–6680.
162. Frushicheva, M. P., Cao, J., Chu, Z. T., and Warshel, A. (2010). Exploring
challenges in rational enzyme design by simulating the catalysis in
artificial kemp eliminase, Proc. Natl. Acad. Sci., 107, pp. 16869–16874.
163. Frushicheva, M. P., Cao, J., and Warshel, A. (2011). Challenges and
advances in validating enzyme design proposals: the case of Kemp
eliminase catalysis, Biochemistry, 50, pp. 3849–3858.
164. Thorn, S. N., Daniels, R. G., Auditor, M.-T. M., and Hilvert, D. (1995).
Large rate accelerations in antibody catalysis by strategic use of
haptenic charge, Nature, 373, pp. 228–230.
165. Korendovych, I. V., Kulp, D. W., Wu, Y., Cheng, H., Roder, H., and DeGrado,
W. F. (2011). Design of a switchable eliminase, Proc. Natl. Acad. Sci.,
108, pp. 6823–6827.
166. Röthlisberger, D., Khersonsky, O., Wollacott, A. M., Jiang, L., DeChancie,
J., Betker, J., Gallaher, J. L., Althoff, E. A., Zanghellini, A., and Dym, O.
(2008). Kemp elimination catalysts by computational enzyme design,
Nature, 453, pp. 190–195.
167. Knowles, J. R., and Albery, W. J. (1977). Perfection in enzyme catalysis:
the energetics of triosephosphate isomerase, Acc. Chem. Res., 10, pp.
105–111.
168. Hennig, M., Darimont, B., Jansonius, J., and Kirschner, K. (2002). The
catalytic mechanism of indole-3-glycerol phosphate synthase: crystal

www.ebook3000.com
March 23, 2016 13:4 PSP Book - 9in x 6in 16-Allan-Svendsen-c16

References 611

structures of complexes of the enzyme from Sulfolobus solfataricus


with substrate analogue, substrate, and product, J. Mol. Biol., 319, pp.
757–766.
169. Khersonsky, O., Kiss, G., Röthlisberger, D., Dym, O., Albeck, S., Houk,
K. N., Baker, D., and Tawfik, D. S. (2012). Bridging the gaps in
design methodologies by evolutionary optimization of the stability and
proficiency of designed Kemp eliminase KE59, Proc. Natl. Acad. Sci.,
109, pp. 10358–10363.
170. Lo Leggio, L., Kalogiannis, S., Eckert, K., Teixeira, S., Bhat, M. K., Andrei,
C., Pickersgill, R. W., and Larsen, S. (2001). Substrate specificity and
subsite mobility in T. aurantiacus xylanase 10A, FEBS Lett., 509, pp.
303–308.
171. Malabanan, M. M., Nitsch-Velasquez, L., Amyes, T. L., and Richard, J. P.
(2013). Magnitude and Origin of the Enhanced Basicity of the Catalytic
Glutamate of Triosephosphate Isomerase, J. Am. Chem. Soc., 135, pp.
5978–5981.
172. Pihko, P. M., Rapakko, S., and Wierenga, R. K. (2009). Hydrogen bonding
in organic synthesis: oxyanion holes and their mimics, in Hydrogen
Bonding in Organic Synthesis (Wiley-VCH Verlag, Weinheim, Germany).
173. Hamed, R. B., Batchelar, E. T., Clifton, I., and Schofield, C. J. (2008).
Mechanisms and structures of crotonase superfamily enzymes: how
nature controls enolate and oxyanion reactivity, Cell. Mol. Life Sci., 65,
pp. 2507–2527.
174. Pápai, I., Hamza, A., Pihko, P. M., and Wierenga, R. K. (2011).
Stereoelectronic requirements for optimal hydrogen-bond-catalyzed
enolization, Chem. A Eur. J., 17, pp. 2859–2866.
175. Giger, L., Caner, S., Obexer, R., Kast, P., Baker, D., Ban, N., and Hilvert, D.
(2013). Evolution of a designed retro-aldolase leads to complete active
site remodeling, Nat. Chem. Biol., 9, pp. 494–498.
176. Arnaud, C. H. (2013). Enzyme by design, Chem. Eng. News, 91, pp. 26–
27.
This page intentionally left blank

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Chapter 17

Handling the Numbers Problem in


Directed Evolution

Carlos G. Acevedo-Rochaa,b and Manfred T. Reetza,b


a Fachbereich Chemie, Philipps-Universität, Hans-Meerwein-Strasse,

35032 Marburg, Germany


b Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1,

45070 Mülheim, Germany


reetz@mpi-muelheim.mpg.de

This chapter summarizes statistical methods and models serving


as useful guides when designing and generating high-quality
mutant libraries in directed evolution, the emphasis being on
saturation mutagenesis as the preferred gene mutagenesis method.
Especially, the degree of oversampling when assessing mutant
libraries needs to be considered for optimal experimentation, as
screening is the main bottleneck in the overall procedure compared
to genetic selection and display systems. Thus, efforts toward
practical applications should be focused on reducing the screening
effort as much as possible. Along these lines, the statistical basis
when choosing appropriate multiresidue randomization sites in
conjunction with reduced amino acid alphabets is illuminated. A
comparison between the traditional Patrick and Firth algorithm
and the recently proposed statistical approach by Nov is presented.
Finally, practical tips on how to perform iterative saturation

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

614 Handling the Numbers Problem in Directed Evolution

mutagenesis (ISM) in the quest to evolve enzyme variants with


increased activity, enhanced or reversed stereoselectivity, altered
regioselectivity, and higher thermostability are also given.

17.1 Introduction

Directed evolution is now a well-established genetic method for


improving essentially any catalytic property of enzymes as catalysts
in synthetic organic chemistry and biotechnology [1–4]. The long-
standing traditional limitations regarding the use of enzymes in
practical applications such as low activity, narrow substrate scope,
and poor or wrong stereo- and regioselectivity, as well as insufficient
robustness under operating conditions, can be addressed by this
protein engineering method. Other problems, including allostery,
product inhibition, poor expression efficiency, and undesired side
products, can also be solved. Directed evolution involves iterative
cycles of gene mutagenesis, expression, and screening or genetic
selection. It is the evolutionary pressure exerted in each step that
makes this kind of laboratory evolution, as it is sometimes called,
different from the so-called rational design based on traditional
site-directed mutagenesis. The most commonly employed gene
mutagenesis methods are error-prone polymerase chain reaction
(epPCR), saturation mutagenesis, and DNA shuffling [1–4]. Applying
any one of these techniques or combinations thereof generally
leads to success, the degree of catalyst improvement depending
upon the amount of time and resources the researcher is willing to
invest. However, for truly efficient directed evolution, strategies and
techniques that allow for fast and reliable access to highly improved
biocatalysts are necessary. This is especially true when aiming for
industrial applications.
Recalling that the last step in a directed evolution process is
screening [5, 6] or selection [7–9], it is important to differentiate
between the two options. In the former case, analytical tools such
as UV-Vis spectroscopy and fluorescence or multiplexing mass
spectrometry (MS) are used to screen or prescreen all of the
harvested transformants in a given mutant library, typically 103 –105
clones. In numerous studies, a suitable chromophore or fluorophore

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Introduction 615

is attached covalently to the substrate of interest, but such surrogate


compounds are not likely to be employed as starting materials in
real (industrial) applications. The use of isotope-labeled substrates
when applying multiplexing MS does not entail such problems [10],
but this analytical instrument is quite expensive.
In contrast, genetic selection systems are constructed in such a
way that they ensure cell growth of the host organism (bacterial,
yeast, etc.) because it harbors a given enzyme with the desired
improved or altered catalytic property necessary for survival [7–
9]. On the DNA level, extremely large libraries covering extensive
portions of protein sequence space would be principally possible
(e.g., 109 ), limited only by factors such as the efficiency of
transformation. The crucial advantage would be that only those
host cells will appear on agar plates that harbor variants with
the desired catalytic profile. Ideally, the so-called junk variants
would not be formed, in contrast to systems based on screening
in which only few transformants harbor improved biocatalysts
(hits). Unfortunately, efficient selection systems for every enzyme
type, reaction category, and nature of substrate have not been
developed to date. Although a few selection systems geared for
directed evolution of enantioselective enzymes have been reported
[11, 12], none have reached a truly practical format [13]. Yet another
alternative is to employ some kind of (surface) display system,
coupled with an ultrahigh-throughput assay based on fluorescence-
activated cell sorting (FACS), allowing typically 108 cells to be
evaluated within a few hours [14–18]. Some researchers view these
experimental setups as selection systems, while others call them
screening assays. Irrespective of the nomenclature, these systems
usually utilize surrogate substrates that would not be used in
real applications. Furthermore, progress in directed evolution of
enantioselective enzymes using the FACS-based technology has been
slow [16–18].
All of these approaches need to be viewed in terms of the
numbers problem in directed evolution, which has to do with the
vastness of protein sequence space. Consider, for example, a protein
composed of 300 amino acids. The number (N) of theoretically
possible variants using all 20 canonical amino acids as building
blocks in mutagenesis depends upon the number of point mutations
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

616 Handling the Numbers Problem in Directed Evolution

occurring in a given mutant according to Eq. 17.1


19M × 300!
N = (17.1)
(300 − M)! × M!
where M is the number of amino acid substitutions per enzyme
molecule. For M = 1, the number of different variants amounts
to 5700, which reaches astronomical dimensions when M = 2
(16 million variants) or M = 3 (∼30 billion variants). Thus, one
possibility would be to strive for libraries as large as possible
by applying screening based on a display system coupled with
FACS, thereby covering large portions of the protein sequence, or
by devising an efficient selection system likewise covering large
portions of DNA sequence space. Nevertheless, the alternative would
be to keep the mutant libraries as small as possible so that they
can be handled by a convenient screening technique, while ensuring
that they contain notably improved variants. In this case it would be
necessary to navigate efficiently in protein sequence space, ideally
by restricting the search to a targeted protein region in which the
likelihood of finding sufficiently improved hits is highest.
Due to the problems associated with devising efficient selection
systems as in the case of stereoselective enzymes [13], screening
has been used most often in directed evolution [5, 6]. Since this is
the bottleneck of the overall process in terms of time expenditure
and resources, much research has been invested in increasing the
efficacy of directed evolution, not just in devising practical screening
systems. The logical consequence has been the development of
mutagenesis strategies and techniques that generate higher-quality
(smarter) libraries on DNA and on protein level, thus requiring
a minimum screening effort. The practical goal is to generate
small libraries harboring highly improved variants that can be
screened for activity, stereoselectivity, and/or regioselectivity by
automated gas chromatography (GC) or high-performance liquid
chromatography (HPLC) available in standard laboratories. This
means that individual 2000- to 3000-membered libraries can
be handled within a few days using microtiter plates without
the need of surrogates or expensive analytical tools. Figure 17.1
summarizes the three approaches to handle the numbers problem
in directed evolution: (1) screening based on a traditional analytical

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Saturation Mutagenesis in Directed Evolution 617

Figure 17.1 Scheme illustrating three approaches for handling the number
problem in directed evolution.

instrumentation, (2) screening utilizing display systems, and (3)


selection systems.
This chapter reviews recent progress in developing statistical
methods as aids in handling the numbers problem in directed evolu-
tion. Emphasis is on saturation mutagenesis [19, 20] and iterative
saturation mutagenesis (ISM) [4, 20], because these knowledge-
guided methods have emerged as the most general and efficient
approaches to directed evolution for increasing activity, enlarging
or shifting substrate scope, enhancing or reversing stereoselectivity,
altering regioselectivity and/or enhancing thermostability, and
modulating binding affinity and even protein–protein interactions
of a growing number of different enzyme types. If the reader is
interested in statistical models for random methods such as epPCR,
some guidelines can be found elsewhere [21]. At the end of the
chapter, a few tips on how to group residues for ISM and how to
avoid possible pitfalls are included [22].

17.2 Saturation Mutagenesis in Directed Evolution

Saturation mutagenesis, sometimes termed “cassette mutagenesis,”


“combinatorial saturation mutagenesis,” or “site saturation muta-
genesis,” has been used in protein engineering for a long time
[4, 19, 23]. It entails the crucial extension of the original site-
specific mutagenesis in proteins, as developed by Smith [24]. The
method involves amino acid randomization at predetermined single
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

618 Handling the Numbers Problem in Directed Evolution

positions or sites comprising more than one position with the


introduction of all other 19 canonical amino acids or a subset
thereof. Focused libraries are thereby generated, in contrast to
epPCR, which is a shotgun method addressing the whole gene or
gene regions. One of the earliest applications was the improvement
of oxidative stability of enzymes by focusing on residues that had
previously been subjected to site-directed mutagenesis utilizing
rational design, a procedure that led to new point mutations
imparting greater thermostability than the original amino acid
substitution [25]. Later saturation mutagenesis at a four-residue
site aligning the binding pocket of a lipase was applied for the first
time in order to enhance enantioselectivity of an enzyme [26]. This
approach was subsequently systematized by considering all possible
first- and second-sphere residues around the binding pocket as
part of the combinatorial active-site saturation test (CAST) (a useful
acronym for describing such a form of saturation mutagenesis) [27].
The individual amino acid positions are then grouped into sites,
termed A, B, C, D, and so on, each composed of one or more residues,
as illustrated in Fig. 17.2. When enhancing thermostability and
robustness in organic solvents, the criterion for choosing optimal
sites for saturation mutagenesis is different: Residues are chosen
having the highest B-factors as revealed by X-ray crystallography (B-
FIT method) [28].
Another decisive step forward was the development of ISM [22,
28, 29]. The gene of a given hit in one library is used as a template
for saturation mutagenesis at the other sites, and the process is
continued until all identified sites have been visited once. The cases
for two-, three-, and four-site ISM systems are illustrated in Fig. 17.3.
A rapidly growing number of studies have appeared on the basis
of this principle, in all cases relatively small mutant libraries in the
order of 2000–3000 transformants being screened [4, 20].
In the quest to keep the screening effort at a minimum, it is
helpful to consider factors revolving around such questions as how
to group individual amino acid positions into multiresidue sites,
how to choose the optimal reduced amino acid alphabet, which
upward pathway to choose, and how to escape from possible local
minima (lack of improved variants in a given library). At this
point the reader is referred to useful ISM tutorials for improving

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Saturation Mutagenesis in Directed Evolution 619

Figure 17.2 Scheme illustrating CASTing with randomization sites A, B, C,


D, and so on, for saturation mutagenesis, each site comprising one or more
amino acid positions. From Ref. [27]. Copyright Wiley-VCH Verlag GmbH &
Co. KGaA. Reproduced with permission.

Figure 17.3 Schematic representation of (a) two-site, (b) three-site, and (c)
four-site ISM systems (from Ref. [53]. Copyright Wiley-VCH Verlag GmbH
& Co. KGaA. Reproduced with permission). Single or multiple-residue sites
are labeled with letters A, B, C, and D, and saturation mutagenesis can be
performed at each site until the desired degree of biocatalyst improvement
has been achieved.
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

620 Handling the Numbers Problem in Directed Evolution

enzyme selectivity [22] or stability [28] and to key papers describing


current optimization and fine-tuning of saturation mutagenesis, all
for reducing the screening effort:

• Application of quick quality control (QQC) before library


screening [30, 31]
• Instead of NNK, primer combinations to reduce codon bias
[30, 31]
• Effect of primer quality and cost on library quality [31]
• Improved techniques for difficult-to-diversify templates
[32, 33]
• Pooling of transformants for further reducing screening
effort [34, 35]
• Use of reduced amino acid alphabets [36–38]
• Use of bioinformatics guides [39, 40]
• Use of computational biology guides [41–43]

17.3 Statistical Analyses

17.3.1 Conventional Statistics Based on the Patrick and


Firth Algorithm
The issue of oversampling in the screening process is of prime
importance, for example, when aiming for covering with a defined
degree of confidence a given library size for determining a fitness
landscape. It is important to recall that the term “fitness landscape”
was brought almost half a century ago by Maynard-Smith to
describe the notion of a protein sequence space recognizing the
numbers problem in the molecular evolution of proteins [44]. In
the case of directed evolution, practical compromises have to be
made, guided by appropriate statistical analyses. The traditional
Patrick and Firth approach [45] and similar ones [46] are based
on Poisson statistics and on the assumption that amino acid bias
does not exist, but this is often not the case [31]. The algorithm
has been nevertheless used in the development of the CASTER
computer aid [28] (free of charge at the author’s homepage:
www.kofo.mpg.de/de/forschung/organische-synthese), which is a
useful, user-friendly guide for designing saturation mutagenesis

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Statistical Analyses 621

libraries. Other programs for library design are also available,


including GLUE-IT and AA-Calculator [47] as well as TopLib [48].
The Patrick and Firth algorithm [45] for estimating completeness
as a function of the number of transformants screened, T , can be
transformed into Eq. 17.2, with Pi denoting the probability that a
particular sequence occurs in the library and F i referring to the
frequency [36].
 
1 − Pi
T = − ln . (17.2)
Fi
The relationship reduces to Eq. 17.3 upon substituting for F i and V
denoting the number of gene mutants comprising a given library.
T = −V ln(1 − Pi ) (17.3)
Equation 17.3 reflects the correlation between the number of
mutants, V , of a library and the number of transformants, T , that
need to be screened for a defined degree of completeness. The so-
called oversampling factor Of has been defined by Eq. 17.4 [36]:
T
T Of = = − ln(1 − Pi ) (17.4)
V
When computing the oversampling factor Of as a function of
the percent library coverage, the curve displayed in Fig. 17.4
emerges. If 95% coverage is aimed for, it is necessary to screen
approximately threefold excess of transformants. Since the relation
has an exponential character, going beyond 95% coverage would
require vastly higher screening effort. Of course, full library coverage
may not be necessary for obtaining adequately improved variants,
but Fig. 17.4 serves as a guide in making such decisions [36].
In numerous studies utilizing saturation mutagenesis [4, 20],
sites were chosen comprising more than one amino acid position.
CASTER provides immediate information regarding the degree of
oversampling as a function of the number of such positions in a site
and the chosen reduced amino acid alphabet. The smaller the amino
acid alphabet, the lower the structural diversity, but the benefit is
the drastically reduced screening effort. One of many options is
the use of NDT codon degeneracy (D: adenine/guanine/thymine;
T: thymine), encoding 12 amino acids (Phe, Leu, Ile, Val, Tyr, His,
Asn, Asp, Cys, Arg, Ser, and Gly). The dramatic statistical difference
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

622 Handling the Numbers Problem in Directed Evolution

Oversampling versus library coverage

10
9
Oversampling factor

8
7
6
5
4
3
2
1
0
0 10 20 30 40 50 60 70 80 90 100
Library Coverage (%)

Figure 17.4 Correlation between degree of library coverage and oversam-


pling of enzyme variants in a given library. From Ref. [36]. Copyright Wiley-
VCH Verlag GmbH & Co. KGaA. Reproduced with permission.

between NNK and NDT codon degeneracy becomes apparent when


viewing the degree of oversampling necessary for ensuring 95%
coverage (Table 17.1). When considering a randomization site
composed of only one amino acid position, a possibility is to employ
NNK codon degeneracy because very little assaying is necessary
anyway (94 transformants; alternatively, the 19 variants can be
generated individually by site-directed mutagenesis, mutagenesis
but the economic costs have to be considered, especially for high-
throughput applications [31]). However, when opting for sites
comprising two or more amino acid positions, the use of a
reduced amino acid alphabet saves screening. Whereas NNK ensures
maximum structural diversity on the protein level, experimental
data have shown that employing reduced amino acid alphabets such
as NDT is in fact the best choice [20, 36]. Amino acid alphabets much
smaller than the one defined by NDT codon degeneracy can also be
chosen [37, 38], thereby reducing the screening effort even more,
especially when bioinformatics provides a guide for such decisions.
Sequence alignment of homologous enzymes at CAST sites allows
conserved amino acids to be identified, which are then picked as
members of a highly reduced alphabet [39, 40].
It is also of interest to view the whole range of percentage
coverage when choosing a particular reduced amino acid library
and opting for a certain number of positions at a randomization

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Statistical Analyses 623

Table 17.1 Oversampling required for 95% library coverage when using
NNK and NDT codon degeneracy as calculated by the CASTER computer
aid on the basis of conventional statistics and assuming 100% yield of
diversification

Target NNK NDT


residues Codons Transformants Codons Transformants
1 32 94 12 34
2 1028 3066 144 430
3 32,768 98,163 1728 5175
4 1 × 106 3 × 106 20,736 62,118
5 3 × 107 1 × 108 2 × 105 7 × 105
6 1 × 109 3 × 109 3 × 106 9 × 106
7 3 × 1010 1 × 1011 3 × 107 1 × 108
8 1 × 10 12
3 × 10 12
4 × 10 8
1 × 109
9 3 × 10 13
1 × 10 14
5 × 10 9
1 × 1010
10 1 × 1015 3 × 1015 6 × 1010 1 × 1011

NNK randomization
5 residues
10000
9000 4 residues 3 residues 2 residues
8000
Transformatnts

7000
6000
5000
4000
3000
2000
1000 1 residue
0
0 10 20 30 40 50 60 70 80 90 100
Library Coverage (%)

Figure 17.5 Library coverage computed for NNK degeneracy at ran-


domization sites comprising 1, 2, 3, 4, or 5 amino acids (according to
traditional statistics and assuming 100% yield of diversification). From
Ref. [36]. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with
permission.

site, these data also being available by CASTER [28, 36]. In the case
of NNK and NDT codon degeneracy, Figs. 17.5 and 17.6 feature
the respective curves. The comparison nicely illustrates the issue
facing the operator when addressing simultaneously many sites
for NNK saturation. This does not mean that such sites should be
avoided, provided smaller amino acid alphabets are chosen to keep
the screening work as low as possible.
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

624 Handling the Numbers Problem in Directed Evolution

NDT randomization
10000 5 residues 4 residues 3 residues
9000
8000
Transformants

7000
6000
5000
4000
3000
2000 2 residues
1000
1 residue
0
0 10 20 30 40 50 60 70 80 90 100
Library Coverage (%)

Figure 17.6 Library coverage computed for NDT degeneracy at random-


ization sites comprising 1, 2, 3, 4, or 5 amino acid positions (according
to traditional statistics and assuming 100% yield of diversification). From
Ref. [36]. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with
permission.

17.3.2 Statistics Based on the Nov Algorithm


A different approach to treating the statistics of saturation mutage-
nesis libraries was recently proposed by Nov [48, 49]. Rather than
seeking the best hit at, for example, 95% library coverage, attention
is placed on the second, third, or nth best mutant, the assumption
being that such a variant fulfills all practical requirements defined
by the researcher. It is important to stress that the nth-best-based
algorithm is based on the traditional one by Patrick and Firth, which
means that the effort for screening a given library with 95% chance
for full coverage is equivalent to finding the very best mutant in
that library with the same probability, provided there is no codon
redundancy. In any case, if the second, third, or nth best mutant is
pursued, that would simply correspond to a smaller portion of the
library size covered [50]. For example, for a two-amino-acid library
randomized with NDT, the Patrick and Firth algorithm recommends
oversampling 430 transformants (Table 17.1), which is the same
number required to find the best mutant (Table 17.2). If, however,
1 of the 3 best mutants is aimed for, only 143 mutants will have to be
screened (Table 17.2), corresponding to a library coverage of about
63%, as calculated by TopLib [48].
In other words, the nth-best-based algorithm approach proposes
that it is not necessary to explore completely the combinatorial

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Statistical Analyses 625

Table 17.2 Sampling required for finding with 95% probability the
best and one of the three best mutants when using NNK and NDT
codon degeneracy, respectively, as calculated by TopLib using the
nth-best-based statistical approach and assuming 100% yield of
diversification

Target NNK NDT


residues 1/1 best (95%) 1/3 best (95%) 1/1 best (95%) 1/3 best (95%)
1 80 22 34 3
2 2130 534 430 15
3 55,485 12,374 5175 63
4 1 × 106 3 × 105 62,118 255
5 4 × 107 7 × 106 7 × 105 1023
6 9 × 10 8
2 × 10 8
9 × 106 4090
7 2 × 1010 4 × 109 1 × 108 16,361
8 6 × 1011 8 × 1010 1 × 109 65,443
9 1 × 1013 2 × 1012 1 × 1010 2 × 105
10 4 × 1014 4 × 1013 1 × 1011 1 × 106

sequence space of a given library, thereby being more oriented


toward practical goals. Consequently, the following question arises:
Is it then only necessary to screen a limited portion of a large
combinatorial library? The answer depends on whether the right
residues are targeted. If these are correctly chosen for the fitness
parameter under study, it is very likely that only partial screening
(e.g., 10%) of a large library could suffice in reaching the goal. We
have noticed that this is possible in a number of studies [26, 38]
without applying the nth-best-based method, but also when the
library coverage screening is higher (50%–75%) [50]. If, however,
the wrong residues are chosen for improving a desired phenotypic
trait, not even exploring 99% of the combinatorial library would
yield satisfactory results (see later). This can be explained by
the degree of contribution necessary to achieve a fitness value
by the targeted residues. As indicated before, we have developed
guidelines to improve the thermal or chemical stability of enzymes
using the B-FIT method [28] as well as activity, regioselectivity,
and stereoselectivity when using the CASTing approach [22]. These
guidelines help the user to assign empirical scores that are likely
involved in the catalytic parameter of interest to each residue on the
basis of evidence from the literature and/or hints obtained from the
3D protein structure and/or biochemical data in the lab.
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

626 Handling the Numbers Problem in Directed Evolution

It is noteworthy to mention that both the Patrick and Firth and


the Nov statistical approaches are based on two main assumptions:
(1) the chance of generating each randomized codon per residue
is uniform and (2) libraries are 100% diversified (yield after
saturation mutagenesis). These assumptions are often not fully
correct, depending on various factors such as the sequence of the
targeted position, gene length, GC content, and PCR conditions,
including reaction temperature, primer length and mispriming, etc.
[30, 31]. For this reason, a QQC test is highly recommended as
indicated before: Only one sequencing reaction is performed on the
pool of library plasmids created after saturation mutagenesis to
determine the approximate yield of randomization, which can be
incorporated into the TopLib program [48]. In spite of these two
assumptions, both algorithms provide frameworks to screen with
confidence libraries created by saturation mutagenesis. Additionally,
another assumption that one has to consider when applying the
nth-best-based algorithm is that the fitness values of the variants
of a given library must have a continuous distribution without ties
in the fitness space [48]. However, the distance between the best
variant and the second-best variant may not be the same, depending
on the number of target residues for saturation mutagenesis [49],
so this approach might apply differently to combinatorial libraries
depending on their sizes. Nevertheless, it can be used in various
types of fitness landscapes (see Section 17.5.1).

17.4 How to Group and Randomize Amino Acid Positions

In ISM, the number of theoretically possible upward pathways


depends upon the number of sites to be randomized (Fig. 17.3)
[4, 22]. It is constructive to illuminate a hypothetical example
that reflects the statistical factors. Consider an enzyme in which
eight individual CAST amino acid positions close to each other
have been identified, which can be grouped, for example, into two
four-residue sites or alternatively into four two-residue sites, the
former comprising four amino acid positions in two multisites (A
and B), while the latter comprises two amino acid positions in
four multisites (A, B, C, D), respectively. In the former case only

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

How to Group and Randomize Amino Acid Positions 627

two upward pathways exist, which require the generation of four


mutant libraries if both trajectories are to be tested (Fig. 17.3a).
In the case of the four-site system, 24 upward climbs are relevant
involving a total of 64 mutant libraries (Fig. 17.3c). Assuming that
an NDT-reduced amino acid alphabet is chosen and 95% library
coverage as calculated by conventional statistics is desired, while
traversing all relevant upward pathways, reference to Table 17.1
allows fast comparison. When considering both pathways in the
two-site ISM system, a total of 4 × 62,118 = 248,472 transformants
need to be screened. In the case of all 24 pathways of the four-
site system, a total of 64 × 430 = 27,520 transformants would be
required. It is also recommended to calculate the time and costs
needed for generating 4 large versus 64 small libraries, but it is clear
that the four-site ISM system requires much less screening effort.
The situation changes further if the researcher screens with lower
library coverage but uses NDT codon degeneracy. For example, when
settling for only 10% library coverage (which may suffice for finding
sufficiently improved variants), the computed screening effort is
drastically reduced, requiring for the two-site ISM system 8740 (4 ×
2185) transformants and for the four-site ISM system only 960 (64
× 15) transformants. In contrast, when applying the Nov algorithm
and settling for one of the three best mutants (Table 17.2), screening
is also reduced drastically: 82,828 (4 × 20,707) transformants for
the two-site ISM system and 9152 (64 × 153) transformants for the
four-site ISM system.
It is interesting to note that the first example of saturation
mutagenesis at the binding pocket of an enzyme (lipase) as the
catalyst in an enantioselective transformation (hydrolytic kinetic
resolution of a chiral carboxylic acid p-nitrophenyl ester) involved
randomization of a four-residue site by saturation mutagenesis
using NNK codon degeneracy [26]. Upon screening only 5000
transformants, a variant was identified showing a notably enhanced
selectivity factor (E = 30 versus E = 1.2 for wild type [WT]).
For 95% chance of finding the best mutant, 1 × 106 transformants
should have been screened, but at that time oversampling was not
considered. Later the experimental result appeared surprising in
light of the fact that only a very tiny portion of the respective protein
sequence space had been explored, and at the time no adequate
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

628 Handling the Numbers Problem in Directed Evolution

explanation was provided [28]. On the basis of the Nov algorithm,


the screening of 5000 mutants from such a large combinatorial
library would correspond to approximately finding 1 of the best
111 mutants with a probability of 95% and assuming a yield of
100% [49]. This corresponds to a portion of less than 0.5% library
coverage using traditional statistics (Fig. 17.5).
Although definite recommendations as to the optimal way to
select a limited number of amino acid positions (e.g., CAST type)
and how to group them optimally into randomization sites cannot
be issued presently, current experience points to a preference for
multiresidue sites in combination with properly chosen reduced
amino acid alphabets [20]. The exact number of amino acid positions
in such a site may depend upon the particular enzyme of interest and
the catalytic property to be improved. For a very general and useful
ISM guideline, see Ref. [22], but one of several ways to proceed in any
new situation is to perform scanning mutagenesis to all canonical
amino acids or a reduced amino acid alphabet for identifying
hotspots individually at all target residues identified via CAST or B-
FIT before grouping them. Then one can choose the best (Patrick
and Firth) or the nth best (Nov) mutant from the initial scanning and
address either single- or multisite residues using again all canonical
amino acids or a reduced amino acid alphabet, and so on.

17.5 Fitness Landscapes

17.5.1 Fujiyama vs. Badlands Fitness Landscapes


Since the classical paper of Maynard-Smith on natural selection in
relation to the concept of protein sequence space [44], and Kauff-
man’s analysis of rugged fitness versus single-peaked Fujiyama-
type fitness landscapes [51] (Fig. 17.7), researchers have tried to
assign the results of some directed evolution studies to one of
these categories [52]. While caution needs to be exercised when
comparing natural Darwinian evolution with protein engineering
[52–54], two types of upward climbs can be imagined: those in
which a single point mutation is introduced at every evolutionary
stage as favored by some researchers [54, 55] and those in which

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Fitness Landscapes 629

Figure 17.7 Two main types of fitness landscapes as suggested by some


researchers. (a) The Fujiyama smooth topology is modular and the parts
do not interact (above). (b) Badlands rugged topology is integral and
the parts can be independent (parts do not interact) or dependent
(parts interact). K means the number parts that each part interacts with
(http://www.terrorismanalysts.com/pt/index.php/pot/article/view/30/
html).

multiple point mutations are introduced simultaneously at each


step [20, 28, 38, 52, 56]. In our opinion the decision as to
which strategy should be followed depends upon the type of gene
mutagenesis method that is applied and the fitness parameter to
be improved. If epPCR is used, for example, to enhance activity,
the introduction of single point mutations may be the best option,
which has been associated with Fujiyama landscapes, in contrast
to multiple mutations, which have been postulated to result in
rugged landscapes [54]. In fact, epPCR with a high mutation rate
providing excellent results on activity has been reported but at the
expense of investing a large screening effort [26]. However, the
general advice regarding the superiority of single point mutations
along evolutionary pathways does not apply in some proteins
when using saturation mutagenesis, which has a different statistical
basis [20, 28]. Although ISM has been used successfully in some
studies involving a sequence of upward steps characterized by
single point mutations [35, 57, 58], numerous other studies utilizing
the simultaneous introduction of multiple point mutations at each
evolutionary stage suggest the superiority of such an approach
[20, 28, 53, 56]. If a set of point mutations is introduced in a
single operation, then the respective point mutations can interact
with one another epistatically in a nonadditive manner [59],
enabling synergistic or cooperative effects. Of course, epistasis can
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

630 Handling the Numbers Problem in Directed Evolution

also occur between the accumulating sets of mutations. Indeed,


enormous cooperative effects have been uncovered in ISM studies
by deconvolution [59, 60], which throws light on the efficacy of
the method. Since individual point mutations and sets of point
mutations can always interact with one another, irrespective of the
mutagenesis method, assignment to Fujiyama versus rugged fitness
landscapes may not be all that meaningful, as often implied in the
literature [52, 54]. In the following section a different kind of concept
for describing fitness landscapes is introduced, which has proven to
be practical when engaging in directed evolution.

17.5.2 Fitness-Pathway Landscapes and How to Escape


from Local Minima
In this section we discuss the concept of fitness-pathway landscape,
which is distinctly different from the classical definition of fitness
landscapes used in traditional evolutionary biology for describing
Darwinian evolution in nature. Irrespective of the mutagenesis
method and the strategy of how to group amino acids for ISM, a
much-feared phenomenon in directed evolution is the occurrence of
libraries that fail to harbor any improved mutants [1–4]. If such an
event is encountered when applying ISM, the operator has several
options [22]:

• Screen more transformants (if the maximum had not


already been reached).
• Change the reduced amino acid alphabet.
• Step back one or two generations and follow a different
upward pathway.
• Step back one generation and utilize a similar or the second
best mutant for climbing up the same pathway.
• Revisit a previously addressed multiresidue site.
• Target new residues for mutagenesis.

Yet another more general trick was recently developed, which is


easy to apply in any iterative mutagenesis method: Simply identify
an inferior variant in the dead-end library and utilize its gene as a
template in the next cycle of mutagenesis, expression, and screening
[53]. This contra-intuitive strategy emerged during a study aimed at

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Fitness Landscapes 631

exploring all 24 pathways of a four-site ISM system, as illustrated


in Fig. 17.3c. In this study, the hydrolytic kinetic resolution of
glycidyl phenyl ether (rac-1) catalyzed by the epoxide hydrolase
from Aspergillus niger (ANEH) was chosen as the model reaction. WT
ANEH shows low enantioselectivity in slight favor of (S)-2 (E = 4.6),
as shown below [53]:

The mechanism of ANEH involves activation of the epoxide O-


atom by two H-bonds originating from Tyr251 and Tyr314 and
the rate-determining SN2 reaction by Asp192 followed by rapid
hydrolysis of the short-lived enzyme–ester intermediate. On the
basis of the ANEH crystal structure [61] with the substrate modeled
into the binding pocket, six potential CAST sites were identified
as A (I193/S195/F196), B (L215/A217/R219), C (M329/L330), D
(L349/C350), E (T317/T318), and F (F244/M245/L249) (Fig. 17.8).
In an ISM-based investigation preceding the fitness landscape
study, an arbitrary upward pathway B→C→D→F→E had been
chosen that provided the best variant at that time (E = 115) [29]. In
order to keep the experimental work at a minimum when mapping
all pathways of an ISM scheme, the number of sites was reduced
to four, which entails the construction of 24 evolutionary pathways
involving the generation of 64 libraries [53]. The four sites were
chosen as B, D, E, and F, but the three-residue sites B and F were
truncated to two-residue sites B*(L215/R219) and F*(F244/L249).
Thus, all four sites B*, D, E, and F* employed in the study are two-
residue sites. All of the 64 mutant libraries were generated by NDT
codon degeneracy, which required in each case the screening of
about 430 transformants for 95% coverage according to traditional
statistics (CASTER).
In total, 27,520 transformants were analyzed. Interestingly,
in 16 of the 24 upward pathways, improved hits were found
in all of the respective libraries, meaning the absence of local
minima [53]. In the remaining eight trajectories, libraries were
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

632 Handling the Numbers Problem in Directed Evolution

Figure 17.8 Six CAST sites A, B, C, D, E, and F for potential saturation


mutagenesis on the basis of the X-ray structure of ANEH, PDB: 1QO7. From
Ref. [53]. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with
permission.

encountered, which failed to harbor any improved variants. As


already alluded to, escaping from such local minima proved to be
possible by identifying an inferior or nonimproved variant and
using its gene as a template for saturation mutagenesis at the next
site in the respective pathway. The total ISM system constitutes a
multidimensional energy surface involving all mutational sets at the
four sites B*, C, D, and F* as independent vectors and G† as
the dependent variable. Since this is difficult to picture graphically,
a fitness-pathway landscape was constructed by considering the
data at every stage of a pathway in a stacking mode linking the
catalytic performance of WT ANEH with the final best mutant in
each pathway [53]. When this is done for all 24 pathways as in the
present case, fitness-pathway landscapes of the type shown in Fig.
17.9 result. Two different types of trajectories can be discerned, one
characterized by a smooth continuous decrease in G† , in which

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Fitness Landscapes 633

Figure 17.9 Schematic representation of a fitness-pathway landscape


featuring 24 trajectories leading from (top) WT ANEH to (bottom) the final
enantioselectivity of the last variant at the end of a pathway, as specified
by the respective G† values that were determined by the measured
enantioselectivity data [53]. Green line: typical pathway lacking any local
minima; red line: pathway in which at least one local minimum occurs. From
Ref. [53]. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with
permission.

all of its libraries contain at least one improved variant (higher


enantioselectivity). Pathway number 20 marked in green in Fig. 17.9
is an example. The other type of pathway is characterized by at
least one library along the trajectory, which lacks improved variants,
causing a local minimum (e.g., red arrow defined by E→F*→B*→D
in Fig. 17.9). In such situations an inferior mutant was chosen for the
subsequent mutagenesis cycle, in all cases leading out of the local
minimum, as shown by the free-energy profiles of all 24 pathways
(Fig. 17.10).
It was discovered that all 24 pathways lead to notably improved
variants showing selectivity factors in the range E = 28–158.
The ISM scheme of the 12 best variants in the upper range of
E = 80–158 is shown in Fig. 17.11. It can be seen that in the
pathway E→B*→D→F*, a local minimum is encountered because
an inferior mutant in library D had to be employed in saturation
mutagenesis at the final site F*, thereby leading to a variant with
E = 95. Several other libraries harbored only mutants that showed
enantioselectivity comparable to that of the respective parent. In
such cases the use of such nonimproved variants likewise enabled
the pathway to be continued with generation of improved variants
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

634 Handling the Numbers Problem in Directed Evolution

Figure 17.10 Schematic representation of the free energy profiles of the


24 pathways from a frontal view of the fitness-pathway landscape; green:
pathway lacking local minima; red: pathways in which at least one local
minimum occurs (lack of improved variants in the respective library). From
Ref. [53]. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with
permission.

in the subsequent cycle. The corresponding scheme featuring the


other 12 less successful pathways (E = 28—80) can be found in
the original study [18]. In summary, 20 of the 24 pathways provided
variants with very high stereoselectivity in the range E = 50–158.
The lesson to be learned in view of the numbers problem
in directed evolution [36] is twofold. First, the probability of
achieving notable success when arbitrarily choosing one of several
possible ISM pathways seems reasonable. Second, the best recipe
for escaping from local minima is simple. To date this is the only
study reporting the exploration of all pathways in an ISM scheme,
each multisite being composed of two target residues [53]. But
it should be mentioned that the selection and grouping of target
residues were crucial because most of the multiresidue sites contain
amino acids adjacent to or at least very close to each other (Fig.
17.8), thereby increasing the probability of cooperative interactions

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

Fitness Landscapes 635

Figure 17.11 Part of the 24-pathway ISM scheme featuring the best 12
trajectories that lead to ANEH variants displaying E values amounting to
>80 in the hydrolytic kinetic resolution of rac-1. From Ref. [53]. Copyright
Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with permission.

between the side chains of the mutated amino acids. This may
not necessarily apply to multiresidue sites in which the target
amino acids are located far away from each other, as in large active
sites for improving selectivity of complex enzymes. This question
remains to be addressed experimentally. For practical purposes,
exploring all possible pathways is not advisable because such
systematization would entail excessive screening. Indeed, several
dozen ISM investigations of various enzyme types reported from
different groups involve arbitrarily chosen pathways leading to
practical results [4, 20], although alternative nonchosen pathways
could prove to be even more proficient.
Several kinds of pitfalls in ISM (as in other methods such as
epPCR or DNA shuffling) may arise in addition to the possibility
of encountering a local minimum on the fitness-pathway landscape
[22]. Sometimes the quality of a library may be poor already at
the DNA level. Although the experimenter may have envisioned a
defined diversity, in reality it may not be formed experimentally. In
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

636 Handling the Numbers Problem in Directed Evolution

such cases it does not make sense to waste time and resources by
trying to express and screen something that does not exist. For this
purpose the so-called QQC has been developed, according to which
sequence analysis of pooled plasmids for each library is performed
prior to transformation into an expression strain [30, 31]. Since
then, other ways to measure the quality of saturation mutagenesis
libraries have appeared [31, 62]. Irrespective of the quality control
method of choice, however, if the designed diversity of a given
library has not been achieved, going back to the lab and optimizing
the experimental conditions are necessary. For example, if the
applied method utilizes PCR, the annealing temperature between
the template and the primers may be suboptimal [30, 31]. Other
many useful tips have been summarized elsewhere [22].

17.6 Conclusions and Perspectives

This chapter highlights several ways to handle the numbers


problem in directed evolution based on saturation mutagenesis,
with emphasis on statistical aspects, specifically when aiming for
increased activity, enhanced or reversed stereoselectivity, altered
regioselectivity, or higher thermostability. Conventional saturation
mutagenesis, including its systematization in the form of ISM, is rec-
ommended as a particularly efficient approach, provided structural
and mechanistic information is available. Fortunately, with powerful
protein 3D structure prediction tools [22], it is now possible to apply
ISM in the absence of de facto structural knowledge. Although many
researchers still consider random directed evolution approaches as
excellent alternatives, several studies comparing various mutagen-
esis techniques have pointed to the superior efficacy of saturation
mutagenesis relative to epPCR and DNA shuffling [63–65], in
addition to scores of reports in which only saturation mutagenesis
was applied with great success [4, 19, 20]. However, projects
based on saturation mutagenesis may not proceed optimally if the
statistical basis and library quality are ignored or if the target
residues are chosen blindly. Hopefully, this chapter will serve as a
useful guide in future directed evolution so that less effort is needed
to advance the practical development of useful biocatalysts.

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

References 637

Acknowledgments

This work was supported by the Max-Planck-Society and the Arthur


C. Cope Fund. We also thank Sabrina Höbenreich (neé Kille) and
Yuval Nov for insightful discussions.

References

1. Bommarius, A. S. (2015). Biocatalysis, a status report, Annu. Rev. Chem.


Biomol. Eng., 6, pp. 319–345.
2. Brustad, E. M., and Arnold, F. H. (2011). Optimizing non-natural protein
function with directed evolution, Curr. Opin. Chem. Biol., 15, pp. 201–
210.
3. Bornscheuer, U. T., Huisman, G. W., Kazlauskas, R. J., Lutz, S., Moore, J.
C., and Robins, K. (2012). Engineering the third wave of biocatalysis,
Nature, 485, pp. 185–194.
4. Reetz, M. T. (2013). Biocatalysis in organic chemistry and biotechnology:
past, present, and future, J. Am. Chem. Soc., 135, pp. 12480–12496.
5. Reetz, M. T. (2003). An overview of high-throughput screening systems
for enantioselective enzymatic transformations, Methods Mol. Biol., 230,
pp. 259–282.
6. Reymond, J. L. (2006). Enzyme Assays: High-Throughput Screening,
Genetic Selection and Fingerprinting (Wiley-VCH, Weinheim).
7. Taylor, S. V., Kast, P., and Hilvert, D. (2001). Investigating and engineering
enzymes by genetic selection, Angew. Chem., Int. Ed. Engl., 40, pp. 3310–
3335.
8. Lin, H., and Cornish, V. W. (2002). Screening and selection methods for
large-scale analysis of protein function, Angew. Chem., Int. Ed. Engl., 41,
pp. 4402–4425.
9. Aharoni, A., Griffiths, A. D., and Tawfik, D. S. (2005). High-throughput
screens and selections of enzyme-encoding genes, Curr. Opin. Chem.
Biol., 9, pp. 210–216.
10. Reetz, M. T., Becker, M. H., Klein, H.-W., and Stöckigt, D. (1999). A
method for high-throughput screening of enantioselective catalysts,
Angew. Chem., Int. Ed. Engl., 38, pp. 1758–1761.
11. Reetz, M. T., Höbenreich, H., Soni, P., and Fernandez, L. (2008). A
genetic selection system for evolving enantioselectivity of enzymes,
Chem. Commun. (Camb.), pp. 5502–5504.
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

638 Handling the Numbers Problem in Directed Evolution

12. Boersma, Y. L., Droge, M. J., van der Sloot, A. M., Pijning, T., Cool, R.
H., Dijkstra, B. W., and Quax, W. J. (2008). A novel genetic selection
system for improved enantioselectivity of Bacillus subtilis lipase A,
ChemBioChem, 9, pp. 1110–1115.
13. Acevedo-Rocha, C. G., Agudo, R., and Reetz, M. T. (2014). Directed
evolution of stereoselective enzymes based on genetic selection as
opposed to screening systems, J. Biotechnol., 191, pp. 3–10.
14. Yang, G., and Withers, S.G. (2009). Ultrahigh-throughput FACS-based
screening for directed enzyme evolution, ChemBioChem, 10, pp. 2704–
2715.
15. Levin, A. M., and Weiss, G. A. (2006). Optimizing the affinity and
specificity of proteins with molecular display, Mol. Biosyst., 2, pp. 49–57.
16. Lipovsek, D., Antipov, E., Armstrong, K. A., Olsen, M. J., Klibanov, A. M.,
Tidor, B., and Wittrup, K. D. (2007). Selection of horseradish peroxidase
variants with enhanced enantioselectivity by yeast surface display,
Chem. Biol., 14, pp. 1176–1185.
17. Droge, M. J., Boersma, Y. L., van Pouderoyen, G., Vrenken, T. E.,
Rüggeberg, C. J., Reetz, M. T., Dijkstra, B. W., and Quax, W. J. (2006).
Directed evolution of Bacillus subtilis lipase A by use of enantiomeric
phosphonate inhibitors: crystal structures and phage display selection,
ChemBioChem, 7, pp. 149–157.
18. Becker, S., Höbenreich, H., Vogel, A., Knorr, J., Wilhelm, S., Rosenau,
F., Jaeger, K. E., Reetz, M. T., and Kolmar, H. (2008). Single-cell high-
throughput screening to identify enantioselective hydrolytic enzymes,
Angew. Chem., Int. Ed. Engl., 47, pp. 5085–5088.
19. Valetti, F., and Gilardi, G. (2013). Improvement of biocatalysts for
industrial and environmental purposes by saturation mutagenesis,
Biomolecules, 3, pp. 778–811.
20. Reetz, M. T. (2011). Laboratory evolution of stereoselective enzymes: a
prolific source of catalysts for asymmetric reactions, Angew. Chem., Int.
Ed. Engl., 50, pp. 138–174.
21. Eggert, T., Reetz, M. T., and Jaeger, K. E. (2004). Directed evolution by
random mutagenesis: a critical evaluation. In Enzyme Functionality:
Design, Engineering, and Screening, Svendsen, A., ed. (Marcel Dekker,
New York).
22. Acevedo-Rocha, C. G., Kille, S., and Reetz, M. T. (2014). Iterative
saturation mutagenesis: a powerful approach to engineer proteins
by systematically simulating Darwinian evolution, Methods Mol. Biol.,
1179, pp. 103–128.

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

References 639

23. Siloto, R. M. P., and Weselake, R. J. (2012). Site saturation mutagenesis:


methods and applications in protein engineering, Biocatal. Agric.
Biotechnol., 1, pp. 181–189.
24. Smith, M. (1994). Synthetic DNA and biology (Nobel lecture), Angew.
Chem., Int. Ed. Engl., 33, pp. 1214–1221.
25. Estell, D. A., Graycar, T. P., and Wells, J. A. (1985). Engineering an enzyme
by site-directed mutagenesis to be resistant to chemical oxidation, J.
Biol. Chem., 260, pp. 6518–6521.
26. Reetz, M. T., Wilensek, S., Zha, D., and Jaeger, K. E. (2001). Directed evo-
lution of an enantioselective enzyme through combinatorial multiple-
cassette mutagenesis, Angew. Chem., Int. Ed. Engl., 40, pp. 3589–3591.
27. Reetz, M. T. (2014). One hundred years of the Max-Planck-Institut für
Kohlenforschung, Angew. Chem., Int. Ed. Engl., 53(33), pp. 8562–8586.
28. Reetz, M. T., and Carballeira, J. D. (2007). Iterative saturation mutagene-
sis (ISM) for rapid directed evolution of functional enzymes, Nat. Protoc.,
2, pp. 891–903.
29. Reetz, M. T., Wang, L. W., and Bocola, M. (2006). Directed evolution
of enantioselective enzymes: iterative cycles of CASTing for probing
protein-sequence space, Angew. Chem., Int. Ed. Engl., 45, pp. 1236–1241.
30. Kille, S., Acevedo-Rocha, C. G., Parra, L. P., Zhang, Z. G., Opperman, D. J.,
Reetz, M. T., and Acevedo, J. P. (2013). Reducing codon redundancy and
screening effort of combinatorial protein libraries created by saturation
mutagenesis, ACS Synth. Biol., 2, pp. 83–92.
31. Acevedo-Rocha, C. G., Reetz, M. T., and Nov, Y. (2015). Economical
analysis of saturation mutagenesis experiments, Sci. Rep., 5, p. 10654.
32. Sanchis, J., Fernandez, L., Carballeira, J. D., Drone, J., Gumulya, Y.,
Höbenreich, H., Kahakeaw, D., Kille, S., Lohmer, R., Peyralans, J. J.,
et al. (2008). Improved PCR method for the creation of saturation
mutagenesis libraries in directed evolution: application to difficult-to-
amplify templates, Appl. Microbiol. Biotechnol., 81, pp. 387–397.
33. Acevedo-Rocha, C. G., and Reetz, M. T. (2014). Assembly of designed
oligonucleotides: a useful tool in synthetic biology for creating high
quality combinatorial DNA libraries, Methods Mol. Biol., 1179, pp. 189–
206.
34. Polizzi, K. M., Parikh, M., Spencer, C. U., Matsumura, I., Lee, J. H., Realff,
M. J., and Bommarius, A. S. (2006). Pooling for improved screening of
combinatorial libraries for directed evolution, Biotechnol. Prog., 22, pp.
961–967.
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

640 Handling the Numbers Problem in Directed Evolution

35. Bougioukou, D. J., Kille, S., Taglieber, A., and Reetz, M. T. (2009). Directed
evolution of an enantioselective enoate-reductase: testing the utility
of iterative saturation mutagenesis, Adv. Synth. Catal., 351, pp. 3287–
3305.
36. Reetz, M. T., Kahakeaw, D., and Lohmer, R. (2008). Addressing the
numbers problem in directed evolution, ChemBioChem, 9, pp. 1797–
1804.
37. Sun, Z., Lonsdale, R., Kong, X.-D., Xu, J.-H., Zhou, J., and Reetz, M. T.
(2015). Reshaping an enzyme binding pocket for enhanced and inverted
stereoselectivity: use of smallest amino acid alphabet in directed
evolution, Angew. Chem., Int. Ed., 54, pp. 12410–12415.
38. Sandström, A. G., Wikmark, Y., Engström, K., Nyhlen, K., and Bäckvall, J.-
E. (2012). Combinatorial reshaping of the Candida antarctica lipase A
substrate pocket for enantioselectivity using an extremely condensed
library, Proc. Natl. Acad. Sci. U S A, 109, pp. 78–83.
39. Nobili, A., Gall, M. G., Pavlidis, I. V., Thompson, M. L., Schmidt, M., and
Bornscheuer, U. T. (2013). Use of ’small but smart’ libraries to enhance
the enantioselectivity of an esterase from Bacillus stearothermophilus
towards tetrahydrofuran-3-yl acetate, FEBS J., 280, pp. 3084–3093.
40. Reetz, M. T., and Wu, S. (2008). Greatly reduced amino acid alphabets in
directed evolution: making the right choice for saturation mutagenesis
at homologous enzyme positions, Chem. Commun. (Camb.), pp. 5499–
5501.
41. Wedge, D. C., Rowe, W., Kell, D. B., and Knowles, J. (2009). In silico
modelling of directed evolution: implications for experimental design
and stepwise evolution, J. Theor. Biol., 257, pp. 131–141.
42. Feng, X., Sanchis, J., Reetz, M. T., and Rabitz, H. (2012). Enhancing
the efficiency of directed evolution in focused enzyme libraries by the
adaptive substituent reordering algorithm, Chemistry, 18, pp. 5646–
5654.
43. Ferrario, V., Ebert, C., Svendsen, A., Besenmatter, W., and Gardossi, L.
(2014). An integrated platform for automatic design and screening of
virtual mutants based on 3D-QSAR analysis, J. Mol. Catal. B: Enzym., 101,
pp. 7–15.
44. Smith, J. M. (1970). Natural selection and the concept of a protein space,
Nature, 225, pp. 563–564.
45. Patrick, W. M., and Firth, A. E. (2005). Strategies and computational tools
for improving randomized protein libraries, Biomol. Eng., 22, pp. 105–
112.

www.ebook3000.com
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

References 641

46. Bosley, A. D., and Ostermeier, M. (2005). Mathematical expressions


useful in the construction, description and evaluation of protein
libraries, Biomol. Eng., 22, pp. 57–61.
47. Firth, A. E., and Patrick, W. M. (2008). GLUE-IT and PEDEL-AA: new
programmes for analyzing protein diversity in randomized libraries,
Nucleic Acids Res., 36, pp. W281–285.
48. Nov, Y. (2012). When second best is good enough: another probabilistic
look at saturation mutagenesis, Appl. Environ. Microbiol., 78, pp. 258–
262.
49. Nov, Y. (2013). Fitness loss and library size determination in saturation
mutagenesis, PLOS ONE, 8, p. e68069.
50. Höbenreich, S., Zilly, F., Acevedo-Rocha, C. G., Zilly, M., and Reetz, M.
T. (2015). Speeding up directed evolution: combining the advantages
of solid-phase combinatorial gene synthesis with statistically guided
reduction of screening effort, ACS Synth. Biol., 4, pp. 317–331.
51. Kauffman, S. A., and Weinberger, E. D. (1989). The NK model of rugged
fitness landscapes and its application to maturation of the immune
response, J. Theor. Biol., 141, pp. 211–245.
52. Poelwijk, F. J., Kiviet, D. J., Weinreich, D. M., and Tans, S. J. (2007).
Empirical fitness landscapes reveal accessible evolutionary paths,
Nature, 445, pp. 383–386.
53. Gumulya, Y., Sanchis, J., and Reetz, M. T. (2012). Many pathways in
laboratory evolution can lead to improved enzymes: how to escape from
local minima, ChemBioChem, 13, pp. 1060–1066.
54. Romero, P. A., and Arnold, F. H. (2009). Exploring protein fitness
landscapes by directed evolution, Nat. Rev. Mol. Cell Biol., 10, pp. 866–
876.
55. Tracewell, C. A., and Arnold, F. H. (2009). Directed enzyme evolution:
climbing fitness peaks one amino acid at a time, Curr. Opin. Chem. Biol.,
13, pp. 3–9.
56. Kwan, D. H., Constantinescu, I., Chapanian, R., Higgins, M. A., Kötzler, M.
P., Samain, E., Boraston, A. B., Kizhakkedathu, J. N., and Withers, S. G.
(2015). Toward efficient enzymes for the generation of universal blood
through structure-guided directed evolution, J. Am. Chem. Soc., 137, pp.
5695–5705.
57. Yang, Y., Liu, J., and Li, Z. (2014). Engineering of P450pyr hydroxylase
for the highly regio- and enantioselective subterminal hydroxylation of
alkanes, Angew. Chem., Int. Ed. Engl., 53, pp. 3120–3124.
March 21, 2016 13:48 PSP Book - 9in x 6in 17-Allan-Svendsen-c17

642 Handling the Numbers Problem in Directed Evolution

58. Ji, J., Fan, K., Tian, X., Zhang, X., Zhang, Y., and Yang, K. (2012). Iterative
combinatorial mutagenesis as an effective strategy for generation of
deacetoxycephalosporin C synthase with improved activity toward
penicillin G, Appl. Environ. Microbiol., 78, pp. 7809–7812.
59. Reetz, M. T. (2013). The importance of additive and non-additive
mutational effects in protein engineering, Angew. Chem., Int. Ed. Engl.,
52, pp. 2658–2666.
60. Zhang, Z. G., Lonsdale, R., Sanchis, J., and Reetz, M. T. (2014). Extreme
synergistic mutational effects in the directed evolution of a baeyer-
villiger monooxygenase as catalyst for asymmetric sulfoxidation, J. Am.
Chem. Soc., 136, pp. 17262–17272.
61. Zou, J., Hallberg, B. M., Bergfors, T., Oesch, F., Arand, M., Mowbray, S. L.,
and Jones, T. A. (2000). Structure of Aspergillus niger epoxide hydrolase
at 1.8 A resolution: implications for the structure and function of the
mammalian microsomal class of epoxide hydrolases, Structure, 8, pp.
111–122.
62. Sullivan, B., Walton, A. Z., and Stewart, J. D. (2013). Library construction
and evaluation for site saturation mutagenesis, Enzyme. Microb.
Technol., 53, pp. 70–77.
63. Parikh, M. R., and Matsumura, I. (2005). Site-saturation mutagenesis is
more efficient than DNA shuffling for the directed evolution of beta-
fucosidase from beta-galactosidase, J. Mol. Biol., 352, pp. 621–628.
64. Reetz, M. T., Prasad, S., Carballeira, J. D., Gumulya, Y., and Bocola, M.
(2010). Iterative saturation mutagenesis accelerates laboratory evolu-
tion of enzyme stereoselectivity: rigorous comparison with traditional
methods, J. Am. Chem. Soc., 132, pp. 9144–9152.
65. Reetz, M. T. (2012). Directed evolution of enzymes. In Enzyme Catalysis
in Organic Synthesis, Drauz, K., Gröger, H., and May, O., eds. (Wiley-VCH
Verlag GmbH & Co. KGaA, Weinheim), pp. 119–190.

www.ebook3000.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

Chapter 18

Hints from Nature: Metagenomics in


Enzyme Engineering

Esther Gabor, Birgit Heinze, and Jürgen Eck


Biotechnology Research and Information Network, Darmstädter Straße 34-36,
D-64673 Zwingenberg, Germany
je@brain-biotech.de

It is nowadays generally agreed that the metagenome constitutes


a virtually unlimited source of microbial genes with industrial
potential. While the view of the metagenome tended to be a rather
static one, the plasticity of this genomic resource is now being more
and more perceived. The composition of microbial communities,
for instance, changes during the course of the year according to
variations in temperature, humidity, and nutrients. Furthermore,
genes are transferred, recombined, and mutated by physicochemical
or genetic processes—a phenomenon called evolution.
From an enzyme engineer’s point of view this observation raises
the question of how to make use of this phenomenon in the quest
for the ideal enzyme for a given biocatalytic process. In this chapter,
we want to highlight some topical approaches for harvesting tailor-
made enzymes from the metagenome and for using metagenomic
information in the design of optimized enzyme catalysts.

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

644 Hints from Nature

18.1 Metagenomics and the Ideal Enzyme

It was only about 25 years ago that researchers started to realize


that prokaryotic diversity may be much higher than previously per-
ceived. By using increasingly sophisticated and powerful molecular
techniques, it became clear that not only a common natural habitat
such as forest or farmland soil but also marine sediments regularly
comprise thousands of different bacterial and even archaeal species
[1]—the majority of more than 99% not being cultivable in the
laboratory by standard techniques.
In this period, Handelsman et al. grafted the term “metagenome”
for the complex microbial community DNA that is present in a
given natural habitat [2]. In view of >109 genes that may be
present in less than 1 ccm of soil [3] it appeared feasible to
find a suitable enzyme catalyst for virtually any reaction type
and constraint. Since those days, a large number of studies have
been carried out that use the gigantic metagenomic reservoir—
terrestrial and marine—as a resource for enzyme discovery (for
a review from an industrial point of view, see Lorenz et al. [4];
a more academic compilation is provided by Kennedy et al. [5]).
Concerning the diversity of recovered sequences, many of these
projects have been very successful—with sequence identities to
existing database entries being generally in the range of only 30%–
70%. Unfortunately, the degree of sequence space coverage does not
always match the degree of performance space coverage, that is, the
recovered enzymes often do not match the profile of the desired
ideal biocatalysts [6]. One explanation for this observation could be
the incomplete recovery of the potential genes of interest. This may
partly be caused by the expression bias encountered when screening
for enzyme activity in a specific heterologous host system. Each
host only allows for the expression, and consequently detection,
of a specific and limited set of enzymes, thereby restricting the
overall performance space [7]. Performance space may furthermore
be affected by sequence bias that is introduced by the polymerase
chain reaction (PCR)/probe-based screening techniques relying on
known sequence motifs of a specific group of enzymes.
On the other hand, survival in natural, highly competitive envi-
ronments is likely to select for very different enzyme properties than

www.ebook3000.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

Metagenomics and the Ideal Enzyme 645

would be useful in an artificial, man-made process environment.


Stability in the presence of elevated concentrations of surfactants,
as required for detergent proteases, for instance, is a feature
unlikely to be relevant in a natural soil environment. In any way,
metagenomic—as well as classically discovered—enzymes often
need further optimization to fit to a given application.
In first instance, random in vitro mutagenesis methods such
as error-prone PCR can be helpful in this situation by efficiently
creating sequence variations. This introduction of changes by
deliberate faulty sequence amplification in principle mimics the
natural occurrence of point mutations due to replication errors
or physicochemical modifications. In contrast to natural evolu-
tion, however, malperforming enzyme variants are not erased by
counterselection, nor are beneficial changes rewarded by higher
reproduction rates. Therefore, frequencies of improved or let alone
functional mutants are generally extremely low, for example, <5%
as in the case of triple site saturation mutants of galactose oxidase
[8], making library screening rather cumbersome.
Similarly, a systematic evaluation of the sequence space of
a subtilisin by site saturation mutagenesis at each amino acid
position of the enzyme, followed by biochemical characterization
of all mutants, revealed that the number of mutants with improved
properties is surprisingly low [9]. Only less than 20% of the mutants
were better for a given property, indicating that the majority was
alike or worse than wild type. To make matters even worse, it turned
out that there was no correlation between improved mutations for
any two properties. Thus a positive mutation for one property had
only a <20% chance to be better for another property as well.
That has severe consequences for random libraries with multiple
mutations, because the vast majority of combined mutants are
statistically worse than the parent enzyme. A triple mutant has only
a probability of less than 0.2 × 0.2 × 0.2 = 0.008 to be better than
wild type for a given property (e.g., activity toward substrate A),
while it has at least a probability of 0.8 × 0.8 × 0.8 = 0.5 to be
worse than the parent in any other property (e.g., thermal stability
or activity toward substrate B). This leads to the requirement to
screen huge libraries of mutants in order to identify a few good
ones.
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

646 Hints from Nature

In contrast to classical random in vitro evolution experiments,


which require a lot of practical effort, in vivo adaptation processes—
even when only based on point mutations as a driving force—can
be surprisingly effective. The seminal study of Rainey and Travisano
[10] demonstrated that a clonal population of Pseudomonas cells can
differentiate into distinct phenotypes in as little as 30 cell division
events. In principle, each division cycle constitutes one round of
error-prone PCR with a selection step of better-performing variants
in between.
To mimic natural evolution in the laboratory, molecular bi-
ological methods termed “directed (molecular) evolution” have
been developed. Both processes—in vivo and in vitro evolution—
constitute a continuous circuit of genetic variation and phenotypic
selection [11]. Methods for the creation of protein-encoding DNA
libraries are random mutagenesis, as mentioned above, and gene
recombination. Gene recombination involves the exchange of larger
portions of genetic material and is generally a more potent-
directed evolution strategy. Homologous recombination, nonhomol-
ogous recombination, reciprocal recombination, and site-specific
recombination are application formats for the rapid improvement of
biological systems [12]. Homologous recombination is an important
source of the genetic variation, whereby new sequences are
generated by exchange of related segments of genes and genomes
[13]. It promotes the combination of traits and is a useful tool for
protein engineering. DNA shuffling, the first homologous in vitro
recombination method, which generates new genetic sequences
by both random mutation and recombination, was developed by
Stemmer [14]. The method was used to fragment and recreate a
single gene (gene shuffling) or a family of related genes (family
shuffling) [15]. Homologous recombination methods are, however,
biased toward sequence similarity of parental genes [16].
Alternatively, several nonhomologous recombination methods
have been designed to facilitate the shuffling of genes with
insufficient sequence identity. Benkovic and coworkers introduced
incremental truncation for the creation of hybrid genes (ITCHY) [17],
a very efficient technique for random fusion of domains from two
parent enzymes. The combinatorial approach is independent of DNA
homology but limited by the random fusion of truncated fragments

www.ebook3000.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

Molecular Microdiversity 647

from parents with different gene lengths and recombination sites


not being structurally related, which results in the creation of
libraries containing a large number of inactive clones [12]. However,
a large number of directed evolution methods have been developed
until now and delivered insights into sequence and structure of
protein chimera.
In addition to these sophisticated, nature-inspired in vitro
approaches, another straightforward option exists. Why not directly
use natural (micro-) sequence space that is constantly generated—
and functionally evaluated—in vivo? To access this particular
portion of the metagenome, dedicated methodologies are, however,
required. In the following, we will present two different approaches
that make use of evolutionary snapshots for discovering and
designing industrially relevant enzymes.

18.2 Molecular Microdiversity

To refer to the metagenome may be misleading in the circumstances,


since the genetic composition of this resource depends on the
studied habitat as well as on the period of sampling. Nutrient state,
temperature, aeration, and humidity affect the present microbial
population, that is, the metagenome. At the same time, the genetic
“hardware” of the community is constantly altered by mutational
and recombinatorial effects. As a consequence, metagenomic DNA
samples from the same location but with a few months period in
between the sampling dates may be substantially different.
Considering point mutations as the elementary source of genetic
novelty, it can be projected that in a limited volume of only 100
ccm of soil, about 600 variants of a given (ubiquitous) gene of
interest emerge every single day [18]. Of course, the majority of this
diversity will quickly vanish due to detrimental effects on enzyme
functionality. If only 1 of 150 mutations, however, was beneficial as
estimated by [19], a number of 4–5 viable variants would still be
generated on a daily basis.
To explore to what extent different variants of a given gene
can persist—and coexist—in a single habitat, we studied the
molecular microdiversity of an abundant enzyme in soil: subtilisin
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

648 Hints from Nature

Carlsberg (subC). Subtilisins are secreted serine proteases that are


involved in the degradation of albuminous nutrients, sporulation,
and defense mechanisms of many bacteria [20]. In the human world,
subtilisins play an important commercial role, most prominently
in the detergent industry [21]. For this reason, subtilisins have
been extensively studied, leading to a wealth of information
about structure–function relationships and the effects of specific
mutations (most of them being described and claimed in the patent
literature). This makes subtilisins an excellent example for the
comparison of the in vitro and in vivo inventories of functional
mutations.
In an initial effort to identify metagenomic mutations, we
determined the full-length sequences of 120 subC genes present in
garden molds (Fig. 18.1). Out of these sequences, 94 were unique
on the DNA level, encoding 51 different mature protein variants.
Remarkably, the two to eight amino acid changes that were detected
in each protein variant compared to subC did not lead to inactivation
of the respective protein. The observation that all protein variants

Figure 18.1 The microdiversity approach. On the basis of metagenomic


DNA, a library of full-length sequences of the gene of interest is established.
Individual clones are high-fidelity-sequenced to identify mutated residues.

www.ebook3000.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

Molecular Microdiversity 649

were still functional (when expressed in protease-deficient Bacillus


subtilis DB104) emphasizes the hypothesis that the respective
genes do not constitute evolutionary junk but rather a repository
of different functionalities. Accordingly, temperature stability and
substrate specificity that we chose as indicators for diversity in
enzyme characteristic were found to be quite different among
enzyme variants. In our perception, enzymatic microdiversity (and
functionality) may therefore be a tool of microbial populations
to readily respond to changing environmental conditions, that is,
altered nutrient supply or seasonal temperature variations.
But how can this functional repository serve industrial purposes?
On the one hand, microdiversity may be directly exploited as
a source of similar but functionally distinct enzymes. Compared
to enzyme variants created by mutagenesis techniques, natural
variants have the advantage that they have been activity-checked
by nature, that is, the screening effort dramatically decreases. Full-
length sequences can be directly cloned in optimized expression
systems, which allow high-level enzyme expression and omit
the great screen anomaly [22] that often hampers metagenomic
approaches in enzyme discovery. Admittedly, we, however, expect
that microdiversity can only be found for highly abundant or very
common proteins, such as catabolic enzymes, that are not heavily
conserved.
When targeting suitable enzymes, fundamental molecular
lessons may be learned from the recovered persistent gene variants.
In the case of subC, 45 of its 274 amino acid positions have been
found in our study to be altered (Fig. 18.2). Most positions were
found on the outer shell of the enzyme at two mutational hot spots
regions on opposite sites of the molecule. Since the effects of amino
acid changes distant to substrate-binding pockets and the active site
are usually difficult to predict, these positions would most likely
never be targeted in a rational enzyme engineering approach. On
the other hand, we observed that many amino acid exchanges found
in the metagenomic gene variants have previously been described
as mutations useful in increasing the performance of subtilisins in
specific applications. Interestingly, at position 147 located in the
hydrophobic core of the enzyme we found one of the few mutations
(V147I) that was in vitro found to be acceptable at this location
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

650 Hints from Nature

Figure 18.2 Mutational hotspots in subC variants. Color code: dark blue, no
mutation; light blue, 1 mutation; green, 2–5 mutations; yellow, 8 mutations;
orange, 10–20 mutations; and red, 51 mutations. The active site is indicated
by whitish amino acid side chains of the catalytic triad (D32, H64, and S221).

with respect to the remaining activity [9]. In this respect, mutational


hot spots identified in the natural microdiversity of a given enzyme
indicate which amino acids may be altered without destroying
enzyme function, thereby providing an effective starting point for
enzyme optimization, for example, by site saturation mutagenesis.
This appears to be particularly attractive when studying enzymes
lacking 3D structure information.

18.3 Metagenomic Enzyme Chimera

Since its introduction in 1994, DNA shuffling has been the most
common method to recombine genes and has become a powerful

www.ebook3000.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

Metagenomic Enzyme Chimera 651

tool in protein engineering. In the past decade, additional methods


have been established, aiming at circumventing the sequence bias
inherent to recombination-based directed evolution methods. In
the field of metagenomics, truncated metagenomic gene-specific PCR
(TMGS-PCR) was, for instance, developed [23]. As a starting gene,
a lipase isolated by functional metagenome screening was used as
a proof-of-concept. On the basis of its sequence, truncated gene-
specific primers were designed and used for the amplification of
homologous genes from different environmental soil samples. The
retrieved genes were DNaseI-digested and recombined by PCR. A
functional diversity of chimeric genes was generated, but generally,
the amplification of full-length genes still remains a shuffling hurdle.
To overcome the existing drawbacks of purely recombinatorial
methods, particularly when working with complex DNA, we con-
structed a library of diverse metagenomic gene/enzyme chimera.
In this library, a parent enzyme served as the enzyme backbone,
while predefined sequence segments were replaced and diversified
by metagenomic DNA fragments. As a lead enzyme, we chose an O-
succinyl homoserine sulfhydrylase functioning as a homocysteine
synthase in the pathway of methionine synthesis. Methionine is
an essential amino acid for many higher organisms and finds a
variety of applications as an animal feed and food additive, as well
as a component for medical products. In microorganisms, there
are two parallel routes for sulfur assimilation into the methionine
backbone, transulfuration and direct sulfhydrylation, with most
microorganisms synthesizing methionine via one of these pathways.
The direct sulfhydrylation pathway utilizes anorganic sulphide as
the sulfur donor, which is incorporated into the homoserine ester by
sulfhydrylases, leading to the direct formation of L-methionine [24].
In our enzymatic conversion, O-succinyl homoserine was mixed
with methyl mercaptan to produce L-methionine.
On the basis of CLANS analysis [25], 23 nearest neighbors of the
lead candidate were identified and used for a global alignment. In a
straightforward approach, regions of highest sequence identity were
defined as (degenerate) primer sites for metagenomic DNA amplifi-
cation and, in the second stage, as crossover sites for recombination
of the metagenomic DNA into the parent backbone. In this way, it is
possible to recombine gene segments with low sequence homology
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

652 Hints from Nature

and to preserve activity in the resulting chimeric enzymes. However,


we need to consider that the actual position as well as the
number of crossover points in a protein may strongly affect the
functionality of the resulting protein variants [13]. Therefore, more
sophisticated approaches appear to be useful in defining primer
sites, for instance, the computational algorithm SCHEMA [26]. On
the basis of the 3D structures of parent proteins, this algorithm
identifies interactions between residues and determines the number
of interactions that are disrupted by the creation of a hybrid protein.
In this way, the algorithm can help to identify suitable recombination
sites that maximize the generation of proteins with preserved
functionality.
In our study, metagenomic DNA was isolated from sulfur-rich
soil, which is obviously a good source of enzymes involved in
direct sulfhydrylation. The isolated DNA was used as a template
in PCR reactions with the three best-performing primer pairs, that
is, yielding PCR product of the highest purity without background
signals. For each primer pair, the resulting mixture of gene
fragments was subjected to recombination into the backbone of the
lead candidate gene. In a total of 83 different metagenomic chimeric
genes, the percentage of exchanged gene length was generally at
least 40% and—as expected—crossover events occurred at the
predefined primer amplification sites.
Although the analyzed sequence space is obviously reduced by
using defined recombination sites, the fraction of correctly folded
and functional proteins can be strongly increased in this way. In
our library more than 65% of metagenomic chimera was found
to be active. In an initial effort, we expressed and characterized
18 of these full-length chimeric genes. Biotransformations were
performed with O-succinyl-L-homoserine as a standard substrate,
and product formation was confirmed by high-performance liquid
chromatography (HPLC) analysis. All studied candidates were active
and showed biocatalytic properties different from the parent
enzyme. Accordingly, phylogenetic analysis revealed the access to a
group of new enzymes that occupy a distinct and rather well-defined
portion of sulfhydrylase sequence space as can be seen in Fig. 18.3.
The partial replacement of gene segments by metagenomic DNA
overcomes the typical shuffling problems associated with full-length

www.ebook3000.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

Outlook 653

Figure 18.3 CLANS analysis of sulfhydrylase chimera compared to parental


enzyme. The chimera are forming a new sequence space.

gene reconstruction as well as with in-frame cloning. Importantly,


also limitations of gene expression—as frequently encountered
when functionally screening metagenomic DNA libraries—are
circumvented. Consequently, the creation of metagenomic enzyme
chimera constitutes a promising tool to functionally explore metage-
nomic sequence space that would otherwise remain elusive.

18.4 Outlook

In the past two decades, functional metagenomics has seen


as rapid rise as a powerful methodology in enzyme discovery,
yielding numerous biocatalysts that have not been described before.
However, it is increasingly realized that still a major portion of
genes present in a metagenome fails to be accessed by conventional
DNA library screening methods. One major reason for this is the
limited expression capacity of the (so far) established expression
hosts. Consequently, many efforts are currently undertaken to
identify and optimize new expression systems, comprising potent
screening hosts as well as versatile, broad-range vector systems
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

654 Hints from Nature

[7, 27]. However, there are also other options other than this
major route on how to extract and use the genetic information of
noncultured microbial genomes. As shown in this chapter, molecular
microdiversity offers interesting insights into mutational hot spots
and molecular evolution. At the same time, metagenomic gene
variants constitute a pool of functional enzymes that may be useful
in research-based or industrial applications. Similarly, libraries of
metagenomic enzyme chimera are enriched in functional gene
variants covering a sequence space that could not be accessed
otherwise. These examples underline the fact that there is an
unbroken need in methodological advances to successfully mine
the vast—and still mostly unexplored—microbial diversity that is
already present and that is constantly emerging in nature.

References

1. Torsvik, V., Daae, F. L., Sandaa, R. A., and Ovreas, L. (1998). Novel
techniques for analysing microbial diversity in natural and perturbed
environments, J. Biotechnol., 64, pp. 53–62.
2. Handelsman, J., Rondon, M. R., Brady, S. F., Clardy, J., and Goodman, R.
M. (1998). Molecular biological access to the chemistry of unknown soil
microbes: a new frontier for natural products, Chem. Biol., 5, pp. R245–
R249.
3. Gans, J., Wolinsky, M., and Dunbar, J. (2005). Computational improve-
ments reveal great bacterial diversity and high metal toxicity in soil,
Science, 309, pp. 1387–1390.
4. Lorenz, P., and Eck, J. (2005). Metagenomics and industrial applications,
Nature, 3, pp. 510–515.
5. Kennedy, J., O’Leary, N. D., Kiran, G. S., Morrissey, J. P., O’Gara, F., Selvin,
J., and Dobson, A. D. (2011). Functional metagenomic strategies for the
discovery of novel enzymes and biosurfactants with biotechnological
applications from marine ecosystems, J. Appl. Microbiol., 111, pp. 787–
799.
6. Burton, S. G., Cowan, D. A., and Woodley, J. M. (2002). The search for the
ideal biocatalyst, Nat. Biotechnol., 20, pp. 37–45.
7. Craig, J. W., Chang, F. Y., Kim, J. H., Obiajulu, S. C., and Brady, S. F.
Expanding small-molecule functional metagenomics through parallel

www.ebook3000.com
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

References 655

screening of broad-host-range cosmid environmental DNA libraries in


diverse proteobacteria, Appl. Environ. Microbiol., 76, pp. 1633–1641.
8. Sun, L., Bulter, T., Alcalde, M., Petrounia, I. P., and Arnold, F. H.
(2002). Modification of galactose oxidase to introduce glucose 6-oxidase
activity, ChemBioChem, 3, pp. 781–783.
9. Aehle, W., Cascao-Pereira, L. G., Estell, D. A., Goudegebuur, F., Kellis, J.
T., Poulose, A. J., and Schmidt, B. F. (2010). Compositions and methods
comprising protease variants. Danisco US Inc. WO 2010/056640A2.
10. Rainey, P. B., and Travisano, M. (1998). Adaptive radiation in a
heterogeneous environment, Nature, 394, pp. 69–72.
11. Zhang, Y. X., Perry, K., Vinci, V. A., Powell, K., Stemmer, W. P., and del
Cardayre, S. B. (2002). Genome shuffling leads to rapid phenotypic
improvement in bacteria, Nature, 415, pp. 644–646.
12. Rubin-Pitel, B., Cho, C. M.-H., Chen, W., and Zhao, H. (2006). Chapter 3,
Directed evolution tools in bioproduct and bioprocess development. In
Bioprocessing for Value-Added Products from Renewable Resources: New
Technologies and Applications, Yang, S.-T., ed. (Elsevier, Amsterdam), pp.
49–72.
13. Trudeau, D. L., Smith, M. A., and Arnold, F. H. (2013). Innovation by
homologous recombination, Curr. Opin. Chem. Biol., 17, pp. 902–909.
14. Stemmer, W. P. (1994). DNA shuffling by random fragmentation and
reassembly: in vitro recombination for molecular evolution, Proc. Natl.
Acad. Sci. U S A, 91, pp. 10747–10751.
15. Ness, J. E., Welch, M., Giver, L., Bueno, M., Cherry, J. R., Borchert, T. V.,
Stemmer, W. P., and Minshull, J. (1999). DNA shuffling of subgenomic
sequences of subtilisin, Nat. Biotechnol., 17, pp. 893–896.
16. Neylon, C. (2004). Chemical and biochemical strategies for the ran-
domization of protein encoding DNA sequences: library construction
methods for directed evolution, Nucleic Acids Res., 32, pp. 1448–1459.
17. Lutz, S., Ostermeier, M., and Benkovic, S. J. (2001). Rapid generation of
incremental truncation libraries for protein engineering using alpha-
phosphothioate nucleotides, Nucleic Acids Res., 29, p. E16.
18. Gabor, E., Niehaus, F., Aehle, W., and Eck, J. Zooming in on metagenomics:
molecular microdiversity of subtilisin Carlsberg in soil, J. Mol. Biol., 418,
pp. 16–20.
19. Perfeito, L., Fernandes, L., Mota, C., and Gordo, I. (2007). Adaptive
mutations in bacteria: high rate and small effects, Science, 317, pp. 813–
815.
March 17, 2016 16:42 PSP Book - 9in x 6in 18-Allan-Svendsen-c18

656 Hints from Nature

20. Tripathi, L. P., and Sowdhamini, R. (2008). Genome-wide survey of


prokaryotic serine proteases: analysis of distribution and domain
architectures of five serine protease families in prokaryotes, BMC
Genomics, 9, p. 549.
21. Gupta, R., Beg, Q. K., and Lorenz, P. (2002). Bacterial alkaline proteases:
molecular approaches and industrial applications, Appl. Microbiol.
Biotechnol., 59, pp. 15–32.
22. Ekkers, D. M., Cretoiu, M. S., Kielak, A. M., and Elsas, J. D. The great
screen anomaly: a new frontier in product discovery through functional
metagenomics, Appl. Microbiol. Biotechnol., 93, pp. 1005–1020.
23. Wang, Q., Wu, H., Wang, A., Du, P., Pei, X., Li, H., Yin, X., Huang, L., and
Xiong, X. (2010). Prospecting metagenomic enzyme subfamily genes for
DNA family shuffling by a novel PCR-based approach, J. Biol. Chem., 285,
pp. 41509–41516.
24. Hacham, Y., Gophna, U., and Amir, R. (2003). In vivo analysis of
various substrates utilized by cystathionine gamma-synthase and O-
acetylhomoserine sulfhydrylase in methionine biosynthesis, Mol. Biol.
Evol., 20, pp. 1513–1520.
25. Frickey, T., and Lupas, A. (2004). CLANS: a Java application for
visualizing protein families based on pairwise similarity, Bioinformatics,
20, pp. 3702–3704.
26. Voigt, C. A., Martinez, C., Wang, Z. G., Mayo, S. L., and Arnold, F. H. (2002).
Protein building blocks preserved by recombination, Nat. Struct. Biol., 9,
pp. 553–558.
27. Cheng, J., Pinnell, L., Engel, K., Neufeld, J. D., and Charles, T. C. Versatile
broad-host-range cosmids for construction of high quality metagenomic
libraries, J. Microbiol. Methods, 99, pp. 27–34.

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Chapter 19

A Functional and Structural Assessment


of Circularly Permuted Bacillus circulans
Xylanase and Candida antarctica
Lipase B

Stephan Reitingera and Ying Yub


a Institute for Biomedical Aging Research, University of Innsbruck, Rennweg 10,

6020 Innsbruck, Austria


b Perinatal Institute, Cincinnati Children’s Hospital Medical Center, Cincinnati,

OH 45242, USA
stephan.reitinger@uibk.ac.at, ying.yu@cchmc.org

19.1 Introduction

The process of connecting a protein’s original amino- and carboxy-


termini with a peptide linker, along with the placement of new
termini elsewhere in the protein structure by disrupting an existing
peptide bond in its backbone, results in circular permutation (CP).
Except for the introduction of additional amino acid residues
constituting the peptide linker that is joining the old ends, the amino
acid composition of the original protein is unaltered. Thus, circularly

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

658 A Functional and Structural Assessment of Circularly Permuted

Figure 19.1 Scheme of circular permutation. Native N- and C-termini are


joined with a peptide linker, and new N’- and C’-termini are generated
elsewhere in the protein.

permuted protein variants (permutants) are predominantly char-


acterized by a rearrangement of the protein sequence (Fig. 19.1).
Although in permutants the overall three-dimensional structures
of active variants compared to the native state of the original
protein are largely unchanged, modified protein dynamics can be
observed, depending on the position of the introduced backbone
fissure. In particular, relocation of the new termini in proximity to
the active center or substrate-binding sites of an enzyme affects the
conformational flexibility of the polypeptide chain and accordingly
governs catalytic performance or protein stability. According to
these characteristics, not only nature but also protein engineers took
up this intriguing conception to evolve protein function.

19.2 Naturally Occurring Circular Permutations: Selected


Examples

The prevalence of naturally occurring CP is still under discussion


[1–3], and novel computational approaches are pursuit to identify
natural CPs [4–6]. However, a large number of natural examples has
been described so far [7]. In this chapter we will touch on three
early examples, describing the discovery of natural CP and an initial
attempt to copy nature.
In 1979, researchers from the Rockefeller University published
the first evidence on circularly permuted proteins [8, 9]. The two

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Naturally Occurring Circular Permutations 659

plant lectins, favin and concavalin A (Con A), have been identified
not only to show similar biological activities but also to share
permutated homologous sequences. Other studies confirmed high
three-dimensional structure similarities of the two legume lectins
[10–13] and that the circular homology displayed between Con
A and favin is not due to gene rearrangements in the genome
but rather to a posttranslational modification [14]. To produce the
mature form of Con A, from the precursor protein chain accruing
from polysomes, first, a loop is clipped off, generating new ends,
and second, after removal of a nine-residue peptide from the original
carboxy-terminus, a peptide bond is formed between the converging
original N- and C-termini [15, 16].
The second striking example of naturally occurring CP-like
polypeptides is the saposin–swaposin homologs. Prosaposin is the
precursor of four saposin domains arranged as tandem repeats
connected with linker sequences [17]. Saposin domains are highly
conserved and share a common four alpha helical structure
(N−α1−α2−α3−α4−C). A circularly permutated saposin-like vari-
ant, termed “swaposin” [18], was identified as insert in plant
aspartic proteinases [19]. Sequence alignments confirmed the
permutated nature of this domain insert with an arrangement of
(N −α3−α4−linker−α1*−α2*−C ), in which helices α3 and α4 as
well as helices α1* and α2* originate from different saposin repeats
[18]. Since the CP of swaposin is manifested at DNA level, and due
to their chimeric composition, Ponting and Russell [18] suggested
that swaposins are derived from an ancestral prosaposin-like gene
composed of the carboxy-terminal half of one saposin and the
amino-terminal half of a different saposin repeat.
A third example are bacterial β-glucanases (1,3-1,4-β-D-glucan
4-glucanohydrolases), enzymes that hydrolyze glycosidic bonds in
mixed-linked polysaccharides containing 1,3-β- and 1,4-β-glycosidic
linkages and share more than 30% amino acid sequence identity
[1]. Interestingly, solely β-glucanase from Fibrobacter succinogenes
of the family 16 glycoside hydrolases (GH16) in the Carbohydrate-
Active Enzymes (CAZy) database [20, 21] exists as a circular
permuted variant with respect to other members of this enzyme
family. A polypeptide sequence of 60 amino acid moieties from the
carboxy-terminal part of F. succinogenes β-glucanase aligns with
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

660 A Functional and Structural Assessment of Circularly Permuted

the amino-terminal regions of other members of GH16. Applying


recombinant cloning techniques, a circularly permutated Bacillus
macerans β-glucanase (cpMAC-57) was expressed that resembled
the sequence organization of F. succinogenes β-glucanase [22].
Indeed cpMAC was enzymatically active and folded into a stable
enzyme with a three-dimensional structure almost identical with
the wild-type B. macerans β-glucanase [22, 23].
With the exception of being established posttranslationally [8,
14], the discovery of many more naturally occurring circularly per-
muted proteins indicate that it is mostly a genetic event underlying
the formation of CPs [24]. There are two main mechanistic models
that are currently being discussed to explain the evolutionary
aspect of circularly permutated proteins in nature, namely (i)
gene duplication followed by deletion and (ii) gene fission with
subsequent fusion [24–27]. In the first case a precursor gene
initially undergoes duplication, and in the second step, major parts
of the fused tandem repeat are deleted, yielding intermediate
or true permutants. Regarding CP as an evolutionary strategy
for adapting proven proteins to altered environmental conditions,
one would agree that after gene duplication the original protein
is still functional to perform its primary function. Deletions on
one or either end of the repeat will occur stepwise to adapt
the intermediate permutant to its new function [28]. In the
second strategy, gene fragments coding for independent polypeptide
building blocks or domains that originate from fission events are
joined together in a reversed order. Frequently, in such cases both
“halves” may still function as separated proteins [29, 30].
A basic prerequisite for the origination of a circular permutant is
that the native amino- and carboxy-termini can be joined by a rather
short peptide linker. Evaluation of the distances between the native
termini revealed that in <10% of the total protein structures solved
so far, the old ends are separated by 10 Å or less, what is considered a
short distance that can readily be spanned by two or three residues
[31]. The aspect of termini distance distribution and also minimal
distance distributions for the terminal 10 amino acid residues has
been reviewed by Yu and Lutz [27]. In conclusion, the scientific field
just started to understand the realm of circularly permuted proteins
and its implications on the evolution of proteins. It is conceivable

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Circular Permutation of Bacillus circulans Xylanase 661

that in the near future more examples will be discovered and that
the body of knowledge acquired will certainly inspire scientists and
protein engineers.

19.3 Circular Permutation of Bacillus circulans Xylanase

B. circulans xylanase (Bcx) is a member of family 11 glycoside


hydrolases (GH11) in CAZy [20, 21] and degrades xylan, a β-1,4-
linked polymer of xylose, by acting as a single-domain endoxylanase.
Bcx was first cloned and expressed by recombinant means in
bacteria in the late 1980s [32–34] and by now has emerged as
model β-glycosidase in enzymology, being extensively characterized
both structurally [35–41] and functionally [42–49]. The catalytic
center of the ∼20 kDa enzyme was identified and first mapped
following a site-directed mutagenesis approach [46, 50]. Later work
employing the mechanism-based inactivator 2,4-dinitrophenyl 2-
deoxy-2-fluoro-β-xylobioside (2F-DNPX2, Fig. 19.2b) confirmed a
classical Koshland retaining mechanism [42], with two glutamate
residues (Glu) at positions 78 and 172 to act as the catalytic
nucleophile and the general acid/base catalyst, respectively [51].
Bcx cleaves xylan via a two-step, double-displacement mecha-
nism (Fig. 19.2a), forming a covalent glycosyl–enzyme intermediate
[49]. The carboxylate of Glu78 achieves a nucleophilic attack at the
anomeric center, while Glu172 acts as general acid and donates a
proton from its side-chain group to the reaction that advances to
the glycosyl–enzyme intermediate, being established (glycosylation
step) and hydrolyzed (deglycosylation step) via oxocarbenium ion-
like transition states. In the second reaction step, Glu172 acting as
the base catalyst deprotonates a water molecule, which accordingly
affords hydrolysis of the intermediate.
During catalysis, the dual role of Glu172 is yielded through
pKa cycling of the acid/base side-chain group as determined by
selective isotopic labeling and nuclear magnetic resonance (NMR)
spectroscopy [37]. Wild-type Bcx and variants with substitution of
the catalytic Glu moieties to glutamine (Gln) residues were labeled
with [δ-13 C] glutamic acid. This allowed McIntosh et al. [37] to record
one-dimensional 13 C-NMR spectra and, thus, determine the pKa
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

662 A Functional and Structural Assessment of Circularly Permuted

Figure 19.2 Reaction mechanism of Bcx. (a) The two-step, double


displacement mechanism is shown. Values for the pKa of the active site
amino acid residues are indicated for each step. (b) Chemical structures
of the mechanism-based inactivator 2,4-dinitrophenyl 2-deoxy-2-fluoro
β-xylobioside (2F-DNPX2) and the substrate o-nitrophenyl β-xylobioside
(ONXP2) are illustrated.

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Circular Permutation of Bacillus circulans Xylanase 663

values by a nonlinear least-squares fitting of the chemical shifts as


a function of pH for titrations involving one or two ionizable groups
[52]. In the absence of a substrate, in wild-type Bcx, the apparent
pKa values for Glu78 and Glu172 are 4.6 and 6.7, respectively. In
an earlier report the pKa value for Glu172 determined via Fourier
transform infrared spectroscopy (FTIR) was estimated to be 6.8
[36]. These pKa values are in close concordance with data obtained
from pH-dependent kinetic evaluation of Bcx. For the reaction of
free Bcx with the substrate o-nitrophenyl β-xylobioside (ONPX2,
Fig. 19.2b), values for the second-order rate constant kcat /Km were
plotted as a function of pH and yielded a bell-shaped dependency
with the apparent pKa values of 4.6 and 6.8 [43] and a catalytic
pH optimum of 5.7 [37]. A recent study confirmed that the pH
optimum of Bcx is rather determined by the pKa values of the
active site residues than by the global charge of the enzyme, as
demonstrated through random succinylation of exposed charged
amino acid moieties [53].
Taken together, these studies demonstrate that near the pH
optimum, where Bcx is maximally active, Glu78 is predominantly
ionized in order to perform the nucleophilic attack at the anomeric
center, while Glu172 remains in the catalytically competent pro-
tonation state to act as a general acid. On the other hand,
formation of a covalently modified Glu78 as present in the glycosyl–
enzyme intermediate eliminates the electrostatic coupling of the
two catalytic residues and, thus, reduces the pKa value of Glu172
by approximately 2.5 units to 4.2 [37]. This drop is in line with its
quality to function as a base catalyst in the deglycosylation step.
McIntosh et al. determined the decrease in pKa after formation
of a relatively stable covalent bond between Bcx’s nucleophile
and the anomeric center of the slow substrate 2F-DNPX2 with a
hydrolytic turnover rate of t1/2 = 350 minutes [37, 51]. Remarkably,
not only the described glycosylation of the nucleophile reduces
the pKa values of the acid/base catalyst from 6.7 to 4.6, but
the mutation of Glu78 to Gln also achieves a similar decrease
[37]. This dramatic shift can primarily be explained on the basis
of reduced electrostatic repulsion between the ∼6-Å-separated
(distance between Cδ -atoms) [54] catalytic residues, while the
nucleophile alternates between its negatively charged nucleophilic
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

664 A Functional and Structural Assessment of Circularly Permuted

and neutral glycosylated states. Indeed, a more detailed analysis of


NMR-monitored pH titration data as a function of ionic strength
together with theoretical electrostatic calculations corroborates in
fact biphasic titration curves for Glu78 and Glu172 according to a
model of two coupled ionizations and their electrostatic interaction
[37, 55].
In addition to the described coupled ionization equilibria of the
two catalytic residues, conformational changes in the protein and
interactions with other neighboring amino acid residues have been
discussed. A striking example is given when looking at position 35
in Bcx, which in the folded enzyme is hydrogen bonded to Glu172.
In the so-called alkaline xylanases, such as Bcx, an asparagine (Asn)
moiety is found at this site, whereas an aspartic acid (Asp) is
present in those with a more acidic pH optimum [45]. Mutation
of Asn35 to Asp (N35D) in Bcx resulted not only in an increase
of activity by ∼20% but also in a pronounced decrease of the pH
optimum to 4.6 compared to 5.7 of the wild-type enzyme [54].
Kinetic analysis of kcat /Km versus pH yielded a typical bell-shaped
activity profile with apparent pKa values of 3.5 and 5.8 for Glu78 and
Glu172, respectively. Interestingly, NMR-monitored titration data on
the N35D variant show triphasic curves for Glu78 and Glu172 with
apparent pKa values of 5.7 and 8.4, respectively. Thus, on the basis
of the latter values, the pH optimum of the N35D variant would be
estimated to be 7.0. The apparent pKa value for Asp35 in this case
is 3.7, and it appears that the kinetically measured pH-dependent
activity of the N35D variant is rather governed by Glu78 and Asp35
than by Glu78 and Glu172, as observed in wild-type Bcx [54]. This
phenomenon was further supported by a kinetic mechanism termed
reverse protonation [56, 57], in which substrate hydrolysis requires
that the group with the higher pKa (5.7 for Glu78) is ionized,
while the side chain (3.7 for Asp35) with the lower pKa remains
protonated. At the optimal pH conditions, only ∼1% of the Bcx N35D
enzyme is in this favorable ionization state, however, being a highly
active catalyst, since the transition state for glycosyl transfer is
considerably stabilized by the hydrogen bond established between
Asp35 and Glu172 in the glycosyl–enzyme intermediate [54].
To gain a more detailed understanding on how motions affect the
catalytic center of Bcx, a CP approach was tested [58]. Bcx folds into a

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Circular Permutation of Bacillus circulans Xylanase 665

β-jellyroll structure, which positions its native amino- and carboxy-


termini in very close proximity (Fig. 19.3). The positively charged α-
amine of alanine at position 1 and the negatively charged α-carboxyl
of tryptophan 185 at the C-terminus are separated by ∼2.7 Å and
form a salt bridge. In addition to directly joining the two termini,
also peptide linkers consisting of a single-glycine or two-glycine
moieties were tested. X-ray crystallographic studies of a CP with
a single-glycine linker revealed only small structural perturbations
in this region. However, around the new N-terminus at alanine
123 within a thumb-like loop over the active site, conformational
differences between monomeric copies of the permutants within
the same crystal lattice were observed, thus showing that this part
of the protein is relatively flexible. Since there was hardly any
difference in the catalytic performance of the three linkers tested
yet, the two-glycine linker was found to be the most stable version,
as determined by thermal denaturation curves, and this two-residue
linker was selected as a template for the generation of a circular
Bcx permutant library [58]. For CP, the circularized coding sequence
of Bcx was randomly linearized through limited DNase I treatment,
and the so-generated CP fragments were cloned into an expression
vector containing a leader peptide, ensuring transportation of the
translated CPs to the periplasmic space. Catalytically active CP
variants were screened with a Congo Red overlay plate assay, that
is, bacterial colonies secreting active CP variants hydrolyzed the
substrate xylan that was contained in the overlaid soft agar and thus
after staining formed visible halos around those colonies.
This assay revealed about 2% of the 3000 library members
screened show hydrolytic activity against xylan. Subsequent se-
quencing identified 35 active variants with unique new amino-
terminal positions, what can be consider quite striking taking into
account that Bcx is composed of only 185 amino acid residues
in total (Fig. 19.3). Remarkably, the introduced new ends were
found not only in exposed loops but also within β-strands and
in close vicinity of the active center. Less than half of the new
start sites were found in loop regions, namely cpA1 (Lβ13-β1),
cpS2 (Lβ13-β1), cpT3 (Lβ13-β1), cpG13 (Lβ1-β2), cpG23 (Lβ2-
β3), cp34 (Lβ3-β4), cpG92 (Lβ7-β8), cpD119 (Lβ9-β10), cpK135
(Lβ10-β11), cpG139 (Lβ10-β11), cpS162 (Lβ11-β12), cpW185
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

666 A Functional and Structural Assessment of Circularly Permuted

Figure 19.3 Circular permutation analysis of Bacillus circulans xylanase


(Bcx). The amino acid sequence of Bcx is shown in one-letter code. New
N-termini of screened permutants are indicated in blue with subjacent
wild-type numbering. Secondary structure elements (yellow and auburn
bares) and calculated putative permutant sites (CPred, red bars) encircle
the sequence. The two-glycine linker (brown) is marked between the native
N-terminus (alanine [A] 1) and C-terminus (tryptophan [W] 185). In the
center the three-dimensional structure of Bcx is shown (Protein Data Bank
[PDB] [59] accession number: 1BCX [46]). In the model catalytically active
permutants are illustrated in blue and predicted CP sites (CPred) in red.
Active site amino acid moieties are labeled with green diamonds and with
tags (E78 and E172). Positions of the five permutants (G34, N35, R73, Y94,
and Y174) and the native N- and C-termini are marked.

(Lβ13-β1), and cpLinkG2 (Lβ13-β1). The position of the new


N-terminus with respect to wild-type sequence numbering and the
first amino acid residue is listed, and the corresponding loop is
indicated in parentheses. The rest of the permutants had the new
N-terminal ends either at the beginning or at the end of a β-strand or
involved cleavage of a strand such as cpW6 (β1), cpT10 (β1), cpA18
(β2), cpS27 (β3), cpG56 (β5), cpP60 (β5), cpR73 (β6), cpY94 (β8),

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Circular Permutation of Bacillus circulans Xylanase 667

cpG96 (β8), cpV98 (β8), cpQ127 (β10), cpY166 (β12), and cpN181
(β13). Due to the random CP approach, on the basis of nucleolytic
enzyme treatment, frequently some residues have been deleted at
the C-terminus. A complete list of deletions is shown by Reitinger
et al. [58]. For instance in cpV98 almost the entire β-strand 8 is
deleted. Since most of the Bcx permutants are properly folded and
active, it is suggested that hydrogen bonds within the two β-sheets
are sufficient to compensate for the interactions of the clipped C-
terminus. No permutants were found in the α-helix and in β-strand
9. In the catalytically most active permutants, the new N-termini
were located either in loop regions or at the beginning or end of a
β-strand.
Of great interest are circularly permuted variants close to the
catalytic center such as cpE78, which places the catalytic nucleophile
to the N-terminus. However, these CPs were suggested to be
deficient in maintaining the proper three-dimensional structure
and, thus, thermally unstable and showed only limited activity
against tested substrates [58]. This was also true for cpQ127, which
hydrogen bonds to E78. Although cpQ127 was missing the entire
hairpin thumb-like loop region (between Y108 and Q127), it was
found to be catalytically active. Unfortunately, detailed kinetic or
structural measurements could not be performed due to insufficient
protein expression. For cpN35, cpY94, and cpY174, in which the
new termini are in close proximity to the active site, yields when
expressed in bacteria Escherichia coli were found to be relatively
high (at least 50% yield relative to wild-type Bcx) and each
permutant folded properly, as indicated by circular dichroism (CD)
spectra, which are practically identical to the one of wild-type
Bcx [58]. Interestingly, thermal denaturation studies suggested that
relocation of the N-terminus to either of the position N35, Y94,
or Y174 impinges on enzyme stability. The midpoint temperature
(Tm ) for the irreversible unfolding of wild-type Bcx is 56◦ C and
was lowered by 3.7◦ C for cpY94, by 5.6◦ C for cpN35, and by 10.1◦ C
for Y174 [58]. In contrast, the hydrolytic activity of the three
permutants was found to be comparable or marginally higher than
that of wild-type Bcx. In the case of cpN35 kcat /Km values were
shown to be increased from 21 to 76 s−1 ·mM−1 when compared to
the native form of Bcx [58]. Kinetic parameters were obtained with
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

668 A Functional and Structural Assessment of Circularly Permuted

2,5 dinitrophenyl β-xylobioside and a substrate depletion approach.


In summary it could be suggested that the catalytic performance
of Bcx can be improved by partly unleashing the enzyme’s firm
conformation at the cost of structural stability.
NMR spectroscopic analysis of the polypeptide backbone of wild-
type Bcx, cpN35, and cpY94 supports the notion that permutants
indeed show limited altered dynamic properties. In such 1 H-15 N
heteronuclear single-quantum coherence (HSQC) spectra discrete
signals for each backbone or side-chain 1 H-15 N amide or indole
group is observed and thus can be referred to as the protein’s
individual fingerprint. Comparing the 1 H N and 15 N chemical
shift differences between wild-type Bcx and cpN35 revealed only
localized structural alterations in proximity to the new and old
termini. In line with this observation is that for the acid/base
catalyst E172 as well as G173, Y174, and Q175 signals in the
HSQC spectrum for cpN35 could not be observed. These amino
acid moieties are structurally contiguous to the new N-terminus
at position 35 and, thus, indicative of localized perturbations. The
authors suggested conformational exchange broadening due to
motions in a millisecond-to-microsecond time range as the reason
for the deficiency of signals in the HSQC spectra [58]. In addition,
15
N T1 and T2 relaxation measurements, as well as heteronuclear 1 H-
15
N nuclear Overhauser effect (NOE) experiments, confirmed cpN35
to be properly folded and monomeric, with a global correlation
time in the nanoseconds range similar to wild-type Bcx [35]. Only
little variations in particular close to the C-terminus that is in the
positions of T33 and G34 were reported. In contrast the new N-
terminus was observed to be rather rigid.
Employing computational tools such as the web interface CPred
[60] to predict putative new N-termini is a useful starting point
for CP studies. In brief, CPred extracts the actual three-dimensional
data (solvent accessibility, hydrogen bonds and packing density,
physiochemical properties of amino acid moieties, etc.) of the
submitted protein and applies four different machine-learning
methods. The probability score of each algorithm is then taken and
averaged to yield an integrated score for viable CP site prediction
[60]. Predicted regions for Bcx are illustrated in Fig. 19.3. Frequently,
calculated sites and experimentally confirmed backbone fissures

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Circular Permutation on Candida antarctica Lipase B 669

yielding catalytically active permutants match closely; however, also


discrepancies need to be noted, such as the α-helix. In this secondary
structure element, putative CP sites are predicted by the CPred
algorithm, although in the random CP study no permutant with
N-termini relocation to the α-helix was detected. The authors of
the study claim that in addition to being a nonfavorable region,
simply the number of CP candidates analyzed could have been
too little [58]. The large number of active permutants generated
on the basis of wild-type Bcx with a two-glycine linker is an
indication for the enzyme’s structural stability. As expected and
also predicted by bioinformatics programs, new ends are frequently
relocated to exterior loops of Bcx. However the relatively high
number of permutation sites within β-strands is presumably related
to Bcx’s β-jellyroll structure and the resultant tight hydrogen-
bonding network stabilizing even less favorable cleavage sites while
providing room for altered catalytic performance. Unfortunately, not
all active permutants could be studied in detail due to low protein
yields or loss of stability.
In summary, Bcx’s compact jellyroll structure affords this small
enzyme to be a suitable model protein to investigate CP and its
effects on folding and catalytic performance, in particular when
comparing the CP data to the huge body of already existing data,
both structurally and functionally. In this regard, one of the most
remarkable finding of the study on Bcx is the tolerance to CP at many
positions, beyond just exposed loops.

19.4 Circular Permutation on Candida antarctica Lipase B

Lipases (E.C. 3.1.1.3, triacylglycerol hydrolases) are enzymes that


naturally catalyze the hydrolysis of triglycerides (fats) into glycerol
and fatty acids. However, in anhydrous organic media, they are
also active on a broad range of substrates and can catalyze
other reactions, such as esterification, transesterification, and
amidation. The versatility and generally high selectivity of lipases
in both aqueous and organic solvents have made them the most
useful biocatalysts in both industry and academia. Applications of
lipases as an asymmetric biocatalyst include kinetic resolution of
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

670 A Functional and Structural Assessment of Circularly Permuted

racemic alcohols and amides, desymmetrization of complex drug


intermediates, and biodiesel production [62–64]. In the lipase
family, Candida antarctica lipase B (CaLB) is among the most popular
members due to its high efficiency and stereoselectivity.
CaLB originates from the yeast C. antarctica. It is composed of
317 amino acids and has a molecular mass of 33 kDa. The crystal
structure of CaLB has been reported by Uppenberg et al. [65–67].
It is a monomer and belongs to the α/β hydrolase fold family,
represented by a central, eight-stranded parallel β-sheet, flanked on
both sides by α-helices (Fig. 19.4). The active site of CaLB contains
a typical Ser-His-Asp catalytic triad, an oxyanion hole that stabilizes
the intermediate, and the substrate-binding pocket composed of one
acyl binding site and one alcohol binding site. Unlike many α/β
hydrolases, which have a separated cap domain, the cap of CaLB
is constituted by three short helices in the middle of the sequence
(α7/9, residues 135–155) and two long helices at the C-terminal
(α16/17, residues 267–288). Helix α7-9 is also known as the lid,
and helix 17 is part of the alcohol-binding site. The cap region plays
a critical role in catalysis by defining the substrate-binding pocket
and regulating the access to the active site.
CaLB has been successfully expressed in fungi Aspergillus oryzae
and in the yeasts Pichia pastoris and Saccharomyces cerevisiae
[68]. The purified enzyme, especially the immobilized form, is a
robust biocatalyst with high stability and solvent tolerability [69].
Because of the important commercial and research value, CaLB has
been a popular target of protein engineering to tailor its catalytic
properties and to meet different applications. Using rational design
and directed evolution approaches, a range of mutant proteins were
generated with improved activity, stability, or stereoselectivity [70,
71].
On the basis of the hypothesis that termini relocation could
potentially benefit the enzyme catalysis by altering the accessibility
to active site pocket and increasing the backbone flexibility, Lutz’s
group applied random CP on CaLB [72]. The 17 Å distance between
the native termini was connected by a flexible six-amino-acid
linker (–GGTSGG–). The generated CP library was expressed in
P. pastoris and screened for hydrolytic activity toward tributyrin
using a colony-based assay. Such functional screening identified

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Circular Permutation on Candida antarctica Lipase B 671

Figure 19.4 Three-dimensional structure of Candida antarctica lipase B


(CaLB). The three-dimensional structure of CaLB is shown in cartoon
representation on the basis of the PDB [59] entry 1TCA [65]. The catalytic
triad is shown in stick mode and labeled in green (S105, D187, and H244).
New N-termini of discussed circular permutants are illustrated in blue
and labeled with tags. Relevant α-helices are labeled in red. The native N-
terminus and C-terminus are marked.

that 63 unique variants, which was almost 20% of the theoretical


permutated enzymes, retained hydrolytic activity. Similar to the
observations in Bcx, many of the exposed surface loops were
tolerant to permutation. Over half of the active variants had their
new termini located in the loop regions. However, it was interesting
that the newly introduced termini were also located in secondary
structure elements, with the majority concentrated in the cap region
that includes the lid helix 7/9 and the C-terminal helix 16/17 [72].
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

672 A Functional and Structural Assessment of Circularly Permuted

The tolerance to CP of α16/17 is particularly noticeable since this


long helical segment sits on top of the active site pocket and makes
up the alcohol-binding site. Additionally, a cluster of functional
variants were found around the native N-terminus and C-terminus,
including the β-strand 9, which was not surprising because the
terminal regions are usually flexible and these permutants should
be very similar to the wild-type enzymes. No active permutants were
found in the β-sheets and the α-helices that make up the core of the
α/β hydrolase fold.
The impact of CP on enzymatic function was evaluated on a
variety of substrates. Initial kinetic analysis using p-nitrophenol
butyrate ( p-NB) and 6,8-difluoro-4-methylumbelliferyl (DiFMU)
octanoate as reference substrates revealed that relocation of the
new termini into the cap helix 16/17 substantially benefited the
catalytic performance. Six tested permutants from this region,
cpP268, cpL277, cpL278, cpA283, cpA284, and cpP289, all showed
higher catalytic efficiency than wild-type enzyme, ranging 2- to
14-fold for p-NB and 6-175-fold for DiFMU [72, 73]. The activity
enhancement is mostly due to the increase of turnover rate
kcat . In contrast to the cap permutants, permutants cpG44 and
cpQ193 with new termini close to the active site pocket suffered a
significant activity drop. Over tenfold catalytic efficiency loss was
observed, suggesting backbone cleavage within the protein core
has detrimental effects on the enzyme performance. The other
group of permutants that have new termini located in the lid
region (cpL144, cpA148, and cpS150) did not show dramatic activity
changes. To further explore the influence of CP on enzyme function,
a more detailed kinetic study was carried out with cpA283, the
permutant with the highest rate enhancement. Purified cpA283 was
immobilized for reactions in organic solvents. It was first tested
in esterification and transesterification with several secondary
alcohols and carboxylates, which demonstrated that CP did not affect
the enantiomeric preference of the enzyme. The permutants with
hydrolytic rate enhancement retained or even slightly increased
their enantioselectivity compared to that of wild-type enzyme. Next,
the performance of the immobilized cpA283 on transesterification
reaction was expanded to a series of pure triglycerides, as well
as various cooking oils [74]. In comparison with the wild-type

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Circular Permutation on Candida antarctica Lipase B 673

CaLB, the permutant showed 2.6- to 9-fold higher catalytic efficiency


for the triglycerides with varied fatty acid chains. Such increases
were exclusively due to the faster turnover rate as the apparent
binding constants KM mostly remained unchanged. Furthermore,
the potential application of permuted CaLB in biodiesel production
was demonstrated by conversion of three different vegetable oils.
In the transesterification of butanol and oils, as well as the
interesterification of ethyl acetate with oils, cpA283 consistently
outperformed the wild-type CaLB. Finally, the immobilized cp283
also showed two- to fourfold improvement in the synthesis of a
variety of acyl glycerols [75]. All of the activity studies have nicely
demonstrated the benefit of CP on CaLB catalysis. The observed
enhancements of cpCaLBs were believed to be the results of changes
in local backbone flexibility and active site accessibility.
The structural influence of CP on CaLB was first examined by
intrinsic tryptophan fluorescence, which indicated little changes on
the overall protein tertiary structure of several permutants [73].
However, CD spectroscopy revealed significant stability losses. The
Tm value of helix 16/17 permutants dropped by 12◦ C–14◦ C, while
the Tm value of lid permutants dropped by 4◦ C–7◦ C comparing
to the wild-type enzyme. This stability loss was attributed to the
glycine-rich linker that connected the native termini of CaLB. It
was believed that together with the original termini, this linker
introduced an extended, structural poorly defined region in the
protein. Corresponding to such hypothesis, a second engineering
process, called incremental truncation, was carried out on the
linker and eventually yielded new permutants with improved
stability and preserved high activity [76]. The gained stability was
attributed to the removal of the flexible residues in the linker
region that destabilized the overall protein structure. Additionally,
linker truncation also led to the quaternary-structure changes from
monomer to dimer. X-ray crystallography on one truncated variant
(cpA2837) revealed a domain-swapped homodimer and provided
structural insight into the consequence of CP (PDB code: 3ICW).
The backbone cleavage at the C-terminal helix 16/17 opened up
the active site pocket and converted the narrow access tunnel to
the catalytic triad into a broad crevice that contributed to the
accelerated substrate entry and product release. In agreement with
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

674 A Functional and Structural Assessment of Circularly Permuted

literature reports on circular permutants, termini relocation on


CaLB has the maximum impact on the local region near the original
and new termini yet few changes on the protein core structure were
observed, as the core fold of cpA2837 overlays very well with the
wild-type structure. The domain swapping in cp283A7 involves
the N-terminal segment of the two subunits. The N-terminus of one
subunit points away from the rest of protein and orients itself onto
the other subunit in a way that its N-terminus precisely connects
with the other subunit [76]. Such terminal reorientation minimizes
the exposed surface area and was considered as the driving force
behind increased thermostability.
In conclusion, CP of CaLB is a compelling example that by simple
relocation of the backbone termini to specific regions can relax
the polypeptide conformation and, thus, can improve the catalytic
performance against certain substrates considerably.

19.5 Conclusion

CP studies on Bcx and CaLB demonstrate that on the basis of the


wild-type templates numerous active circular permutants could
be generated. In both enzymes loop regions have been identified
to be permissive for the relocation of new termini. In addition,
the original amino- and carboxy-termini, as well as the peptide
linker, were shown to be a hot spot for backbone fissures yielding
catalytically competent variants. In contrast, buried secondary
structure elements, essentially responsible for the accurate folding
of the protein and thus directly for stability and proper functioning
of the enzyme, seem to be rather resistant for CP. In Bcx, it was
demonstrated that in the active center two intra-β-strand CP sites,
that is, Asn at position 35 (cpN35), which is hydrogen bonded
to the general acid/base catalyst, and the catalytic nucleophile
glutamate at position 78 (cpE78) can serve as new N-termini.
A fourfold higher catalytic activity was reported for the former,
whereas the latter could not be analyzed in detail due to diminished
protein stability and solubility. Internal α-helical CP sites have
been reported for CaLB, yielding variants with over 100-fold
improved turnover rates. Increased conformational flexibilities in

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

Conclusion 675

both discussed enzymes contributed to the improved catalytic


actions. Moreover, a dense network of hydrogen bonds assured
proper enzyme folding although affording regional relaxation in
proximity to the newly established backbone ends. Presumably, an
improved turnover rate in enzymes can be achieved at the cost
of stability or localized conformational perturbance, respectively.
The gained structural flexibility is thus reflected by enhanced active
site accessibility and/or spatial alterations in rate-determining
substrate–enzyme interactions.
However, not only the localization of the new backbone fissure
matters but also the quality of the peptide linker joining the old
ends is of importance. In both enzymes, the natural N- and C-termini
were spatially close and could be bridged by a two-residue or six-
residue linker, respectively. Short distances (<10 Å), when joining
the old termini, work best and render native protein folding possible
with only little structural disturbance. Small and hydrophilic amino
acid moieties have preferentially been employed to join the ends.
In general, chances of perturbance increase with hydrophobicity
and size of the residues used for building the linker, as well as the
distance that needs to be spanned.
Beyond Bcx and CaLB, the Lutz group has recently successfully
extended CP to the flavin-dependent oxidoreductase old yellow
enzyme (OYE) [77]. Instead of the random CP approach, the group
used a highly efficient cell-free technique by combining whole-gene
synthesis with in vitro transcription and translation for library
preparation. This strategy dramatically reduced the library size
and maximized diversity. Functional analysis with three reference
substrates identified over 70 active variants and several showed
an order of magnitude improved activity. Similar to CaLB, the new
termini of functional variants concentrated in loop and lid regions
near the active site.
In summary, CP resembles a powerful tool for protein scientists
to improve the catalytic performance of enzymes. In combination
with other protein engineering approaches, CP affords the expan-
sion of catalytic actions to altered environmental conditions such as
ionic strength, temperature, and pH or even allows the screening for
variants with novel enzymatic properties.
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

676 A Functional and Structural Assessment of Circularly Permuted

Acknowledgments

The authors would like to thank Dr. Lawrence P. McIntosh


(University of British Columbia, Vancouver, BC, Canada) and
Dr. Stefan Lutz (Emory University, Atlanta, GA, USA) for fruitful
discussions and carefully reading the chapter. SR is supported by a
Marie Curie International Reintegration Grant (European Commis-
sion) and the Herzfelder Family Foundation (Vienna, Austria).

References

1. Heinemann, U., and Hahn, M. (1995). Circular permutations of protein


sequence: not so rare?, Trends Biochem. Sci., 20, pp. 349–350.
2. Jung, J., and Lee, B. (2001). Circularly permuted proteins in the protein
structure database, Protein Sci., 10, pp. 1881–1816.
3. Lindqvist, Y., and Schneider, G. (1997). Circular permutations of natural
protein sequences: structural evidence, Curr. Opin. Struct. Biol., 7, pp.
422–427.
4. Andreeva, A., Prlic, A., Hubbard, T. J., and Murzin, A. G. (2007). SISYPHUS:
structural alignments for proteins with non-trivial relationships, Nucleic
Acids Res., 35, pp. D253–D259.
5. Lo, W. C., and Lyu, P. C. (2008). CPSARST: an efficient circular
permutation search tool applied to the detection of novel protein
structural relationships, Genome Biol., 9, p. R11.
6. Uliel, S., Fliess, A., and Unger, R. (2001). Naturally occurring circular
permutations in proteins, Protein Eng., 14, pp. 533–542.
7. Lo, W. C., Lee, C. C., Lee, C. Y., and Lyu, P. C. (2009). CPDB: a database of
circular permutation in proteins, Nucleic Acids Res., 37, pp. D328–D332.
8. Cunningham, B. A., Hemperly, J. J., Hopp, T. P., and Edelman, G. M. (1979).
Favin versus concanavalin A: circularly permuted amino acid sequences,
Proc. Natl. Acad. Sci. U S A, 76, pp. 3218–3222.
9. Hemperly, J. J., Hopp, T. P., Becker, J. W., and Cunningham, B. A. (1979).
The chemical characterization of favin, a lectin isolated from Vicia faba,
J. Biol. Chem., 254, pp. 6803–6810.
10. Edelman, G. M., Cunningham, B. A., Reeke, G. N., Jr., Becker, J. W., Waxdal,
M. J., and Wang, J. L. (1972). The covalent and three-dimensional
structure of concanavalin A, Proc. Natl. Acad. Sci. U S A, 69, pp. 2580–
2584.

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

References 677

11. Einspahr, H., Parks, E. H., Suguna, K., Subramanian, E., and Suddath, F.
L. (1986). The crystal structure of pea lectin at 3.0-A resolution, J. Biol.
Chem., 261, pp. 16518–16527.
12. Reeke, G. N., Jr., Becker, J. W., Cunningham, B. A., Wang, J. L., Yahara, I.,
and Edelman, G. M. (1975). Structure and function of concanavalin A,
Adv. Exp. Med. Biol., 55, pp. 13–33.
13. Reeke, G. N., Jr., Becker, J. W., and Edelman, G. M. (1975). The covalent and
three-dimensional structure of concanavalin A. IV. Atomic coordinates,
hydrogen bonding, and quaternary structure, J. Biol. Chem., 250, pp.
1525–1547.
14. Carrington, D. M., Auffret, A., and Hanke, D. E. (1985). Polypeptide
ligation occurs during post-translational modification of concanavalin
A, Nature, 313, pp. 64–67.
15. Bowles, D. J., Marcus, S. E., Pappin, D. J., Findlay, J. B., Eliopoulos, E.,
Maycox, P. R., and Burgess, J. (1986). Posttranslational processing of
concanavalin A precursors in jackbean cotyledons, J. Cell Biol., 102, pp.
1284–1297.
16. Bowles, D. J., and Pappin, D. J. (1988). Traffic and assembly of
concanavalin A, Trends Biochem. Sci., 13, pp. 60–64.
17. Hazkani-Covo, E., Altman, N., Horowitz, M., and Graur, D. (2002). The
evolutionary history of prosaposin: two successive tandem-duplication
events gave rise to the four saposin domains in vertebrates, J. Mol. Evol.,
54, pp. 30–34.
18. Ponting, C. P., and Russell, R. B. (1995). Swaposins: circular permuta-
tions within genes encoding saposin homologues, Trends Biochem. Sci.,
20, pp. 179–180.
19. Guruprasad, K., Tormakangas, K., Kervinen, J., and Blundell, T. L. (1994).
Comparative modelling of barley-grain aspartic proteinase: a structural
rationale for observed hydrolytic specificity, FEBS Lett., 352, pp. 131–
136.
20. Henrissat, B. (1991). A classification of glycosyl hydrolases based on
amino acid sequence similarities, Biochem. J., 280(Pt 2), pp. 309–
316.
21. Henrissat, B., and Davies, G. (1997). Structural and sequence-based
classification of glycoside hydrolases, Curr. Opin. Struct. Biol., 7, pp. 637–
644.
22. Hahn, M., Piotukh, K., Borriss, R., and Heinemann, U. (1994). Native-like
in vivo folding of a circularly permuted jellyroll protein shown by crystal
structure analysis, Proc. Natl. Acad. Sci. U S A, 91, pp. 10417–10421.
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

678 A Functional and Structural Assessment of Circularly Permuted

23. Keitel, T., Simon, O., Borriss, R., and Heinemann, U. (1993). Molecular
and active-site structure of a Bacillus 1,3-1,4-beta-glucanase, Proc. Natl.
Acad. Sci. U S A, 90, pp. 5287–5291.
24. Eisenbeis, S., and Höcker, B. (2010). Evolutionary mechanism as a
template for protein engineering, J. Pept. Sci., 16, pp. 538–544.
25. Aharoni, A., Gaidukov, L., Khersonsky, O., Mc, Q. G. S., Roodveldt, C., and
Tawfik, D. S. (2005). The ’evolvability’ of promiscuous protein functions,
Nat. Genet., 37, pp. 73–76.
26. Bliven, S., and Prlic, A. (2012). Circular permutation in proteins, PLOS
Comput. Biol., 8, p. e1002445.
27. Yu, Y., and Lutz, S. (2011). Circular permutation: a different way to
engineer enzyme structure and function, Trends Biotechnol., 29, pp. 18–
25.
28. Jeltsch, A. (1999). Circular permutations in the molecular evolution of
DNA methyltransferases, J. Mol. Evol., 49, pp. 161–164.
29. Lee, J., and Blaber, M. (2011). Experimental support for the evolution of
symmetric protein architecture from a simple peptide motif, Proc. Natl.
Acad. Sci. U S A, 108, pp. 126–130.
30. Weiner, J., 3rd, and Bornberg-Bauer, E. (2006). Evolution of circular
permutations in multidomain proteins, Mol. Biol. Evol., 23, pp. 734–
743.
31. Wang, C. K., Kaas, Q., Chiche, L., and Craik, D. J. (2008). CyBase: a
database of cyclic protein sequences and structures, with applications
in protein discovery and engineering, Nucleic Acids Res., 36, pp. D206–
D210.
32. Yang, R. C., MacKenzie, C. R., Bilous, D., and Narang, S. A. (1989).
Hyperexpression of a Bacillus circulans xylanase gene in Escherichia
coli and characterization of the gene product, Appl. Environ. Microbiol.,
55, pp. 1192–1195.
33. Yang, R. C., MacKenzie, C. R., Bilous, D., and Narang, S. A. (1989).
Identification of two distinct Bacillus circulans xylanases by molecular
cloning of the genes and expression in Escherichia coli, Appl. Environ.
Microbiol., 55, pp. 568–572.
34. Yang, R. C., MacKenzie, C. R., and Narang, S. A. (1988). Nucleotide
sequence of a Bacillus circulans xylanase gene, Nucleic Acids Res., 16,
p. 7187.
35. Connelly, G. P., Withers, S. G., and McIntosh, L. P. (2000). Analysis of the
dynamic properties of Bacillus circulans xylanase upon formation of a
covalent glycosyl-enzyme intermediate, Protein Sci., 9, pp. 512–524.

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

References 679

36. Davoodi, J., Wakarchuk, W. W., Campbell, R. L., Carey, P. R., and
Surewicz, W. K. (1995). Abnormally high pKa of an active-site glutamic
acid residue in Bacillus circulans xylanase. The role of electrostatic
interactions, Eur. J. Biochem., 232, pp. 839–843.
37. McIntosh, L. P., Hand, G., Johnson, P. E., Joshi, M. D., Korner, M., Plesniak,
L. A., Ziser, L., Wakarchuk, W. W., and Withers, S. G. (1996). The pKa
of the general acid/base carboxyl group of a glycosidase cycles during
catalysis: a 13C-NMR study of bacillus circulans xylanase, Biochemistry,
35, pp. 9958–9966.
38. Plesniak, L. A., Connelly, G. P., Wakarchuk, W. W., and McIntosh, L. P.
(1996). Characterization of a buried neutral histidine residue in Bacillus
circulans xylanase: NMR assignments, pH titration, and hydrogen
exchange, Protein Sci., 5, pp. 2319–2328.
39. Plesniak, L.A., Wakarchuk, W. W., and McIntosh, L. P. (1996). Secondary
structure and NMR assignments of Bacillus circulans xylanase, Protein
Sci., 5, pp. 1118–1135.
40. Törrönen, A., Harkki, A., and Rouvinen, J. (1994). Three-dimensional
structure of endo-1,4-beta-xylanase II from Trichoderma reesei: two
conformational states in the active site, EMBO J., 13, pp. 2493–2501.
41. Törrönen, A., and Rouvinen, J. (1995). Structural comparison of two
major endo-1,4-xylanases from Trichoderma reesei, Biochemistry, 34,
pp. 847–856.
42. Koshland, D. E. (1953). Stereochemistry and the mechanism of
enzymatic reactions, Biol. Rev., 28, pp. 416–436.
43. Lawson, S. L., Wakarchuk, W. W., and Withers, S. G. (1996). Effects of
both shortening and lengthening the active site nucleophile of Bacillus
circulans xylanase on catalytic activity, Biochemistry, 35, pp. 10110–
10118.
44. Lawson, S. L., Wakarchuk, W. W., and Withers, S. G. (1997). Positioning
the acid/base catalyst in a glycosidase: studies with Bacillus circulans
xylanase, Biochemistry, 36, pp. 2257–2265.
45. Törrönen, A., and Rouvinen, J. (1997). Structural and functional proper-
ties of low molecular weight endo-1,4-beta-xylanases, J. Biotechnol., 57,
pp. 137–149.
46. Wakarchuk, W. W., Campbell, R. L., Sung, W. L., Davoodi, J., and Yaguchi,
M. (1994). Mutational and crystallographic analyses of the active site
residues of the Bacillus circulans xylanase, Protein Sci., 3, pp. 467–475.
47. Wakarchuk, W. W., Sung, W. L., Campbell, R. L., Cunningham, A., Watson,
D. C., and Yaguchi, M. (1994). Thermostabilization of the Bacillus
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

680 A Functional and Structural Assessment of Circularly Permuted

circulans xylanase by the introduction of disulfide bonds, Protein Eng.,


7, pp. 1379–1386.
48. White, A., and Rose, D. R. (1997). Mechanism of catalysis by retaining
beta-glycosyl hydrolases, Curr. Opin. Struct. Biol., 7, pp. 645–651.
49. White, A., Tull, D., Johns, K., Withers, S. G., and Rose, D. R. (1996).
Crystallographic observation of a covalent catalytic intermediate in a
beta-glycosidase, Nat. Struct. Biol., 3, pp. 149–154.
50. Gebler, J., Gilkes, N. R., Claeyssens, M., Wilson, D. B., Beguin, P.,
Wakarchuk, W. W., Kilburn, D. G., Miller, R. C., Jr., Warren, R. A., and
Withers, S. G. (1992). Stereoselective hydrolysis catalyzed by related
beta-1,4-glucanases and beta-1,4-xylanases, J. Biol. Chem., 267, pp.
12559–12561.
51. Miao, S., Ziser, L., Aebersold, R., and Withers, S. G. (1994). Identification
of glutamic acid 78 as the active site nucleophile in Bacillus subtilis
xylanase using electrospray tandem mass spectrometry, Biochemistry,
33, pp. 7027–7032.
52. Shrager, R. I., Cohen, J. S., Heller, S. R., Sachs, D. H., and Schechter, A. N.
(1972). Mathematical models for interacting groups in nuclear magnetic
resonance titration curves, Biochemistry, 11, pp. 541–547.
53. Ludwiczek, M. L., D’Angelo, I., Yalloway, G. N., Brockerman, J. A., Okon,
M., Nielsen, J. E., Strynadka, N. C., Withers, S. G., and McIntosh, L. P.
(2013). Strategies for modulating the pH-dependent activity of a family
11 glycoside hydrolase, Biochemistry, 52, pp. 3138–3156.
54. Joshi, M. D., Sidhu, G., Pot, I., Brayer, G. D., Withers, S. G., and McIntosh,
L. P. (2000). Hydrogen bonding and catalysis: a novel explanation for
how a single amino acid substitution can change the pH optimum of a
glycosidase, J. Mol. Biol., 299, pp. 255–279.
55. McIntosh, L. P., Naito, D., Baturin, S. J., Okon, M., Joshi, M. D., and Nielsen,
J. E. (2011). Dissecting electrostatic interactions in Bacillus circulans
xylanase through NMR-monitored pH titrations, J. Biomol. NMR, 51, pp.
5–19.
56. Mock, W. L., and Aksamawati, M. (1994). Binding to thermolysin of
phenolate-containing inhibitors necessitates a revised mechanism of
catalysis, Biochem. J., 302(Pt 1), pp. 57–68.
57. Mock, W. L., and Stanford, D. J. (1996). Arazoformyl dipeptide substrates
for thermolysin. Confirmation of a reverse protonation catalytic
mechanism, Biochemistry, 35, pp. 7369–7377.
58. Reitinger, S., Yu, Y., Wicki, J., Ludwiczek, M., D’Angelo, I., Baturin, S.,
Okon, M., Strynadka, N. C., Lutz, S., Withers, S. G., and McIntosh, L. P.

www.ebook3000.com
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

References 681

(2010). Circular permutation of Bacillus circulans xylanase: a kinetic


and structural study, Biochemistry, 49, pp. 2464–2474.
59. Bernstein, F. C., Koetzle, T. F., Williams, G. J., Meyer, E. F., Jr., Brice, M. D.,
Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M. (1977). The
Protein Data Bank: a computer-based archival file for macromolecular
structures, J. Mol. Biol., 112, pp. 535–542.
60. Lo, W. C., Wang, L. F., Liu, Y. Y., Dai, T., Hwang, J. K., and Lyu, P. C.
(2012). CPred: a web server for predicting viable circular permutations
in proteins, Nucleic Acids Res., 40, pp. W232–W237.
61. Lo, W. C., Dai, T., Liu, Y. Y., Wang, L. F., Hwang, J. K., and Lyu, P. C. (2012).
Deciphering the preference and predicting the viability of circular
permutations in proteins, PLOS ONE, 7, p. e31791.
62. Bornscheuer, U. T., Bessler, C., Srinivas, R., and Krishna, S. H. (2002).
Optimizing lipases and related enzymes for efficient application, Trends
Biotechnol., 20, pp. 433–437.
63. Jaeger, K. E., and Eggert, T. (2002). Lipases for biotechnology, Curr. Opin.
Biotechnol., 13, pp. 390–397.
64. Tan, T., Lu, J., Nie, K., Deng, L., and Wang, F. (2010). Biodiesel production
with immobilized lipase: a review, Biotechnol. Adv., 28, pp. 628–634.
65. Uppenberg, J., Hansen, M. T., Patkar, S., and Jones, T. A. (1994). The
sequence, crystal structure determination and refinement of two crystal
forms of lipase B from Candida antarctica, Structure, 2, pp. 293–308.
66. Uppenberg, J., Ohrner, N., Norin, M., Hult, K., Kleywegt, G. J., Patkar, S.,
Waagen, V., Anthonsen, T., and Jones, T. A. (1995). Crystallographic and
molecular-modeling studies of lipase B from Candida antarctica reveal
a stereospecificity pocket for secondary alcohols, Biochemistry, 34, pp.
16838–16851.
67. Uppenberg, J., Patkar, S., Bergfors, T., and Jones, T. A. (1994). Crystalliza-
tion and preliminary X-ray studies of lipase B from Candida antarctica,
J. Mol. Biol., 235, pp. 790–792.
68. Rotticci-Mulder, J. C., Gustavsson, M., Holmquist, M., Hult, K., and
Martinelle, M. (2001). Expression in Pichia pastoris of Candida
antarctica lipase B and lipase B fused to a cellulose-binding domain,
Protein Expr. Purif., 21, pp. 386–392.
69. Idris, A., and Bukhari, A. (2012). Immobilized Candida antarctica lipase
B: hydration, stripping off and application in ring opening polyester
synthesis, Biotechnol. Adv., 30, pp. 550–563.
70. Lutz, S. (2004). Engineering lipase B from Candida antarctica, Tetrahe-
dron: Asymmetry, 15, pp. 2743–2748.
February 2, 2016 16:50 PSP Book - 9in x 6in 19-Allan-Svendsen-c19

682 A Functional and Structural Assessment of Circularly Permuted

71. Wu, Q., Soni, P., and Reetz, M. T. (2013). Laboratory evolution of
enantiocomplementary Candida antarctica lipase B mutants with broad
substrate scope, J. Am. Chem. Soc., 135, pp. 1872–1881.
72. Qian, Z., and Lutz, S. (2005). Improving the catalytic activity of Candida
antarctica lipase B by circular permutation, J. Am. Chem. Soc., 127, pp.
13466–13467.
73. Qian, Z., Fields, C. J., and Lutz, S. (2007). Investigating the structural
and functional consequences of circular permutation on lipase B from
Candida antarctica, ChemBioChem, 8, pp. 1989–1996.
74. Yu, Y., and Lutz, S. (2010). Improved triglyceride transesterification by
circular permuted Candida antarctica lipase B, Biotechnol. Bioeng., 105,
pp. 44–50.
75. Laszlo, J. A., Yu, Y., Lutz, S., and Compton, D. L. (2011). Glycerol acyl-
transfer kinetics of a circular permutated Candida antarctica lipase B,
J. Mol. Catal. B, Enzym., 72, pp. 175–180.
76. Qian, Z., Horton, J. R., Cheng, X., and Lutz, S. (2009). Structural redesign
of lipase B from Candida antarctica by circular permutation and
incremental truncation, J. Mol. Biol., 393, pp. 191–201.
77. Daugherty, A. B., Govindarajan, S., and Lutz, S. (2013). Improved
biocatalysts from a synthetic circular permutation library of the flavin-
dependent oxidoreductase old yellow enzyme, J. Am. Chem. Soc., 135, pp.
14425–14432.

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

Chapter 20

Ancestral Reconstruction of Enzymes

Satoshi Akanuma and Akihiko Yamagishi


Department of Applied Life Sciences, Tokyo University of Pharmacy and Life Sciences,
1432-1 Horinouchi, Hachioji, Tokyo 192-0392, Japan
akanuma@toyaku.ac.jp

Phylogenetic prediction of ancient protein sequences, followed by


experimental reconstruction of the ancient proteins, is currently
a common technique to study the molecular evolutions of pro-
teins and life. We applied the ancestral sequence reconstruction
technique to resurrect ancient proteins that might be possessed
by the last universal common ancestor, the Commonote. All of
the reconstructed proteins are extremely thermally stable. The
result is consistent with the idea that the common ancestor was
a (hyper)thermophile that flourished at a very high temperature.
In addition, given that the ancient organisms were thermophilic,
the reconstruction technique provides a new method for designing
extremely thermally stable proteins.

20.1 Introduction

To date, huge genome information derived from the genome


project on various organisms has been accumulated. The genome

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

684 Ancestral Reconstruction of Enzymes

information, together with the development of phylogenetic analysis


methods, allows us to infer ancient protein sequences [1, 2]. We can
now reconstruct the ancient genes that encode the ancient proteins
[3]. Producing and characterizing the ancient proteins will provide a
powerful means for improving our knowledge on the ancient living
systems and the environments of early life.
In this chapter, we first overview ancestral sequence reconstruc-
tion. We then introduce our recent study to provide experimental
evidence of the thermophilicity of ancient life using the ancestral
sequence reconstruction technique. Finally, we discuss the applica-
tion of ancestral sequence reconstruction to design thermally stable
proteins.

20.2 Reconstruction of an Ancestral Protein Sequence

20.2.1 Overview
Information of an ancient protein is inherited in its descendants,
that is, extant homologous proteins. Therefore, ancestral sequences
of a particular protein can be inferred by analyzing a set of
homologous amino acid sequences of extant organisms [1–3].
Figure 20.1 illustrates a procedure for reconstructing an ancestral
protein sequence. First, homologous protein sequences of the target
protein are collected from public databases and the sequences were
aligned. The resulting multiple alignment is used for phylogenetic
tree building. Next, using the sequences and the topology of the tree,
an ancestral protein sequence is inferred with computer. Then, the
gene encoding the inferred amino acid sequence is synthesized by
genetic engineering methods, including polymerase chain reaction
(PCR) amplification. The gene is expressed in Escherichia coli, and
the ancestral protein is purified and characterized.

20.2.2 Methods for Ancestral Sequence Reconstruction


Currently, the maximumlikelihood (ML) method [4] and the
Bayesian method [5] are mainly being used for ancestral sequence
reconstruction. The ML method calculates the likelihood of each

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

Reconstruction of an Ancestral Protein Sequence 685

Figure 20.1 The strategy for ancestral sequence reconstruction. The


flowchart of reconstruction and characterization of an ancestral protein. See
the text for more details of each step.

type of amino acid at each residue position in an ancestral sequence


at a node of the tree using a statistical model of evolution. The
ancestral sequence inferred by the ML method is the set of residues
that show the highest likelihood to exist at the respective positions.
Thus, the ML method assumes only the most likely estimate of the
tree and substitution models. In contrast, uncertainties of the tree
topology, branch lengths, and substitution models are taken into
account for the computation of an ancestral sequence using the
Bayesian method. Procedures and theories of ancestral sequence
reconstruction are described in greater detail in the reviews by
Thornton [3] and Gaucher et al. [6].
Williams et al. [7] assessed the accuracy of ancestral sequence
reconstruction using the ML or the Bayesian method. They
computed an artificial trajectory of protein evolution, which were
then used to evaluate if the ML and Bayesian methods accurately
reconstructed ancestral sequences. They found that the ML method
can effectively predict accurate ancestral amino acids in contrast
to the Bayesian method, which sometimes predicts less probable
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

686 Ancestral Reconstruction of Enzymes

ancestral amino acids. Even so, the ML method tends to incorporate


nonancestral amino acids that are found frequently in extant
sequences. Therefore, they asserted that use of the ML method for
ancestral sequence reconstruction may result in an overestimation
of the thermal stability of the reconstructed protein because the
frequently occurring amino acids at a given position are often
stabilizing [8, 9]. They also mentioned that the Bayesian method is
a more reliable guide to characterize the thermodynamic properties
of reconstructed proteins.
Recently, Thornton and coworkers [10] also compared the
accuracy of ancestral sequence reconstructions using the ML and
Bayesian methods. They suggested that the ML method accurately
reconstructs ancestral sequences even in the face of uncertainty
associated with phylogeny and that the Bayesian method is not
beneficial to improve the accuracy of reconstructed sequences.
Importantly, it should be kept in mind that the accuracy of
ancestral sequence reconstruction is sensitive to the use of incorrect
evolutionary models and that none of the reconstruction methods
predict perfectly accurate ancestral amino acid sequences.

20.2.3 Early Works


The ancestral sequence reconstruction technique has been applied
to several eukaryotic proteins. Benner and coworkers [11] re-
constructed the last common ancestor of alcohol dehydrogenase
1 (Adh1) and its homolog Adh2 to understand the evolution
of ethanol production/consumption in yeast. Adh1 is involved
in the reduction of acetaldehyde to produce ethanol and Adh2
is used for the consumption of ethanol. Kinetic analysis of the
reconstructed protein suggested that the common ancestor of the
alcohol dehydrogenases was optimized to produce, rather than
consume, ethanol. Thornton and coworkers published excellent
papers in which they reported ancestral sequence reconstruction
experiments to understand the evolutionary trajectory of changes
in substrate specificity of hormone receptors [12–14]. Thornton’s
group also applied ancestral sequence reconstruction to dissect
the evolution of the increased complexity of vacuolar H+ -ATPases
[15].

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

The Commonote 687

The reconstruction technique has provided a new way for im-


proving our knowledge of environmental temperatures experienced
by ancient organisms. Gaucher et al. [16] reconstructed ancestors
of bacterial elongation factor Tu, whose unfolding temperatures
were then characterized. They estimated the environmental tem-
perature of the common ancestor of bacteria according to the
generally accepted idea that the unfolding temperature of a protein
correlates well with the optimum environmental temperature of
its host. Because the thermal stability of the reconstructed protein
was similar to that of the elongation factor Tu from an extant
thermophile, they concluded that the common ancestral organism
of extant bacteria was thermophilic. In the following paper [17],
they expand the timescale of reconstructed targets to the elongation
factor Tu hosted by bacteria that lived 3500 to 500 million
years ago. Their analysis suggested that ancient bacteria adapted
progressively to low-temperature environment possibly in response
to a temperature change in ancient oceans. Thus, the reconstruction
method is currently a common method for improving our knowledge
of the physical characteristics of ancient proteins as well as ancient
global environment where early life evolved.

20.3 The Commonote

20.3.1 The Last Universal Common Ancestor, the


Commonote
All organisms that currently exist on earth are separated into
three domains of life: Archaea, Bacteria, and Eukarya [18, 19]. A
phylogenetic tree of the extant organisms suggests that all of the
extant organisms are the descendants of a single ancestor, the
last universal common ancestor. Although the common ancestor
has been denoted as LUCA, LCA, or senancestor, we herein refer
to it as “Commonote” [20]. The characteristics of the Commonote
have been the target of research, because the result will help us
understand the characteristics and environment of our universal
ancestor.
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

688 Ancestral Reconstruction of Enzymes

20.3.2 Theoretical Studies on the Environmental


Temperature of the Commonote
Especially, the growth temperature of the ancient organism has been
an interesting topic. A number of theoretical studies have focused
on the environment temperature of the Commonote. In the most ref-
erenced phylogenetic trees that were built from smallsubunit rRNA
sequences, hyperthermophilic Archaea and Bacteria are located at
the deepest and shortest branches [18, 21]. Therefore, the common
ancestors of Archaea and Bacteria have been proposed to be
hyperthermophilic [22, 23]. Given the apparent hyperthermophilic
ancestry for both archaeal and bacterial lineages, the Commonote is
also parsimoniously inferred as a hyperthermophile.
However, some early theoretical studies suggested that the
Commonote was not thermophilic. Galtier et al. [24] conducted a
model analysis of rRNA sequences and estimated the G + C content
in an ancestral rRNA sequence. The ancestral G + C content provides
a powerful means to estimate the environmental temperatures of
the ancestral organism because a greater G + C content in rRNA
often reflect a higher optimum environmental temperature. The
G + C content of the common ancestor calculated by Galtier et al. was
not compatible with hightemperature living organisms. Therefore,
they concluded that the Commonote was not a thermophile but more
likely a mesophile.
Boussau et al. [25] also performed computational analyses of
both the G + C content in rRNA and the amino acid composition in
protein sequences. They used both the rRNA-based and the protein-
based thermometers to assess the evolution of thermophily over a
phylogenetic tree. Their results suggested that both the last common
ancestors of Archaea and of Bacteria were thermophilic, but the
Commonote was nonthermophilic. However, it is well known that
the thermal stabilities of proteins are sometimes very sensitive
to a few amino acid substitutions. Therefore, due to the lack
of empirical testing, the conclusion obtained from their analysis
remained inferential.
Moreover, the use of a different method led to a contradictory
conclusion. Di Giulio [26] analyzed the same genome dataset
with Galtier et al. by using a different phylogenetic algorism and

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

The Commonote 689

concluded that the Commonote was most likely a hyperthermophile.


In subsequent studies, he obtained further supports for the
hyperthermophilic ancestry of life [27, 28]. Brooks et al. [29]
also computed the amino acid compositions of a set of proteins
that might existed in the Commonote. The ancestral amino acid
compositions calculated using an expectation maximization method
was more similar to the composition of the same set in extant
thermophiles than in extant mesophiles.
Thus, conclusive evidence characterizing the environmental
temperature of the Commonote has yet to be reported. Moreover,
no definitive experimental test of this issue had been reported
prior to our work described below. Therefore, we had begun
to address this issue experimentally by reconstructing ancestral
protein sequences and characterizing their properties, which should
reflect the ancestor’s characteristics and its environment.

20.3.3 Reconstruction of an Ancestral Nucleoside


Diphosphate Kinase
What was the growth temperature of the Commonote? Our approach
used to answer this question involved phylogenetic tree building to
infer the amino acid sequences of archaeal and bacterial ancestral
nucleoside diphosphate kinases (NDKs), expressing the genes
encoding the inferred sequences, and characterizing the unfolding
temperatures of the gene products [30]. Most extant organisms
contain at least one gene that encodes an NDK, and therefore,
the Commonote is also thought to have possessed an NDK. More
importantly, the unfolding temperature of an NDK correlates well
with the environmental temperature of its host organism (Fig. 20.2
[30]). Therefore, the stability of a resurrected NDK would be directly
related to the natural environmental temperature of the ancient
organism that hosted the ancestral NDK. Thus, an NDK is an ideal
protein as a resurrection target.
We first prepared a multiple amino acid sequence alignment
using 204 NDK sequences from extant species. The resulting
alignment was used to build two phylogenetic trees. The first tree
was built without constraints. In a commonly used rRNA tree,
Archaea and Bacteria each form a single domain. However, in
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

690 Ancestral Reconstruction of Enzymes

Figure 20.2 Unfolding midpoint temperatures of extant microbial NDKs as


the function of the optimum environmental temperatures of their hosts. The
lower limit of the estimate for the environmental temperature (75◦ C) of
the Commonote was found using the calibration curve and the unfolding
temperature (94◦ C) of Bac4mut7. (a) Dictyostelium discoideum, (b) E. coli,
(c) Bacillus subtilis, (d) Methanothermobacter thermautotrophicus, (e) T.
thermophilus, (f) Archaeoglobus fulgidus, (g) Methanocaldococcus jannaschii,
(h) S. tokodaii, (i) Aeropyrum pernix, and (j) Pyrococcus horikoshii.

the unconstrained NDK tree, some archaeal sequences appeared


among the bacterial sequences. We therefore built the second tree
with the constraint that archaeal and bacterial sequences were
separated into the respective monophyletic groups. From both trees,
we inferred the sequences at the deepest archaeal and bacterial
nodes. The ML method was used for tree building and predicting
ancestral sequences. The ancestral sequences predicted from the
unconstrained tree are named Arc3 (archaeal) and Bac3 (bacterial),

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

The Commonote 691

and those predicted from the constrained tree are named similarly
as Arc4 and Bac4.
The topologies of the phylogenetic trees often depend on the
genes or proteins analyzed [31]. Indeed, even the topology of
the constrained NDK tree differ somewhat from that of the most
commonly referenced species tree built using small-subunit rRNA
sequences [19]. We therefore constructed another phylogenetic tree
that contained the small-subunit rRNA sequences of the species that
were used to build the NDK trees. Using the resulting tree topology
and replacing each rRNA sequence with its corresponding species
NDK sequence, we inferred additional ancestral NDK sequences,
Arc5 and Bac5.
The genes encoding the inferred amino acid sequences were
synthesized by a PCR-mediated gene synthesis method. The genes
were then expressed in E. coli, and the ancestral proteins were
purified. The temperature-induced unfolding experiment showed
that the reconstructed ancestral NDKs are all stable at or above
100◦ C and that the predicted thermal stabilities of the ancestral
NDKs were robust against the topologies of the trees used to
infer the ancestral sequences (Table 20.1). The environmental
temperatures of the common ancestors of Bacteria and Archaea are
estimated to be ∼80◦ C to 93◦ C and ∼81◦ C to 97◦ C, respectively,
using the unfolding temperatures of the ancestral NDKs and the
calibration curve shown in Fig. 20.2 [30].

Table 20.1 Unfolding midpoint temperatures (Tm ) of reconstructed NDKs


and the phylogenetic tree used to reconstruct the ancestral sequences

Ancestral Node for the Tree used to infer


NDK ancestral sequence T m (◦ C) the ancestral sequence
Arc3 Last archaeal ancestor 112 Unconstrained tree
Arc4 Last archaeal ancestor 109 Constrained tree
Arc5 Last archaeal ancestor 108 rRNA tree
Bac3 Last bacterial ancestor 109 Unconstrained tree
Bac4 Last bacterial ancestor 102 Constrained tree
Bac5 Last bacterial ancestor 107 rRNA tree
Bac4mut4-N Commonote 97 –
Bac4mut7 Commonote 94 –

Source: Ref. [30].


March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

692 Ancestral Reconstruction of Enzymes

20.3.4 Estimation of the Environmental Temperature of


the Commonote
To determine the position of the root on a phylogenetic tree, a
composite tree of two or more paralogous proteins that might be
duplicated from their common ancestral protein before the age
of the Commonote should have been constructed [31]. However,
there is no good paralogue of NDK. Therefore, the phylogenetic
trees used to infer the ancestral NDK sequences are not rooted and
we cannot infer the precise sequence of the Commonote’s NDK.
Nevertheless, we expect that the node for the Commonote would
be located between the roots of Bacteria and Archaea In addition,
comparing the sequences of the six reconstructed NDKs (Arc3/4/5
and Bac3/4/5), identical amino acids were inferred for 115 out of
139 positions. These amino acids are, therefore, likely to exist at the
same positions in the NDK sequence of the Commonote. For the 24
remaining positions, the NDK sequence of the Commonote would
likely have residues present in at least one of the reconstructed NDK
sequences because the Commonote’s sequence is parsimoniously
expected to have had residues contained in either the last common
bacterial or the archaeal ancestor’s sequence. According to this
idea, 5.4 × 108 combinations are possible for the residues at the
24th position of the Commonote’s NDK. We therefore tried to
explore the sequence for the lowest stability among all the possible
Commonote’s NDK sequences.
Bac4 shows the lowest unfolding temperature among the
reconstructed ancestral NDKs. We therefore individually replaced
the residues at the 24 positions of Bac4 by residues found at the
same positions in the other ancestral NDKs and thus created 29 NDK
mutants. Further, we simultaneously introduced four amino acid
substitutions, which reduced the unfolding temperature of Bac4 and
thus generated Bac4mut4-N. Because an amino acid substitution
that improves the stability, when individually introduced into Bac4,
may show destabilizing effect in combination with other mutations,
we also tested the effect of the combinations of amino acid
substitutions that are located within 5 Å of one another on the
stability of Bac4mut4-N. In addition, we mutated Bac4mut4-N by
introducing all destabilizing combinations of mutations. The unfold-

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

Application to Designing Thermally Stable Proteins 693

ing temperature of the resulting Bac4mut7 probably represents the


lowest estimate of the unfolding temperature for the Commonote’s
NDK. According to the unfolding temperature of Bac4mut7 (94◦ C;
Table 20.1) and the calibration curve shown in Fig. 20.2, the
lowest estimate for the Commonote’s environment temperature
is 75◦ C. Thus, our ancestral sequence reconstruction followed by
the characterization of the thermal stabilities of the reconstructed
proteins strongly supported the idea that the Commonote was likely
a (hyper)thermophilic organism that flourished at a temperature
above 75◦ C [30].

20.4 Application to Designing Thermally Stable Proteins

20.4.1 Design of Thermally Stable Proteins


Enzymes show extreme catalytic power and strict substrate and
reaction specificities that are attractive features for the use of en-
zymes in biotechnological applications. However, a crucial limitation
often comes from their instability. Therefore, the establishment
of a generic means for improving enzyme’s stability has long
been desired. Nevertheless, designing thermally stable proteins is
still a challenging task. A conventional method based on rational
designing requires a high-resolution three-dimensional structure
for the design of mutations to enhance the thermal stability of
an enzyme [32–34]. Many different types of factors have been
proposed to affect the stability of an enzyme [35–45], but we
do not completely understand the sequence–structure–stability
relationship. In addition, the extent to which each factor contributes
to the stability is usually small and depends on the structural
context. Therefore, the same rational design strategy does not
always improve stabilities of different types of enzymes.
Directed evolution is an alternative method to produce mutant
enzymes with improved thermal stability [46]. For this technique,
detailed knowledge of the sequence–structure–stability relation-
ship, as well as the structure of the targeted enzyme, is not required
[47]. A number of proteins and enzymes have been stabilized
using this technique [48–57]. However, a prerequisite for directed
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

694 Ancestral Reconstruction of Enzymes

evolution experiments is the existence of an appropriate screening


or selection system, which is often unavailable. Moreover, selecting
stable enzymes using directed evolution techniques often requires
much time and labor.
The reconstructing technique described above also provides
a new method for designing thermally stable proteins. Given
the (hyper)thermophilic ancestry, one can expect that ancestral
residues contribute to the thermal stability of a protein to a
much greater extent than do nonancestral residues. Therefore, a
mutant enzyme in which one or a small number of nonancestral,
original residues are replaced with inferred ancestral amino acids
may show the trend toward enhanced thermal stability when
compared to a wild-type enzyme. The method using the ancestral
sequence reconstruction also does not require any knowledge of the
structure of an enzyme under study and relies only on the amino
acid sequences of homologous enzymes, which are, in most cases,
available in the growing public databases. The recent expansion of
genome projects of Bacteria and Archaea will provide the resource
necessary for the construction of a phylogenetic tree that is essential
for inferring ancestral sequences. Thus, reconstruction of ancient
sequences may serve as a generic method for creating thermally
stable proteins.

20.4.2 Case Studies to Create Thermally Stable Enzymes by


Introducing Ancestral Residues as Amino Acid
Substitutions
Table 20.2 lists some studies to validate ancestral sequence
reconstruction to create thermally stable enzymes conducted in
the authors’ laboratory. The validity of introducing the inferred
ancestral amino acids into a wild-type enzyme for improving
the thermal stability of an enzyme was first evaluated using
a leucine biosynthetic enzyme, 3-isopropylmalate dehydrogenase
(IPMDH). Miyazaki et al. [58] computed an ancestral amino acid
sequence of IPMDH, which might be hosted by the Commonote.
They introduced inferred ancestral residues individually into the
intrinsically thermally stable IPMDH from the hyperthermophilic
archaeon, Sulfolobus tokodaii. The thermal stabilities of the resulting

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

Application to Designing Thermally Stable Proteins 695

Table 20.2 Ancestral sequence introduction or reconstruction to produce


thermally stable enzymes conducted in the authors’ laboratory

Reference Target enzyme Study design and observations


Miyazaki et al. [58] S. tokodaii IPMDH Five of the seven ancestral mutants
showed thermal stability higher than
that of the wild-type enzyme.
Watanabe et al. [59] T. thermophiles IPMDH Six of the twelve ancestral mutants
showed thermal stability higher than
that of the wild-type enzyme.
Iwabata et al. [60] C. noboribetus ICDH Four of the five mutants showed
thermal stability higher than that of the
wild-type ICDH.
Shimizu et al. [61] T. thermophiles GlyRS Six of the eight mutants were more
thermally stable than the wild-type
enzyme. Seven mutants showed higher
activity than the wild-type at 70◦ C.
Akanuma et al. [62] B subunit of DNA gyrase The full-length ancestral B subunit of
DNA gyrase is slightly less thermally
stable than the B subunit of T.
thermophilus DNA gyrase.
Akanuma et al. [62] DNA gyrase ATPase domain The ancestral ATPase domain is as
thermally sable as the T. thermophilus
ATPase domain. The ancestral ATP
domain hydrolyzed ATP 4.7-fold more
rapidly than the thermophilic ATP
domain when the protein concentration
was 30 μM.

mutant IPMDHs were then assessed by measuring the remaining


activity after heat treatment and the change in 222 nm elliptcity
upon thermal unfolding. As the result, five of the seven mutants were
more thermally stable than was the wild-type IPMDH.
A homologous enzyme from the extremely thermophilic bac-
terium, Thermus thermophilus, was also subjected for testing the
validity of introducing ancestral amino acids. Watanabe et al. [59]
produced and characterized 12 mutants, in each of which one or
a small number of the original residue(s) were replaced by the
ancestral amino acid(s). Then they compared the thermal stabilities
of the mutants with the wild-type IPMDH from T. thermophilus and
found that 6 of the 12 ancestral mutants showed enhanced thermal
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

696 Ancestral Reconstruction of Enzymes

stability. A similar trend was also observed in the study using


isocitrate dehydrogenase (ICDH) from the extremely thermophilic
archaeon, Caldococcus noboribetus [60]. In that study, five ancestral
mutants, each containing an ancestral amino acid residue, were
produced. Of them, four mutants were more thermally stable than
was wild-type ICDH.
A similar method was used to improve the thermal stability of
an enzyme involved in the translation system of T. thermophilus.
Shimizu et al. [61] inferred ancestral amino acid sequences of α2 -
type glycyl-tRNA synthetase (GlyRS). Then they introduced each
or a combination of inferred ancestral amino acids into GlyRS
from T. thermophiles and evaluated the thermal stabilities of the
resulting eight mutants. Eventually, they found that six mutants were
more stable than was wild-type GlyRS. Thus, these experimental
studies clearly showed that the thermal stabilities of enzymes from
thermophiles and even from hyperthermophiles could be further
improved by introducing ancestral amino acid residues found in
inferred ancestral amino acid sequences.

20.4.3 Reconstruction of Thermally Stable, Ancestral DNA


Gyrase Using a Small Set of Homologous Amino
Acid Sequences
Generally, use of a large dataset requires much time and labor.
Therefore, we tested if an ancestral sequence reconstructed by
analyzing a small set of homologous amino acid sequences could
fold into a thermally stable and catalytically active enzyme [62]. We
targeted the B subunit of DNA gyrase, which is involved in DNA
replication. We inferred the amino acid sequence corresponding
to the common ancestor of bacterial DNA gyrases by comparing
the sequences from extant 16 bacteria. We reconstructed the genes
encoding the whole inferred sequence and its ATPase domain,
which were then expressed in E. coli. The structural properties of
the full-length reconstructed B subunit were similar to those of
the corresponding thermophilic T. thermophilus enzyme although
the reconstructed B subunit was slightly less stable than was the
B subunit from T. thermophilus DNA gyrase. The reconstructed
ATPase domain was as thermally stable as the corresponding

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

Conclusion 697

domain from T. thermophilus. Moreover, the reconstructed ATPase


domain is significantly catalytically active. Thus, a small number of
homologous sequences are sufficient to create a thermally stable
and functional enzyme.
As described above, the accuracy of a predicted ancestral
sequence depends on sequence sampling as well as on the use of
a correct evolutionary model. However, the 16 sequences used to
infer the ancestral sequence of the B subunit of DNA gyrase seemed
not to accurately represent the sequence space encompassed by
DNA gyrases. Therefore, the reconstructed B subunit of DNA gyrase
unlikely reflects the true ancestral sequence. If so, why did ancestral
sequence reconstruction create a thermally stable enzyme? A
possible interpretation is that the high thermal stability of the
reconstructed gyrase is possibly related to the inherent nature of
the ancestral sequence reconstructions, specifically to the presence
of frequently occurring amino acids at many positions. There is
another proposal where the most frequently occurring amino acid at
any position among homologous sequences is thought to contribute
to enhancing protein’s stability to a greater extent than any rarely
occurring amino acid at the same position [8, 9]. To clarify the reason
why the reconstructed ancestral enzyme was thermally stable, we
also created the consensus ATPase domain of DNA gyrase that
contained the most frequent amino acid at each position in the 16
sequences that had been used to reconstruct the ancestral sequence.
The consensus ATPase domain was less thermally stable than was
the ancestral ATPase domain. Therefore, the high thermal stability
of the ancestral enzyme was likely ascribed in part to the antiquity
of some of the inferred ancestral residues. Thus, ancestral sequence
reconstruction, even using a small set of homologous amino acid
sequences, provides a convenient and generic means to design
thermally stable enzymes.

20.5 Conclusion

Elucidation of early life and the environment in which early life


evolved is important for our comprehensive understanding of
the history of life on earth. In this chapter, we first described
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

698 Ancestral Reconstruction of Enzymes

the principle and procedure of ancestral sequence reconstruction,


which have been used for investigating ancient proteins and life.
Then, we introduced our recent study to experimentally address
the question, What was the growth temperature of the Commonote?
The approach used to answer this question involved phylogenetic
tree building to infer the amino acid sequences of archaeal and
bacterial ancestral NDK, expressing the genes encoding the inferred
sequences, and characterizing the unfolding temperatures of the
gene products. The midpoint unfolding temperatures of the bacterial
and archaeal ancestors were quite similar to those of modern
(hyper)thermophilic proteins. Because the amino acids conserved
among the ancestral sequences probably include those of the
Commonote and because the nonconserved amino acids did not
substantially affect the thermal stability, the Commonote probably
possessed a thermostable NDK, that is, it was a (hyper)thermophile.
Given the thermophilicity of ancestral life, the ancestral reconstruc-
tion technique will provide a reliable means of creating thermally
stable enzymes, which only relies on sequences of homologous
enzymes.

Acknowledgments

Our works described here were supported in part by Grants-in-Aid


from the Japanese Society for the Promotion of Science (JSPS).

References

1. Messier, W., and Stewart, C. B. (1997). Episodic adaptive evolution of


primate lysozymes, Nature, 385, pp. 151–154.
2. Bielawski, J. P., and Yang, Z. (2003). Maximum likelihood methods for
detecting adaptive evolution after gene duplication, J. Struct. Funct.
Genomics, 3, pp. 201–212.
3. Thornton, J. W. (2004). Resurrecting ancient genes: experimental
analysis of extinct molecules, Nat. Rev. Genet., 5, pp. 366–375.
4. Yang, Z., Kumar, S., and Nei, M. (1995). A new method of inference of
ancestral nucleotide and amino acid sequences, Genetics, 141, pp. 1641–
1650.

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

References 699

5. Yang, Z., and Rannala, B. (1997). Bayesian phylogenetic inference using


DNA sequences: a Markov chain Monte Carlo method, Mol. Biol. Evol., 14,
pp. 717–724.
6. Gaucher, E. A., Kratzer, J. T., and Randall, R. N. (2010). Deep phylogeny:
how a tree can help characterize early life on Earth, Cold Spring Harb.
Perspect. Biol., 2, p. a002238.
7. Williams, P. D., Pollock, D. D., Blackburne, B. P., and Goldstein, R. A.
(2006). Assessing the accuracy of ancestral protein reconstruction
methods, PLOS Comput. Biol., 2, p. e69.
8. Steipe, B., Schiller, B., Pluckthun, A., and Steinbacher, S. (1994). Sequence
statistics reliably predict stabilizing mutations in a protein domain, J.
Mol. Biol., 240, pp. 188–192.
9. Steipe, B. (2004). Consensus-based engineering of protein stability:
from intrabodies to thermostable enzymes, Methods Enzymol., 388, pp.
176–186.
10. Hanson-Smith, V., Kolaczkowski, B., and Thornton, J. W. (2010). Robust-
ness of ancestral sequence reconstruction to phylogenetic uncertainty,
Mol. Biol. Evol., 27, pp. 1988–1999.
11. Thomson, J. M., Gaucher, E. A., Burgan, M. F., De Kee, D. W., Li, T., Aris, J. P.,
and Benner, S. A. (2005). Resurrecting ancestral alcohol dehydrogenases
from yeast, Nat. Genet., 37, pp. 630–635.
12. Bridgham, J. T., Carroll, S. M., and Thornton, J. W. (2006). Evolution of
hormone-receptor complexity by molecular exploitation, Science, 312,
pp. 97–101.
13. Ortlund, E. A., Bridgham, J. T., Redinbo, M. R., and Thornton, J. W. (2007).
Crystal structure of an ancient protein: evolution by conformational
epistasis, Science, 317, pp. 1544–1548.
14. Bridgham, J. T., Ortlund, E. A., and Thornton, J. W. (2009). An epistatic
ratchet constrains the direction of glucocorticoid receptor evolution,
Nature, 461, pp. 515–519.
15. Finnigan, G. C., Hanson-Smith, V., Stevens, T. H., and Thornton, J. W.
(2012). Evolution of increased complexity in a molecular machine,
Nature, 481, pp. 360–364.
16. Gaucher, E. A., Thomson, J. M., Burgan, M. F., and Benner, S. A. (2003).
Inferring the palaeoenvironment of ancient bacteria on the basis of
resurrected proteins, Nature, 425, pp. 285–288.
17. Gaucher, E. A., Govindarajan, S., and Ganesh, O. K. (2008). Palaeotem-
perature trend for Precambrian life inferred from resurrected proteins,
Nature, 451, pp. 704–707.
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

700 Ancestral Reconstruction of Enzymes

18. Woese, C. R. (1987). Bacterial evolution, Microbiol. Rev., 51, pp. 221–271.
19. Woese, C. R., Winker, S., and Gutell, R. R. (1990). Architecture of
ribosomal RNA: constraints on the sequence of “tetra-loops”, Proc. Natl.
Acad. Sci. U S A, 87, pp. 8467–8471.
20. Yamagishi, A., Kon, T., Takahashi, G., and Oshima, T. (1998). From the
common ancestor of all living organisms to protoeukaryotic cell. In
Thermophiles: The Keys to Molecular Evolution and the Origin of Life?
Wiegel, J., and Adams, M. W. W., eds. (Taylor & Francis, UK), pp. 287–295.
21. Achenbach-Richter, L., Gupta, R., Zillig, W., and Woese, C. R. (1988).
Rooting the archaebacterial tree: the pivotal role of Thermococcus celer
in archaebacterial evolution, Syst. Appl. Microbiol., 10, pp. 231–240.
22. Pace, N. R. (1991). Origin of life: facing up to the physical setting, Cell,
65, pp. 531–533.
23. Stetter, K. O. (1996). Hyperthermophilic procaryotes, FEMS Microbiol.
Rev., 18, pp. 149–158.
24. Galtier, N., Tourasse, N., and Gouy, M. (1999). A nonhyperthermophilic
common ancestor to extant life forms, Science, 283, pp. 220–221.
25. Boussau, B., Blanquart, S., Necsulea, A., Lartillot, N., and Gouy, M. (2008).
Parallel adaptations to high temperatures in the Archaean eon, Nature,
456, pp. 942–945.
26. Di Giulio, M. (2000). The universal ancestor lived in a thermophilic or
hyperthermophilic environment, J. Theor. Biol., 203, pp. 203–213.
27. Di Giulio, M. (2003). The universal ancestor and the ancestor of bacteria
were hyperthermophiles, J. Mol. Evol., 57, pp. 721–730.
28. Di Giulio, M. (2003). The universal ancestor was a thermophile or a
hyperthermophile: tests and further evidence, J. Theor. Biol., 221, pp.
425–436.
29. Brooks, D. J., Fresco, J. R., and Singh, M. (2004). A novel method for
estimating ancestral amino acid composition and its application to
proteins of the Last Universal Ancestor, Bioinformatics, 20, pp. 2251–
2257.
30. Akanuma, S., Nakajima, Y., Yokobori, S., Kimura, M., Nemoto, N., Mase,
T., Miyazono, K., Tanokura, M., and Yamagishi, A. (2013). Experimental
evidence for the thermophilicity of ancestral life, Proc. Natl. Acad. Sci. U
S A, 110, pp. 11067–11072.
31. Akanuma, S., Yokobori, S., and Yamagishi, A. (2013). Comparative
genomics of thermophilic bacteria and archaea. In Thermophilic
Microbes in Environmental and Industrial Biotechnology, Satyanarayana,

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

References 701

T., Litterchild J., and Kawarabayasi, Y., eds. (Springer Science, Berlin), pp.
331–349.
32. Ivens, A., Mayans, O., Szadkowski, H., Jurgens, C., Wilmanns, M., and
Kirschner, K. (2002). Stabilization of a (βα)8 -barrel protein by an
engineered disulfide bridge, Eur. J. Biochem., 269, pp. 1145–1153.
33. Kirino, H., Aoki, M., Aoshima, M., Hayashi, Y., Ohba, M., Yamagishi, A.,
Wakagi, T., and Oshima, T. (1994). Hydrophobic interaction at the sub-
unit interface contributes to the thermostability of 3-isopropylmalate
dehydrogenase from an extreme thermophile, Thermus thermophilus,
Eur. J. Biochem., 220, pp. 275–281.
34. Ulmer, K. M. (1983). Protein engineering, Science, 219, pp. 666–671.
35. Haney, P. J., Badger, J. H., Buldak, G. L., Reich, C. I., Woese, C. R., and Olsen,
G. J. (1999). Thermal adaptation analyzed by comparison of protein
sequences from mesophilic and extremely thermophilic Methanococcus
species, Proc. Natl. Acad. Sci. U S A, 96, pp. 3578–3583.
36. Facchiano, A. M., Colonna, G., and Ragone, R. (1998). Helix stabilizing
factors and stabilization of thermophilic proteins: an X-ray based study,
Protein Eng., 11, pp. 753–760.
37. Petukhov, M., Kil, Y., Kuramitsu, S., and Lanzov, V. (1997). Insights into
thermal resistance of proteins from the intrinsic stability of their alpha-
helices., Proteins, 29, pp. 309–320.
38. Dong, H., Mukaiyama, A., Tadokoro, T., Koga, Y., Takano, K., and
Kanaya, S. (2008). Hydrophobic effect on the stability and folding of a
hyperthermophilic protein, J. Mol. Biol., 378, pp. 264–272.
39. Clark, A. T., McCrary, B. S., Edmondson, S. P., and Shriver, J. W.
(2004). Thermodynamics of core hydrophobicity and packing in the
hyperthermophile proteins Sac7d and Sso7d, Biochemistry, 43, pp.
2840–2853.
40. Christodoulou, E., Rypniewski, W. R., and Vorgias, C. R. (2003). High-
resolution X-ray structure of the DNA-binding protein HU from the
hyper-thermophilic Thermotoga maritima and the determinants of its
thermostability, Extremophiles, 7, pp. 111–122.
41. Tanaka, T., Sawano, M., Ogasahara, K., Sakaguchi, Y., Bagautdinov,
B., Katoh, E., Kuroishi, C., Shinkai, A., Yokoyama, S., and Yutani, K.
(2006). Hyper-thermostability of CutA1 protein, with a denaturation
temperature of nearly 150◦ C, FEBS Lett., 580, pp. 4224–4230.
42. Cheung, Y. Y., Lam, S. Y., Chu, W. K., Allen, M. D., Bycroft, M., and
Wong, K. B. (2005). Crystal structure of a hyperthermophilic archaeal
acylphosphatase from Pyrococcus horikoshii: structural insights into
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

702 Ancestral Reconstruction of Enzymes

enzymatic catalysis, thermostability, and dimerization, Biochemistry,


44, pp. 4601–4611.
43. Arnott, M. A., Michael, R. A., Thompson, C. R., Hough, D. W., and Danson,
M. J. (2000). Thermostability and thermoactivity of citrate synthases
from the thermophilic and hyperthermophilic archaea, Thermoplasma
acidophilum and Pyrococcus furiosus, J. Mol. Biol., 304, pp. 657–668.
44. Dalhus, B., Saarinen, M., Sauer, U. H., Eklund, P., Johansson, K., Karlsson,
A., Ramaswamy, S., Bjork, A., Synstad, B., Naterstad, K., Sirevag, R., and
Eklund, H. (2002). Structural basis for thermophilic protein stability:
structures of thermophilic and mesophilic malate dehydrogenases, J.
Mol. Biol., 318, pp. 707–721.
45. Matsumura, M., Becktel, W. J., Levitt, M., and Matthews, B. W. (1989).
Stabilization of phage T4 lysozyme by engineered disulfide bonds, Proc.
Natl. Acad. Sci. U S A, 86, pp. 6562–6566.
46. Arnold, F. H., and Volkov, A. A. (1999). Directed evolution of biocatalysts,
Curr. Opin. Chem. Biol., 3, pp. 54–59.
47. Arnold, F. H., Wintrode, P. L., Miyazaki, K., and Gershenson, A. (2001).
How enzymes adapt: lessons from directed evolution, Trends. Biochem.
Sci., 26, pp. 100–106.
48. Tamakoshi, M., Nakano, Y., Kakizawa, S., Yamagishi, A., and Oshima,
T. (2001). Selection of stabilized 3-isopropylmalate dehydrogenase of
Saccharomyces cerevisiae using the host-vector system of an extreme
thermophile, Thermus thermophilus, Extremophiles, 5, pp. 17–22.
49. Akanuma, S., Yamagishi, A., Tanaka, N., and Oshima, T. (1998). Serial
increase in the thermal stability of 3-isopropylmalate dehydrogenase
from Bacillus subtilis by experimental evolution, Protein Sci., 7, pp. 698–
705.
50. Tamakoshi, M., Yamagishi, A., and Oshima, T. (1995). Screening of
stable proteins in an extreme thermophile, Thermus thermophilus, Mol.
Microbiol., 16, pp. 1031–1036.
51. Kotsuka, T., Akanuma, S., Tomuro, M., Yamagishi, A., and Oshima, T.
(1996). Further stabilization of 3-isopropylmalate dehydrogenase of an
extreme thermophile, Thermus thermophilus, by a suppressor mutation
method, J. Bacteriol., 178, pp. 723–727.
52. Sugimoto, N., Takakura, Y., Shiraki, K., Honda, S., Takaya, N., Hoshino, T.,
and Nakamura, A. (2013). Directed evolution for thermostabilization of
a hygromycin B phosphotransferase from Streptomyces hygroscopicus,
Biosci. Biotechnol. Biochem., 77, pp. 2234–2241.

www.ebook3000.com
March 23, 2016 13:9 PSP Book - 9in x 6in 20-Allan-Svendsen-c20

References 703

53. Schwab, T., and Sterner, R. (2011). Stabilization of a metabolic enzyme


by library selection in Thermus thermophilus, ChemBioChem., 12, pp.
1581–1588.
54. Morawski, B., Quan, S., and Arnold, F. H. (2001). Functional expression
and stabilization of horseradish peroxidase by directed evolution in
Saccharomyces cerevisiae, Biotechnol. Bioeng., 76, pp. 99–107.
55. Cherry, J. R., Lamsa, M. H., Schneider, P., Vind, J., Svendsen, A., Jones, A.,
and Pedersen, A. H. (1999). Directed evolution of a fungal peroxidase,
Nat. Biotechnol., 17, pp. 379–384.
56. Hoseki, J., Yano, T., Koyama, Y., Kuramitsu, S., and Kagamiyama, H.
(1999). Directed evolution of thermostable kanamycin-resistance gene:
a convenient selection marker for Thermus thermophilus, J. Biochem.,
126, pp. 951–956.
57. Liao, H., McKenzie, T., and Hageman, R. (1986). Isolation of a
thermostable enzyme variant by cloning and selection in a thermophile,
Proc. Natl. Acad. Sci. U S A, 83, pp. 576–580.
58. Miyazaki, J., Nakaya, S., Suzuki, T., Tamakoshi, M., Oshima, T., and
Yamagishi, A. (2001). Ancestral residues stabilizing 3-isopropylmalate
dehydrogenase of an extreme thermophile: experimental evidence
supporting the thermophilic common ancestor hypothesis, J. Biochem.,
129, pp. 777–782.
59. Watanabe, K., Ohkuri, T., Yokobori, S., and Yamagishi, A. (2006). De-
signing thermostable proteins: ancestral mutants of 3-isopropylmalate
dehydrogenase designed by using a phylogenetic tree, J. Mol. Biol., 355,
pp. 664–674.
60. Iwabata, H., Watanabe, K., Ohkuri, T., Yokobori, S., and Yamagishi, A.
(2005). Thermostability of ancestral mutants of Caldococcus nobori-
betus isocitrate dehydrogenase, FEMS Microbiol. Lett., 243, pp. 393–398.
61. Shimizu, H., Yokobori, S., Ohkuri, T., Yokogawa, T., Nishikawa, K., and
Yamagishi, A. (2007). Extremely thermophilic translation system in
the common ancestor Commonote: ancestral mutants of Glycyl-tRNA
synthetase from the extreme thermophile Thermus thermophilus, J. Mol.
Biol., 369, pp. 1060–1069.
62. Akanuma, S., Iwami, S., Yokoi, T., Nakamura, N., Watanabe, H., Yokobori,
S., and Yamagishi, A. (2011). Phylogeny-based design of a B-subunit of
DNA gyrase and its ATPase domain using a small set of homologous
amino acid sequences, J. Mol. Biol., 412, pp. 212–225.
This page intentionally left blank

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

PART IV

ENZYME SCREENING AND ANALYSIS

705
This page intentionally left blank

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Chapter 21

High-Throughput Screening or Selection


Methods for Evolutionary Enzyme
Engineering

Shuobo Shi,a,1 Hongfang Zhang,a,1 Ee Lui Ang,


and Huimin Zhaoa,b
a Metabolic Engineering Research Laboratory, Science and Engineering Institutes,

Agency for Science, Technology, and Research, Singapore


b Department of Chemical and Biomolecular Engineering,

University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA


zhao5@illinois.edu

Enzymes have become attractive candidates for the production of


industrial chemicals and pharmaceutical intermediates compared
to conventional chemical catalysts. However, the properties of
naturally existing enzymes are not always satisfactory for spe-
cific functions. Tailoring and evolving enzyme characteristics by
evolutionary enzyme engineering techniques is a powerful tool
to create novel biocatalysts, but isolating variants with desired
properties from a large pool of randomly generated mutants still
remains a bottleneck. Consequently, new selection and screening
methodologies have been constantly developed to accomplish the
goals of evolutionary enzyme engineering. This review will focus

1 These authors contributed equally to this work.

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

708 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

on the recent development of novel high-throughput and ultrahigh-


throughput selection and screening strategies to isolate enzyme
variants from a large library of mutants.

21.1 Introduction

Enzymes are useful catalysts for environmentally friendly synthesis


of valuable industrial chemicals and important pharmaceutical
intermediates [1–4]. Compared to conventional chemical catalysts,
enzymes can be employed effectively under energy-efficient con-
ditions, such as relatively mild temperature and pH values with
water as a solvent. In addition, enzymes have the potential to
be attractive alternative catalysts because of their extraordinary
capability to catalyze a wide range of complex molecules as
substrates and remarkable catalytic specificity and selectivity with
regard to unparalleled regio- and stereochemistry [5, 6]. Thus,
enzymes can save the efforts of tedious blocking and deblocking
steps that are common in enantio- and regioselective organic
synthesis and usually generate few by-products. However, the
naturally occurring enzymes are generally far from being optimized
for practical applications. As a consequence, protein engineering
technologies have been developed to improve and enhance enzyme
characteristics or to create new enzyme variants with better
performance for particular catalytic reactions [7–11].
Protein engineering can be accomplished through three exper-
imental routes: rational design, directed evolution, and a hybrid
approach combining rational design and directed evolution [12, 13].
Rational design refers to structure-based design using site-directed
mutagenesis or computational design [14, 15]. This approach does
not require a high-throughput screening or selection method;
however, it is greatly hindered in practice by limited knowledge of
protein structure, mechanism, and substrate-binding mode. In con-
trast, directed evolution is based upon iterative Darwinian evolution
process and consists of two major steps. In the first step, large,
diverse mutant libraries are created by randomly mutating the genes
encoding the targeted enzymes using a variety of in vivo and in vitro
methods [16–18]. Generating an unbiased mutation spectrum is the

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Introduction 709

Figure 21.1 Flowchart of a directed evolution experiment.

most important criteria for the construction of random mutagenesis


libraries [19]. In the second step, the properties of the enzyme
variants in the libraries are evaluated by an effective and efficient
selection or screening method to isolate mutants with enhanced
performance and the key positions crucial for inducing improved
characteristics are identified. The library size to be explored is,
to a certain extent, influenced by how well the mutagenesis is
rationalized [20]. The isolated variants with improved properties
will serve as templates for the next round of directed evolution
(Fig. 21.1). Directed evolution techniques have been successfully
applied to optimize the properties of an impressive number of
enzymes [21–23]. The combined rational design and directed
evolution approach is a semirational approach, shifting to small and
focused libraries by site saturation mutagenesis, which increases the
chance of influencing catalytic properties [24]. A high-throughput
screening or selection method is advantageous but not essential due
to its dramatically reduced library size, but the combined approach
also requires sufficient knowledge of a protein’s structure, function,
and mechanism [24, 25].
The bottleneck of a successful directed evolution endeavor is
the availability of a selection or screening procedure capable of
rapidly identifying one or more positive candidates from a large
pool of mutants on the basis of specific criteria. Theoretically,
analyzing more variants can increase the possibility of finding a
desired mutant in the library to some degree, because the typical
library size remains many orders of magnitude larger than the
number of enzyme variants that can be evaluated. The key criteria
for selection and screening techniques are as follows: first, the
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

710 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

evaluation assay should be as closely associated with the desired


property of the targeted enzymes as possible; second, the assay
should be sensitive over the desired dynamic range to facilitate the
recovery of variants with only small improvement; and last but not
least, the evaluation method should have a high throughput [26,
27]. When applying directed evolution to evolve enzyme properties,
we will ask the question whether to apply a selection method
(106 to 1012 mutants analyzed in parallel) or a screening method
(103 to 106 mutants analyzed individually) [28]. In general, if
the enhanced properties need simultaneous mutations at multiple
positions, larger libraries will present better chances of isolating
mutants with dramatically altered phenotypes. In this case, high-
throughput or even ultrahigh-throughput selection and screening
(>1010 variants can be investigated) methods are becoming more
attractive.
As summarized in Table 21.1 and Figs. 21.2 and 21.3, this
chapter highlights the progress in the development of selection and
screening methodologies over the past few years for evolutionary
enzyme engineering. Readers who are interested in library creation
methods are referred to two recent publications [18, 78].

21.2 Selection

In the selection process, selection pressure is simultaneously


imposed on the entire library of mutants, but only improved variants
are identified. The selection strategies usually need to be specifically
tailored to each directed evolution effort and cannot be transferred
to another enzyme, and the desired enzymatic property is generally
associated with cell survival and growth. Selection methods tend
to be more efficient than screening with respect to isolating
desired mutants. Nevertheless, selection procedures are usually
difficult to develop and control and readily result in false positives
[79]. Growth-based assays, such as genetic complementation and
chemical complementation, whereby an enhancement of the desired
characteristic provides growth advantages to host cells and enables
them to out-grow their competitors, is popular for high-throughput
selection, as illustrated in Fig. 21.2a,b. Moreover, the recent

www.ebook3000.com
March 21, 2016

Table 21.1 Summary of recent examples of protein evolution discussed in this chapter
13:50

Strategy Library creation method Library size Targeted enzyme Evolved properties Reference

Selection
Solid-medium-based A specific mixture of PCR ∼104 Lipase A Improved enantioselectivity [29]
selection primers
Error-prone PCR ∼107 2-Keto-3-deoxy-6- Expansion of the substrate pro- [30]
phosphogluconate (KDPG) file of KDPG to a nonfunctional-
aldolases ized electrophilic substrate, 2-keto-
4-hydroxyoctonoate (KHO)

Error-prone PCR ∼105 β-Subunit of glutaryl acylase from Increased activity and affinity for [31]
Pseudomonas SY-77 the adipyl substrate
Region-directed mutagenesis ∼104 α-Subunit of glutaryl acylase from Increased activity and affinity for [32]
Pseudomonas SY-77 the adipyl substrate
Error-prone PCR 3300 β−Glucosidase from Paenibacillus Enhanced thermostability, the most [33]
polymyxa thermostable mutant A17S with an
PSP Book - 9in x 6in

11-fold increase in the half-life of


thermoinactivation at 50◦ C
Error-prone PCR ∼106 A cellodextrin transporter 2 Increased cellobiose uptake activity [34]
Error-prone PCR ∼60,000 Human estrogen receptor α (hERα) Five hERα variants with up to [35]
7600-fold improvement in the
binding affinity for testosterone
Saturation mutagenesis and hERα Sixty-seven-fold increased activity [36]
random mutagenesis to the synthetic ligand CV3320 and
tenfold decreased activity to
17β-estradiol
Selection

(Contd.)
711
21-Allan-Svendsen-c21
March 21, 2016
13:50

Table 21.1 (Contd.)

Strategy Library creation method Library size Targeted enzyme Evolved properties Reference
8
Liquid-culture-based DNA shuffling ∼10 Cellulase A cellulose variant with a sixfold [37]
selection increase inkcat /KM over the parent
enzyme
Error-prone PCR ∼106 Aldolase Two mutants with an altered [38]
substrate specificity isolated
Mutagenesis of plasmids ∼108 Alkane hydroxylase and P450 Mutants with higher rates of [39]
enzyme 1-butanol production from butane
and maintenance of their preference
for terminal hydroxylation
Site-directed mutagenesis using ∼108 Esterase Mutants with changed [40]
degenerate primers enantiopreference
Enabled continuous ∼1012 T7 RNA polymerase Alteration of the promoter and [41]
PSP Book - 9in x 6in

mutagenesis with the use of initiation specificity


mutagenesis plasmid
Enabled continuous T7 RNA polymerase Altered substrate specificities [42]

www.ebook3000.com
mutagenesis with the use of
mutagenesis plasmid
Error-prone PCR and ∼105 Cellobiose phosphorylase Increased activity of lactose [43]
site-directed and site saturation phosphorylase (LP)
mutagenesis
Error-prone PCR and DNA ∼106 Citramalate Synthase Threefold increase of the kcat and [44]
shuffling KM
712 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

Error-prone PCR ∼105 Xylose reductase Increased Vmax [45]


21-Allan-Svendsen-c21
March 21, 2016
13:50

Phage-display-based Site saturation mutagenesis ∼108 Sortase Altered substrate selectivity [46]
selection
Site saturation mutagenesis ∼104 Lipase Inverted enantioselectivity [47]
Limited diversity for selected ∼1010 Caspase 1 Improved affinities of Fabs [48]
chains
Yeast-display-based Parsimonious mutagenesis ∼106 Fc antigen binding Improved thermal stability [49]
selection
Error-prone PCR IgG1-Fc Improved thermal stability [0]

Ribosome-display- Randomizing olignucleotide of ∼1013 Randomized amino acid residues at Improved binding to streptavidin [51]
based peptides the N terminus of FABP
selection
Trinucleotide cassette ∼109 Single-chain antibody fragments Improved binding to insulin [52]
mutagenesis (scFvs)
A random NNK codon region ∼1012 Random decapeptides Affinities to specific monoclonal [53]
PSP Book - 9in x 6in

antibody
Random (NNY)n codon ∼109 Random peptides Discovery of a metal-binding motif [54]
mRNA-display-based A random codon region ∼1012 Random peptides Discovery of CaM-binding peptides [55]
selection
A random NNS codon region ∼1012 Random cyclic peptides Discovery of Gαi1-binding peptides [56]
Screening
MS-based screening Iterative saturation ∼104 P450pyr hydroxylase Increased enantioselectivity [57]
mutagenesis

(Contd.)
Selection
713
21-Allan-Svendsen-c21
March 21, 2016
13:50

Table 21.1 (Contd.)

Strategy Library creation method Library size Targeted enzyme Evolved properties Reference

Saturation mutagenesis at three >14,000 Nonribosomal peptide synthetase Altered substrate specificity [58]
residues AdmK
Agar-plate-based Error-prone PCR 2 × 104 Nonribosomal peptide synthetase Improved activity over enterobactin [59]
screening
Error-prone PCR ∼105 Luciferase from Luciola mingrelica Improved thermostability [60]
Error-prone PCR (S)-Aminotransferase from Improved activity and [61]
Athrobacter citreus thermostability
DNA shuffling ∼104 Lipase Improved thermostability and [62]
activity
Microtiter-plate- Error-prone PCR ∼4 × 104 Phosphite dehydrogenase Enhanced thermostability [63]
based
PSP Book - 9in x 6in

screening
Error-prone PCR ∼4 × 104 Endoglucanase and β-glucosidase Increased specific activity [64]
Saturation mutagenesis ∼3 × 105 P450cam mono-oxygenase Increased substrate specificity [65]

www.ebook3000.com
FACS-based screening Error-prone PCR >106 Glycosyltransferases Increased catalytic activity [66]
Error-prone PCR 3.6 × 107 P450 Mono-oxygenase Increased activity [67]
events per day
Error-prone PCR and DNA ∼2 × 106 Drosophila melanogaster Changed substrate specificity [68]
shuffling 2 -deoxynucleoside kinase
Error-prone PCR >108 Esterase from Pseudomonas Improved enantioselectivity [69]
aeruginosa (EstA)
714 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering
21-Allan-Svendsen-c21
March 21, 2016
13:50

Saturation mutagenesis ∼5 × 107 Protease OmpT Improved specificity [70]


Error-prone PCR >108 E. coli lipoic acid ligase Improved pAz ligation activity [71]
Saturation mutagenesis 5 × 106 DhbE, part of a three-module Dramatically changed substrate [72]
nonribosomal peptide synthetases specificity of up to 200-fold
Random and targeted >106 Serum paraoxonase (PON1) Improved catalytic activity by [73]
mutagenesis ∼105 -fold
Microfluidics 2000 events β-Galactosidase (lacZ) lacZ genes were enriched 502-fold [74]
per second from a 1:100 molar ratio of
lacZ:lacZmut genes
Error-prone PCR ∼8 × 105 Arylsulfatase from P. aeruginosa Improved activity and expression [75]
(PAS) level
Error-prone PCR ∼1010 M-MuLV reverse transcriptase Elevated thermostability [76]
500 events per Human tissue plasminogen More than 1300-fold enrichment of [77]
second activator (tPA) the active wild-type enzyme
PSP Book - 9in x 6in

Selection
715
21-Allan-Svendsen-c21
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

716 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

Figure 21.2 Overview of selection methodologies. (a) Cell growth selection


on a solid medium, (b) cell growth selection on liquid medium, (c) direct
selection of a phage-displayed protein library, (d) direct selection of a yeast-
displayed protein library for improved thermal stability, (e) direct selection
of a ribosome-displayed protein library, and (f) direct selection of an mRNA-
displayed protein library.

advancement in the ultrahigh-throughput selection techniques,


for instance, the phage-assisted continuous evolution (PACE) for
continuous directed evolution of T7 RNA polymerase (RNAP) (Fig.
21.4) [41, 42] and in vivo evolution of alkane hydroxylase AlkB
and CYP153A6 for butane oxidation [39], has greatly shortened the
selection time and accelerated the evolution process.

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Selection 717

Figure 21.3 Overview of screening methodologies. (a) Chromatography-


and mass-spectrometry-based screening, (b) solid-medium-based screen-
ing, (c) microtiter-plate-based screening, (d) fluorescence-activated-cell-
sorting (FACS)-based screening, and (e) microfluidics-based screening with
in vitro compartmentalization (IVT).

21.2.1 Solid-Medium-Based Selection


Solid-medium-based selection is a simple selection strategy for
evolving highly efficient enzymes by plating cells expressing the
enzyme variant library onto solid selection media (Fig. 21.2a). It is
less labour intensive and can be achieved by linking an expressed
enzyme variant with the survival of a microorganism. A recent
study showed an example where growth advantage is conferred
to cells expressing enantioselective enzyme mutants [29]. Here
the desired enantiomer was esterified to aspartate and added to
selective minimal medium plates as the aspartate source to support
the growth of an aspartate auxotroph expression strain. Only
cells expressing LipA variants that could hydrolyze this aspartate
ester were able to grow. A mutant was identified with inverted
enantioselectivity. Another example is the expansion of the substrate
profile of 2-keto-3-deoxy-6-phosphogluconate (KDPG) aldolase to
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

718 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

Figure 21.4 Overview of the PACE system. MP, mutagenesis plasmid; AP,
accessory plasmid; SP, selection phage. In PACE, host cells continuously flow
through a fixed-volume vessel (the lagoon), where the cells are infected
with an SP, encoding the gene(s) of interest. Functional library members
are able to induce sufficient pIII production from the AP and will propagate
and persist in the lagoon, confining the accumulation of mutations. Modified
figure reprinted by permission from Macmillan Publishers Ltd: Nature (Ref.
[41]), copyright (2011).

recognize a nonfunctionalized electrophilic substrate, 2-keto-4-


hydroxyoctonoate (KHO) [30]. A selection scheme was constructed
by utilization of a pyruvate kinase-deficient (pykA and pykF) strain
that is unable to grow in minimal media. Addition of exogenous
pyruvate rescues the auxotrophic phenotype. Using random muta-
genesis and selection of cells that can grow on KHO, mutants of
KDPG were identified that can improve the efficiency for cleavage
of KHO by up to 25-fold. Otten et al. have successfully converted
the glutaryl acylase from Pseudomonas SY-77 into an adipyl acylase
enzyme that can be used for a one-step bioconversion of adipyl-7-
aminodesacetoxycephalosporanic acid (adipyl-7-ADCA) to 7-ADCA
[31]. The mutant library was selected in a leucine-deficient host

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Selection 719

using adipyl-leucine as the sole leucine source. Only mutants that


can hydrolyze adipyl-leucine would survive. A mutant was isolated
with a 10-fold improved catalytic efficiency (kcat /KM ) on adipyl-7-
ADCA. In a similar study on the α-subunit of the glutaryl acylase
from Pseudomonas SY-77, mutants with increased activity and
affinity for the adipyl substrate were selected [32]. Recently, Zhang
and coworkers applied directed evolution to improve the activity
and thermostability of a well-studied β-glucosidase from Paeni-
bacillus polymyxa [33]. They developed a novel high-throughput
combinatorial growth-based selection/screening approach to select
variants of interest. Several thermostable mutants were identified,
and the most thermostable mutant A17S had an 11-fold increase
in the half-life of thermoinactivation at 50◦ C. Lian et al. evolved
cellodextrin transporter 2 (CDT2) with an increased cellobiose
uptake activity to enable improved fermentation of cellobiose in
Saccharomyces cerevisiae under anaerobic conditions [34]. The best
evolved CDT2 mutant conferred a 2.2-fold increase in the cellobiose
uptake activity resulting in a 4- and 4.4-fold improvement in the
cellobiose consumption rate and ethanol productivity, respectively.
A yeast two-hybrid system was developed to aid the directed
evolution of human estrogen receptor α (hERα) variants with
significantly enhanced androgen specificity and affinity [35]. A
two-tiered strategy consisting of an agar-plate-based selection
followed by a 96-well-plate-based screening was used. The plate
was supplemented with an appropriate testosterone concentration,
and yeast cells bearing the parental hERα ligand-binding domain
(LBD) in each round of directed evolution cannot form colonies,
whereas yeast cells bearing a variant with improvement in ligand-
binding affinity can grow. Five hERα variants with up to 7600-
fold improvement in the binding affinity for testosterone were
identified after two rounds of directed evolution. A similar strategy
was successfully used to select mutants with altered ligand-binding
specificity of hERα for a nonsteroidal synthetic ligand [36].

21.2.2 Liquid-Medium-Based Selection


Culturing variants in a liquid medium provides a specific evolu-
tionary pressure only suitable for the growth of desired mutants
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

720 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

(Fig. 21.2b). The process leads to an enrichment of the desired


mutants. After the enrichment, a smart library was created with a
smaller size but a higher fraction of beneficial mutations, in contrast
to the initial library that composed of a high proportion of nonsense
or deleterious mutations. To improve the activity of cellulases,
a chemical complementation system was developed to link bond
cleavage with cell survival in yeast [37]. Cells can grow in the
selective media only if the cellulase variants can hydrolyze the sub-
strate methotrexate-cellotetraose-dexamethasone (Mtx-Cel-Dex). A
cellulase variant with a sixfold increase in kcat /KM over the best
parent enzyme for the hydrolysis of p-nitrophenyl cellobioside was
isolated. In addition to cellulases, the chemical complementation
method can be readily adapted to other enzymatic systems. Recently,
Sugiyama et al. reported the development of an in vivo selection
system for the directed evolution of L-rhamnulose aldolase from L-
rhamnulose-1-phosphate aldolase (RhaD) [38]. A strain deficient in
the L-rhamnulose metabolic pathway in Escherichia coli was used
as a selection host. Expression of the desired mutant RhaD in this
selection host can cleave unphosphorylated L-rhamnulose and allow
the mutant to grow in the minimal media lacking L-rhamnulose.
Two mutants were isolated with altered substrate specificity from
dihydroxyacetone phosphate (DHAP) to dihydroxyacetone (DHA).
Selection by chemical complementation was also used to evolve
the medium-chain alkane hydroxylase AlkB from Pseudomonas
putida GPo1 and P450 enzyme CYP153A6 from Mycobacterium sp.
strain HXN-1500 for the terminal hydroxylation of butane to 1-
butanol [39]. A selection method based on enhanced cell growth
on butane as the sole carbon source was used to enrich beneficial
mutants. In another example, Fernández-Álvaro et al. combined in
vivo selection and cell sorting to isolate enantioselective esterases
[40]. (R)-3-phenyl butaric acid is coupled to the growth-supporting
glycerol, and (S)-3-phenyl butaric acid is coupled to the toxic
2,3-dibromopropanol. E. coli cells with an esterase selective for
the (R)-enantiomer showed better viability than those with an
esterase selective for the (S)-enantiomer. The enriched library
was then sorted out by fluorescence-activated cell sorting (FACS)
analysis. In another related study, directed evolution was used
to modify the substrate specificity of Cellulomonas uda cellobiose

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Selection 721

phosphorylase (CP) from cellobiose to lactose [43]. Cells expressing


lactose phosphorylase (LP) with a higher activity can grow faster
and be enriched by an in vivo selection on liquid minimal medium
containing lactose as the sole carbon source. One clone, LP1 mutant,
was found with a threefold increase in LP-specific activity.
To improve the activity of a heterologous protein, citramalate
synthase (CimA), a directed evolution strategy based on the
requirement of L-isoleucine for growth was used [44]. An E. coli
strain that is auxotrophic for L-isoleucine was used, and it cannot
grow unless CimA is active. The kcat and KM values for the best
CimA variant (CimA3.7) was found to be improved by threefold
over the wild type. A mutant xylose reductase (XR) was selected
from a randomly mutagenized library in sequential anaerobic batch
cultivation [45]. The library was expressed in S. cerevisiae with
xylose as the sole carbon source. Two conserved mutations in the
XYL1 gene, A814G and C824A, were identified, and these mutations
were the reason for the strains to display higher aerobic growth
rates and ethanol yields.
A recent notable selection method is PACE [41] (Fig. 21.4). PACE
achieves continuous selection by linking the desired protein activity
with the production of the M13 bacteriophage pIII using a helper
(accessory) plasmid. Functional library members induce production
of pIII and will persist in the lagoon, confining the accumulation of
mutations. Continuous random mutagenesis is enabled by induction
of the mutagenesis plasmid containing inducible genes for DNA
proofreading and repair mutant proteins with elevated error rates
during DNA replication. As a proof of concept, the PACE system
was used to evolve a T7 RNAP to recognize a unique promoter
sequence and to initiate transcripts with adenosine triphosphate
(ATP) or cytidine triphosphate (CTP) instead of guanosine triphos-
phate (GTP) in 200 rounds of evolution in eight days. Carlson
et al. expanded the PACE method to enable the evolution of
biomolecules with radically altered or highly specific new activities
by developing a general strategy to modulate selection stringency
and a negative selection that enables continuous counterselection
against undesired activities [42]. These advances were integrated
to continuously evolve T7 RNAP variants with ∼10,000-fold altered
promoter specificity in a three-day PACE experiment.
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

722 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

21.2.3 Display-Based Selection


Cell surface display allows the enzyme variants to be displayed on
the outside of the cells by fusing them with the anchoring motifs.
It enables exploration of interactions between proteins, peptides,
and small-molecule ligands, which can be applied as a criterion in
selection of the desired protein variants. Various display systems
have been used, including phage display, yeast surface display,
ribosome display, and mRNA display.
In the phage display technique, foreign (poly)peptides are
displayed on the surface of a phage particle and selection is
achieved by binding the target (Fig. 21.2c). For example, Piotukh
et al. introduced a phage-display-based assay to broaden substrate
selectivity of Staphylococcus aureus sortase A from a highly
conserved LPxTG sorting motif to FPxTG or APxTG [46]. This method
establishes a monovalent display of G5-sortase as a pIII fusion
protein on the envelope of the M13 phage. The displayed G5-
sortase then captures a substrate peptide by intramolecular ligation.
A library of mutant sortases was constructed by randomizing the
substrate recognition loop. The isolated sortase A possessed a
remarkably broad selectivity and accepted aromatic amino acids
as well as amino acids with small side chains. A similar selection
system was developed to isolate variants of lipase A (LipA) with
inverted enantioselectivity by phage display [47]. In another study,
a general strategy was presented for identifying conformation-
specific antibodies by using small molecule ligands to lock target
proteins into the conformation of interest [48].
Compared to other display methods, the yeast display method
has the advantage of a eukaryotic expression system to enable dis-
play of complex proteins. In combination with magnetic separation
or flow cytometry, yeast display is a highly effective method for
protein engineering such as improving the stability of proteins (Fig.
21.2d). For example, using this method, a more stable Fc antigen
binding (Fcab) was developed for binding Her2/neu [49]. The Fcab
variant library was displayed on yeast surface, followed by heat
incubation and selection for retained Her2/neu binding. In a similar
study, the thermal stability of the human IgG1-Fc was increased after
four rounds of selection [50].

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Selection 723

Ribosome display is a powerful in vitro tool for selecting and


evolving proteins (Fig. 21.2e). The library size is not limited by
the efficiency of cell transformation, and random mutations could
be easily introduced [80]. Ribosome display was applied to select
streptavidin-binding peptides from a library of large size of 1013
[51]. A His-Pro-Gln (HPQ) consensus motif was found to be essential
for streptavidin binding in members of the selected peptides.
With the help of a substitution analysis, the 15-mer peptides
were shortened to a 9-mer variant with a dissociation constant
of 17 nM, which is a 1000-fold increase in affinity compared to
the already known peptides of this size. Ribosome display can be
used for in vitro selection and evolution of single-chain antibody
fragments (scFvs) [52]. Three ribosome display experiments were
carried out to isolate variants that specifically bind insulin. The
dissociation constants of the isolated scFvs were as low as 82 pM.
Ribosome (polysome) display has also enabled the selection of
decapeptides with high affinities on an immobilized monoclonal
antibody specific for the peptide dynorphin B [53]. In another case
of ribosome display system, a metal-binding motif was selected
from an artificial peptide library (APL) [54]. The display complex
consisted of APL-associating proteins, ribosome, and mRNA. The
complex was stabilized by automatic association of a protein with an
RNA motif, which used for selection of peptide aptamers to produce
a metal-chelating resin. The display system has found several new
proteins and peptides that bind Co(II)-immobilized resin and Co(II)
complex, respectively.
Recently, mRNA display has also been developed as in vitro,
cell-free, display technologies for peptides or proteins selection.
In this method, the polypeptide chain is covalently linked to the
3 end of its own mRNA by utilization of puromycin attached
to pause translation (Fig. 21.2f). For example, it allows a target-
binding selection against calmodulin (CaM) [55]. More than 2000
peptide binders were identified, including several that were highly
homologous to the CaM-binding motifs found in natural proteins.
Millward et al. described cyclic mRNA display and employed it to
design macrocyclic peptide ligands for Gαi1 [56]. Gαi1 serves as
a signaling protein to connect the cell surface G-protein-coupled
receptors (GPCRs) to downstream effector pathways. The resulting
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

724 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

library was panned against Gαi1, and the tightly binding sequences
were amplified and enriched by polymerase chain reaction (PCR). It
is striking to find that the conserved hexamer core motif can be seen
in all of the selected variants, and the resulting cyclic peptides, such
as the cyclic Gαi binding peptide (cycGiBP), exhibit an enhanced
resistance to protease degradation and bind Gαi1 with a strong
affinity (Ki ≈ 2.1 nM).

21.3 Screening

In the screening process, every single variant will be investigated


for the desired enzymatic reaction, and the property of each
variant becomes known. Compared to selection, screening usually
has a relatively lower throughput. Every round of the screening
process can evaluate around 103 to 105 mutants, by utilizing any
biochemical or biophysical detection method, even including high-
performance liquid chromatography (HPLC) and mass spectrometry
(MS) [81, 82]. Though screening methodologies are somewhat time
consuming and labor intensive, they also have incomparable merits
and frequently produce useful variants. First, multiple parameters
can be simultaneously examined in the screening process, such as
the initial and final activities or activity and specificity. Second, the
targeted property of each single variant in the library can be tested
very precisely, leading to the identification of dramatically enhanced
mutants and even variants with marginal improvement or no
improvement. This information will serve as a valuable reference for
further rational design of supermutants by incorporating additional
mutations or eliminating deleterious mutations. Third, in the
screening process we can directly employ the real substrates and
reaction conditions for the developed assay, and you can get what
you are screening for [10]. On the other hand, screening does not
always have low throughput and some high-throughput or even
ultrahigh-throughput screening methods have been developed, such
as FACS [83] and microfluidics [84], which combine the merits of
high throughput and screening accuracy and can evaluate up to 1010
variants per screening round.

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Screening 725

21.3.1 Chromatography- and Mass-Spectrometry-Based


Screening
The most straightforward and reliable means of investigating a
library is to perform the assay directly on each individual variant.
Screening techniques based on chromatography and MS, as shown
in Fig. 21.3a, follow such principles and produce very accurate
results. Though the throughput of these screening methods is fairly
low compared to either selection or other screening methods,
they still catch the eyes of many researchers due to the accuracy
and reliability. For example, mannitol dehydrogenase (MDH) can
selectively transform ribitol to L-ribose, which is the precursor of
L-nucleoside antiviral drugs, but its production suffers from the
inferior natural existing MDHs, hindering the production efforts
of L-ribose. An HPLC-based high-throughput method that could
confidently determine enantiomeric excess (ee) greater than 99.0%
was developed to analyze the conversion reaction from ribitol
selectively to L-ribose to facilitate the directed evolution efforts on
MDHs [82]. Enzymatic reaction solution was directly injected after
quenching, and ribose and ribitol were separated in 2.3 min. The
total run time of each sample was ∼4 min, providing a screening
capability of 4 × 96-well plates/day/instrument. In addition, an MS-
based screening method was developed to determine hydroxylation
enantioselectivity by using enantio-enriched deuterated substrates.
The gas chromatography (GC)/MS analysis gave ratios of products
with M and M + 1 that could be extrapolated to calculate the
enantioselectivity. The GC/MS analysis only took 1.5 min per
sample, resulting in a theoretical throughput of 960 samples per
day [81]. A similar LC/MS-based method was used to investigate
enantioselective biohydroxylation of N-benzylpyrrolidine, achieving
an analysis time of only 1.0 min per sample. The ee value was
determined to be comparable to that by HPLC with chiral columns,
even when the sample concentration was as low as 0.005 mM.
This method was used to evolve a mono-oxygenase, P450pyr
hydroxylase from Sphingomonas sp. HXN-200, for increased ee value
from 53% to 98% for the hydroxylation of N-benzyl pyrrolidine
to (S)-N-benzyl 3-hydroxypyrrolidine [57]. In a related study, a
multiplexed LC-MS/MS assay was used for directed evolution of the
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

726 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

nonribosomal peptide synthetase (NRPS) AdmK to generate new


andrimid derivatives in vivo [58]. Four clones were identified to
exhibit altered substrate specificity. Compared to a bioassay, the MS
detection of android has a ∼23,000-fold greater sensitivity.

21.3.2 Solid-Medium-Based Screening


Solid-medium-based screening is a simple and attractive format
for high-throughput screening and only involves the incubation of
colonies with the enzyme substrates. This screening scheme links
the enzyme-catalyzed substrate conversion to a phenotype that is
easy to recognize, such as colony size or color, to identify clones
expressing an enzyme with a desired characteristic [85]. Solid-
medium-based screening, illustrated in Fig. 21.3b, can evaluate
about 105 –106 colonies per screening round and can be adjusted
to screen a wide range of properties, such as substrate specificity,
thermostability, and activity. However, the prerequisite of this
screening system is to associate the desired properties with a visual
signal with a suitable dynamic range and sensitivity.
Nonribosomal peptides (NRPs) are produced by NRPS enzymes
that function as molecular assembly lines. Domain swapping within
two different NRPS can generate novel but nonfunctional NRPS, but
the impaired chimeric NRPSs function can be restored by rounds
of mutagenesis and colony-size-based in vivo screening for NRP
production [59]. For example, E. coli EntF (enterobactin synthetase)
domain A was replaced by SyrE-A1, a ∼50 kDa A domain from
the Pseudomonas syringae syringomycin NRPS. The random library
of EntF-SyrE-A1 variants was screened on M9 screening plates on
the basis of the colony size, resulting in one mutant 1B-06 with
approximately eightfold activity improvement over enterobactin
production. Using a similar strategy, an admK-CytC1 mutant (a broad
spectrum antibiotic) was isolated that can produce ∼10-fold more
andrimid in E. coli by screening ∼6,000 colonies. Thermostability
of luciferase from Luciola mingrelica was enhanced by 65-fold
through four rounds of directed evolution. The high-throughput
screening was conducted on agar plates by measuring the initial in
vivo bioluminescence and the residual bioluminescence after heat
shock at 50◦ C–55◦ C [60]. Aminotransferases can selectively catalyze

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Screening 727

the transfer of an amine group from a donor amine molecule to


a carbonyl acceptor to generate a chiral amine. The activity and
thermostability of (S)-aminotransferase from Athrobacter citreus
was improved by four rounds of error-prone PCR (epPCR) and
a color-based screening assay on agar plates [61]. The isolation
of the mutants with enhanced properties was also based on
appearance of color, resulting in a mutant with a threefold reduction
in biocatalyst loading. Akbulut et al. generated and selected a
Bacillus pumilus lipase variant with improved specific activity and
thermostability with a agar-plate-based high-throughput screening
[62]. Transformants from the DNA shuffling library were replicated
on tributyrin agar plates, and transformants with active lipase were
discovered by the formation of clear halos due to the hydrolysis
of tributyrin. Surprisingly, although the desired property to be
identified was activity, the mutant also displayed a remarkable
increase in thermostability, in accordance with a previous report on
Candida antarctica lipase B [86].

21.3.3 Microtiter-Plate-Based Screening


Microtiter-plate-based screenings are among the most popular
screening systems used to identify enzyme mutants with desired
properties. The mutant libraries are first created by various
mutation-generating techniques, and then the individual colony is
grown on standard microtiter plates, commonly 96-well plates, as
shown in Fig. 21.3c. The enzyme variants are usually assessed in
a second plate, with the original plate as a backup, by detecting
different biological signals such as a color change or absorbance
difference [79].
Phosphite dehydrogenase (PTDH) can regenerate the costly
nicotinamide cofactors reduced nicotinamide adenine dinucleotide
(NADH) and reduced nicotinamide adenine dinucleotide phosphate
(NADPH), but its low stability limited its potential application in in-
dustry. The thermostability of PTDH from Pseudomonas stutzeri was
improved by directed evolution together with a high-throughput
colorimetric screening assay [63]. A random mutagenesis library
was generated by epPCR, and individual colonies were inoculated
into a 96-well plate to screen for thermostable mutants. By
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

728 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

recording the absorbance change, only mutants showing a higher


ratio of residual and initial activity than the wild type and a
similar initial activity to the wild type were selected for further
characterization. After 3 rounds of directed evolution, 12 beneficial
mutations were revealed and combined to create a supermutant
with >7000-fold thermostability. Recently, a novel high-throughput
microtiter-plate-based screening method was proposed for the
directed coevolution of an endoglucanase and a β-glucosidase in
E. coli [64]. The hypothesis behind the experimental design was
that two enzymes, that is, endoglucanase and β-glucosidase could
be coexpressed to produce a cocktail mixture to directly hydrolyze
cellulose to glucose, which was detected by a coupled reaction
of glucose oxidase and peroxidise to produce a color compound.
The library creation and screening processes were carried out
in three rounds, and mutants with up to tenfold higher activity
compared to wild type were isolated. The substrate specificity of
P450cam mono-oxygenase toward diphenylmethane was changed
by a semirational design coupled with a microtiter-plate-based high-
throughput screening [65]. A focused library (∼300,000 variants)
was constructed by mutating nine residues in the active site of
P450cam mono-oxygenase. The mutant activity toward diphenyl-
methane was monitored by measuring the depletion of NADH at
absorbance changes at 340 nm. Five mutants were finally found to
transform diphenylmethane with a specific activity of up to 75% of
the wild-type activity on D-camphor and a coupling rate of 7%–18%.
Besides the abovementioned traditional schemes, Clark and
coworkers recently developed a one-pot high-throughput in vitro
glycoside hydrolase (HIGH) cell-free expression system to evolve
and identify glycoside hydrolases (GHs), together with a col-
orimetric pH indicator to represent the activity [87]. GHs can
depolymerize the carbohydrate content of biomass to fermentable
monosaccharides, but the low activity and stability hinder their
industrial applications. As a proof of principle, the HIGH method
was verified over four different types of GH activities, including β-
glucosidases, cellulases (endoglucanase), amylases, and xylanases,
with both soluble and insoluble substrates. In addition, the HIGH
screening method was also used to identify different GHs from a
metagenomic library constructed from the cow rumen microbiome.

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Screening 729

21.3.4 Yeast Two-/Three-Hybrid System


The yeast two-hybrid system is well known for its broad application
in the study of protein–protein interactions and protein–DNA
interactions [88, 89]. The principle of the two-hybrid system is
to separate a transcription factor into the DNA-binding domain
and the activation domain. When the two domains are in close
proximity, an artificial transcription factor is reconstituted to
initiate the transcription of downstream reporter genes, and the
levels of transcription of the reporter gene can represent the
interaction strength between the two domains. The yeast three-
hybrid system is a variation of the two-hybrid technique for the
investigation of RNA–protein interactions [90]. Recently, new high-
throughput screening methods based on the yeast two-/three-
hybrid system were developed for protein engineering. Cornish and
coworkers developed a reaction-independent yeast three-hybrid-
system-based high-throughput assay for measuring the enzyme
cleavage activity [91]. The yeast Mtx-Dex three-hybrid system
was used to link enzyme catalysis to transcription of the lacZ
reporter gene in vivo and verified the feasibility of the design
through cephalosporin hydrolysis by the Enterobacter cloacae P99
cephalosporinase (β-lactam hydrolase, EC 3.5.2.6). First, two fusion
proteins LexA-DHFR and B42-GR were constructed to create an
artificial transcriptional activator LexA-B42, which, in combination
with Mtx-Dex, led to the transcription of downstream lacZ reporter
gene. Second, the linker between Mtx-Dex was replaced with the
substrate for cephalosporinase cleavage reaction. Consequently,
cephalosporinase-catalyzed Mtx–substrate–Dex molecule cleavage
would be detected by monitoring the decrease in transcription of the
downstream lacZ gene. Yeast strain bearing a lacZ reporter plasmid
coexpressed the LexA-DHFR and B42-GR fusion proteins and plated
on X-Gal plates for screening of cephalosporinase activity on the
basis of their levels of β-galactosidase expression.

21.3.5 FACS-Based Screening


FACS-based screening (Fig. 21.3d) has been extensively applied in
the process of evolutionary enzyme engineering recently because
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

730 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

of its high sensitivity and ultrahigh-throughput (as many as 108


variants per day) [83]. FACS is a specialized flow cytometry that
can rapidly separate heterogeneous mixture of cells in a suspension
on the basis of cell size and fluorescent labeling and isolate the
variants with desired properties. FACS that can screen libraries of
up to 1010 mutants, coupled with cell-trapped product detection,
cell surface display, and in vitro complementation, technologies is
particularly valuable for enrichment and isolation of rare mutants
with beneficial properties from a large library of mutants, as
delineated next.
One scheme of FACS-based screening traps the fluorescence-
labeled product in the cells and then uses FACS to sort and
enrich mutants of interest. In this regard, the fluorescence in-
tensity can represent the enzyme activity to a certain degree.
Glycosyltransferases (GTs) are responsible for assembling complex
carbohydrates, such as polysaccharides and proteoglycans, which
are important in a number of cell functions. Withers and coworkers
developed a FACS-based high-throughput screening methodology
to facilitate the directed evolution of GTs with improved substrate
specificity in intact E. coli cells [66]. They used CstII from
Campylobacter jejuni, which can transfer sialic acid (Neu5Ac) from
CMP-Neu5Ac to the carbohydrate groups of various glycoproteins
and glycolipids. E. coli JM107 NanA− cells were transformed with
two plasmids to coexpress CMP-Neu5Ac synthetase and randomly
mutated CstII, incubated with donor Neu5Ac and fluorescence-
labeled acceptor lactose-bodipy. The substrate was carefully de-
signed so that it could be transported into and out of the cells, but the
fluorescent product was only trapped inside the cells. A variant with
up to 400-fold higher catalytic activity was isolated using FACS. P450
mono-oxygenase can catalyze the O-dealkylation of 7-benzoxy-3-
carboxycoumarin ethyl ester (BCCE) and produce a fluorescence
coumarin derivative 7-hydroxy-3-coumarin that is retained within
the cells [67]. One round of directed evolution of P450 mono-
oxygenase BM3 identified some mutants with up to sevenfold
increased activity. Using a similar strategy for trapping fluorescence-
labeled product inside the cells, Lutz and coworkers obtained a
Drosophila melanogaster 2 -deoxynucleoside kinase variant with an
overall 10,000-fold enhancement in substrate specificity [68].

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Screening 731

Another scheme of FACS-based screening is to display het-


erologous enzyme variants on the surface of the cells by fusing
them to an anchor motif so that the displayed enzymes are freely
accessible to the added substrates without any physical barriers.
If the fluorescent product can also be captured on the cell surface
without diffusion, FACS-based assay can accelerate the screening
process to find mutants of interest. The enantioselectivity of
esterase EstA from Pseudomonas aeruginosa (PAS) was increased
by random mutagenesis and a single-cell high-throughput screening
system [69]. The variants of EstA were displayed on the surface
of E. coli. Two substrates (R)- and (S)-2-methyldecanoic acid
(2-MDA) carrying indicator groups 2,4-dinitrophenyl and biotin
were synthesized, respectively. After enzymatic hydrolysis of the
substrates and product attachment to the E. coli surface, each of
the two enantiomers was labeled with either a green or a red
fluorescent dye, and then the cell populations were distinguished by
FACS. Likewise, Georgiou and coworkers also reported an ultrahigh-
throughput FACS-based assay for screening the catalytic activity of
protease OmpT that is naturally located on the outer membrane of
E. coli [70]. Through screening around 5 × 107 cells, one mutant
OmpT-AR1 was discovered with increased specificity for cleavage at
Ala-Arg site. Besides E. coli cells, yeast surface display, along with
FACS screening, has also been extensively used. The E. coli lipoic
acid ligase (LplA) mutant catalyzes covalent probe ligation to a 13-
amino-acid LplA acceptor peptide (LAP) in the process of probe in-
corporation mediated by enzymes (PRIME) for intracellular protein
labeling in living cells [71]. To extend the PRIME labeling method
to the cell surface, the ligation activity of LplA toward picolyl azide
(pAz) was improved by a yeast-display-facilitated directed evolution
together with FACS-based screening. LplA libraries were displayed
on the surface of S. cerevisiae and incubated with 100–200 μM
pAz. Mutants with improved pAz ligation activity were isolated by
screening more than 108 clones. NRPSs are large multifunctional
enzymes that synthesize medicinally important NRPs, with the
adenylation (A) domains especially crucial for the NRPs production
[72]. The substrate specificity of DhbE, part of a three-module NRPS
assembly line in Bacillus subtilis, was expanded by engineering its
A-domain. Saturation mutagenesis was performed over four amino
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

732 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

acid residues in DhbE A-domain, resulting in a library of 5 × 106


clones. The library was displayed on the yeast surface and screened
over two biotin-conjugated substrates by FACS. Finally the A-domain
mutants of DhbE were isolated with dramatically changed substrate
specificity of up to 200-fold.
FACS-based screening can also work in conjunction with in vitro
compartmentalization (IVC) to link the genotype and phenotype in
single cells within a water-in-oil emulsion or a water-in-oil-in-water
double emulsion. Biochemical processes such as transcription,
translation, and enzyme catalysis can still be performed in the
water droplet compartments, and the efficiency is comparable to
that in bulk volume [83]. The volume of the aqueous droplets
in IVC is typically in the picoliter and femtoliter scale, enhancing
the screening capability significantly compared to solid-medium-
based screening or microtiter-plate-based screening. Swartz and
coworkers demonstrated the feasibility of using IVC and FACS
together to detect the activity of [FeFe] hydrogenase [92]. In the
emulsion droplets, the [FeFe] hydrogenase cpl genes were attached
to the surfaces of microbeads, and then the encoded proteins
were expressed and activated via a cell-free protein synthesis
reaction and also attached to the surfaces of microbeads. The
Cpl beads and negative control beads were mixed at a 1:20 ratio
followed by emulsification with assay components and fluorogenic
substrate C12-resorufin. FACS could notably enrich the Cpl beads.
This method was used to improve the catalytic activity of serum
paraoxonase (PON1) toward the coumarin analog of S p -cyclosarin
for the detoxication of organophosphate nerve agents [73].

21.3.6 Microfluidics-Based Screening


Microfluidics is a droplet-based device offering remarkable mer-
its for performing various single-cell analysis and sorting with
ultrahigh throughput and strong sensitivity [93]. As shown in
Fig. 21.3e, compartmentalization in droplets greatly enhances the
assay sensitivity by increasing the rare substrate concentration
and decreasing the diffusion time. It also significantly reduces the
assay cost because the sample volume is reduced dramatically to
picoliter scale compared to microtiter-plate-based screening, thus

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Screening 733

raising the throughput to ∼108 samples per day. Droplet-based


microfluidic systems have been established as valuable tools for
various applications, such as multistep biological and biochemical
assays, DNA sequencing, and drug screening [94]. For example,
the activity of cellulases over the real-world insoluble substrates
Avicel and Switchgrass measured in microfluidics was in excellent
agreement with that determined by the conventional screening
methods [95].
Griffiths and coworkers established a completely in vitro
ultrahigh-throughput platform using droplet-based microfluidics
and demonstrated its practicability as a screening system for protein
engineering and directed evolution by using lacZ gene [74]. The
principle behind the experimental design is to couple cell-free
transcription and translation systems and fluorogenic reagents with
droplet-based microfluidics. Single genes are compartmentalized in
aqueous droplets for amplification to reach 30,000 copies; each
droplet will be fused to a second droplet for transcription and
translation with a cell-free system containing fluorogenic reagents.
The lacZ genes encoding β-galactosidase were enriched around 502-
fold. Hollfelder and coworkers applied directed evolution together
with droplet-based microfluidics assay to improve the activity
of an arylsulfatase from PAS toward a nonactive phosphonate
substrate [75]. The random mutagenesis library of PAS was screened
for increased promiscuous hydrolytic activity on the nonnative
substrate, phosphonate 3, in monodispersed droplets. A mutant
with sixfold activity and sixfold expression improvements was
enriched, indicating droplet-based microfluidic screening could be
harnessed to pick up rare events with small activity improvement.
Another good example is to combine ribosome display and droplet-
based microfluidics screening for the directed evolution of M-
MuLV reverse transcriptase for enhanced thermostability [76]. The
mutant M-MuLV reverse transcriptase library (∼1010 ) was created
by epPCR and subjected to in vitro transcription and translation. The
enzyme–mRNA–ribosome ternary complex was compartmentalized
by forming a water-in-oil emulsion. Mutants were screened by
microfluidics device at increased temperature. Finally, mutants
with increased thermostability were isolated that could synthesize
1.1 kb cDNA from the RNA template at 53◦ C–58◦ C, while wild-
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

734 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

type enzymes became inactive above 48◦ C. Merten and coworkers


established a high-throughput screening method for the engineering
of mammalian proteins by coupling eukaryotic virus display and
microfluidics-based encapsulation [77]. They chose human tissue
plasminogen activator (tPA) as an example to demonstrate the
feasibility of their system. tPA was displayed on the surface of the
murine leukemia virus (MLV) and subsequently compartmentalized
into droplets of a water-in-oil emulsion via a microfluidics device.
The activity of tPA in the presence of different concentrations of
its endogenous inhibitor PAI-1 can be measured by monitoring the
fluorescence changes in each droplet. This on-chip fluorescence-
activated droplet sorting was able to process 500 samples per
second, and a more than 1300-fold enrichment of the active wild-
type tPA enzyme was demonstrated. This method paved the way
for screening of structurally complex proteins requiring disulfide
bridging, glycosylation, and/or membrane anchorage.

21.4 Conclusions and Prospects

Directed evolution has become a common and powerful tool


for engineering proteins with desired properties and has been
successfully applied in diverse fields such as human health, energy,
biotechnology, and synthetic biology. However, directed evolution is
not without limitations. Because even a small protein of 50 amino
acids can have 2050 possible variants (>1065 ), it is impossible to
exhaustively probe the entire theoretical sequence space. Hence
the quality of the library is very important as it allows for more
efficient use of screening or selection of resources to find beneficial
mutations. The combination of bioinformatics tools and protein
crystallography tools enables the design of a smart library that
contains an increased fraction of beneficial mutations by introducing
genetic variation on selected residues. The power of a smart library
has showed some successes [46, 47, 58, 65, 72], and this strategy is
expected to be significantly enhanced due to the rapid development
of computational algorithms and more accurate protein structures
[7, 96]. However, in most directed evolution experiments, the biggest
challenge is the development of a suitable screening or selection

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

Conclusions and Prospects 735

method as many phenotypes of interest are not growth based.


The detectable signal could be obtained from the direct phenotype
of desirable variant or a series of biosensor constructs through
the action of single- and multistep enzymatic pathways. FACS-
based screening and microfluidics-based screening represent two
powerful high-throughput screening platforms. FACS relies on the
binding or linking to a fluorescent signal [66–73, 83], whereas
microfluidic devices can be coupled to many analytical instruments
(e.g., MS), thus opening doors to many more applications than FACS
[74–76, 93–95].
Directed evolution has also been extended to pathway engineer-
ing for synthesis of value-added products [97–99]. Most successful
directed evolution efforts to date focus on engineering a single
enzymatic trait. However, the enzyme is modified independently
from other enzymes within the pathway, and it may not be optimal
for the pathway. Therefore, for pathway engineering, mutants are
not just chosen for a changed trait of a protein but instead are
targeted for a change of flux for a certain pathway. The endeavor
requires an exponentially larger library containing mutations of
multiple proteins and is more difficult than that of engineering a
single protein. Due to the required library of large size, a likely trend
for the near future will be the creation of a smart library to reduce
the library to an affordable size that can be distinguished by FACS or
microfluidic devices.
In conclusion, directed evolution has had huge successes in
protein engineering. With the help of smart libraries and versatile
techniques for high-throughput selection/screening, directed evolu-
tion will be increasingly used for engineering of proteins, pathways,
and even genomes in the coming years.

Acknowledgments

We thank the Agency for Science, Technology, and Research,


Singapore, for supporting various research projects in the Metabolic
Engineering Research Laboratory (MERL) through the Visiting
Investigator Program to HZ.
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

736 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

References

1. Zhang, M. M., Su, X., Ang, E. L., and Zhao, H. (2013). Recent advances
in biocatalyst development in the pharmaceutical industry, Pharm.
Bioprocess., 1, pp. 179–196.
2. Nestl, B. M., Nebel, B. A., and Hauer, B. (2011). Recent progress in
industrial biocatalysis, Curr. Opin. Chem. Biol., 15, pp. 187–193.
3. Wohlgemuth, R. (2010). Biocatalysis: key to sustainable industrial
chemistry, Curr. Opin. Biotechnol., 21, pp. 713–724.
4. de Carvalho, C. C. C. R. (2011). Enzymatic and whole cell catalysis:
finding new strategies for old processes, Biotechnol. Adv., 29, pp. 75–
83.
5. Schmid, A., Dordick, J., Hauer, B., Kiener, A., Wubbolts, M., and Witholt,
B. (2001). Industrial biocatalysis today and tomorrow, Nature, 409, pp.
258–268.
6. Zhang, Z. G., Parra, L. P., and Reetz, M. T. (2012). Protein engineering
of stereoselective Baeyer-Villiger monooxygenases, Chemistry, 18, pp.
10160–10172.
7. Park, H.-S., Nam, S.-H., Lee, J. K., Yoon, C. N., Mannervik, B., Benkovic, S.
J., and Kim, H.-S. (2006). Design and evolution of new catalytic activity
with an existing protein scaffold, Science, 311, pp. 535–538.
8. Bornscheuer, U. T., Huisman, G. W., Kazlauskas, R. J., Lutz, S., Moore, J.
C., and Robins, K. (2012). Engineering the third wave of biocatalysis,
Nature, 485, pp. 185–194.
9. Reetz, M. T. (2011). Laboratory evolution of stereoselective enzymes: a
prolific source of catalysts for asymmetric reactions, Angew. Chem., Int.
Ed., 50, pp. 138–174.
10. Romero, P. A., and Arnold, F. H. (2009). Exploring protein fitness
landscapes by directed evolution, Nat. Rev. Mol. Cell Biol., 10, pp. 866–
876.
11. Yang, H. Q., Li, J. H., Shin, H. D., Du, G. C., Liu, L., and Chen, J. (2014).
Molecular engineering of industrial enzymes: recent advances and
future prospects, Appl. Microbiol. Biotechnol., 98, pp. 23–29.
12. Bommarius, A. S., Blum, J. K., and Abrahamson, M. J. (2011). Status
of protein engineering for biocatalysts: how to design an industrially
useful biocatalyst, Curr. Opin. Chem. Biol., 15, pp. 194–200.
13. Lutz, S. (2010). Beyond directed evolution-semi-rational protein engi-
neering and design, Curr. Opin. Biotechnol., 21, pp. 734–743.

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

References 737

14. Duan, X. G., Chen, J., and Wu, J. (2013). Improving the thermostability
and catalytic efficiency of Bacillus deramificans pullulanase by site-
directed mutagenesis, Appl. Environ. Microbiol., 79, pp. 4072–4077.
15. Korkegian, A., Black, M. E., Baker, D., and Stoddard, B. L. (2005).
Computational Thermostabilization of an Enzyme, Science, 308, pp.
857–860.
16. Blagodatski, A., and Katanaev, V. L. (2011). Technologies of directed
protein evolution in vivo, Cell. Mol. Life Sci., 68, pp. 1207–1214.
17. Labrou, N. E. (2010). Random mutagenesis methods for in vitro directed
enzyme evolution, Curr. Protein Pept. Sci., 11, pp. 91–100.
18. McLachlan, M. J., Sullivan, R. P., and Zhao, H. (2009). Directed
enzyme evolution and high-throughput screening. In Biocatalysis for the
Pharmaceutical Industry (John Wiley & Sons, Ltd., Chichester, UK), pp.
45–64.
19. Wong, T. S., Roccatano, D., Zacharias, M., and Schwaneberg, U. (2006).
A statistical analysis of random mutagenesis methods used for directed
protein evolution, J. Mol. Biol., 355, pp. 858–871.
20. Reetz, M. T., Kahakeaw, D., and Lohmer, R. (2008). Addressing the
numbers problem in directed evolution, ChemBioChem, 9, pp. 1797–
1804.
21. Cobb, R. E., Chao, R., and Zhao, H. (2013). Directed evolution: past,
present, and future, AIChE J., 59, pp. 1432–1440.
22. Brustad, E. M., and Arnold, F. H. (2011). Optimizing non-natural protein
function with directed evolution, Curr. Opin. Chem. Biol., 15, pp. 201–
210.
23. Turner, N. J. (2009). Directed evolution drives the next generation of
biocatalysts, Nat. Chem. Biol., 5, pp. 567–573.
24. Andrews, F. H., and McLeish, M. J. (2013). Using site-saturation
mutagenesis to explore mechanism and substrate specificity in thiamin
diphosphate-dependent enzymes, FEBS J., 280, pp. 6395–6411.
25. Chica, R. A., Doucet, N., and Pelletier, J. N. (2005). Semi-rational
approaches to engineering enzyme activity: combining the benefits of
directed evolution and rational design, Curr. Opin. Biotechnol., 16, pp.
378–384.
26. Aharoni, A., Griffiths, A. D., and Tawfik, D. S. (2005). High-throughput
screens and selections of enzyme-encoding genes, Curr. Opin. Chem.
Biol., 9, pp. 210–216.
27. Boersma, Y. L., Droge, M. J., and Quax, W. J. (2007). Selection strategies
for improved biocatalysts, FEBS J., 274, pp. 2181–2195.
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

738 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

28. Goldsmith, M., and Tawfik, D. S. (2012). Directed enzyme evolution:


beyond the low-hanging fruit, Curr. Opin. Struct. Biol., 22, pp. 406–412.
29. Boersma, Y. L., Dröge, M. J., van der Sloot, A. M., Pijning, T., Cool, R.
H., Dijkstra, B. W., and Quax, W. J. (2008). A novel genetic selection
system for improved enantioselectivity of Bacillus subtilis lipase A,
ChemBioChem, 9, pp. 1110–1115.
30. Cheriyan, M., Walters, M. J., Kang, B. D., Anzaldi, L. L., Toone, E. J.,
and Fierke, C. A. (2011). Directed evolution of a pyruvate aldolase to
recognize a long chain acyl substrate, Bioorg. Med. Chem., 19, pp. 6447–
6453.
31. Otten, L. G., Sio, C. F., Vrielink, J., Cool, R. H., and Quax, W. J. (2002).
Altering the substrate specificity of cephalosporin acylase by directed
evolution of the β-subunit, J. Biol. Chem., 277, pp. 42121–42127.
32. Sio, C. F., Riemens, A. M., van der Laan, J.-M., Verhaert, R. M. D., and
Quax, W. J. (2002). Directed evolution of a glutaryl acylase into an adipyl
acylase, Eur. J. Biochem., 269, pp. 4495–4504.
33. Liu, W., Hong, J., Bevan, D. R., and Zhang, Y. H. P. (2009). Fast
identification of thermostable beta-glucosidase mutants on cellobiose
by a novel combinatorial selection/screening approach, Biotechnol.
Bioeng., 103, pp. 1087–1094.
34. Lian, J., Li, Y., Rad, M. H., and Zhao, H. (2014). Directed evolution
of a cellodextrin transporter for improved biofuel production under
anaerobic conditions in Saccharomyces cerevisiae, Biotechnol. Bioeng.,
111 pp. 1521–1531.
35. Chen, Z., Katzenellenbogen, B. S., Katzenellenbogen, J. A., and Zhao, H.
(2004). Directed evolution of human estrogen receptor variants with
significantly enhanced androgen specificity and affinity, J. Biol. Chem.,
279, pp. 33855–33864.
36. Islam, K. M. D., Dilcher, M., Thurow, C., Vock, C., Krimmelbein, I. K., Tietze,
L. F., Gonzalez, V., Zhao, H. M., and Gatz, C. (2009). Directed evolution
of estrogen receptor proteins with altered ligand-binding specificities,
Protein Eng. Des. Sel., 22, pp. 45–52.
37. Peralta-Yahya, P., Carter, B. T., Lin, H., Tao, H., and Cornish, V. W.
(2008). High-throughput selection for cellulase catalysts using chemical
complementation, J. Am. Chem. Soc., 130, pp. 17446–17452.
38. Sugiyama, M., Hong, Z., Greenberg, W. A., and Wong, C.-H. (2007). In
vivo selection for the directed evolution of l-rhamnulose aldolase from
l-rhamnulose-1-phosphate aldolase (RhaD), Bioorg. Med. Chem., 15, pp.
5905–5911.

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

References 739

39. Koch, D. J., Chen, M. M., van Beilen, J. B., and Arnold, F. H. (2009). In vivo
evolution of butane oxidation by terminal alkane hydroxylases AlkB and
CYP153A6, Appl. Environ. Microbiol., 75, pp. 337–344.
40. Fernández-Álvaro, E., Snajdrova, R., Jochens, H., Davids, T., Böttcher, D.,
and Bornscheuer, U. T. (2011). A combination of in vivo selection and
cell sorting for the identification of enantioselective biocatalysts, Angew.
Chem., Int. Ed., 50, pp. 8584–8587.
41. Esvelt, K. M., Carlson, J. C., and Liu, D. R. (2011). A system for the
continuous directed evolution of biomolecules, Nature, 472, pp. U499–
U550.
42. Carlson, J. C., Badran, A. H., Guggiana-Nilo, D. A., and Liu, D. R.
(2014). Negative selection and stringency modulation in phage-assisted
continuous evolution, Nat. Chem. Biol., 10, pp. 216–222.
43. De Groeve, M. R., De Baere, M., Hoflack, L., Desmet, T., Vandamme, E.
J., and Soetaert, W. (2009). Creating lactose phosphorylase enzymes by
directed evolution of cellobiose phosphorylase, Protein Eng. Des. Sel., 22,
pp. 393–399.
44. Atsumi, S., and Liao, J. C. (2008). Directed evolution ofMethanococcus
jannaschii citramalate synthase for biosynthesis of 1-Propanol and 1-
Butanol by Escherichia coli, Appl. Environ. Microbiol., 74, pp. 7802–7808.
45. Runquist, D., Hahn-Hagerdal, B., and Bettiga, M. (2010). Increased
ethanol productivity in xylose-utilizing Saccharomyces cerevisiae via a
randomly mutagenized xylose reductase, Appl. Environ. Microbiol., 76,
pp. 7796–7802.
46. Piotukh, K., Geltinger, B., Heinrich, N., Gerth, F., Beyermann, M., Freund,
C., and Schwarzer, D. (2011). Directed evolution of sortase A mutants
with altered substrate selectivity profiles, J. Am. Chem. Soc., 133, pp.
17536–17539.
47. Dröge, M. J., Boersma, Y. L., Van Pouderoyen, G., Vrenken, T. E.,
Rüggeberg, C. J., Reetz, M. T., Dijkstra, B. W., and Quax, W. J. (2006).
Directed evolution of Bacillus subtilis lipase A by use of enantiomeric
phosphonate inhibitors: crystal structures and phage display selection,
ChemBioChem, 7, pp. 149–157.
48. Gao, J., Sidhu, S. S., and Wells, J. A. (2009). Two-state selection of
conformation-specific antibodies, Proc. Natl. Acad. Sci. U S A, 106, pp.
3071–3076.
49. Traxlmayr, M. W., Lobner, E., Antes, B., Kainer, M., Wiederkum, S.,
Hasenhindl, C., Stadlmayr, G., Rüker, F., Woisetschläger, M., Moulder, K.,
and Obinger, C. (2013). Directed evolution of Her2/neu-binding IgG1-
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

740 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

Fc for improved stability and resistance to aggregation by using yeast


surface display, Protein Eng. Des. Sel., 26, pp. 255–265.
50. Traxlmayr, M. W., Faissner, M., Stadlmayr, G., Hasenhindl, C., Antes, B.,
Rüker, F., and Obinger, C. (2012). Directed evolution of stabilized IgG1-
Fc scaffolds by application of strong heat shock to libraries displayed on
yeast, Biochim. Biophys. Acta, 1824, pp. 542–549.
51. Lamla, T., and Erdmann, V. A. (2003). Searching sequence space for high-
affinity binding peptides using ribosome display, J. Mol. Biol., 329, pp.
381–388.
52. Hanes, J., Schaffitzel, C., Knappik, A., and Pluckthun, A. (2000). Picomolar
affinity antibodies from a fully synthetic naive library selected and
evolved by ribosome display, Nat. Biotechnol., 18, pp. 1287–1292.
53. Mattheakis, L. C., Bhatt, R. R., and Dower, W. J. (1994). Anin vitro
polysome display system for identifying ligands from very large peptide
libraries, Proc. Natl. Acad. Sci. U S A, 91, pp. 9022–9026.
54. Wada, A., Sawata, S. Y., and Ito, Y. (2008). Ribosome display selection
of a metal-binding motif from an artificial peptide library, Biotechnol.
Bioeng., 101, pp. 1102–1107.
55. Huang, B. C., and Liu, R. (2007). Comparison of mRNA-display-
based selections using synthetic peptide and natural protein libraries,
Biochemistry, 46, pp. 10102–10112.
56. Millward, S. W., Fiacco, S., Austin, R. J., and Roberts, R. W. (2007). Design
of cyclic peptides that bind protein surfaces with antibody-like affinity,
ACS Chem. Biol., 2, pp. 625–634.
57. Pham, S. Q., Pompidor, G., Liu, J., Li, X. D., and Li, Z. (2012). Evolving
P450pyr hydroxylase for highly enantioselective hydroxylation at non-
activated carbon atom, Chem. Commun., 48, pp. 4618–4620.
58. Evans, B. S., Chen, Y., Metcalf, W. W., Zhao, H., and Kelleher, N. L.
(2011). Directed evolution of the nonribosomal peptide synthetase
AdmK generates new andrimid derivatives in vivo, Chem. Biol., 18, pp.
601–607.
59. Fischbach, M. A., Lai, J. R., Roche, E. D., Walsh, C. T., and Liu, D. R.
(2007). Directed evolution can rapidly improve the activity of chimeric
assembly-line enzymes, Proc. Natl. Acad. Sci. U S A, 104, pp. 11951–
11956.
60. Koksharov, M. I., and Ugarova, N. N. (2011). Thermostabilization of
firefly luciferase by in vivo directed evolution, Protein Eng. Des. Sel., 24,
pp. 835–844.

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

References 741

61. Martin, A. R., DiSanto, R., Plotnikov, I., Kamat, S., Shonnard, D., and
Pannuri, S. (2007). Improved activity and thermostability of (S)-
aminotransferase by error-prone polymerase chain reaction for the
production of a chiral amine, Biochem. Eng. J., 37, pp. 246–255.
62. Akbulut, N., Tuzlakoğlu Öztürk, M., Pijning, T., İşsever Öztürk, S., and
Gümüşel, F. (2013). Improved activity and thermostability of Bacillus
pumilus lipase by directed evolution, J. Biotechnol., 164, pp. 123–
129.
63. Johannes, T. W., Woodyer, R. D., and Zhao, H. (2005). Directed evolution
of a thermostable phosphite dehydrogenase for NAD (P) H regeneration,
Appl. Environ. Microbiol., 71, pp. 5728–5734.
64. Liu, M., Gu, J. L., Xie, W. P., and Yu, H. W. (2013). Directed co-evolution of
an endoglucanase and a beta-glucosidase in Escherichia coli by a novel
high-throughput screening method, Chem. Commun., 49, pp. 7219–
7221.
65. Hoffmann, G., Bonsch, K., Greiner-Stoffele, T., and Ballschmiter, M.
(2011). Changing the substrate specificity of P450cam towards
diphenylmethane by semi-rational enzyme engineering, Protein Eng.
Des. Sel., 24, pp. 439–446.
66. Aharoni, A., Thieme, K., Chiu, C. P., Buchini, S., Lairson, L. L., Chen,
H., Strynadka, N. C., Wakarchuk, W. W., and Withers, S. G. (2006).
High-throughput screening methodology for the directed evolution of
glycosyltransferases, Nat. Methods, 3, pp. 609–614.
67. Ruff, A. J., Dennig, A., Wirtz, G., Blanusa, M., and Schwaneberg, U.
(2012). Flow cytometer-based high-throughput screening system for
accelerated directed evolution of P450 monooxygenases, ACS Catal., 2,
pp. 2724–2728.
68. Liu, L. F., Li, Y. F., Liotta, D., and Lutz, S. (2009). Directed evolution of
an orthogonal nucleoside analog kinase via fluorescence-activated cell
sorting, Nucleic Acids Res., 37, pp. 4472–4481.
69. Becker, S., Hobenreich, H., Vogel, A., Knorr, J., Wilhelm, S., Rosenau,
F., Jaeger, K. E., Reetz, M. T., and Kolmar, H. (2008). Single-cell high-
throughput screening to identify enantioselective hydrolytic enzymes,
Angew. Chem., Int. Ed., 47, pp. 5085–5088.
70. Yoo, T. H., Pogson, M., Iverson, B. L., and Georgiou, G. (2012). Directed
evolution of highly selective proteases by using a novel FACS-Based
screen that capitalizes on the p53 regulator MDM2, ChemBioChem, 13,
pp. 649–653.
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

742 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

71. White, K. A., and Zegelbone, P. M. (2013). Directed evolution of a probe


ligase with activity in the secretory pathway and application to imaging
intercellular protein-protein interactions, Biochemistry, 52, pp. 3728–
3739.
72. Zhang, K., Nelson, K. M., Bhuripanyo, K., Grimes, K. D., Zhao, B., Aldrich,
C. C., and Yin, J. (2013). Engineering the substrate specificity of the DhbE
adenylation domain by yeast cell surface display, Chem. Biol., 20, pp. 92–
101.
73. Gupta, R. D., Goldsmith, M., Ashani, Y., Simo, Y., Mullokandov, G., Bar,
H., Ben-David, M., Leader, H., Margalit, R., Silman, I., Sussman, J. L., and
Tawfik, D. S. (2011). Directed evolution of hydrolases for prevention of
G-type nerve agent intoxication, Nat. Chem. Biol., 7, pp. 120–125.
74. Fallah-Araghi, A., Baret, J. C., Ryckelynck, M., and Griffiths, A. D. (2012).
A completely in vitro ultrahigh-throughput droplet-based microfluidic
screening system for protein engineering and directed evolution, Lab
Chip, 12, pp. 882–891.
75. Kintses, B., Hein, C., Mohamed, M. F., Fischlechner, M., Courtois, F., Leine,
C., and Hollfelder, F. (2012). Picoliter cell lysate assays in microfluidic
droplet compartments for directed enzyme evolution, Chem. Biol., 19,
pp. 1001–1009.
76. Skirgaila, R., Pudzaitis, V., Paliksa, S., Vaitkevicius, M., and Janulaitis, A.
(2013). Compartmentalization of destabilized enzyme-mRNA-ribosome
complexes generated by ribosome display: a novel tool for the directed
evolution of enzymes, Protein Eng. Des. Sel., 26, pp. 453–461.
77. Granieri, L., Baret, J. C., Griffiths, A. D., and Merten, C. A. (2010). High-
throughput screening of enzymes by retroviral display using droplet-
based microfluidics, Chem. Biol., 17, pp. 229–235.
78. Nikhil, U. N., and Huimin, Z. (2009). Improving protein functions by
directed evolution. In The Metabolic Pathway Engineering Handbook
(CRC Press), pp. 2-1–2-37.
79. van Rossum, T., Kengen, S. W. M., and van der Oost, J. (2013). Reporter-
based screening and selection of enzymes, FEBS J., 280, pp. 2979–2996.
80. Zahnd, C., Amstutz, P., and Pluckthun, A. (2007). Ribosome display:
selecting and evolving proteins in vitro that specifically bind to a target,
Nat. Methods, 4, pp. 269–279.
81. Chen, Y. Z., Tang, W. L., Mou, J., and Li, Z. (2010). High-throughput
method for determining the enantioselectivity of enzyme-catalyzed
hydroxylations based on mass spectrometry, Angew. Chem., Int. Ed., 49,
pp. 5278–5283.

www.ebook3000.com
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

References 743

82. Sun, B. G., Miller, G., Lee, W. Y., Ho, K., Crowe, M. A., and Partridge, L.
(2013). Analytical method development for directed enzyme evolution
research: a high throughput high-performance liquid chromatography
method for analysis of ribose and ribitol and a capillary electrophoresis
method for the separation of ribose enantiomers, J. Chromatogr. A,
1271, pp. 163–169.
83. Yang, G., and Withers, S. G. (2009). Ultrahigh-throughput FACS-based
screening for directed enzyme evolution, ChemBioChem, 10, pp. 2704–
2715.
84. Agresti, J. J., Antipov, E., Abate, A. R., Ahn, K., Rowat, A. C., Baret,
J.-C., Marquez, M., Klibanov, A. M., Griffiths, A. D., and Weitz, D. A.
(2010). Ultrahigh-throughput screening in drop-based microfluidics
for directed evolution, Proc. Natl. Acad. Sci. U S A, 107, pp. 4004–
4009.
85. Turner, N. J. (2006). Agar plate-based assays. In Enzyme Assays (Wiley-
VCH Verlag GmbH & Co. KGaA, Weinheim), pp. 137–161.
86. Suen, W. C., Zhang, N., Xiao, L., Madison, V., and Zaks, A. (2004). Improved
activity and thermostability of Candida antarctica lipase B by DNA
family shuffling, Protein Eng. Des. Sel., 17, pp. 133–140.
87. Kim, T.-W., Chokhawala, H. A., Hess, M., Dana, C. M., Baer, Z., Sczyrba,
A., Rubin, E. M., Blanch, H. W., and Clark, D. S. (2011). High-throughput
in vitro glycoside hydrolase (HIGH) screening for enzyme discovery,
Angew. Chem., Int. Ed., 50, pp. 11215–11218.
88. Fields, S., and Song, O. (1989). A novel genetic system to detect protein-
protein interactions, Nature, 340, pp. 245–246.
89. Brueckner, A., Polge, C., Lentze, N., Auerbach, D., and Schlattner, U.
(2009). Yeast two-hybrid, a powerful tool for systems biology, Int. J. Mol.
Sci., 10, pp. 2763–2788.
90. Martin, F. (2012). Fifteen years of the yeast three-hybrid system:
RNA-protein interactions under investigation, Methods, 58, pp. 367–
375.
91. Baker, K., Bleczinski, C., Lin, H., Salazar-Jimenez, G., Sengupta, D., Krane,
S., and Cornish, V. W. (2002). Chemical complementation: a reaction-
independent genetic assay for enzyme catalysis, Proc. Natl. Acad. Sci.
U S A, 99, pp. 16537–16542.
92. Stapleton, J. A., and Swartz, J. R. (2010). Development of an in vitro
compartmentalization screen for high-throughput directed evolution of
FeFe hydrogenases, PLOS ONE, 5, p. e15275.
March 21, 2016 13:50 PSP Book - 9in x 6in 21-Allan-Svendsen-c21

744 High-Throughput Screening or Selection Methods for Evolutionary Enzyme Engineering

93. Guo, M. T., Rotem, A., Heyman, J. A., and Weitz, D. A. (2012). Droplet
microfluidics for high-throughput biological assays, Lab Chip, 12, pp.
2146–2155.
94. Mazutis, L., Gilbert, J., Ung, W. L., Weitz, D. A., Griffiths, A. D., and
Heyman, J. A. (2013). Single-cell analysis and sorting using droplet-
based microfluidics, Nat. Protoc., 8, pp. 870–891.
95. Chang, C., Sustarich, J., Bharadwaj, R., Chandrasekaran, A., Adams, P.
D., and Singh, A. K. (2013). Droplet-based microfluidic platform for
heterogeneous enzymatic assays, Lab Chip, 13, pp. 1817–1822.
96. Lutz, S. (2010). Beyond directed evolution: semi-rational protein
engineering and design, Curr. Opin. Biotechnol., 21, pp. 734–743.
97. Eriksen, D. T., Hsieh, P. C. H., Lynn, P., and Zhao, H. (2013). Directed
evolution of a cellobiose utilization pathway in Saccharomyces cerevisiae
by simultaneously engineering multiple proteins, Microb. Cell Fact., 12,
p. 61.
98. Yuan, Y., and Zhao, H. (2013). Directed evolution of a highly efficient
cellobiose utilizing pathway in an industrial Saccharomyces cerevisiae
strain, Biotechnol. Bioeng., 110, pp. 2874–2881.
99. Dietrich, J. A., Shis, D. L., Alikhani, A., and Keasling, J. D. (2012). Tran-
scription factor-based screens and synthetic selections for microbial
small-molecule biosynthesis, ACS Synth. Biol., 2, pp. 47–58.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Chapter 22

Nanoscale Enzyme Screening


Technologies

Helen Webb-Thomasen and Andreas H. Kunding


DTU Nanotech, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
ahku@nanotech.dtu.dk

22.1 Introduction

High-throughput screening of enzymes is a powerful tool for


optimization of existing enzymes as well as for the development of
enzymes with novel functions. Essentially, high-throughput screen-
ing can be broken down into three steps: (i) preparation of a diverse
collection of enzyme mutants, (ii) evaluation of individual enzyme
performance, and (iii) extraction of those enzymes exhibiting the
desired traits.
Using the toolbox of modern molecular biology highly diversified
genetic libraries encoding a wealth of enzyme variants can be
produced within a single test tube. While library diversities in
the range of 107 –1010 different variants can easily be achieved
experimentally [1], it still remains a daunting task to evaluate
the performance of each enzyme variant in the library. Currently,

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

746 Nanoscale Enzyme Screening Technologies

experimentalists are still lacking tools to screen the performance


of 1010 enzyme variants, but recent innovations in microfabrication
and nanotechnology are pushing the numbers ever higher.
The main concept of enzyme performance screening is the ability
to compartmentalize individual enzyme species from each other
such that individual activities can be monitored and evaluated to
obtain the best-suited enzyme for a specific task. Compartmental-
ization can assume many forms, perhaps the simplest one being
the Eppendorf tube, but as the number of enzyme variants in the
library increases, so does the need to reduce the size of individual
enzyme compartments. In the next section, we will discuss a number
of strategies for spatial isolation of enzymes by making use of the
nanotechnology toolbox.

22.2 Approaches to Nanocompartmentalization of


Enzymes

Preparation of micro- to nanoscale compartments is challenging


and can be divided into two categories, top down and bottom
up. The top-down approach is the route taken by, for example,
the microchip semiconductor industry and relies on micro- or
nanoscale tailoring of material structure by highly precise removal
and addition of matter [2]. The top-down approach is time
consuming and requires sophisticated microfabrication equipment
and cleanroom processing but yields highly reproducible and
homogeneous compartments and devices. On the other hand, the
bottom-up approach relies on triggered material self-assembly to
form structures of certain geometry [3]. The trigger is mainly
chemical (e.g., pH, salts, solvents), but it can also be physical (e.g.,
pressure, temperature), whereas the materials utilized typically
derive from biology (e.g., lipids, peptides) or amphiphiles in general.
The bottom-up approach is more straightforward compared to
the top-down approach and does not require too sophisticated
equipment but typically yields more heterogeneous compartments.
Some of the most successful applications of the bottom-up approach
include the confinement of enzymes inside liposomes, virus-like
particles, polymersomes, and water-in-oil emulsion droplets.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Approaches to Nanocompartmentalization of Enzymes 747

22.2.1 Liposomes
Liposomes are composed of lipid molecules possessing a hydrophilic
headgroup and two hydrophobic hydrocarbon tails. When exposed
to an aqueous solution, the lipid molecules arrange themselves in
order to minimize the contact between the hydrophobic tails and
water, thus resulting in the formation of a lipid bilayer sheet [4, 5]. To
further minimize surface energy, the lipid bilayer sheet will curl up
to finally form a closed spherical compartment, thereby eliminating
high-energy edges from the bilayer sheet. If enzyme molecules are
added to the solution used to hydrate the lipid mixture, then the
spontaneous compartmentalization of the lipid bilayer will lead to
the random confinement of enzymes into single liposomes [6].
Liposomes come in a great variety of sizes ranging from
diameters as small as 20 nm up to several microns [7, 8], and even
though great efforts have been made in order to precisely control the
liposome compartment size, still sample polydispersity indices up
to 50% are common [9]. The lipid bilayer constitutes a biomimetic
interface, which serves to increase the stability of the enzyme
residing within the aqueous lumen [10]. A great number of different
enzymes have been successfully encapsulated inside liposomes,
such as alkaline phosphatase [11], α-amylase [12], asparaginase
[13], chymotrypsin [14], β-galactosidase [15], rulactine [16], and
δ-aminolevulinate dehydratase [17], just to mention a few.
However, to transform liposomes into a viable platform for
screening of enzyme function, two challenges have to be overcome,
(i) addressability of individual enzyme mutants and (ii) reagent
exchange at the nanoscale.

22.2.1.1 Addressability
One way to achieve addressability of individual liposomes is to
immobilize them on a planar solid substrate [18] (see Fig. 22.1A).
Due to the nanoscopic dimensions of the liposomes, a great number
of single compartments can be accommodated on a relatively
small area (up to 4 mio./mm2 ), thus allowing the performance
of individual liposome-encapsulated enzymes to be monitored in
parallel by microscopy imaging [19] (see Fig. 22.1B). However, a
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

748 Nanoscale Enzyme Screening Technologies

Figure 22.1 Nanoscale compartmentalization using liposomes. (A) Sketch


of surface-immobilized liposomes encapsulating enzyme molecules. The
liposomes are attached to a solid substrate by molecular tethers displayed
on the liposome membrane. (B) Sketch of reagent mixing between
individual liposomes. Fusogenic liposomes containing substrate molecules
are added to surface-immobilized liposomes encapsulating enzymes. The
substrate is delivered to the enzyme upon membrane fusion between the
two liposome populations. (C) Sketch of localized reagent mixing inside a
single liposome. A large unilamellar liposome encapsulates smaller lipo-
somes containing enzyme and substrate molecules, respectively. Once the
temperature is increased to the lipid bilayer phase transition temperature
of the smaller liposomes, the enzyme and the substrate are released into
the lumen of the large liposome. (D) Sketch of encapsulation of a plasmid
library in liposomes. Single plasmids are transcribed and translated into
proteins localized to the lumen of individual liposomes. By passing the
liposome population through a fluorescence-activated cell-sorting device,
liposomes encapsulating relevant proteins can be selected. (E) Fluorescence
micrograph of a high-density array of liposomes immobilized on a solid
substrate. Reprinted with permission from Ref. [19]. Copyright  c 2003
WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. (F) Content mixing
of single liposomes using micromanipulators. A liposome with a green
fluorescent dye is mixed with another liposome containing a red fluorescent
dye. Upon mixing, the fused liposome complex turns orange. Reprinted with
permission from Ref. [27], Copyright c 2006, American Chemical Society.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Approaches to Nanocompartmentalization of Enzymes 749

drawback of this approach is that extraction of the liposome of


interest from the remaining liposomes is a daunting task, which so
far has not been realized. Although, one might envision micropipette
aspiration of liposome content from the solid substrate, this has
not been demonstrated so far, because it requires a high degree of
precision in order to extract a single nanoscopic liposome from a
densely packed surface. On the other hand, fluorescence-activated
cell sorting (FACS) has been successfully utilized to both monitor
and select single liposomes on the basis of the fluorescence signal
of the encapsulated protein [20] (see Fig. 22.1C). Even though FACS
provides a convenient way of addressing and selecting the contents
of a large number of nanoreactors, the directed evolution of enzymes
in liposomes still remains to be demonstrated by this approach.

22.2.1.2 Reagent exchange


Although the membrane of liposomes does not readily permit
exchange of contents between the liposome lumen and the exterior
environment, a few approaches have been developed in order to
deliver molecules across the bilayer. The simplest way to achieve
permeability is by adjusting the temperature to the phase transition
temperature of the lipid bilayer [21]. At the phase transition
temperature, the membrane becomes permeable and fluid contacts
to the bulk may be established by diffusion through nanoscopic
pores in the bilayer [22, 23]. The size of the pores determine
whether an initially encapsulated enzyme will remain inside or
escape into the bulk, but recent experiments demonstrate how the
pore size can be adjusted as a function of temperature and lipid
chemistry [24].
Yet another approach based on temperature employs two sets
of liposomes; a large liposome is applied as a reaction chamber
in which both the enzyme, and other smaller liposomes are
encapsulated [25] (see Fig. 22.1D). The smaller liposomes were
prepared to exhibit a phase transition temperature lower than that
of the larger liposome, thus allowing for specific cargo release from
the small liposomes once the temperature was raised. In this way, it
was shown how the substrate could be delivered in a controlled and
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

750 Nanoscale Enzyme Screening Technologies

reproducibly fashion to enzymes residing in the lumen of the large


liposome.
More sophisticated reagent exchange has been made possible by
in situ micromanipulation of large (R > 5 μm) surface-immobilized
liposomes [26]. In this case, sets of liposomes containing enzyme
and substrate, respectively, were immobilized on a planar surface
[27]. Next, a micropipette was used to pull out a thin (R = 30–
100 nm) lipid nanotube from the immobilized liposome. By moving
the micropipette to the position of another immobilized liposome,
the two could be connected through the lipid nanotube and hence
establish fluid contact (see Fig. 22.1E). Furthermore, micropipettes
have been used to pick up individual liposomes, to bring them
in close contact and trigger their fusion by application of an
alternating electric field [28]. Even though both the aforementioned
micropipette-based techniques provide an elegant and highly
controlled way of mixing femto-to-attoliter volumes, it is not likely
that they may be implemented in the high-throughput screening
of enzymes, mainly due to the time-consuming nature of the
micromanipulation process, which would require years to mix the
contents of, for example, one million liposomes.
On the other hand, recent studies have demonstrated fast and
parallel, although random, mixing of millions of zeptoliter liposome
volumes. Here, a great number of liposomes were immobilized on
a solid substrate to provide addressability and parallel screening
using fluorescence imaging [29–31]. Next, another set of comple-
mentary liposomes were added, which subsequently fused to their
surface-immobilized counterparts, thus delivering their cargo (see
Fig. 22.1F). Mixing could be triggered by, for example, incorporation
of complementary DNA sequences in the liposomes membrane [32],
complementary attachment proteins [29, 31, 33], or complementary
charged lipids [30].
In conclusion, liposomes have been utilized extensively in
combination with various enzyme variants but mainly (i) to provide
a more compatible environment for enzymatic catalysis or (ii) to
study the function of single or few enzymes. Thus so far only
few research groups have managed to set up directed evolution of
proteins using flow cytometric screening and sorting of liposomes.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Approaches to Nanocompartmentalization of Enzymes 751

22.2.2 Polymersomes and VirusLike Particles


Polymersomes are nanoscale spherical containers inspired by
liposomes but built from artificial polymer units exhibiting the same
self-organizing property as phospholipids [34]. It is thus possible
to add more chemical functionality into polymersomes as it is
with liposomes, although at the cost of reduced biocompatibility.
For example, chemical switches can be built into the polymersome
membrane in order to induce permeability, which has been exploited
in the context of confined enzymatic reactions [35, 36]. It has been
shown that enzymatic cascade reactions could be controlled by
preparing distinct populations of polymersomes carrying different
enzymes in their lumen. The polymersome membranes had been
designed to be responsive to pH such that by increasing the
pH of the solution, the membranes would become permeable
[35]. Consequently, enzymatic cascades could be designed and
activated simply by increasing the pH, thus allowing substrate
molecules to enter the polymersome lumen and the enzyme residing
within. Furthermore, by constructing nested systems comprising
enzymes-in-polymersomes-in-polymersomes (i.e., small enzyme-
containing polymersomes are encapsulated in a large substrate-
containing polymersome) coupled enzymatic reactions could be
carried out inside single polymersomes, thus downscaling reaction
compartments drastically [36].
A challenge faced by liposomes and polymersomes is how
to prepare populations exhibiting homogeneous sizes. The self-
assembly process of amphiphiles is a stochastic process driven
by minimization of surface energy, which leads to polydisperse
particle populations. On the other hand, virus particles and virus
capsids exhibit tightly controlled size distributions but are also
constructed by spontaneous (and stochastic) self-assembly. This
is possible due to the relatively more complex protein subunits
(as compared to phospholipids) comprising the viral capsid, which
assemble into a crystal-like spherical protein shell of highly defined
geometry. This feature has been exploited to achieve homogeneous
compartmentalization and organization of enzymes on the surface
of, for example, the cowpea mosaic virus capsid [37]. Furthermore,
the virus capsid of the cowpea chlorotic mottle virus can be
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

752 Nanoscale Enzyme Screening Technologies

rendered semipermeable to small molecules by changes in pH,


which was recently exploited to confine single enzymes in the
capsid lumen and continuously feed them the substrate [38]. Due
to their small (20–100 nm) and homogeneous size, virus capsids
exhibit a great potential in terms of downscaling and parallelizing
biochemical reactions; however, the main drawback is that sorting
and selection cannot be conducted in any straightforward manner,
since especially small virus capsids are at the limit of detection using
flow cytometric approaches [39].

22.2.3 Water-in-Oil Emulsion Droplets


Micron-sized aqueous droplets dispersed in an oil phase have been
applied extensively for the directed evolution of proteins/enzymes
and a range of other applications [40–42]. Like liposomes, water-
in-oil emulsion droplets provide an aqueous lumen shielded from
the external environment and are prepared by spontaneous self-
assembly. Emulsion droplets form spontaneously when an aqueous
phase is slowly added to an agitated oil phase containing small
amounts of surfactant molecules. The surfactant molecules serve
as droplet stabilizers by exposing their hydrophobic parts to the
oil phase and their hydrophilic parts to the aqueous phase. This
prevents to a great extent the fusion of droplets and hence a colloidal
suspension is obtained (see Fig. 22.2A,B). Emulsions prepared
in this way exhibit a great degree of polydispersity with sizes
ranging from a few hundred nanometers to several tens of microns
[43]. However, more narrow size distributions can be obtained by
processing the colloidal suspension with, for example, ultrasound or
mechanical agitation [44] (see Fig. 22.2C). More recently, production
of ultra-mono-disperse emulsion droplets has been realized by
applying microfluidic channels, as will be discussed in more detail
in Section 22.3.1.
The term “in vitro compartmentalization” has been coined for
this kind of colloidal suspensions and is almost exclusively applied
to water-in-oil emulsion droplets, although many other approaches
that do not apply emulsions are also capable of providing compart-
mentalization on the same length scale. High-throughput screening
of enzyme libraries is accomplished by compartmentalization of

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Approaches to Nanocompartmentalization of Enzymes 753

dilute DNA libraries into emulsion droplets [44, 45]. The dilution
of the DNA library has to be adjusted according to the average
compartment volume such that approximately a single DNA mole-
cule becomes encapsulated in a single emulsion droplet. If the DNA
library is co-encapsulated with cell-free transcription/translation
mixture and enzyme activity assay reagents, then enzyme variants
will become expressed from the DNA template and produce a
fluorescence signal proportional to their activity (see Fig. 22.2D).
However, to ensure the integrity of the library, only a single DNA
molecule is to become encapsulated in an emulsion droplet, and
hence the yield of the cell-free expression can be quite low. Enzyme
copy numbers below 100 is usually achieved [45], which puts more
pressure on the signal-to-noise ratio of the enzyme activity assay. To
circumvent this limitation, it would be possible to apply bead-based
emulsion polymerase chain reaction (PCR), in which an additional
solid phase is introduced in the setup, as has been demonstrated in
genotyping and sequencing experiments [46, 47]. The solid phase
comprises micron-sized colloid particles functionalized with DNA
capture elements, which are encapsulated into individual emulsion
droplets. The advantage of this approach is that a two-step reaction
can be carried out. In the first step, the colloid particle, a single
DNA molecule and PCR reagents are encapsulated within emulsion
droplets. Consequently, the population of emulsion droplets can
become thermocycled to achieve monoclonal amplification and
subsequent attachment of the resulting amplicons to the colloid
particle. In the next step, the emulsion is broken and the DNA-
functionalized beads are washed and recovered. Next, a new
emulsion may be formed in which the beads are co-encapsulated
with the cell-free expression mixture and enzyme activity reagents.
Because the DNA has been amplified several orders of magnitude,
also the expressed enzymes become more abundant, which lowers
the performance requirements to the activity assay. This approach
would exhibit the further advantage that because a bead suspension
is generally more stable than an emulsion, the recovered beads may
be stored for an extended period of time. Hence, not only the actual
enzyme activity screen can be conducted at a more convenient time,
but also aliquots of the bead-DNA library may be stored for quality
control or further investigations.
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

754 Nanoscale Enzyme Screening Technologies

Figure 22.2 Enzyme library screening using emulsion droplets. (A) Sketch
of water droplets dispersed in an oil phase. The water-in-oil emulsion
droplets are stabilized by surfactant molecules added to the oil phase.
(B) Sketch of reagent delivery to water-in-oil emulsion droplets using
nanodroplets. (C) Sketch of enzyme expression from encapsulated single
cells. First, single cells are incubated in water-in-oil emulsion droplets
to express genetically encoded enzymes. Next, the water-in-oil emulsion
droplets are transformed into water-in-oil-in-water emulsion droplets to
allow for flow cytometric measurements of enzyme activity. (D) Sketch
of fluorescence-activated cell-sorting screening of water-in-oil-in-water
emulsion droplets hosting an enzyme library. Single droplets are passed
through a fluorescence detector. The bulk liquid is then partitioned
into smaller droplets, which are deflected electrostatically depending on
the measured fluorescence response. (E) Bright-field micrograph of a
population of water-in-oil emulsion droplets dispersed on a solid substrate
(top). The emulsion droplets exhibit a wide range of sizes, and only
the largest ones give rise to a fluorescence signal from an encapsulated
fluorescent probe (bottom). Scale bar is 10 μm. Reprinted by permission
from Macmillan Publishers Ltd: Nat. Methods (Ref. [44]), copyright (2006).

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Approaches to Nanocompartmentalization of Enzymes 755

22.2.3.1 Addressability
The method of choice for screening and selection of promising
enzyme variants is FACS, a variant of flow cytometry capable of
sorting emulsions on the basis of their fluorescence intensity. Using
FACS, great numbers of emulsion droplets can be passed by a
fluorescence detector in single file, while the sorting is achieved by
applying an electrostatic charge on the droplets and passing them
through an electrostatic deflection system [41] (see Fig. 22.2E).
In this way, several million emulsion droplets can be inspected
and sorted in a few hours, which has lead to the discovery of a
number of novel and highly efficient enzyme variants, including β-
galactosidase [48], thiolactonase [49], and phosphotriesterase [50].
Even though the high throughput of FACS enables access to
screening of highly diverse enzyme libraries, it comes at the cost of
assay sensitivity and flexibility. For example, to screen one million
droplets in one hour, each droplet will only be allowed less than
2 ms detection time. Consequently, only high signal-to-noise-ratio
assays can be utilized to screen the performance of a large library.
Furthermore, because most FACS equipment only accepts particles
suspended in aqueous solution, the water-in-oil emulsion droplets
need to be reformulated into water-in-oil-in-water droplets. This
is typically achieved by extruding the initial water-in-oil emulsion
through narrow membrane pores into an aqueous phase [44]. This
procedure allows the initial aqueous compartments to retain a thin
oil/surfactant layer, which on one side faces the enzyme solution and
on the other side the bulk aqueous phase.

22.2.3.2 Reagent exchange


Another challenge faced by the emulsion droplet technology is the
difficulty related to adding additional contents to existing droplets.
In the case where microbeads are utilized as a solid phase to capture
DNA or proteins, this is less of an issue because emulsions can be
broken and reformed, while still retaining the integrity of the library.
In addition, two alternative strategies have been devised to enable
content delivery to emulsion droplets. The first approach relies on
encapsulating the reagents destined for delivery, for example, an
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

756 Nanoscale Enzyme Screening Technologies

enzyme substrate, into nanosized emulsion droplets [51]. These


nanodroplets are formed by increasing the surfactant content and
by subjecting the emulsion to rigorous mechanical agitation (see
Fig. 22.2F). The increased surfactant concentration stabilizes the
higher curvature radius of the nanodroplets but still enables them
to fuse and deliver their content to the greater emulsion droplets
containing the enzyme library.
The other approach is based on chemical destabilization of
the oil/water/surfactant interface using organic solvents such
as dimethylsulfoxide, dimethylformamide, or ethanol/methanol
mixtures. The reagents to be delivered are dissolved at a high
concentration in the organic solvents, and next a small volume of
the mixture is added to the emulsion. Upon mixing, the reagents will
partition into the existing emulsion droplets without disrupting the
integrity of the individual compartments.
In conclusion, directed evolution of proteins and enzymes inside
water-in-oil(-in-water) emulsion droplets has been highly success-
ful in terms of discovering biomolecules with novel functions.
However, the community still faces challenges with respect to (i)
homogeneous droplet formation, (ii) reproducible and homoge-
neous reagent delivery, and (iii) increase of the throughput without
compromising assay quality. Bead-assisted library generation and
screening holds the greatest potential because it enables a greater
degree of control over the reaction conditions. Nevertheless,
significant improvements of emulsion droplet technology have been
achieved recently by utilizing top-down (instead of bottom-up)
droplet production by use of microfabricated microfluidic flow
cytometric devices, as will be discussed in further detail in the next
section.

22.3 Microfabricated Chip Devices for Enzyme


Compartmentalization and Screening

The fusion of dry solid-state microfabrication technology with


wet (bio)chemistry has opened up new opportunities for sample
(i) volume downscaling, (ii) throughput, and (iii) automation.
By utilizing microfabrication technology, highly reproducible and

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Microfabricated Chip Devices for Enzyme Compartmentalization and Screening 757

uniform processes can be achieved, which is not possible with


spontaneous and random self-assembly processes, as the ones
described in the previous sections. One of the most successful
approaches to high-throughput enzyme screening and selection
relies on miniaturized flow cytometers combined with emulsion
droplet generators [52]. However, other approaches based on
compartmentalized microarrays also show great promise due to
their relatively high throughput and low reagent consumption but
most importantly the ability to conduct screening in parallel, as
opposed to sequential screening with flow cytometers. In the next
sections, we will discuss a few of the aforementioned top-down
techniques and compare them with each other as well as with their
bottom-up counterparts.

22.3.1 Microfluidic-Generated Emulsion Droplets


Inspired by the water-in-oil emulsion droplet compartmentalization
technology described earlier, the microfluidic community adapted
this approach to enable highly controlled and homogeneous droplet
generation in combination with sophisticated liquid micromanip-
ulation and screening equipment. Unlike bulk emulsion droplets,
microfluidic-generated emulsion droplets are formed one at a time
by combining liquid streams of water and oil. A popular method for
microfluidic droplet generation is termed “flow focusing” [53], in
which two streams of an oil phase are combined perpendicular to
one stream of an aqueous phase. Because the flow of three streams
is focused into a single stream, the aqueous phase will be pinched
off by the oil phase, thus forming individual emulsion droplets (see
Fig. 22.3A). By fine-tuning the channel geometry and the relative
flow rates of each of the two phases, a great range of droplet sizes
can be reproducibly generated [54]. The droplets generated by such
a device typically exhibit less than 3% polydispersity and may be
generated at rates up to 10,000 per second (10 kHz). Furthermore,
several aqueous streams may be combined into a single one prior
to the droplet generation, which allows reagent mixing only to take
place immediately before compartmentalization.
However, in many cases precise temporal and spatial control
of reagent mixing is desirable, for example, in order to trigger a
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

758 Nanoscale Enzyme Screening Technologies

Figure 22.3 Microfluidic microdroplet devices. (A) Sketch of an emulsion


droplet generator based on flow focusing of two oil streams and one water
stream. The pressure profile at the junction forces the water phase to break
up into well-defined droplets. (B) Sketch of reagent delivery to emulsion
droplets confined to a microfluidic channel. The pico-injection module
consists of a channel with reagents, which can be specifically added to
droplets by increasing the pressure temporarily on the injection channel.
The pico-injector may be connected to a detector such that injection can
be triggered for specific droplets. (C) Sketch of on-chip droplet sorting.
Droplets are passed through a fluorescence detector in single file, which may
trigger an electrical pulse from an electrode downstream. The electric field
deflects the droplets into a specific channel for subsequent recovery. (D)
Sketch of a droplet detection chamber. A narrow channel is expanded into
a large chamber to store a large number of droplets for subsequent imaging
analysis. (E) Annotated bright-field micrograph of a microfluidic sorting
device using electrodes to deflect single droplets. Reproduced from Ref.
[60] with permission of PNAS. (F) Combined bright-field and fluorescence
micrograph acquired from a droplet detection chamber. Droplets can be
classified according to their fluorescence signal in a high-throughput format
by large-scale imaging. Scale bar is 100 μm. Reproduced from Ref. [71],
Copyright (2011), with permission of the Royal Society of Chemistry.

certain chemical reaction. This requires that the mixing takes place
at the level of individual droplets, that is, subsequent to the droplet
formation. Various methods to induce droplet mixing have been
demonstrated, and they may be classified as either active or passive
mixing. Passive mixing is predominantly achieved by shaping the
channel geometry such as to mediate fusion between adjacent

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Microfabricated Chip Devices for Enzyme Compartmentalization and Screening 759

droplets [55]. For example, by allowing pairs of droplets to enter an


expansion channel and subsequently force them through a narrow
channel such that spontaneous droplet fusion proceeds. This is
mainly not only due to the induction of high droplet curvatures at the
interface but also because the narrow channel reduces the amount
of the stabilizing oil/surfactant phase present at the droplet contact
zone. Even though subsequent fusion events between more than
two droplets can be realized, it is still not readily experimentally
achievable due to the more complex channel architectures that need
to be fabricated.
Active mixing, on the other hand, puts fewer requirements on
channel architecture but requires a higher degree of integration
with external devices. Active mixing has been demonstrated by
localized destabilization of the interface between two adjacent
droplets. Destabilization can be achieved by transmitting strong
electric pulses [56] or heat pulses [57] to droplet pairs in close
proximity. In another approach, termed “pico-injection,” another
channel containing mixing reagents is positioned perpendicular to
the main channel conveying the droplets [58] (see Fig. 22.3B).
To induce mixing between the droplet contents and the mixing
reagents, a pressure pulse can be applied to the injection channel. In
this way, a meniscus of mixing reagents will protrude into the main
channel and thus be combined with the next droplet passing by the
injector module. An injection channel is more easily implemented
than the aforementioned active mixing methods, because it only
relies on accurate flow control and does not require the integration
of an additional electric/heating controller.
The advantage of active over passive mixing is that reagents
can be specifically delivered to a subset of the droplet population,
whereas passive mixing devices deliver reagents to all droplets
passing through the mixing module. However, to achieve functional
active mixing a great degree of droplet synchronization is required,
which calls for incorporation of additional hardware elements to
conduct real-time signal processing and signal triggering.
Examples of enzyme screening in emulsion droplets are not
as numerous for microfluidic chip devices as it is for their
bulk counterparts. Successful attempts to on-chip-directed protein
evolution are made mainly on the basis of encapsulation of
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

760 Nanoscale Enzyme Screening Technologies

live cells in emulsion droplets [59–61]. Individual cells harbor a


specific genetically encoded enzyme variant, which upon expression
becomes secreted into the droplet lumen. By co-encapsulating
a fluorogenic enzyme substrate, the droplets could be sorted
depending on their relative activity. In one of the examples of this
approach, Agresti et al. employed yeast cells to express variants of
horseradish peroxidase (HRP) and to display them on the outer cell
surface [60]. From an initial library of 107 it was possible to select
HRP variants exhibiting 5–12 times higher activity than the parent
enzyme. The device employed by Agresti et al. consisted of two
subdevices, one for droplet generation and one for droplet sorting.
First, the yeast cells would become encapsulated, next they were
collected off-chip and allowed to incubate, and, finally, they would
be injected onto the sorting chip in order to extract and identify
the best-performing enzyme variants. Because two subdevices were
employed a long incubation time could be achieved, thus allowing
cells to express higher amounts of enzyme, which, in turn, leads
to a higher signal-to-noise ratio of the activity assay. Having only
one device would not have permitted extended incubation, because
the on-chip residence time is determined by the total channel
length and the flow rates applied. Because rapid droplet generation
requires relatively high flow rates, typically droplets will only
reside 5–10 min on the chip before they exit the chip again. Apart
from cell-assisted enzyme expression and screening, also cell-free
expression of, for example, β-galactosidase has been achieved in
microfluidic droplet chips [62]. However, because the optimum
conditions for transcription/translation can differ substantially
from those of the enzyme activity assay, it would be necessary to
fuse the initial droplets containing the expressed enzymes to larger
droplets encapsulating enzyme-screening reagents. In this way, the
in vitro expression mixture would be diluted and the reaction
conditions would be adjusted to the optimum for the activity
screen.
Droplet inspection and sorting are to a great extent inspired by
conventional FACS technology, thus leading to the analogous term
“fluorescence-activated droplet sorting” (FADS) [59]. However, due
to the great degree of customization of microfluidic platforms new

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Microfabricated Chip Devices for Enzyme Compartmentalization and Screening 761

approaches to sorting have been implemented and thus allow for


more versatile selection of promising droplets. For example, apart
from electric deflection of droplets, also deflection using surface-
acoustic waves [63] or localized heating [64] has been utilized. In
other cases, when magnetic particles have been encapsulated in the
droplet lumen, magnetic fields have been able to deflect droplets
as they pass by a sorting module [65]. Nevertheless, fluorescence-
activated droplet dielectrophoresis [59] has so far been the
best-performing sorting technology, with sorting rates equaling
that of flow cytometry (300–2000 droplets per second) (see
Fig. 22.3C,D).
On-chip droplet inspection and characterization is mainly con-
ducted by use of fluorescence measurements, as is the case with
FACS technology. However, alternative detection methods, including
electrochemical [66], electrophoresis [67], mass spectrometry [68],
and Raman scattering [69], have been utilized for real-time mon-
itoring of single-droplet content. All of the three aforementioned
techniques constitute label-free detection methods and thus offers
more in-depth analysis than what is possible with fluorescence-
based screening. The main dilemma of droplet technology is the
inverse relationship between sample throughput and assay time
duration. For high-throughput devices, the individual droplet only
is allowed a submillisecond residence time in front of the detector,
which is too short a time frame (considering the small droplet
volume) for more in-depth measurements on the basis of, for
example, electrochemistry or Raman spectroscopy. To enable longer
detection times, hybrid devices comprising a droplet-generation
module and a droplet detection chamber have been constructed [70,
71]. The droplet detection chamber provides a large open volume,
in which a great number of droplets can reside in for extended
time periods (minutes to hours) (see Fig. 22.3E,F). Consequently,
droplet inspection can be carried out using imaging-based detectors,
which allows longer detection times and better spectral resolution.
The main drawback of droplet detection chambers is that droplets
exhibiting a given trait can be difficult to extract from the chip again,
because the initial positions of the droplets become scrambled once
they enter and exit the detection chamber.
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

762 Nanoscale Enzyme Screening Technologies

22.3.2 Microfabricated Arrays


Microfabricated arrays are similar to the aforementioned droplet
detection chambers because they allow a great number of samples
to become organized and inspected simultaneously. In certain cases,
microfabricated arrays have been combined with droplet generators
to enable droplet capture and triggered release [72]. This was
accomplished by preparing soft polymer arrays displaying U-shaped
features matching the diameters of the injected droplets. Hence,
droplets injected onto the U-shaped protrusions would become
captured, while droplets approaching occupied features would be
deflected. To release the droplets again, the flow would be reversed,
thus pushing them out of the features.
For a microarray to be functional in terms of enzyme screening,
compartmentalization of the enzyme is a necessity. Consequently,
the first generation of DNA [73] and peptide microarrays [74] was
not immediately applicable to enzyme screening because it did
not display fluidically isolated reaction compartments. The next
generation of microarrays is still in its infancy, but it holds great
promise for high-throughput screening purposes because it allows
for high sample throughput, assay flexibility, and in some cases also
the ability to exchange reagents in a straightforward manner. In the
following sections, we will mention a few of the next-generation
microarrays and comment on their potential for enzyme screening.

22.3.2.1 Optical fiber microarrays


Optical fiber bundles have found biotechnological applications,
because when treated properly they will form dense femtoliter
volume arrays [47, 75–77]. This is due to the chemical differences
between the optical fiber core and the cladding material. Upon
immersion of the fiber bundle in hydrochloric acid, the core will
etch at a faster rate than the cladding, thus leaving behind shallow
cavities. The compartment volume is in the low femtoliter range,
which is roughly 100 times less than the volume exhibited by
microfluidic emulsion droplets.
Among the first applications of etched optical fiber microarrays
was single enzyme screening, as demonstrated by Gorris and Walt

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Microfabricated Chip Devices for Enzyme Compartmentalization and Screening 763

[78] (see Fig. 22.4A). Here a solution of HRP enzymes and a


fluorogenic substrate was enclosed in the fiber cavities and sealed
by a soft gasket to prevent evaporation and intercavity mixing. In
this way, a single HRP molecule could be trapped inside single
cavities and reaction kinetics could be monitored by fluorescence
microscopy. The small scale of the cavities served a twofold purpose,
(i) to ensure stochastic encapsulation of single enzymes and (ii)
to provide signal amplification by fast up-concentration of the
fluorescing product molecules. For example, 1000 fluorophores
residing in a volume of 50 fL yields a concentration of approximately
30 nM, which is readily detectable with standard fluorescence
imaging detectors. Following this approach, it was possible to screen
reaction rates for several thousand single enzymes at the time, thus
reaching unprecedented measurement sensitivity.
Instead of etching cavities, the entire core of the optical fiber
can be removed, thus establishing microcapillary arrays. Using
this approach large bacterial populations hosting diverse enzyme
libraries could be compartmentalized and screened in a high-
throughput format using fluorescence imaging [79]. In this way,
several millions compartments could be screened in a few days,
and the most promising enzyme variants could be extracted by
micropipette aspiration from the array.

22.3.2.2 Elastomeric microarrays


Microarrays produced in soft materials, such as poly(dimethyl
siloxane) (PDMS), are preferable compared to etched optical fiber
bundles, because they can be produced from a master mold, thus
drastically reducing their cost and preparation time. PDMS is an
elastomer extensively used in microfabrication due to its unique
ability to replicate even nanoscopic features on the mold. In a
number of recent reports low-cost PDMS chips had been designed to
exhibit the same features as etched optical fiber bundles, that is, an
array of micron-sized compartments [80, 81]. The arrays were then
used to trap single-enzyme-linked antibodies attached on magnetic
microbeads, thus carrying out a highly sensitive enzyme-linked
immunosorbent assay (ELISA) (see Fig. 22.4B). To provide func-
tional compartmentalization, the enzyme-loaded array was infused
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

764 Nanoscale Enzyme Screening Technologies

Figure 22.4 Applications of next-generation microarrays for enzyme


compartmentalization and screening. (A) Sketch of an etched optical fiber
array hosting single enzymes inside its cavities. Due to the microscopic
compartment volume even single enzymes may become detected. (B) Sketch
of an elastomeric microarray consisting of a large number of microcavities
containing magnetic beads. The magnetic beads are in turn used for
immunogenic capture and detection of single antibody–enzyme complexes.
(C) Sketch of an open surface-tension-based array. Hydrophilic spots on
an otherwise hydrophobic substrate allow micron-sized aqueous droplets
to become captured underneath an oil phase. Single bacterial cells can be
grown inside droplets and later aspirated by a piezo-driven micropipette
unit. (D) Sketch of a surface tension array integrated in a flow system.
When a liquid pulse is actuated across the array surface, droplets form
at the receding waterfront due to the hydrophobic/hydrophilic patterning.
(E) Bright-field micrographs of bacterial cells proliferating inside individual
droplets of an open surface tension array as in (C). Micrographs are
reproduced from Ref. [83]. Copyright  c 2013 Iino, Matsumoto, Nishino,
Yamaguchi and Noji. (F) Fluorescence micrograph of GFP expressed from
droplet-encapsulated plasmid templates. Following synthesis, the histidine-
tagged GFP was immobilized on the substrate by specific capture of surface-
bound nitriloacetic acid elements. Next, cell-free expression reagents could
be removed by flushing the array.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Microfabricated Chip Devices for Enzyme Compartmentalization and Screening 765

with an oil phase, which displaced the bulk aqueous solution, while
leaving femtoliter aqueous compartmentalized reaction volumes
inside the microcavities.
In another variation of the elastomeric microarray, compartmen-
talized cell-free expression of proteins was attempted [81]. To do
so, DNA coding for green fluorescent protein (GFP) was attached
to the surface of microbeads using solid-phase PCR. By variation of
the reaction conditions beads displaying between 1 and 10 copies
of the DNA template could be prepared and were subsequently
loaded into the microcavities of the PDMS microarray. Along with the
DNA-functionalized beads, also cell-free expression reagents were
compartmentalized, and the evolution of fluorescence intensity
could be followed over time, as GFP molecules were synthesized
from the DNA template.
The ability to organize and synthesize protein libraries from
DNA is the first step in in vitro enzyme screening, as emulsion
droplet technology bears evidence of, but needs to be combined
with a viable method to extract the desired enzyme(s) or cell(s).
Both optical fiber bundle arrays and PDMS microarrays still need
to demonstrate extraction of material from the array. The main
obstacle here is that the arrays require sealing by a gasket, a layer
of oil, or both in order to reduce evaporation of the samples. By
sealing off the array it becomes less straightforward to readdress
individual compartments, and thus more flexible solutions have to
be developed.

22.3.2.3 Surface tension microarrays


Instead of achieving compartmentalization by etching or molding
geometrical features into a substrate, it is possible to use surface
chemical differences to partition a substrate into distinct regions.
A general strategy is to render certain parts of a hydrophobic
substrate hydrophilic such that aqueous droplets can be held
in place by surface tension alone. In one example of this [82],
miniature nanoliter droplets could be arrayed on a hydrophilic-
patterned perfluorinated substrate. The array contained 3000
distinct compartments, which was formed by depositing different
solutions onto the array using a microspotting instrument. The
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

766 Nanoscale Enzyme Screening Technologies

platform was used to screen the performance of protease enzymes


and to optimize the reaction conditions for small-molecule protease
inhibitors.
In another recent example, micropatterning of the hydrophobic
photosensitive polymer CYTOP was used to construct arrays able
to capture femtoliter volume aqueous droplets [83]. Individual
arrays hosted up to 300,000 distinct compartments and were
constructed in an open format such that single-array elements could
be addressed with a piezo-driven micropipette (see Fig. 22.4C,E).
The device was used to screen bacterial colonies at the single-
cell level in order to find rare mutants. In particular, bacterial
survival in the presence of antibiotic agents could be screened
and quantified by extraction and subsequent analysis of cells from
individual droplet compartments.
Taking this approach a step further, we have recently devel-
oped a droplet microarray on the basis of surface-tension-driven
organization of individual aqueous compartments [84]. The array
consists of a hydrophilic pattern on a hydrophobic substrate and
is incorporated into an open flow system such that reagents can be
delivered in a straightforward manner simply by liquid infusion and
withdrawal (see Fig. 22.4D). The hydrophilic pattern is composed
of silica, which enables surface functionalization and, thus, provides
a solid phase for selective capture of molecules from the droplet
lumen. In this way, we could achieve cell-free protein expression
and purification in a three-step process: (i) specific functionalization
of the solid phase with nickel-nitrilotriacetic acid (Ni-NTA) tags,
(ii) infusion of cell-free expression reagents mixed with plasmids
encoding histidine-tagged enhanced GFP, and (iii) capture of GFP
expressed inside individual droplets followed by removal of cell-
free expression reagents from the array by flushing (see Fig. 22.4F).
Droplets exhibit 50–100 fL volumes, and in order to prevent
evaporation, we add a layer of oil to the array during the cell-free
expression reaction. The oil layer may be readily removed, once the
proteins have been synthesized, and excess reagents can be removed
by flushing the array with buffer.
The advantage of integrating a surface tension microarray into
a microfluidic flow channel presents itself by enabling several
subsequent reactions to be carried out in a straightforward manner.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Conclusion and Current Challenges 767

Furthermore, functionalization of the solid (hydrophilic) phase


opens up a number of new possibilities because reaction products
can be captured in an addressable format and, for example, used for
inputs in subsequent reactions.

22.4 Conclusion and Current Challenges

Directed evolution of enzymes on the basis of high-throughput


screening of millions of samples is a challenging task consisting
of three processes: (i) library preparation, (ii) library screening,
and (iii) extraction of lead candidates. Recent innovations in micro-
and nanotechnology have improved on all three processes, but
so far only emulsion droplet-based approaches have managed to
deliver on all three simultaneously. In this chapter, we have de-
scribed bottom-up and top-down approaches to achieving enzyme
library compartmentalization and screening. Although bottom-up
approaches such as liposomes, polymersomes, and bulk emulsion
droplets are able to achieve unparalleled sample downscaling with
volumes in the zepto-to-femtoliter range, their main drawback is
the inherent sample polydispersity. Compartment polydispersity
presents difficulties both in terms of library preparation and screen-
ing. First, because compartments exhibit a range of sizes, the library
integrity might be compromised during encapsulation of genetic
material (DNA, RNA, cells), for example, large compartments may
harbor more than one library member, whereas small compartments
might not receive any genes at all. Second, during the enzyme
performance screening, compartment polydispersity necessitates
the measured enzyme assay signal to be normalized to the size
of the compartment, for example, small compartments might host
highly active enzymes, but only a small amount of substrate due
to the small volume, whereas large compartments might include
poorly performing enzymes, but high amounts of substrates, which
can be processed albeit at a lower rate. Even though polydis-
persity can be reduced by, for example, sonication, mechanical
agitation, or filter extrusion, it cannot be completely overcome
and thus will continue to impede accurate library preparation and
quantification.
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

768 Nanoscale Enzyme Screening Technologies

Of all the bottom-up approaches described herein, bulk emulsion


droplet platforms have exhibited the best and most consistent
performance. This is mainly due to their integration with flow
cytometry-based screening and sorting, which has allowed millions
of droplets to become screened in a matter of hours. Liposomes have
not so far managed to deliver consistent and reproducible enzyme
library screening, which likely is due to their smaller volumes (zepto
to attoliter) relative to emulsion droplets (atto to femtoliter), thus
making flow cytometric detection and sorting more challenging.
On the other hand, liposomes are compatible with surface im-
mobilization and imaging, which cannot be readily achieved with
emulsion droplets. Screening using surface-immobilized samples
and imaging detectors is more sensitive than fluorescence-based
flow cytometry and can screen more samples faster, but to date no
good strategies for selecting and sorting of surface-bound samples
have been developed, hence impeding the utility of liposomes as
vehicles for directed evolution of enzymes.
Top-down fabrication of emulsion droplet-generating and
droplet-sorting devices represents the pinnacle of high-throughput
screening. By generating emulsion droplets one at a time, highly
uniform size distributions can be obtained, thus alleviating the
issues with compartment polydispersity faced by bulk emulsion
platforms. Microfluidic microdroplet devices (μMDs) deliver con-
sistent and reproducible performance in combination with a high
degree of customization, which allows them to be utilized for a
great number of challenging screening tasks. However, to achieve
customization skilled scientists with expertise on microfabrication
and microfluidics are required in order to build a functional
platform. Consequently, measurements are not trivial to initiate
and cannot compare in user-friendliness to, for example, FACS
equipment. In addition, a challenge shared between liposomes,
bulk emulsion droplets, and microfluidic microdroplets is the
task of reagent exchange and delivery. Although reagents can
be successfully delivered by, for example, fusogenic liposomes,
nanodroplets, or pico-injectors, it is a much more challenging task to
completely exchange the compartment lumen with another reagent.
Emulsion droplets are in principle able to achieve this if colloidal
capture agents are included in the emulsion droplets (e.g., a surface-

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

Future Improvements 769

functionalized bead or a cell). In this way, molecules of interest,


such as the enzyme, may be immobilized on the colloid agent.
Next, the emulsion can be disrupted and the colloid agent carrying
the molecule of interest can be washed/recovered and finally
reestablished inside new emulsion droplets. The disadvantages of
this approach are that (i) rare samples may be lost during the sample
handling and (ii) any prior performance of a given emulsion droplet
will be lost after the recovery step, for example, if two different
enzyme assays were carried out in two different screening/selection
rounds, then it would not be possible to identify the enzyme that
exhibited the best average performance, simply because the droplet
identity is lost between the first and the second screening.
On the other hand, integrated surface tension microarrays allow
straightforward reagent delivery and exchange due to their open
format and thus allow a greater range of assays to be conducted. The
arrays exhibit high sample uniformity and comparable (or lower)
polydispersity than microfluidic microdroplets, and by adding
molecular capture elements to the array surface, the sample identity
can be preserved between consecutive assays. In addition, arrays
are compatible with high-throughput imaging-based screening, but
their main drawback is related to extraction/selection of lead
candidates. The use of a piezo-driven micropipette to aspirate the
content of an array element is promising but sets a lower limit
to the sample density of approximately 0.5 mio. samples per cm2 .
If the sample density increases, the aspiration accuracy decreases,
thus compromising the extraction efficiency and integrity. Using
standard photolithographic microfabrication, the highest achievable
array density is about 30 mio. samples per cm2 , and consequently
ultrahigh-throughput screening can in principle be realized by
further innovations and improvements of lead candidate extraction.

22.5 Future Improvements

Even though techniques for compartmentalization and screening


of enzymes as described herein have already yielded impressive
results, still further improvements of directed molecular evolution
using high-throughput enzyme screening can be envisioned. One
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

770 Nanoscale Enzyme Screening Technologies

possible scenario is the integration of various compartmentalization


and screening approaches into one technology based on microfluidic
μMDs. It is expected that these devices will experience a gradual
increase in screening throughput by the implementation of (i)
faster droplet generators, (ii) more sensitive detectors, and (iii)
faster sorting switches. Furthermore, due to the modularity of
μMDs, it would be possible to merge them with next-generation
microarrays. Consequently, by subjecting subsets of microfluidically
sorted lead candidate droplets to microarray screening, a higher
level of analytical detail could be achieved by enabling more complex
multistep reactions to take place on the array.
Even further, the success of integrating microcolloid particles as
a solid phase in bulk emulsion droplets could be adapted for μMDs,
thus allowing relevant biomolecules to be captured for subsequent
analysis or simply to enable more efficient reagent exchange.
Alternatively, liposomes could be integrated in the aqueous lumen of
microfluidic microemulsion droplets. This would add a high degree
of biocompatibility in an otherwise highly purified system and could,
for example, be applied to supplement the screening reaction with
specific membrane-bound proteins. Another option would be to use
emulsion-encapsulated liposomes to provide an additional layer of
reagent control by encapsulating specific reagents in the liposome
lumen. By passing the liposomes in emulsions by specific heating
zones on the microfluidic device, reagents could be delivered due
to the enhanced permeability of the liposome membrane at the lipid
bilayer phase transition temperature.
In conclusion, there exists a great number of possible combi-
nations between bulk emulsion, liposome, and μMD technologies,
and unless high-throughput screening is replaced as the method of
choice for directed enzyme evolution, then we believe that during
the next decade we will see a great number of them realized.

Acknowledgments

We thank the Lundbeck Foundation (Grant R69-A8216) for support


of studies on protein and enzyme compartmentalization.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

References 771

References

1. Tee, K. L., and Wong, T. S. (2013). Polishing the craft of genetic diversity
creation in directed evolution, Biotechnol. Adv., 31, pp. 1707–1721.
2. Madou, M. J. (2011). Fundamentals of Microfabrication and Nanotechnol-
ogy (CRC Press).
3. Israelachvili, J. N. (1991). Intermolecular and Surface Forces (Academic
Press, London).
4. Szoka, F., and Papahadjopoulos, D. (1980). Comparative properties and
methods of preparation of lipid vesicles (liposomes), Annu. Rev. Biophys.
Bioeng., 9, pp. 467–508.
5. Lasic, D. D. (1988). The mechanism of vesicle formation, Biochem. J., 256,
p. 1.
6. Walde, P., and Ichikawa, S. (2001). Enzymes inside lipid vesicles:
preparation, reactivity and applications, Biomol. Eng., 18, pp. 143–177.
7. Winterhalter, M., and Lasic, D. D. (1993). Liposome stability and for-
mation: experimental parameters and theories on the size distribution,
Chem. Phys. Lipids, 64, pp. 35–43.
8. Mozafari, M. R. (2005). Liposomes: an overview of manufacturing
techniques, Cell. Mol. Biol. Lett., 10, pp. 711–719.
9. Kunding, A. H., Mortensen, M. W., Christensen, S. M., and Stamou, D.
(2008). A fluorescence-based technique to construct size distributions
from single-object measurements: application to the extrusion of lipid
vesicles, Biophys. J., 95, pp. 1176–1188.
10. Yoshimoto, M. (2011). Stabilization of enzymes through encapsulation
in liposomes. In Enzyme Stabilization and Immobilization, Minteer, S. D.,
ed. (Humana Press), 679, pp. 9–18.
11. Shew, R. L., and Deamer, D. W. (1985). A novel method for encapsulation
of macromolecules in liposomes, Biochim. Biophys. Acta, 816, pp. 1–8.
12. Adrian, G., and Huang, L. (1979). Entrapment of proteins in phos-
phatidylcholine vesicles, Biochemistry (Mosc.), 18, pp. 5610–5614.
13. Cruz, M. E. M., Gaspar, M. M., Lopes, F., Jorge, J. S., and Perez-Soler, R.
(1993). Liposomal l-asparaginase: in vitro evaluation, Int. J. Pharm., 96,
pp. 67–77.
14. Dufour, P., Vuillemard, J. C., Laloy, E., and Simard, R. E. (1996).
Characterization of enzyme immobilization in liposomes prepared from
proliposomes, J. Microencapsul., 13, pp. 185–194.
15. Matsuzaki, M., McCafferty, F., and Karel, M. (2007). The effect of
cholesterol content of phospholipid vesicles on the encapsulation and
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

772 Nanoscale Enzyme Screening Technologies

acid resistance of β-galactosidase from E. coli, Int. J. Food Sci. Technol.,


24, pp. 451–460.
16. Piard, J.-C. et al. (1986). Acceleration of cheese ripening with liposome-
entrapped proteinase, Biotechnol. Lett., 8, pp. 241–246.
17. Espinola, L. G., Wider, E. A., Stella, A. M., and Del C. Batlle, A. M.
(1983). Enzyme replacement therapy in porphyrias: II. Entrapment of
δ-aminolaevulinate dehydratase in liposomes, Int. J. Biochem., 15, pp.
439–445.
18. Christensen, S. M., and Stamou, D. (2007). Surface-based lipid vesicle
reactor systems: fabrication and applications, Soft Matter, 3, p. 828.
19. Stamou, D., Duschl, C., Delamarche, E., and Vogel, H. (2003). Self-
assembled microarrays of attoliter molecular vessels, Angew. Chem., Int.
Ed. Engl., 42, pp. 5580–5583.
20. Sunami, T. et al. (2006). Femtoliter compartment in liposomes for in
vitro selection of proteins, Anal. Biochem., 357, pp. 128–136.
21. Corvera, E., Mouritsen, O. G., Singer, M. A., and Zuckermann, M. J.
(1992). The permeability and the effect of acyl-chain length for
phospholipid bilayers containing cholesterol: theory and experiment,
Biochim. Biophys. Acta, 1107, pp. 261–270.
22. Antonov, V. F., Petrov, V. V., Molnar, A. A., Predvoditelev, D. A., and Ivanov,
A. S. (1980). The appearance of single-ion channels in unmodified lipid
bilayer membranes at the phase transition temperature, Nature, 283,
pp. 585–586.
23. Papahadjopoulos, D., Jacobson, K., Nir, S., and Isac, I. (1973). Phase
transitions in phospholipid vesicles fluorescence polarization and
permeability measurements concerning the effect of temperature and
cholesterol, Biochim. Biophys. Acta, 311, pp. 330–348.
24. Blicher, A., Wodzinska, K., Fidorra, M., Winterhalter, M., and Heimburg, T.
(2009). The temperature dependence of lipid membrane permeability,
its quantized nature, and the influence of anesthetics, Biophys. J., 96, pp.
4581–4591.
25. Bolinger, P., Stamou, D., and Vogel, H. (2008). An Integrated Self-
Assembled Nanofluidic System for Controlled Biological Chemistries,
Angew. Chem., Int. Ed., 47, pp. 5544–5549.
26. Karlsson, M. et al. (2004). Biomimetic nanoscale reactors and networks,
Annu. Rev. Phys. Chem., 55, pp. 613–649.
27. Sott, K. et al. (2006). Controlling enzymatic reactions by geometry in a
biomimetic nanoscale network, Nano Lett., 6, pp. 209–214.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

References 773

28. Chiu, D. T. et al. (1999). Chemical transformations in individual


ultrasmall biomimetic containers, Science, 283, pp. 1892–1895.
29. Christensen, S. M., Mortensen, M. W., and Stamou, D. G. (2011). Single
vesicle assaying of SNARE-synaptotagmin-driven fusion reveals fast and
slow modes of both docking and fusion and intrasample heterogeneity,
Biophys. J., 100, pp. 957–967.
30. Christensen, S. M., Bolinger, P.-Y., Hatzakis, N. S., Mortensen, M. W., and
Stamou, D. (2011). Mixing subattolitre volumes in a quantitative and
highly parallel manner with soft matter nanofluidics, Nat. Nanotechnol.,
7, pp. 51–55.
31. Yoon, T. Y., Okumus, B., Zhang, F., Shin, Y. K., and Ha, T. (2006). Multiple
intermediates in SNARE-induced membrane fusion, Proc. Natl. Acad. Sci.,
103, pp. 19731–19736.
32. Stengel, G., Zahn, R., and Höök, F. (2007). DNA-induced programmable
fusion of phospholipid vesicles, J. Am. Chem. Soc., 129, pp. 9584–9585.
33. Kunding, A. H. et al. (2011). Intermembrane docking reactions are
regulated by membrane curvature, Biophys. J., 101, pp. 2693–2703).
34. Discher, B. M. (1999). Polymersomes: tough vesicles made from diblock
copolymers, Science, 284, pp. 1143–1146.
35. Gräfe, D., Gaitzsch, J., Appelhans, D., and Voit, B. (2014). Cross-linked
polymersomes as nanoreactors for controlled and stabilized single and
cascade enzymatic reactions, Nanoscale, 6, p. 10752.
36. Peters, R. J. R. W. et al. (2014). Cascade Reactions in Multicompartmen-
talized Polymersomes, Angew. Chem., Int. Ed., 53, pp. 146–150.
37. Aljabali, A. A. A., Barclay, J. E., Steinmetz, N. F., Lomonossoff, G. P., and
Evans, D. J. (2012). Controlled immobilisation of active enzymes on the
cowpea mosaic virus capsid, Nanoscale, 4, p. 5640.
38. Comellas-Aragonès, M. et al. (2007). A virus-based single-enzyme
nanoreactor, Nat. Nanotechnol., 2, pp. 635–639.
39. Brussaard, C. P. D., Marie, D., and Bratbak, G. (2000). Flow cytometric
detection of viruses, J. Virol. Methods, 85, pp. 175–182.
40. Kelly, B. T., Baret, J.-C., Taly, V., and Griffiths, A. D. (2007). Miniaturizing
chemistry and biology in microdroplets, Chem. Commun., pp. 1773–
1788.
41. Griffiths, A. D., and Tawfik, D. S. (2006). Miniaturising the laboratory in
emulsion droplets, Trends Biotechnol., 24, pp. 395–402.
42. Tawfik, D. S., and Griffiths, A. D. (1998). Man-made cell-like compart-
ments for molecular evolution, Nat. Biotechnol., 16, pp. 652–656.
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

774 Nanoscale Enzyme Screening Technologies

43. Mazutis, L. et al. (2013). Single-cell analysis and sorting using droplet-
based microfluidics, Nat. Protoc., 8, pp. 870–891.
44. Miller, O. J. et al. (2006). Directed evolution by in vitro compartmental-
ization, Nat. Methods, 3, pp. 561–570.
45. Griffiths, A. D., and Tawfik, D. S. (2003). Directed evolution of an
extremely fast phosphotriesterase by in vitro compartmentalization,
EMBO J., 22, pp. 24–35.
46. Dressman, D., Yan, H., Traverso, G., Kinzler, K. W., and Vogelstein, B.
(2003). Transforming single DNA molecules into fluorescent magnetic
particles for detection and enumeration of genetic variations, Proc. Natl.
Acad. Sci., 100, pp. 8817–8822.
47. Margulies, M. et al. (2005). Genome sequencing in microfabricated high-
density picolitre reactors, Nature, 437, 376–380.
48. Mastrobattista, E. et al. (2005). High-throughput screening of enzyme
libraries: in vitro evolution of a beta-galactosidase by fluorescence-
activated sorting of double emulsions, Chem. Biol., 12, pp. 1291–1300.
49. Aharoni, A., Amitai, G., Bernath, K., Magdassi, S., and Tawfik, D. S.
(2005). High-throughput screening of enzyme libraries: thiolactonases
evolved by fluorescence-activated sorting of single cells in emulsion
compartments, Chem. Biol., 12, pp. 1281–1289.
50. Sepp, A., Tawfik, D. S., and Griffiths, A. D. (2002). Microbead display by in
vitro compartmentalisation: selection for binding using flow cytometry,
FEBS Lett., 532, pp. 455–458.
51. Bernath, K., Magdassi, S., and Tawfik, D. S. (2005). Directed evolution of
protein inhibitors of DNA-nucleases by in vitro compartmentalization
(IVC) and nano-droplet delivery, J. Mol. Biol., 345, pp. 1015–1026.
52. Theberge, A. B. et al. (2010). Microdroplets in microfluidics: an evolving
platform for discoveries in chemistry and biology, Angew. Chem., Int. Ed.,
49, pp. 5846–5868.
53. Anna, S. L., Bontoux, N., and Stone, H. A. (2003). Formation of
dispersions using ‘flow focusing’ in microchannels, Appl. Phys. Lett., 82,
p. 364.
54. Garstecki, P., Stone, H., and Whitesides, G. (2005). Mechanism for
flow-rate controlled breakup in confined geometries: a route to
monodisperse emulsions, Phys. Rev. Lett., 94, p. 164501.
55. Tan, Y.-C., Fisher, J. S., Lee, A. I., Cristini, V., and Lee, A. P. (2004). Design
of microfluidic channel geometries for the control of droplet volume,
chemical concentration, and sorting, Lab Chip, 4, p. 292.

www.ebook3000.com
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

References 775

56. Zagnoni, M., and Cooper, J. M. (2009). On-chip electrocoalescence of


microdroplets as a function of voltage, frequency and droplet size, Lab
Chip, 9, p. 2652.
57. Baroud, C. N., Robert de Saint Vincent, M., and Delville, J.-P. (2007). An
optical toolbox for total control of droplet microfluidics, Lab Chip, 7, p.
1029.
58. Abate, A. R., Hung, T., Mary, P., Agresti, J. J., and Weitz, D. A. (2010). High-
throughput injection with microfluidics using picoinjectors, Proc. Natl.
Acad. Sci., 107, pp. 19163–19166.
59. Baret, J.-C. et al. (2009). Fluorescence-activated droplet sorting (FADS):
efficient microfluidic cell sorting based on enzymatic activity, Lab Chip,
9, p. 1850.
60. Agresti, J. J. et al. (2010). Ultrahigh-throughput screening in drop-based
microfluidics for directed evolution, Proc. Natl. Acad. Sci., 107, pp. 4004–
4009.
61. Huebner, A. et al. (2008). Development of Quantitative Cell-Based
Enzyme Assays in Microdroplets, Anal. Chem., 80, pp. 3890–3896.
62. Holtze, C. et al. (2008). Biocompatible surfactants for water-in-
fluorocarbon emulsions, Lab Chip, 8, p. 1632.
63. Franke, T., Abate, A. R., Weitz, D. A., and Wixforth, A. (2009). Surface
acoustic wave (SAW) directed droplet flow in microfluidics for PDMS
devices, Lab Chip, 9, p. 2625.
64. Baroud, C., Delville, J.-P., Gallaire, F., and Wunenburger, R. (2007).
Thermocapillary valve for droplet production and sorting, Phys. Rev. E,
75, p. 046302.
65. Chen, A. et al. (2013). On-chip magnetic separation and encapsulation
of cells in droplets, Lab Chip, 13, p. 1172.
66. Liu, S. et al. (2008). The electrochemical detection of droplets in
microfluidic devices, Lab Chip, 8, p. 1937.
67. Roman, G. T., Wang, M., Shultz, K. N., Jennings, C., and Kennedy, R.
T. (2008). Sampling and electrophoretic analysis of segmented flow
streams using virtual walls in a microfluidic device, Anal. Chem., 80, pp.
8231–8238.
68. Fidalgo, L. M. et al. (2009). Coupling microdroplet microreactors with
mass spectrometry: reading the contents of single droplets online,
Angew. Chem., Int. Ed., 48, pp. 3665–3668.
69. Sarrazin, F., Salmon, J.-B., Talaga, D., and Servant, L. (2008). Chemical
reaction imaging within microfluidic devices using confocal Raman
March 23, 2016 13:10 PSP Book - 9in x 6in 22-Allan-Svendsen-c22

776 Nanoscale Enzyme Screening Technologies

spectroscopy: the case of water and deuterium oxide as a model system,


Anal. Chem., 80, pp. 1689–1695.
70. Hatch, A. C. et al. (2011). 1-Million droplet array with wide-field
fluorescence imaging for digital PCR, Lab Chip, 11, p. 3838.
71. Pekin, D. et al. (2011). Quantitative and sensitive detection of rare
mutations using droplet-based microfluidics, Lab Chip, 11, p. 2156.
72. Huebner, A. et al. (2009). Static microdroplet arrays: a microfluidic
device for droplet trapping, incubation and release for enzymatic and
cell-based assays, Lab Chip, 9, pp. 692–698.
73. Shoemaker, D. D., and Linsley, P. S. (2002). Recent developments in DNA
microarrays, Curr. Opin. Microbiol., 5, pp. 334–337.
74. Katz, C. et al. (2011). Studying protein–protein interactions using
peptide arrays, Chem. Soc. Rev., 40, p. 2131.
75. Gorris, H. H., and Walt, D. R. (2010). Analytical Chemistry on the
Femtoliter Scale, Angew. Chem., Int. Ed., 49, pp. 3880–3895.
76. Zhang, H., Nie, S., Etson, C. M., Wang, R. M., and Walt, D. R. (2012). Oil-
sealed femtoliter fiber-optic arrays for single molecule analysis, Lab
Chip, 12, p. 2229.
77. Rissin, D. M. et al. (2010). Single-molecule enzyme-linked immunosor-
bent assay detects serum proteins at subfemtomolar concentrations,
Nat. Biotechnol., 28, pp. 595–599.
78. Gorris, H. H., and Walt, D. R. (2009). Mechanistic aspects of horseradish
peroxidase elucidated through single-molecule studies, J. Am. Chem. Soc.,
131, pp. 6277–6282.
79. Lafferty, M., and Dycaico, M. J. (2004). GigaMatrixTM : an ultra high-
throughput tool for accessing biodiversity, J. Assoc. Lab. Autom., 9, pp.
200–208.
80. Kim, S. H. et al. (2012). Large-scale femtoliter droplet array for digital
counting of single biomolecules, Lab Chip, 12, p. 4986.
81. Kinpara, T. (2004). A Picoliter Chamber Array for Cell-Free Protein
Synthesis, J. Biochem. (Tokyo), 136, pp. 149–154.
82. Mugherli, L. et al. (2009). In situ assembly and screening of enzyme
inhibitors with surface-tension microarrays, Angew. Chem., Int. Ed., 48,
pp. 7639–7644.
83. Iino, R., Matsumoto, Y., Nishino, K., Yamaguchi, A., and Noji, H. (2013).
Design of a large-scale femtoliter droplet array for single-cell analysis of
drug-tolerant and drug-resistant bacteria, Front. Microbiol., 4, p. 300.
84. Kunding, A. H., Christensen, S. F. (2014). A method of charging a test-
carrier and a test-carrier. Patent No. WO2014001459A1.

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Chapter 23

Computational Enzyme Engineering:


Activity Screening Using Quantum
Chemistry

Martin R. Hediger
DSM Nutritional Products Site Sisseln, Hauptstrasse 4, P.O. Box, Switzerland
martin.hediger@dsm.com

In this chapter, the application of quantum chemical methods


for screening of enzyme activity is discussed. Using a linear
interpolation approach, where the enzyme structure is divided into
discrete structural frames along the reaction coordinate of the rate-
determining step and the wave function derived semi-empirical
energies of each frame are evaluated to provide an approximate
reaction profile, the amide hydrolysis activity in Candida antarctica
lipase B (CalB) and glycosidic bond hydrolysis activity in Bacillus
circulans xylanase (Bcx) and derived mutants is screened. The
application of this approach to a programmatically generated
library of mutant structures allows ranking and identification of
mutants with likely increased activity toward an artificial substrate.
Experimental verification of the CalB results yields a prediction rate
where 15 of 22 mutants are correctly characterized in terms of
both increased and decreased activity. The screening of Bcx mutants
allows identification of structural patterns among the most active

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

778 Computational Enzyme Engineering

mutants providing a new hypothesis for rationalizing the catalytic-


enhancing determinants of the Bcx active site. At the time of writing,
the presented applications are the largest quantum chemical screens
of enzyme catalysts and present a valuable tool for computational
enzyme engineering.

23.1 Motivation

Let us imagine two companies a and b. Both companies use similar


technical equipment to carry out the biotechnological process X
Biocatalyst
−−−−−→ Y using two different versions of a biocatalyst. Company
a uses an enzyme with a rate constant kcat, a = 1000s −1 , while
company b uses an enzyme with kcat, b = 2000s −1 . Letting all other
things be equal, the process of company b will therefore only require
half the time to produce 1 mole of product compared to the time
required for company a. Company b therefore can save energy
because the reaction volume does not need to be heated for as
long as company a needs to in order to produce the same amount
of product. The need for efficient catalysts arises from such an
outline and the commercial implications of these considerations are
immediate.a
Increasing the performance of enzymes, however, is still far from
trivial and forms a growing body of research. What is clear, though,
is that the development of such catalysts is costly, in terms of
manpower, material, and energy—if carried out in the laboratory.
A number of companies have in fact formed around this demand:
Novozymes (DK), Genzyme (USA), and DSM (NL) to name but a few
[4, 22, 27, 35].
Laboratory costs can, however, be saved to a large part if initial
development is carried out in silico. The proof that computational
results are as reliable as experimental results has been provided
not too long ago [8]. The foundation for successful application
of computational enzyme engineering has thus been laid out and
its success was acknowledged by awarding the Nobel Prize in
Chemistry in 2013 to pioneers of the field.

a In this work, we use the terms enzyme and bio-/catalyst interchangeably.

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Introduction 779

23.2 Introduction

In this review, we aim to provide an introduction to the topic of


modeling of enzyme catalysis for people from both within and
outside the field, with emphasis on aspects of activity screening
and some emphasis on commercial applicability. We outline the
latest achievements, methods in use, and required developments to
further establish computational enzyme engineering and screening
also in commercial practice.
While the process engineering team in company a from before
is working on improving reaction conditions, consider the enzyme
engineer required to engineer the currently used catalyst to match
(and possibly outperform) the activity of the catalyst used by
company b. Following the hypothesis that the function of an enzyme
depends on its structure, in order to change the function, some
kind of change to the structure has to be introduced, and this in
general requires the introduction of one or more mutations of amino
acids in the primary structure of the enzyme. However, it is almost
impossible to predict how a mutated variant of an enzyme will
behave. Will it be possible to express it, will it even fold, and if so,
will it be more or less active relative to the wild type (WT)? Are
its pH properties still the same, and is it of similar temperature
stability? Even the most reasonably suggested mutations (except
for elimination of the catalytic residues) will hardly allow one to
predict qualitatively with confidence if the mutated enzyme will be
more or less active. Therefore it appears as if the only reasonable
strategy were to carry out systematic screening studies.a As was
pointed out initially, such studies are expensive in the laboratory
and therefore an alternative is to carry them out on the computer.
The aim of such initial computational screening is not necessarily
to provide high-accuracy kinetic data but much rather to provide
a reliable, qualitative activity estimate of a preferably large mutant
library. It is important to note that while the initial computational
screening might be of limited accuracy, its major value is the
provision of a systematically developed library of mutants and

a By screening we mean the construction of a large library of mutants that is inspected

and compared for a specific property like activity.


March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

780 Computational Enzyme Engineering

qualitative results indicating which of these are likely to be most


active. On the basis of the screening, promising candidates can then
be pipelined into a more demanding computational method for
further characterization.
Also outside a commercial environment, screening a library of
candidates can be of use. As we show in the second application,
it is possible to identify structural patterns in the more active
mutants and such observations can lead to new hypotheses about
structure/function relationships.
The chapter is organized as follows. First, a number of technical
aspects are introduced in the methods section, and then we will
discuss the application of the approach to the modeling of two
enzymes, Candida antarctica lipase B (CalB) and Bacillus circulans
xylanase (Bcx).

23.3 Methods

23.3.1 Calculation Engines


This article focuses on the application of quantum chemistry
for screening of enzyme mutants. So instead of attempting
to incompletely review modern computational chemistry meth-
ods, which are explained in textbooks on the subject [9, 19,
46], only a few selected facts relevant for the following are
highlighted
Most fundamentally, computational descriptions of a molecular
system are done using either (parametric) force fields or ab
initio electronic structure methods.a Force field methods are
useful in studying the structural behavior of enzymes over a
significant time period (approaching the millisecond scale) and can
provide details about the structural rearrangement of an enzyme.
These simulations are usually referred to as molecular dynamics
simulations.
Chemical reactivity, however, is accompanied by changes in
the electronic structure of the system and that are described by

a Molecular mechanics, empirical mechanics, and classical mechanics are other


expressions for force field methods.

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Methods 781

quantum chemical methods. Quantum chemical methods comprise


so-called semi-empirical, density functional, or ab initio methods
(semi-empirial methods are different from the empirical mechanics
methods of the previous paragraph).
Quantum chemical methods attempt at solving the Schrödinger
equation for a molecule; the solution of this equation will yield the
wave function of the system. However, depending on the specifics of
the quantum chemical method, the solutions are, albeit likely similar,
different. Now, since every quantum chemical method provides
its own description of the wave function (or electron density in
density functional methods) and all properties, such as energy or
charge distribution, are derived from the wave function (or the
electron density), it is essential to note that every method has
its own chemistry. depending on the wave function (or density)
it produces. As a consequence, hydrogen-bonding patterns of a
structure can be different when optimized with two different
quantum chemical methods. Therefore, it is a good practice to verify
the predictions of one quantum chemical method against, perhaps,
higher-level methods or literature references. With regard to the
modeling of enzymatic reactions, if a method can reproduce the
generally accepted mechanism then there is a good chance that
at least qualitatively reliable conclusions can be made using that
method.
In the applications described next, the semi-empirical PM6
method [40, 41] was mainly used together with the MOZYME
technology [39]a . Initial test calculations on small active site models
of two enzymes allowed us to reproduce the generally accepted
mechanisms, and in a comparative study, PM6 is found to provide
results approaching the accuracy of density functional methods [34].
The PM6 method is actively being developed further to take into
account properties such as dispersion or solvation [32, 33]. The
MOZYME technology allows one to localize the molecular orbitals
obtained during the self-consistent field calculation, which results
in computational demands scaling linearly with system size. In

a Semi-empirical essentially means that parts of the calculation are parametrized from

the start, which results in reduced computational demands.


March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

782 Computational Enzyme Engineering

essence, these two technologies in combination therefore allow one


to efficiently model enzymatic reactivity in a screening assay.

23.3.2 Molecular Modeling


A variety of molecular modeling approaches are available in the
literature, all suitable for different applications. We briefly outline
two of them and then introduce in more detail the approach used in
the screening studies of the application section.

Cluster approach: Among the most popular approaches is the


so-called cluster approach, where a larger or smaller part of the
enzyme active site is extracted into a separate model and then
all calculations are carried out on this smaller model (consisting
of usually 10–15 residues [24]). Structural integrity of the model
is preserved by applying constraints on the truncation points,
and among the most prominent applications of this approach is
the kinetic discrimination between possible competing mechanistic
alternatives [18, 30, 37]. This technique, in principle, can be used
in enzyme activity–screening studies; however, care has to be taken
that no mutations are introduced into the model, which can only
be accommodated because of the reduced model size, that is,
because there appears to be space for a side chain where in reality
there would not be any. An advantage of cluster models is that
because of the reduced model size, higher level methods can be
applied.

Full enzyme models: Using a full model of the enzyme structure


has advantages but also raises a number of issues. Including the
full enzyme structure in the calculation naturally increases the
computational demands significantly. Thus when running calcula-
tions on a full enzyme structure, careful consideration of the trade-
off between accuracy and efficiency becomes very important. One
way of addressing this issue is by using so-called hybrid methods,
which combine a quantum chemical– and a classical mechanics–
based calculation. The advantage of these methods is that they are
capable of producing highly accurate results; the setup, especially
the definition of the boundary between the classical and quantum
regions, of such calculations is, however, still not as straightforward

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Methods 783

as is required for large-scale screening approaches. An alternative is


to describe the complete system using the same, albeit lower level
of, theory (i.e., PM6), which reduces efforts to set up the calculation
significantly. This is the approach of the applications described
below.
As was outlined before, semi-empirical methods in combination
with linear scaling technologies, which are readily available in a
number of programs (GAMESS, MOPAC, and others), are sufficiently
efficient and allow for some degree of programmatic customization,
which is also of great relevance for later data analysis.

Calculation of the reaction barrier: Irrespective of the method


used, calculating the activation energy requires to estimate the
difference between the transition state and the most stable
state of the system before the transition state on the reaction
coordinate. This is a nontrivial problem and no generally applicable
methods exist to do this; solving this problem will therefore
always at one point require manual input from the modeler.
Furthermore, characterization and identification of a transition
state are computationally very demanding operations. A reasonable
approach is therefore to approximate the transition state by linearly
interpolating the structure along the reaction coordinate of the rate-
determining step of the total reaction. If carried out carefully, this
approach can provide very good approximations of the transition
state.

Linear interpolation: In both applications outlined below, the


basic procedure to obtain a reaction barrier consists of preparing
structures for the enzyme substrate complex (ES) and the first
intermediate (CalB: TIa , Bcx: GEb ) and calculating the reaction
barrier inbetween the ESCALB -TI and ESBCX -GE pairs, respectively. For
these two pairs, the reaction barrier is calculated by preparing a set
of 10 intermediate structures that approximate the structure of the
system along the reaction coordinate (Fig. 23.1A).c By evaluating

a Tetrahedral intermediate.
b Glycosyl enzyme.
c In the literature, the terms interpolation, linear transit scan, reaction coordinate
calculation, and adiabatic mapping are frequently used to indicate essentially the
same kind of procedure.
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

784 Computational Enzyme Engineering

Activation Energy

6 CALB WT
4
[kcal/mol]

2
0
-2
-4
0 2 4 6 8 10
Reaction Coordinate
Figure 23.1 (A) Illustration of linear interpolation. It is visible how the
substrate and the nucleophilic oxygen of S105 approach each other. Also
visible is the proton transfer to H224 in the late interpolation frames and
how the oxyanion hole (T42) follows the increasingly negative carbonyl
oxygen of the substrate. (B) Evaluation of the system energy for each
interpolation frame results in an approximate reaction barrier (calculation
done with PM6).

the energies for each interpolation frame for the two reactions
ESCALB →TI or ESBCX →GE, an approximate potential energy surface
of the reaction is obtained (Fig. 23.1B). The quality of the results of
this approach is strongly dependent on how careful the modeling
of each of these structures is carried out. Since the geometry of
every interpolation frame is optimized individually, it is possible that
adjacent interpolation frames optimize into significantly different

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Methods 785

local minima; this is a frequent reason for meaningless results. It is


therefore advisable that the two end points of the interpolation be
not too different in structure.
This approach provides an estimate of kcat and in the working
hypothesis it is assumed that binding of the substrate by different
mutants is similar, an observation that is also made in experiments
[26].

Requirements: The most basic requirements are that a structure


representing a bound intermediate along the reaction pathway be
available. For modeling it is advantageous to have a structure with
a bound intermediate that can then be modified into the substrate,
while preserving its natural binding pose. This model is then used
as a template for both stationary points adjacent to the transition
state of the rate-determining step in the total reaction. Using a single
model as the template for both the intermediate of the reaction as
well as the preceding stationary point is additionally advantageous
because the structural differences between these structures will be
small and will less likely lead to nonconclusive reaction barriers
in the interpolation. A second requirement is naturally to have
a reasonable understanding of the reaction mechanism. In case
the mechanism is not available from the literature, computational
experiments on small active site models can provide a working
hypothesis.

Preparation of mutant structures: The structure of the mutants


can be prepared with any kind of molecular modeling software;
however, it is advantageous if the software can be controlled
programmatically. When preparing molecular structures of the
mutants, it is frequently observed that specific pairs of mutants
cannot be accommodated in the same model because the two new
side chains would clash. In the simplest version of the approach,
these variants are simply discarded from the screen, assuming that
such mutants would not be expressed in vivo. Furthermore, side
chain orientations can be oriented in various ways and can be
defaulted using a local optimization procedure (as the one that is
built into PYMOL; see later). Ionization states of side chains greatly
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

786 Computational Enzyme Engineering

complicate the screening, and at the current development state, only


default ionization states are considered.

23.3.3 Software
The largest part of the calculations in the applications below have
been carried out with the MOPAC software [38]. This software is
designed to be convenient to use. It offers various semi-empirical
calculation engines and linear scaling techniques and is optimized
toward working with Protein Data Bank (PDB)-formatted data
structures.
To prepare the molecular models, the protein visualization
software PYMOL [36] offers a huge range of functions and can be
controlled programmatically, which is critical for the preparation
of a large, systematic set of mutants. It has to be noted that
molecular modeling and computational screening still consist of
a lot of customization work; especially final data analysis usually
requires development of software for extraction of relevant data
from the output files. This might also be contributing to pre-
venting quantum chemical methods from becoming more popular
in industrial environments. Docking methods, for example, are
implemented in software that is designed toward user-friendliness
and have established themselves in industry [12]. Another project
is GTKDynamo, combining the modeling features of PYMOL directly
with quantum chemical calculation software, which looks very
advanced but so far, however, has gained traction only with a small
community [2].

23.4 Applications

23.4.1 Overview
From a technical point of view, a computational screening assay
is characterized by its computational efficiency, that is, how long
it takes for the calculations to complete, its user-friendliness, its
accuracy, and the modeling prerequisites. In the following section,
two applications of the presented screening method are introduced.

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Applications 787

It is shown how the method performs relative to these mentioned


criteria.

23.4.2 Engineering Candida antarctica Lipase B


Relevance and summary: Using CalB as a test case, we develop an
efficient modeling protocol, verify the calculations with experimen-
tal results, and show how quantum chemical methods can be used
for enzyme activity screening [14, 15]. The reaction studied is the
hydrolysis of N-benzyl-2-chloroacetamide.
CalB has been highly characterized over the years such that
its structure, physical properties, and expression systems are well
documented [43, 44]. The enzyme is a highly established catalyst
with applications in organic synthesis, formulation technology [11],
and kinetic resolution of racemic mixtures [7, 13, 28]. Furthermore,
since the enzyme is known for its reactive promiscuity [5, 42],
it provides an ideal development platform for applications for
which existing biocatalysts are only of limited efficiency. One such
application is the hydrolysis of amide bond–containing lipophilic
compounds. While in principle amide bonds can be cleaved by
proteases or peptidases, these enzymes are found to be less active
toward hydrophobic substrates such as amide-containing lipids
[29]. Therefore, engineering CalB activity toward the CO − NH
bond containing hydrophobic substrates would greatly expand its
applicability.

Structure and mechanism: Briefly, the active site in CalB consists of


a catalytic triad (S105, D189 and H224) and an associated oxyanion
hole. The mechanism is understood to follow the generally accepted
mechanism of serine protease active sites [17] (Fig. 23.2). The rate-
determining step is believed to be the nucleophilic attack by Oγ of
S105 on the carbonyl carbon C20 of the substrate.
It is noted that the CalB model used consists of not the complete
structure but rather a comparably large cluster model consisting of
around 800 atoms (including hydrogens), where ordinarily cluster
models usually consist of up to 120 atoms. CalB has around 320
amino acids in the backbone, which would make the calculations of
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

788 Computational Enzyme Engineering

ES TS TI
O O O
20 R2 R2 R2
C C
R1 N R1 N R1 N
H H O H
OγHγ O Ser105
Ser105 Ser105 H
Nε2His224 NHis224 HNHis

Figure 23.2 Reaction scheme for the formation of the tetrahedral interme-
diate in CalB. R1 : -CH2 -Cl; R2 : -CH2 -C5 H6 [14].

the full structure take an exceedingly long time to complete; this will
be discussed further in the next section.

Calculations, experimental verification, and calibration: After


establishing a working proof-of-concept of the screening approach
using a limited set of three mutants [15], the method was scaled up
to screen a set of 386 mutants of the CalB active site [14]. This set
consisted of single- to eightfold mutants, and of these, the activity of
22 mutants was measured experimentally.
The accuracy of the method was calibrated in terms of qualitative
predictivity, that is, how likely can the method predict if a mutant
will have higher or lower activity than the WT? For this calibration,
the following points are to be noted. In the experiments, the activity
relative to the WT, in principle, ranges from zero to infinity. In
the calculations on the other hand, the activity of a mutant is
determined by the difference between the mutant reaction barrier
and the WT barrier.a However, even for a very active mutant, the
reaction barrier is not expected to be zero, and so in this approach,
since lower barriers are related to higher activities, the amount by
which a mutant can be more active than the WT is in principle
limited. Conversely, since the reaction barrier of a mutant can in
principle be arbitrarily high, the amount by which a mutant could
be less active than the WT might be infinite. Establishing a direct
quantitative relationship between differences in reaction barrier
heights and relative experimental activities is therefore not very

a If a mutant has a lower reaction barrier than the WT, it is understood to have higher

activity.

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Applications 789

Experimental Predicted
Activity > WT
0

Activity < WT

Figure 23.3 Agreement between predicted and experimental qualitative


activity relative to the wild type [14].

meaningful. Instead, it is more meaningful to only qualitatively


compare activities, that is, if a mutant has higher or lower activity
than the WT (both experimentally and calculated), and use these
results for determining the predictivity.
To cluster the results into increased/decreased activities it
is therefore necessary to decide above/below which activity a
mutant is considered being part of either the higher-activity or
lower-activity clusters. Considering the relatively narrow spread
of experimental activitiesa , mutants with ≥1.2 times the WT
activity are considered improving, while mutants with <0.8 times
the WT activity are considered degrading (mutants in between
are considered as neither improving nor degrading) [14]. The
computational results are clustered in a similar way. A cutoff value
for the calculated reaction barriers is introduced below which the
mutant is considered improving and above which the mutant is
considered degrading. It is found that at a specific cutoff value (12.5
kcal/mol), the method qualitatively predicts activity of 15 out of
22 mutants correctly (Fig. 23.3). Noteworthy, mutants with both
higher as well as lower activity are correctly predicted. Importantly,
the advantage of clustering the mutants in this way is not apparent
maximum agreement with the experimental data in the first place,
but much rather to provide a lower limit for the predictivity under

a The experimental improvement factors range from 0 to roughly 11 times WT activity.


March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

790 Computational Enzyme Engineering

the given set of approximations. It is important to keep in mind


that it will be unlikely to have calibration data available in a
computational enzyme activity assay, since ideally it is performed
before wet-lab experiments are started. Therefore, such an approach
is further justified, since at the initial outset of a screening study,
the focus is mostly on narrowing down possible candidates from a
large library, that is, the focus is on obtaining qualitative conclusions
rather than quantitative results [1]. Following such a screening
analysis, of the mutants predicted to have higher activity, a set
consisting of as many mutants as can be further characterized is
selected and suggested for further study, either computationally or
experimentally.

Screening results and discussion of mutants: Finding mutants


with reaction barriers significantly lower than the WT is difficult.
In the set of 386 screened mutants, we identified only 3 mutants
with a barrier lower than the WT (a double, a triple, and a fourfold
mutant), while a large proportion of the mutants has barriers in
the range from 12 to 14 kcal/mol. We find that a large number of
mutants needs to be screened to identify the best few candidates
for further study. This conclusion is likely even more true in studies
where the enzyme is either intrinsically of only very little activity
toward a specific substrate (it does not work).
Further analysis of the screening results was focused on two
residues, W104 and I189. These are selected because mutations of
these positions can provide more space for the substrate that is
hypothesized to lead to increased activity (Fig. 23.4). As seen, W104
and I189 are in direct proximity to the substrate. From the single-
mutant data, it is found that the mutants W104Q or W104Y have
barriers just around the cutoff value of 12.5 kcal/mol. However,
when in combination with other mutations, such as G39A-W104F-
A141Q-I189A (8.3 kcal/mol) or G39A-T103G-W104Y-A141N (9.8
kcal/mol), these mutations are among the most active ones found. It
is noted that it would be very difficult to arrive at these combinations
by pure rational considerations of the active site.
Analysis of mutations of I189 (Fig. 23.5) provides additional
interesting insight into the nonlinear effects of various mutations.
Five different mutations of this residue are screened, but to illustrate

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Applications 791

Figure 23.4 CalB active site; some residues omitted for clarity. SUB labels
the substrate.
PM6//MOZYME

I189A
8 I189G
7 I189H
Mutation Order (1-8)

I189N
6 I189Y
OTHER
5
4
3
2
1

5 10 15 20
Reaction Barrier [kcal/mol] (calculated)

Figure 23.5 Screening of position 189. Each data point corresponds to a


mutant with a calculated reaction barrier on the x axis and the number
of further mutations indicated by the value of the y axis. The data point
symbols indicate either of the point mutations I189A/G/H/N or Y. Data
points/mutants labeled as OTHER do not contain a mutation of I189. For
example, the far-left purple square on row 3 indicates a mutant containing
I189Y and two other mutations that are not explicitly indicated in the
graph.
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

792 Computational Enzyme Engineering

the point, the focus is on I189G. As a single mutant, I189G is


found to be completely unproductive (18.9 kcal/mol; far-right red
triangle in the row “Mutation Order = 1” of Fig. 23.5). In combination
with another mutation (row 2; the other mutation is not explicitly
labeled in the diagram), the barrier is reduced in one case to
around 17.5 kcal/mol, and in G39A-A141Q-I189G-L278A, the I189G
mutation is among the top three mutants found (6.3 kcal/mol, row
4). Conversely, the mutant G39A-A141Q-L278A (not shown) has a
moderately low barrier (10.9 kcal/mol). These results emphasize
the fact that the activity modulating effect of specific point mutations
can behave strongly nonlinear and nonadditive and should be kept
in mind when considering which mutations to include in further
studies. Put differently, these findings illustrate that it is difficult
to predict the activity of combination mutants on the basis of
single-mutant data alone. Furthermore, since single-mutant data is
apparently not sufficient to predict combination mutant activity,
screening a wide library of mutants should form an essential initial
part of any enzyme engineering project. It thus becomes clear that
computational activity screening can provide a major contribution
to cost and material use reduction of the total enzyme engineering
efforts.

Conclusions and outlook: In the study presented in the next


section, the method is improved with respect to a number of
points. Among these is the fact that due to reasons of exceeding
computational effort, not the complete enzyme structure was used
in the calculations, but only amino acids within 8 Å of the substrate
have been included in the model. The backbone sequence is
therefore interrupted, introducing ambiguity in which amino acids
to include in the model and how to orient hydrogen atoms used to
complete the valences at the interruption sites and possibly allowing
for nonrealistic structural rearrangements during the geometry
optimizations of the structure. Secondly, the mutated positions
were selected, in part, on the basis of prior knowledge of the
enzyme instead of systematically screening all active site positions;
however, a working proof-of-concept is provided by the data. Lastly,
it remained to be shown that the presented approach could also

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Applications 793

HO
OH
O NO2
HO O
O O
HO
HO

Figure 23.6 ortho-nitrophenol-β-xylobioside substrate used in a Bcx study


consisting of two xylose units and ortho-nitrophenol (ONP).

be adapted to mechanistically different reactions of other enzymes.


These points are addressed in the study in the following section.

23.4.3 Engineering Bacillus circulans Xylanase

Relevance and summary: To address the points summarized at


the end of the previous section, activity of the Bcx toward an
artificial substrate (ortho-nitrophenol-β-xylobioside) (Fig. 23.6) is
engineered. Understanding mechanisms of glycoside hydrolases
can also be of major relevance to related fields such as biofuel
production [10, 31, 45, 47].
Structure and mechanism: Due to its use in paper production
[3, 6], Bcx is also well characterized, a number of PDB structures
are available, and, because of its interesting mechanistical features,
it is also the subject of ongoing basic experimental research [26]. In
this study, the mechanism was taken from the literature, as reported
previously [20, 21].
In a first step, the WT reference reaction barrier is established.
This is a crucial step because all subsequent conclusions about
mutants depend on how good the WT reaction barrier is estimated.
As was done for CalB, only the rate-determining step is modeled
(Fig. 23.7). Noteworthy, in the formation of the covalently linked
glycoside–enzyme complex, the substrate isomerizes from a chair to
a boat conformation. As outlined in the methods section, the reaction
barrier is mapped out by fixing the reaction coordinate (x1 in
Fig. 23.7) at discrete, decrementing values, while optimizing the rest
of the protein structure. Since the reaction consists of two concerted
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

794 Computational Enzyme Engineering

Glu172 Glu172 Glu172

HO O O OH O O
RO O RO H RO H
1 OR' Glycosylation O ! O
HO HO OR' HO OR'
HO HO x1 HO
O O O O O O

Glu78 Glu78 Glu78

Figure 23.7 Rate-determining step in conventional glycosylation. x1 :


constrained reaction coordinate; R: xylose; R  : ONP. C1 indicating the
nucleophilic carbon of the first xylose unit [16].

events (nucleophilic attack by E78 and proton transfer from E172


to ONP), in principle a two-dimensional potential energy surface
could be mapped out to determine the reaction barrier. However,
since proton transfer is associated with a rearrangement of the
electronic structure, this reaction coordinate is left unconstrained
and is handled solely by the quantum chemical method, that is,
the proton is free to transfer from or remain on E172. Forcing
the quantum chemical model to do something it would not, if left
unconstrained, most likely results in an increased reaction barrier,
and therefore it is advisable to keep the number of constraints as
low as possible.
A PDB structure comprising a covalently linked inhibitor is
available (1BVV) and is used as a starting point for the modeling
procedure because it gives a very clear indication of how the
substrate is located in the active site. As shown in Fig. 2 in [16], two
possible modeling pathways were worked out. In the first one, the ES
complex is prepared from the GE structure before the GE structure
is optimized (see footnote b). Alternatively, the GE structure is
optimized first and then used as a template for the ES structure.
Within this alternative approach, the preparation of the ES structure
then just requires minor modifications of substrate conformation
before the structure is optimized.
Because in this second approach the ES structure very closely
resembles the optimized GE structure, it is possible to concentrate
the optimization of the ES structure only on the active site, while
the remote part of the enzyme remains fixed at the optimized
geometry of the GE structure. This results in very large gains in

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Applications 795

computational efficiency, which depend on the layer of residues


being optimized around the active site (Fig. 23.8A).a Fixing, that
is, not re-optimizing an already optimized part of the enzyme,
however, also results in slightly increased energy of the ES structure
relative to the fully optimized GE structure and so the reaction
barriers will be decreased (Fig. 23.8B). This increase (of ES energy
relative to GE) results from multiple minor structural strain forming
at the interface between the fixed and optimized layers of the
structure.

Calculations and experimental verification: On the basis of this


second approach, that is, using the optimized GE structure as the
template for the ES structure, and when not applying any constraints
(apart from the reaction coordinate x1 ), the reaction barrier is
found to be 18.5 kcal/mol, which is in very close agreement to
the experimentally reported value of 17.0 kcal/mol obtained from
transition-state theory [21]. This strongly supports that the outlined
modeling procedure and the quantum chemical method provide a
meaningful model of the enzyme kinetics. As pointed out before,
applying constraints on parts of the structure that are located away
from the active site can decrease the reaction barrier because the
ES structure is increased in energy relative to the GE reference
point (which is set to zero). However, it is reasonable to assume
that this relative increase will be the same for all studied structures
and so cancels out once mutants are compared relative to each
other. Overall, application of constraints to remote parts of the
enzyme is necessary to reach the desired calculation efficiency of
one reaction barrier per day when running each interpolation step
in parallel. Application of constraints has been widely adopted by
the community [18, 23, 25, 37]. The time required for optimizing the
structures with or without constraints is found to differ by a factor
of 6 (Fig. 4B in [16]).

a The structural constraints are parameters to the calculation program and are
prepared programmatically using a PYMOL script. The way the parameters are
submitted to the calculation may vary depending on the software used for
calculations.
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

796 Computational Enzyme Engineering

UNCON. MOZ. OPT 10Å, MOZ.


OPT 8Å, MOZ. OPT 12Å, MOZ.
20
Energy [kcal/mol]

15

10

5
0 2 4 6 8 10
Interpolation frame

Figure 23.8 (A) Illustration of optimization layers of Bcx with a bound


ONP substrate. Color-coded residue layers have at least one atom within the
indicated distance to any atom of the substrate: green 8 Å, red 10 Å, purple
12 Å, and cyan 14 Å. Brown residues remain at their optimized GE geometry,
and dark-blue ones are residues that are screened for active mutations. The
substrate is visible in the center of the dark-blue residues. (B) Calculated
reaction barriers for different layers of optimized residues (based on
[16]). MOZ.: Energy evaluated using PM6 and the MOZYME linear scaling
method.

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Applications 797

Screening results and discussion of mutants: By a similar


approach as the one outlined for the modeling of the WT reaction,
it is found that different ways of modeling the set of mutants results
in significantly different levels of efficiency, in terms of both required
manual efforts as well as computational time requirements. We refer
to the original publication for details [16] but point out that when,
similarly as for the WT, the ES structures of the mutants are based
on templates of optimized mutant GE structures, again large gains in
computational efficiency are possible (see Fig. 7 in [16]).
Using PYMOL [36], models of all possible single mutants within
a given distance of the substrate are prepared (excluding the
catalytically active E78 and E172).a It is noted that in the current
implementation of the presented approach, all mutated side chains
are in their default ionization state at physiological pH. Furthermore,
in a number of cases it is not possible to introduce a large side
chain in a spatially restricted environment, such cases are discarded.
Lastly, instead of using a side chain rotamer library, each mutated
side chain is locally optimized (using an empirical built-in PYMOL
function) in the environment of the enzyme before being submitted
to the quantum calculation. This approach allowed to screen a set
of 317 single mutants, where on average 29 h are required to
complete the calculations of a full reaction barrier using one CPU
for each interpolation point.b From this set of single mutants, for
every position i, j, . . . in the active site, the mutation resulting in the
mutant with the lowest barrier X i , Y j , . . . is identified. Then, from
this set of single mutants all possible double mutants (X i , Y j ) are
formed. Analysis of the reaction barriers of the obtained candidates
reveals that the set of double mutants on average has lower
activation barriers than the set of single mutants. Furthermore, the
lowest barriers are found among the double mutants (Fig. 23.9).
Analysis of the set of mutants indicates that position 127 has
strong influence on the activity (9 of 20 best single mutants
carry a mutation at that position). Inspection of the structural
arrangement of the Q127W, W9D/E, and N35E mutations points to
a The PYMOL script used for the preparation of the mutants is available at
https://github.com/mzhKU/Enzyme-Screening/blob/master/vsc-bcx.py.
b All obtained barriers are provided at http://www.scribd.com/doc/133445214/

Supp-Mat-Paper-4.
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

798 Computational Enzyme Engineering

50
Single
Mutant count Double

30
10
0

0 5 10 15 20 25 30
Barrier [kcal/mol]

Figure 23.9 Barrier distribution of single and double Bcx mutants [16]. The
plot is a histogram of reaction barriers, that is, the number of mutants are
identified for a given reaction barrier.

a number of possible explanations of the observed lower barriers.


One observation is highlighted in Fig. 23.10. As discussed before, the
mechanism is postulated to involve a catalytically active, negatively
charged E78 residue as the nucleophile. If Q127 stabilizes the
negative charge by hydrogen bonding, the nucleophilic character
of E78 is reduced. Replacing the potential hydrogen bond donor
Q127 with a non-hydrogen-bonding substitute that can preserve
overall structural integrity with a side chain of similar size, such as
Q127W, could result in increased nucleophilicity of E78 (Fig. 23.10).
Furthermore, as illustrated in Fig. 23.7, the transition state involves
a partial positive charge on C1 of the substrate. Negative charges
on side chains (such as in W9D/E) could act as stabilizers of these
negative charges and increase catalytic activity in doing so, that is,
by stabilization of the transition state. Similarly, when ionized, N35E
could act in stabilizing C1 , or when in neutral protonation state, it
could also be acting as a hydrogen bond donor to the catalytically
active E172. This rationalization of N35 mutations would be in
agreement with a recently proposed reverse protonation mechanism
[21], which still is the subject of ongoing research.
Conclusions and outlook: In overcoming the points raised at the
end of the previous section, we note that in this application the
method is extended such that the reaction barriers of all active

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

Applications 799

Figure 23.10 Rationalization of reaction barriers. (A) Overlay of wild type


(black carbon spheres) and Q127W side chain (green). (B) Stabilizing,
coulombic interactions between negative W9D/E or N35E (red) with the
nucleophilic, partially positively charged C1 of the substrate. Overlay of
Q127W mutant (green spheres) and wild-type enzyme–substrate complex.
The Q127W mutation preserves structural integrity but prevents hydrogen
bond formation to the nucleophilic E78. The substrate is in black spheres in
the upper part of the figure. Distances in Å [16].

site single mutants are screened. Furthermore, the complete Bcx


structure is used as a model rather than just a large cluster
model as was done for CalB. Contrary to the results from the
CalB study, we identified numerous mutants with barriers that are
considerably lower than the WT barrier. This fact might, however,
also be attributed to significantly promoted catalysis by a better
fitting of the substrate in the active site compared to the substrate
fitting in the CalB study. A further important finding is that by
applying constraints to the outer, nonactive site part of the enzyme,
the computational demands for structural optimizations can be
reduced to around two days; this makes the method sufficiently
fast for industrial applications. It is noted, however, that most likely
application of similarly layered constraints in the CalB study would
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

800 Computational Enzyme Engineering

have permitted to use the complete structure in the optimizations


without leading to too large computational time requirements.

23.5 Conclusions

It is concluded that using linear scaling methods and a number


of customized software packages, it is possible to carry out large-
scale quantum chemistry–based enzyme activity screening studies.
An important finding is that single mutant data is not enough
to predict combination mutant activities, because the activity-
modulating effects of the point mutations behave nonlinearly when
in combination with other mutations. In detail this means that
two mutations which appear activating as single mutants are not
necessarily activating when in combination in a double mutant. Also,
a deactivating single mutation can become an activating mutation
when in combination with other mutations (I189G in the CalB
study). On average, double mutants, as seen in the Bcx study,
have lower reaction barriers than single mutants and a systematic
screening of the active site is required to identify the mutants with
highest apparent activity. Lastly, using such a screening approach
allows us not only to identify mutants for further characterization
by higher-level methods or experimental verification but also to
formulate a hypothesis on how the various mutations actually
contribute to increased catalysis.
As an outlook on the computational enzyme engineering field,
we note that some improvements and developments are anticipated.
Among these is to consider later reaction steps such as isomeriza-
tions or product release steps. Also, it would be of great value to the
field if the software would evolve to a state where it can be routinely
used with much less manual and customization efforts to set up,
carry out, and analyze the calculations.

References

1. Agresti, J. J., Antipov, E., Abate, A. R., Ahn, K., Rowat, A. C., Baret,
J.-C., Marquez, M., Klibanov, A. M., Griffiths, A. D., and Weitz, D. A.

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

References 801

(2010). Ultrahigh-throughput screening in drop-based microfluidics for


directed evolution, Proc. Natl. Acad. Sci., 107, 9, pp. 4004–4009.
2. Bachega, J. F. R., Timmers, L. F. S., Assirati, L., Bachega, L. R., Field, M. J.,
and Wymore, T. (2013). Gtkdynamo: a pymol plug-in for qc/mm hybrid
potential simulations, J. Comput. Chem., 34, 25, pp. 2190–2196.
3. Bajpai, P. (1999). Application of enzymes in the pulp and paper industry,
Biotechnol. Prog., 15, 2, pp. 147–157.
4. Beilen, J., and Li, Z. (2002). Enzyme technology: an overview, Curr. Opin.
Biotechnol., 13, 4, pp. 338–344.
5. Bornscheuer, U. T., and Kazlauskas, R. J. (2004). Catalytic promiscuity
in biocatalysis: using old enzymes to form new bonds and follow new
pathways, Angew. Chem., Int. Ed., 43, 45, pp. 6032–6040.
6. Buchert, J., Tenkanen, M., Kantelinen, A., and Viikari, L. (1994).
Application of xylanases in the pulp and paper industry, Bioresour.
Technol., 50, 1, pp. 65–72.
7. Chaput, L., Sanejouand, Y.-H., Balloumi, A., Tran, V., and Graber, M.
(2012). Contribution of both catalytic constant and michaelis constant
to calb enantioselectivity: use of fep calculations for prediction studies,
J. Mol. Catal. B: Enzymatic, 76, pp. 29–36.
8. Claeyssens, F., Harvey, J., Manby, F., Mata, R., Mulholland, A., Ranaghan,
K., Schütz, M., Thiel, S., Thiel, W., and Werner, H. (2006). High-accuracy
computation of reaction barriers in enzymes, Angew. Chem., 118, 41, pp.
7010–7013.
9. Cramer, C. J. (2013). Essentials of computational chemistry: theories and
models (John Wiley & Sons).
10. Gao, D., Chundawat, S. P., Sethi, A., Balan, V., Gnanakaran, S., and Dale,
B. E. (2013). Increased enzyme binding to substrate is not necessary for
more efficient cellulose hydrolysis, Proc. National Academy of Sciences,
110, 27, pp. 10922–10927.
11. Gayot, S., Santarelli, X., and Coulon, D. (2003). Modification of flavonoid
using lipase in non-conventional media: effect of the water content, J.
biotechnology, 101, 1, pp. 29–36.
12. Goodsell, D., Morris, G., and Olson, A. (1996). Automated docking of
flexible ligands: applications of autodock, J. Molecular Recognition, 9, 1,
pp. 1–5.
13. Gotor-Fernández, V., Busto, E., and Gotor, V. (2006). Candida antarctica
lipase b: an ideal biocatalyst for the preparation of nitrogenated organic
compounds, Advanced synthesis & catalysis, 348, 7-8, pp. 797–812.
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

802 Computational Enzyme Engineering

14. Hediger, M. R., De Vico, L., Rannes, J. B., Jäckel, C., Besenmatter, W.,
Svendsen, A., and Jensen, J. H. (2013). In silico screening of 393 mutants
facilitates enzyme engineering of amidase activity in calb, PeerJ, 1, p.
e145.
15. Hediger, M. R., De Vico, L., Svendsen, A., Besenmatter, W., and Jensen,
J. H. (2012). A computational methodology to screen activities of
enzyme variants, PLoS ONE, 7, 12, p. e49849, doi:10.1371/journal.pone.
0049849, URL http://dx.doi.org/10.1371%2Fjournal.pone.0049849.
16. Hediger, M. R., Steinmann, C., De Vico, L., and Jensen, J. H. (2013). A
computational method for the systematic screening of reaction barriers
in enzymes: searching for bacillus circulans xylanase mutants with
greater activity towards a synthetic substrate, PeerJ, 1, p. e111.
17. Hedstrom, L. et al. (2002). Serine protease mechanism and specificity,
Chemical reviews, 102, 12, pp. 4501–4524.
18. Himo, F. (2006). Quantum chemical modeling of enzyme active sites
and reaction mechanisms, Theoretical Chemistry Accounts: Theory,
Computation, and Modeling (Theoretica Chimica Acta), 116, 1, pp. 232–
240.
19. Jensen, F. (2007). Introduction to computational chemistry (John Wiley
& Sons).
20. Joshi, M., Sidhu, G., Nielsen, J., Brayer, G., Withers, S., and McIntosh,
L. (2001). Dissecting the electrostatic interactions and pH-dependent
activity of a family 11 glycosidase, Biochemistry, 40, 34, pp. 10115–
10139.
21. Joshi, M., Sidhu, G., Pot, I., Brayer, G., Withers, S., and McIntosh, L. (2000).
Hydrogen bonding and catalysis: a novel explanation for how a single
amino acid substitution can change the ph optimum of a glycosidase1, J.
molecular biology, 299, 1, pp. 255–279.
22. Kirk, O., Borchert, T. V., and Fuglsang, C. C. (2002). Industrial enzyme
applications, Curr. Opin. Biotechnol., 13, 4, pp. 345–351.
23. Liao, R.-Z., and Thiel, W. (2012). Comparison of qm-only and qm/mm
models for the mechanism of tungsten-dependent acetylene hydratase,
J. Chem. Theory Comput., 8, 10, pp. 3793–3803.
24. Lind, M. E., and Himo, F. (2013). Quantum chemistry as a tool in
asymmetric biocatalysis: Limonene epoxide hydrolase test case, Angew.
Chem., 125, 17, pp. 4661–4665.
25. Lonsdale, R., Houghton, K. T., Zurek, J., Bathelt, C. M., Foloppe, N.,
de Groot, M. J., Harvey, J. N., and Mulholland, A. J. (2013). Quantum
mechanics/molecular mechanics modeling of regioselectivity of drug

www.ebook3000.com
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

References 803

metabolism in cytochrome p450 2c9, J. Am. Chem. Soc., 135, 21, pp.
8001–8015.
26. Ludwiczek, M. L., DAngelo, I., Yalloway, G. N., Brockerman, J. A., Okon,
M., Nielsen, J. E., Strynadka, N. C., Withers, S. G., and McIntosh, L. P.
(2013). Strategies for modulating the ph-dependent activity of a family
11 glycoside hydrolase, Biochemistry, 52, 18, pp. 3138–3156.
27. Meyer, H.-P., Eichhorn, E., Hanlon, S., Lütz, S., Schürmann, M., Wohlge-
muth, R., and Coppolecchia, R. (2013). The use of enzymes in organic
synthesis and the life sciences: perspectives from the swiss industrial
biocatalysis consortium (sibc), Catal. Sci. Technol., 3, 1, pp. 29–40.
28. Naik, S., Basu, A., Saikia, R., Madan, B., Paul, P., Chaterjee, R., Brask, J., and
Svendsen, A. (2010). Lipases for use in industrial biocatalysis: specificity
of selected structural groups of lipases, J. Mol. Catal. B: Enzymatic, 65, 1–
4, pp. 18–23.
29. Nakagawa, Y., Hasegawa, A., Hiratake, J., and Sakata, K. (2007).
Engineering of Pseudomonas aeruginosa lipase by directed evolution
for enhanced amidase activity: mechanistic implication for amide
hydrolysis by serine hydrolases, Protein Eng. Design Sel., 20, 7, pp. 339–
346.
30. Noodleman, L., Lovell, T., Han, W., Li, J., and Himo, F. (2004). Quantum
chemical studies of intermediates and reaction pathways in selected
enzymes and catalytic synthetic systems, Chem. Rev., 104, 2, pp. 459–
508.
31. Ragauskas, A. J., Williams, C. K., Davison, B. H., et al. (2006). The path
forward for biofuels and biomaterials, science, 311, 5760, pp. 484–489.
32. Rezác, J., Fanfrlik, J., Salahub, D., and Hobza, P. (2009). Semiempirical
quantum chemical pm6 method augmented by dispersion and h-
bonding correction terms reliably describes various types of noncova-
lent complexes, J. Chem. Theory Comput., 5, 7, pp. 1749–1760.
33. Rezác, J., and Hobza, P. (2011). A halogen-bonding correction for the
semiempirical pm6 method, Chem. Phys. Lett., 506, 4, pp. 286–289.
34. Schenker, S., Schneider, C., Tsogoeva, S., and Clark, T. (2011). Assessment
of popular dft and semiempirical molecular orbital techniques for
calculating relative transition state energies and kinetic product
distributions in enantioselective organocatalytic reactions, J. Chem.
Theory Comput., 7, 11, pp. 3586–3595.
35. Schmid, A., Hollmann, F., Park, J. B., and Bühler, B. (2002). The use of
enzymes in the chemical industry in Europe, Curr. Opin. Biotechnol., 13,
4, pp. 359–366.
March 23, 2016 13:15 PSP Book - 9in x 6in 23-Allan-Svendsen-c23

804 Computational Enzyme Engineering

36. Schrödinger, LLC (2010).


37. Siegbahn, P. E., and Himo, F. (2009). Recent developments of the
quantum chemical cluster approach for modeling enzyme reactions, J.
Biol. Inorg. Chem., 14, 5, pp. 643–651.
38. Stewart, J. (1990). Mopac: a semiempirical molecular orbital program, J.
Comp.-Aided Mol. Design, 4, 1, pp. 1–103.
39. Stewart, J. (1996). Application of localized molecular orbitals to the
solution of semiempirical self-consistent field equations, Int. J. Quantum
Chem., 58, 2, pp. 133–146.
40. Stewart, J. (2007). Optimization of parameters for semiempirical
methods V: modification of NDDO approximations and application to 70
elements, J. Mol. Modeling, 13, 12, pp. 1173–1213.
41. Stewart, J. (2009). Application of the PM6 method to modeling proteins,
J. Mol. Modeling, 15, 7, pp. 765–805.
42. Svedendahl, M., Carlqvist, P., Branneby, C., Allnr, O., Frise, A., Hult,
K., Berglund, P., and Brinck, T. (2008). Direct epoxidation in candida
antarctica lipase b studied by experiment and theory, ChemBioChem,
9, 15, pp. 2443–2451, doi:10.1002/cbic.200800318, URL http://dx.doi.
org/10.1002/cbic.200800318.
43. Uppenberg, J., Hansen, M. T., Patkar, S., and Jones, T. A. (1994). The
sequence, crystal structure determination and refinement of two crystal
forms of lipase b from candida antarctica, Structure, 2, 4, pp. 293–308.
44. Uppenberg, J., Oehrner, N., Norin, M., Hult, K., Kleywegt, G., Patkar, S.,
Waagen, V., Anthonsen, T., and Jones, T. (1995). Crystallographic and
molecular-modeling studies of lipase B from Candida antarctica reveal
a stereospecificity pocket for secondary alcohols, Biochemistry, 34, 51,
pp. 16838–16851.
45. Yeoman, C. J., Han, Y., Dodd, D., Schroeder, C. M., Mackie, R. I., and
Cann, I. K. (2010). Thermostable enzymes as biocatalysts in the biofuel
industry, Adv. Appl. Microbiol., 70, pp. 1–55.
46. Young, D. (2004). Computational Chemistry: A Practical Guide for
Applying Techniques to Real World Problems (John Wiley and Sons).
47. Zhang, Y.-H. P. (2011). What is vital (and not vital) to advance
economically-competitive biofuels production, Process Biochem., 46, 11,
pp. 2091–2110.

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Chapter 24

In Silico Screening of Enzyme Variants by


Molecular Dynamics Simulation

Hein J. Wijma
Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen,
Nijenborgh 4, 9747 AG Groningen, The Netherlands
h.j.wijma@rug.nl

24.1 Potential Applications of MD Simulations For


Improving Enzymes

Enzymatic catalysis can demonstrate impressively high catalytic


rates, enantioselectivities, and regioselectivities. These achieve-
ments are due to natural evolution, which optimized the 3D
structures of enzymes to efficiently carry out their catalytic tasks,
as required for the organism’s survival [1]. This way, in nature
a huge diversity of enzymes has evolved to efficiently produce
an enormous range of primary and secondary metabolites. Some
of these enzymes could be recruited to catalyze reactions in
various medical, household, agricultural, and industrial applications.
Compared to classical chemical methods, enzymes produce less side
products and work under milder conditions. This leads to higher

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

806 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

product yields, lower costs, and environmental benefits if an enzyme


can carry out the required catalysis. However, additional adaptations
to the enzyme are often required to enable an application [2, 3].
Stability, the catalytic rate for unnatural substrates, and regio- or
enantioselectivity of natural enzymes are often inadequate for the
designed novel application and then need to be improved [2].
There are various protein engineering methods to improve the
properties of enzymes. The most common approaches are directed
evolution [4], bioinformatics-inspired engineering [5], rational en-
gineering [6, 7], and computational design [8–10]. The applications
of the resulting engineered enzymes vary from the production of
bulk chemicals, such as biofuels, to the synthesis of enantiopure
pharmaceuticals [2, 11] or medical enzyme applications [12].
Directed evolution can be a powerful method to improve enzyme
properties but requires high-throughput experimental screening
and as a result often requires large investments of money and
time. Also the other approaches may require a long time span
to obtain improved variants. There is a need for techniques to
engineer the properties of enzymes more efficiently, and with higher
predictability, than with existing methodologies [2].
Molecular dynamics (MD) screening has the potential to save
considerable amounts of time and resources, while improving
enzymes, but as of yet other approaches are still much more
commonly applied. Until a few years ago, it was difficult to do
more than a handful of MD simulations due to the long calculation
times. This limitation has now been relieved by new generations
of computer hardware. Efficient in silico screening also requires
knowledge of the 3D structure, such as a high-resolution X-
ray structure of the enzyme, and a sufficient understanding of
what mechanistically limits the targeted catalytic parameter in the
original enzyme. Despite such challenges, MD simulation can already
be used to successfully screen for improved enzyme properties
(Table 24.1). Examples now include >1000-fold increases in
kcat , conversions of mesostable to thermostable enzymes, and
introduction of previously nonexistent enantioselectivities.
The focus of this chapter is on how MD simulations can be used to
predict which enzyme variants will have improved catalytic activity
or stability (Table 24.1). Subsequently we will address when to

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Potential Applications of MD Simulations For Improving Enzymes 807

Table 24.1 Examples of enzyme improvement obtained by screening


potential mutants with MD simulation

Selection protocol and additional


Goal Achievements information

Faster cocaine kcat /KM improved 4400-fold The enzyme–transition state


hydrolysis by human to 6.7 × 107 M−1 ·s−1 and complex was simulated. Variants
butyrylcholin esterase kcat improved 3700-fold to with mutations predicted to
243 s−1 . stabilize the transition state
through H bonds were
experimentally characterized. Five
rounds of in silico screening
followed by experimental
verification introduced six
mutations [12–17].
Improved Production of targeted The enzyme–substrate complex of
regioselectivity of perillyl alcohol increased to mutant variants was simulated. The
limonene oxidation by 97%. presence of NACs was monitored to
P450-BM3 select positions for mutagenesis.
This resulted in the selection of
three mutations after two rounds
[18].
Improved Production of targeted The enzyme–substrate complex of
enantioselectivity of a (1R, 2R) and computationally designed variants
limonene epoxide (1S,2S)-cyclopentanediol was simulated. The presence of
hydrolase increased to 92% and 95%, NACs was monitored to select five-
respectively. and sevenfold mutants with
opposite enantioselectivities in one
round [19].
Increased amidase kcat /KM improved sevenfold A specific H bond was monitored
activity of Candida to 0.58 M−1 ·s−1 during simulation of the
antarctica lipase B enzyme–substrate complex.
Mutants with increased occurrence
of this H bond were experimentally
characterized. Variant I189Y gave
the highestkcat /KM [20].
Introduction of Kemp Catalytic activity for Kemp Both MD simulations with
elimination activity elimination introduced de substrate bound (to monitor H
novo, with a kcat /KM of 430 bonds in the active site) and
M−1 ·s−1 . without substrate (to monitor
preorganization) were used to
select computationally designed
variants, with 12–13 mutations, for
experimental characterization [21].

(Contd)
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

808 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

Table 24.1 (Contd.)

Selection protocol and


Goal Achievements additional information

Shift specificity of a kcat /KM for NADPH was MD simulations were


dehydrogenase from NADH increased from performed with the NADPH
to NADPH nondetectable to 1.5 × 104 substrate bound. Mutations
M−1 ·s−1 , which was that provided H bonds to the
identical to the original newly introduced phosphate
kcat /KM for NADH. groups were selected for
experimental characterization,
leading to the best mutant
Q20R/D43Q [22].
Improved stability of a Half-life was improved Point mutants of the cocaine
cocaine esterase 50-fold. esterase were simulated at 400
K to monitor their unfolding
rate. The best variant was
L169K [23].
increased by 35◦ C, and
app
Improved stability of a TM MD simulations were used to
limonene epoxide hydrolase half-life improved more predict the effect of the
than 250-fold. mutations on protein
flexibility. After initial
experimental screening of 64
variants, the best 10 mutations
were combined [24].
improved by 23◦ C, and
app
Improved stability of TM With the same protocol as in the
haloalkane dehalogenase half-life improved more entry above, MD simulations
LinB than 200-fold. were used to predict the effect
of the mutations on protein
flexibility. After initial
experimental screening of
variants, 12 mutations were
combined [25].

For these examples, MD simulation was used to evaluate, in silico, variants before their
experimental characterization.

use MD or another computational technique and how to predict


improved catalytic activity, substrate binding affinity, and enzyme
stability. Additionally we will analyze how to improve the accuracy
of MD-based predictions. Finally, some concluding remarks will be
made.

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Molecular Dynamics vs. Other in silico Methods 809

24.2 Molecular Dynamics vs. Other in silico Methods

When selecting a computational method to model the properties


of an enzyme, there are two main criteria (Scheme 24.1). The first
is energetic accuracy—how accurately the computational method
can predict the energy of each modeled conformation. The second
equally important consideration is the sampling ability—how many
of the possible conformations and variants the method can sample
with the available computer facilities. Sampling is highly relevant
as enzymes can have very many possible conformations, potential
modes of binding their substrates, and potential mutations.

Scheme 24.1 Possibilities of general molecular modeling (GMM), molec-


ular dynamics (MD), and quantum mechanics (QM) for in silico screening.
In this chapter, GMM is used to refer to a collection of technically similar
techniques to model protein structures. Examples of GMM are molecular
docking, 3D structure prediction, and computational protein design. These
techniques have the following in common: They save computation time by
using the molecular mechanics approximation, by modeling only the lowest-
energy positions of protein and ligand atoms, and by using implicit solvent
models, in which the (bulk) solvent is modeled as a continuum instead of
with explicit solvent molecules. In MD, the trajectory of atomic motion over
time is simulated. In quantum mechanics, the molecular interactions are
modeled, while explicitly considering their wave functions.

On one extreme, quantum mechanics (QM) methods can ac-


curately predict the energy of each conformation. However, due
to their large calculation costs only a limited number of different
conformations or enzyme variants can be addressed (Scheme 24.1).
At the other end of the scale are general molecular modeling (GMM)
techniques such as docking, 3D structure prediction of mutant
proteins, computational protein design, etc. GMM techniques allow
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

810 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

screening far larger numbers of different enzyme variants and


different conformations than QM and MD with the same amount of
computational resources.
MD is with regard to energetic accuracy and sampling capacity in
between GMM and QM methods (Scheme 24.1). Like GMM methods,
MD uses the simplifying assumption that the atoms behave as a
classical mechanical system. This so-called molecular mechanics
approximation neglects the inherently QM character of molecular
interactions but speeds up the calculations. GMM methods use
more simplifying assumptions than MD to further speed up the
calculations. While in MD the water phase is normally modeled
as water molecules that surround the protein in a sphere, GMM
methods typically treat the water phase as a continuum without
explicit water molecules. Also GMM approaches often use a fixed
protein backbone and computationally cheap but highly simplified
approximations to calculate energies, such as calculating the entropy
by counting the number of rotatable bonds [26], etc. Unlike GMM
and QM, MD approaches model the motions of atoms in time. This
way, MD simulations can provide a population of thermodynamically
realistic conformations of the enzyme, better than possible with
either GMM or QM approaches.
A lack of conformational sampling makes QM prone to generating
false-positive predictions of a high catalytic reaction rate. The reason
is that the input structures for QM originate from GMM techniques
such as docking or computational design. These techniques can
erroneously predict a near-ideal orientation of the substrate for
catalysis [27, 28]. In such cases, QM methods subsequently predicted
a low-energy barrier because in the input structure the substrate
was bound in a reactive conformation. With the same input
structures, MD simulations with their superior sampling showed
that these reactive conformations were unrealistic; the structures
spontaneously relaxed to alternative conformations in which the
substrates had moved away from the catalytic groups [27, 28].
QM has a superior ability to predict catalytic reactivity, only if
the substrate is realistically oriented. For this reason, because MD
is better than GMM or QM at predicting the relative orientation
of a substrate in an active site, it is becoming increasingly more
common to use structures from an MD trajectory as input for

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Molecular Dynamics vs. Other in silico Methods 811

Scheme 24.2 Workflow for improving enzymes using MD simulations.


Bioinformatics, rational design, and computational design can provide
mutations or sets of mutations to be introduced in an existing enzyme. The
predicted 3D structures are the starting points for MD simulations.

QM calculations [29]. With MD screening, the computational costs


per variant are also much lower than for highly accurate QM
calculations, which makes it feasible to first eliminate variants with
unsuitable conformations by MD before QM evaluation.
For in silico screening of enzyme variants, normally, first, a
structure with the substrate (or intermediate) bound is generated
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

812 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

by GMM methods. Also with rational design and bioinformatics,


mostly GMM techniques are employed to predict a 3D structure of
mutants (Scheme 24.2). These structures can be used for predicting
functionality, for example, by calculating the interaction energy
between intermediate and enzyme. Thus, GMM techniques can give
a preliminary prediction, while using few computational resources
[12, 30]. After using GMM to pick initially promising variants, the
more energetically accurate methods of MD or QM can be used
to reevaluate these variants. The typical goal for MD simulations
is to more accurately model the structure and dynamics of the
enzyme and its interactions with the substrate or intermediates
(Scheme 24.2).

24.3 Improving Catalytic Activity by MD Screening

The basic method of screening for improved catalytic activity is


simulating the bound substrate, the transition state (TS), or a high-
energy intermediate (Fig. 24.1). In the case of substrate simulation,
one monitors whether catalytically productive conformations are
present. With transition-state and high-energy intermediate simu-
lations, the interaction energies with the enzyme are evaluated to
rank the different variants.

24.3.1 Transition-State Simulation


One approach to predict the relative catalytic proficiency of enzyme
variants through MD is by analyzing the interactions between
enzyme and TS (Fig. 24.1) [15]. If these interactions become
more favorable, this predicts that the mutation lowers the energy
barrier to reach the TS, which will make catalysis faster. Before
TS simulation was applied to enzymes, this technique was already
widely used for modeling the properties of small-molecule catalysts
[31]. A TS is inherently unstable because it is not located at a local
energy minimum (Fig. 24.1). Therefore, distance restraints between
the atoms whose bonds are broken or being formed have to be
implemented to enforce the otherwise unstable structure of the TS
during MD simulation [15].

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Improving Catalytic Activity by MD Screening 813

Figure 24.1 Different applications of MD simulations to predict catalytic


rate. Boxes show what the MD simulation reports in the case of superior
or inferior enzyme variants during substrate simulation, transition-state
simulation, and high-energy intermediate simulation. (A) Example of a
superior enzyme variant; (B) an inferior enzyme variant.

TS simulation was used to convert human butyrylcholin esterase


into an efficient cocaine hydrolase [12–17, 32]. The resulting
engineered enzyme, with six mutations (A199S/F227A/P285A/
S287G/A328W/Y332G) and a 4400-fold improved kcat /KM (Table
24.1), was fast enough to treat a cocaine overdose by rapidly clearing
cocaine from the bloodstream [12, 14]. The wild-type butyrylcholin
esterase is already responsible for the majority of human cocaine
metabolism but does this too slowly to prevent addiction and toxic
effects [15]. Kinetic and computational characterization of the wild-
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

814 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

type human butyrylcholin esterase indicated that the first reaction


step, in which the active site serine nucleophilically attacks the
substrate to form an unstable tetrahedral intermediate, was slowest
and thereby limited the turnover rate of the enzyme [15]. Therefore,
this first step was modeled to select improved variants instead
of one of the five faster steps from the enzyme’s catalytic cycle
(three other chemical steps plus substrate binding and product
dissociation).
To improve the catalytic activity for cocaine, small numbers
of mutants were experimentally screened after computational
preselection [12–17]. During the MD screening, the interaction
energies between the TS and the enzyme were calculated on the
basis of only the H bond distances between the TS structure and
the enzyme. These distances were recorded during thousands of
snapshots from a 2 ns MD simulation [15]. The interaction energies
were calculated using an empirical equation [12] that modeled the
H-bonding energies. For one of the final mutants, the thus calculated
change in interaction energy due to the introduced point mutations
(33 kJ/mol) was approximately equal to the effect on the TS energy
barrier (25 kJ/mol) as calculated with a high-level QM technique
(B3LYP/6-31G*) [13]. The other used methodologies to improve
cocaine turnover of the enzyme were rational engineering with
support of MD simulation, which resulted in a ninefold improvement
[17], and GMM followed up by QM/MM techniques, which gave a 2.5-
fold improvement in kcat /KM [12] over the previously best mutant
[14]. The three studies in which TS simulations were used con-
tributed in total to a 156-fold improvement in kcat /KM of the enzyme
[14, 15, 33].

24.3.2 High-Energy Intermediate Simulation


If there is an unstable intermediate in the reaction pathway (Fig.
24.1), then one can simulate this intermediate instead of the
TS itself. The underlying assumption is that if the intermediate
is high in energy, then its structure will resemble the TS, and
therefore interactions that stabilize this intermediate will also
stabilize the TS. The intermediate is in a local energy minimum,
and therefore distance restraints are not necessary during MD,

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Improving Catalytic Activity by MD Screening 815

unlike in TS simulation. High-energy intermediate simulation is


often used to model the specificity of lipases and esterases via the
high-energy tetrahedral intermediate that they form during catalysis
[30, 34].
With this approach, good agreement between experimentally ob-
served and predicted enantioselectivity of the industrially relevant
Candida antarctica lipase B (CaLB) could be obtained [34]. These MD
simulations were only 300 ps long, and only residues in the range of
10 Å from the active site serine were allowed to move during the MD
simulation, which saved computation time. Good results could also
be obtained with GMM instead of MD. On the basis of docking and
energy minimization, it was possible to obtain a Candida antarctica
lipase variant with a fivefold faster rate for a bulky ester substrate
[30].

24.3.3 Substrate Simulation with Near-Attack


Conformations
The most commonly used method for modeling catalytic properties
is to simulate the enzyme–substrate complex while monitoring how
the substrate is bound. A fast enzyme will typically bind its substrate
in a conformation that is similar to the TS (Fig. 24.1). Such preor-
ganization decreases the amount of thermal energy that is needed
to distort the conformation of the bound substrate to the higher-
energy conformation of the TS. Thus, the energy barrier to reach
the TS is likely to be lower with preorganization. MD simulations
can predict whether designed enzyme variants will indeed bind the
substrate selectively in such a preorganized conformation. This has
been done either by monitoring the occurrence of H bonds, between
the substrate and the active site, that stabilize the TS or by the near-
attack conformation (NAC) approach.
NACs are conformations of the enzyme–substrate complex that
geometrically resemble the TS. NACs are often defined as having the
reacting atoms of the enzyme and the substrate in close proximity
to each other (within a few angstroms). In a more formal way,
NACs can be defined as having distances between the reacting
atoms of less than van der Waals overlap and angles between
the reacting atoms within 15◦ –20◦ of those in the TS [35–38]. The
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

816 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

structure of the TS is modeled once with QM methods after which


the resulting geometric NAC definitions can be used to screen for
improved variants at a much lower CPU cost. Small differences
in NAC definition, for example, 0.5 Å in distances or 10◦ in angles
between the reaction atoms, have a negligible effect on the observed
ratios of NACs when comparing different variants or states [38].
The NAC approach allows ranking for the relative catalytic rate and
for modeling enantioselectivity or regioselectivity. For a variety of
enzymes good agreement with experimental results could thus be
obtained [35–37, 39–42].
Scoring of NACs was applied to improve P450, an enzyme
in which the regioselectivity of substrate oxidation is largely
determined by which atoms of the substrate can come in close
proximity to the highly reactive oxygen atom that is bound
to the heme iron atom [18, 43, 44]. The criterion for a NAC
during the MD simulations was that the targeted carbon atom
had to be within 4 Å of the reactive heme oxygen. The efforts
resulted in a P450 variant that with 97% selectivity oxidized (4R)-
limonene to perillyl alcohol, a potential inhibitor of various tumor
types. The wild-type P450 did oxidize the substrate as well, but
it did not produce perillyl alcohol in detectable amounts. The
engineering approach was to select suitable positions to carry
out site saturation mutagenesis after studying the behavior of the
substrate in the active site during MD simulations. In a stepwise
approach, in which each time MD simulations were carried out
for the already experimentally characterized variant, three point
mutations (A264V/A328V/L437F) were introduced. The approach
required screening of only 29 variants after A328V had earlier been
obtained by directed evolution.
Furthermore, for a limonene epoxide hydrolase a NAC approach
made it possible to obtain highly enantiodivergent variants [19].
A total of 3015 computationally designed variants were screened
by NACs for improved or inverted enantioselectivity. An advantage
of computational design is that relatively many mutations can be
introduced in a single step [8, 9, 21, 45]. The most promising 37
variants, with up to 8 mutations, were selected for experimental
characterization. From those variants, a fivefold mutant (M32L/
L74I/I80V/ L103F/F139L) produced (1R,2R)-cyclopentanediol

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Improving Catalytic Activity by MD Screening 817

with 92% selectivity from the prochiral precursor cyclopentenoxide.


A variant with seven mutations (M32L/L35W/L74F/M78F/I80A/
I116V/F139L) produced (1S,2S)-cyclopentanediol with 95% selec-
tivity (this equals 90% enantiomeric excess). The in silico screening
took only five CPU hours per variant because conformational
sampling was enhanced by using many very short MD simulations
instead of a single long one [42] (vide infra).

24.3.4 Substrate Simulation with Monitoring of H Bonds


There are also several examples where improved variants could
be selected in silico by monitoring of H bonds between enzyme
and substrate during MD simulation. It was possible to increase
the amidase activity of a lipase sevenfold by selecting a point
mutation that introduced a H bond to the substrate [20]. Earlier QM
work had predicted that this H bond was required for stabilizing
the TS in amidases [46]. It was also possible to convert an
NADH-dependent dehydrogenase into a variant accepting NADPH
as reductant [22]. This was done by determining for each of the
10,000 snapshots obtained from a 10 ns MD simulation whether H
bonds were formed between the protein and the newly introduced
phosphate group. The mutant variants that had the highest number
of H bonds to the introduced phosphate group were characterized
experimentally.
In a computational design project to generate enzymes that
could catalyze the Kemp elimination, variants were screened
by MD to predict whether the introduced catalytic groups and
substrate were suitably aligned for the reaction to occur [21]. This
was done by recording H bond distances and angles during the
MD simulation. The trajectories were also inspected to exclude
variants that lacked preorganization. The targeted substrate was 5-
nitrobenzisoxazole, which is so reactive that its degradation through
the Kemp elimination is catalyzed even by accidental catalysts such
as imidazole and bovine serum albumin [47]. It is noteworthy that
even for such an activated substrate a high degree of preorganization
of catalytic groups and substrate is required to obtain a highly
active enzyme [21]. Variants with such beneficial properties could
be selected by MD simulation [21].
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

818 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

24.4 Predicting and Improving Binding Affinity

If a substrate is poorly converted, this could be because the


enzyme binds it too weakly. Thus, methods to predict binding
affinity can have potential applications in improving substrate
range of enzymes. The MD methods that can predict binding
affinity with the highest accuracy are those that are rigorously
based on statistical mechanics theory. Examples of such methods
are free-energy perturbation, thermodynamic integration, and the
Bennett acceptance ratio method [48]. At the same time, these
methods are so computationally expensive that even for single-
protein variants it is often impossible to calculate the binding
affinity of a ligand in a reasonable amount of time. Instead, these
methods are typically employed to predict the difference in binding
affinity between two structurally similar ligands [49], which can
be calculated much faster than the binding affinity of an entire
ligand. There are also less CPU-expensive MD methods, where
theoretical justifications have largely been replaced by faster-to-
calculate approximations. Obtaining accurate predictions with these
methods remains challenging though [50–52].
For example, the commonly used molecular mechanics–Poisson
Boltzmann surface area (MM-PBSA) and linear interaction energy
(LIE) methods gave worse than random results when used to predict
binding affinity for trypsin ligands [50]. The same methods did give
significant correlations with experimental data for streptavidin and
HIV-1 reverse transcriptase ligands [53, 54]. For these proteins, the
reported average differences between experimental and predicted
binding energies were 7.1 and 2.4 kJ/mol, respectively. These
numbers would suggest that if a dissociation constant (KD ) of 1
mM is predicted for these proteins, the 90% confidence interval
ranges are approximately 200 μM to 5 mM (in the case of a
standard deviation of 2.4 kJ/mol) or 9 μM to 110 mM (with a
standard deviation of 7.1 kJ/mol). These ranges can be calculated
from Gbinding = −RTlnKD , where R is the gas constant and T
is the absolute temperature, and standard statistics, in which the
90% confidence interval is a factor 1.64 wider than the standard
deviation.

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

MD Screening to Improve Enzyme Stability 819

GMM methods can predict ligand-binding affinity much faster


than MD, in seconds instead of hours or days, but their errors are
larger. For example, for the binding affinity prediction by Autodock
Vina, the reported standard error is 12 kJ/mol [55]. For a predicted
KD of 1 mM, this corresponds to a 90% confidence range that
stretches from 0.4 μM to 2.7 M. This is a huge range; mutations that
improve binding affinity will likely have much less effect than the
error margin of this GMM method.
Instead of selecting enzyme variants by explicitly predicting their
KD , one can carry out in silico screening by monitoring interactions
that should improve binding affinities [56]. A successful example
of this approach might be the in silico selection of variants that
accepted NADPH by monitoring of H bonds to its phosphate group
in mutant dehydrogenases (Table 24.1) [22].
Finally, it may be that there is no need for improving binding
affinity. Because titration of the substrate itself is not feasible with
most enzymes, the KM is often used as a proxy for the KD . A high KM
is often interpreted as due to low affinity for the substrate. However,
the KM can be considerably smaller or larger than the KD for the
substrate because the KM depends both on the KD and on the speed
of the reaction steps that follow substrate binding [6]. A high KM is
thus not necessarily due to poor binding affinity.

24.5 MD Screening to Improve Enzyme Stability

It is often desirable to make enzymes more stable. For practical


purposes, increasing enzyme stability can have the same effect as
increasing its turnover rate. If the enzyme survives longer under
the process conditions, less of the (relatively expensive) enzyme
has to be added. Also, some applications are only possible if the
enzyme can survive high temperatures, which are often used to
prevent microbial contamination in food processing. Additionally,
thermostable enzyme variants typically survive better in (co-
)solvents [25] and ionic liquids [57]. A further advantage of
thermostable enzymes is that their catalytic properties can more
easily be improved by mutagenesis than those of mesostable
enzymes. This is because the same mutations that increase the
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

820 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

catalytic rate often decrease stability [58, 59], until the enzyme can
no longer be expressed in soluble form. A solution for such stability
problems can be to engineer a more stable enzyme variant prior to
engineering catalytic activity [19, 24].
An important rule in thermostability engineering is that larger
proteins, like typical enzymes, tend to inactivate irreversibly after
partial unfolding of a structurally weaker region [.60]. The initial
regional unfolding leads to irreversible inactivation by triggering
precipitation or allows for irreversible chemical denaturation (Fig.
24.2b) [61, 62]. This situation differs from small proteins, which
typically unfold reversibly in a two-state equilibrium (Fig. 24.2a).
Another difference is that for such small proteins it is already
possible to engineer extremely thermostable variants [65, 66]. For
most enzymes, the stability increases that are reached through
protein engineering are much smaller, typically in the range of +2◦ C
to +15◦ C [67]. Enzymes are difficult to stabilize because part of
the stabilizing mutations will decrease catalytic activity and because
mutations further away from the weak spot will have minor effects
on thermostability [59]. The targeting of mutations at a critical

Figure 24.2 Reversible versus irreversible inactivation. (A) A reversible


two-state equilibrium between folded and unfolded proteins. This type
of inactivation is mostly observed for smaller enzymes. Examples include
RNAse (11 kDa) [63], lysozyme (19 kDa) [59], and mutants of Bacillus
subtilis lipase (19 kDa) [64]. (B) A partial unfolding of the protein triggers
irreversible inactivation. Examples of enzymes that irreversibly inactivate
are cocaine esterase (64 kDa) [23], haloalkane dehalogenase LinB (34 kDa)
[25], limonene epoxide hydrolase (33 kDa) [24], and wild-type Bacillus
subtilis lipase (19 kDa) [64].

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

MD Screening to Improve Enzyme Stability 821

region can, therefore, save lots of effort [60, 68]. The unfolding prone
regions are often relatively flexible as revealed by high B-factors in
an X-ray structure or by high fluctuations in atomic positions during
an MD simulation. While there is no thermodynamic necessity for
a more stable protein structure to be more rigid, it is commonly
observed that higher flexibility corresponds to structural weakness
[69, 70].
High-temperature MD has been used both for locating the critical
region and for predicting the relative rate of unfolding. A higher tem-
perature during MD speeds up the rate of unfolding, which allows lo-
cating of the critical region by direct observation of its denaturation
[71, 72]. It was possible to select a highly stabilized cocaine esterase
variant (Table 24.1) by monitoring the rate of unfolding of several
variants during 3.5 ns MD simulations at 400 K [23].
Screening of mutants by MD at ambient temperature, to predict
changes in flexibility, has also been used successfully to select
thermostabilizing mutations [24, 25]. Using MD simulations, it was
possible to eliminate half the computationally designed variants
that would otherwise have to be screened experimentally. Control
experiments confirmed that variants that were predicted to be more
flexible were indeed less stable [24]. Relatively short MD simulations
were sufficient (five sets of 100 ps MD simulations per mutant were
used). [24, 25]. The overall protocol, which included a combination
of improved variants, gave unusually good stabilizations for two
different enzymes (Table 24.1); the increases in apparent melting
temperature (TM ) were +23◦ C and +35◦ C.
app

A recently developed method to find the weakest spot in a protein


is by using constraint network analysis [73, 74]. In such analysis,
the MD simulation is used to provide structural sampling of the
possible conformations of the protein. These input structures from
MD simulation are then used to find the critical unfolding regions
by removing in every snapshot all interactions (H bonds, van der
Waals contacts) one by one, starting at the weakest. The results
are averaged over all the snapshots. The results show that initially
a large network of interactions is present, which corresponds to
the folded protein. This network persists until the removal of
particular interactions eliminates the last contacts between distant
parts of the network, abruptly splitting it into smaller fragments.
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

822 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

This corresponds to unfolding. These last contacts are observed


to be located at experimentally discovered weak spots of enzymes
that were earlier mutated for thermostability engineering [73, 75].
Additionally, it was possible to use this method to get correlation to
experimental melting temperatures [74].

24.6 Improving Correlation between MD and


Experiment

The two main factors that determine how well MD simulations agree
with the experimentally observed enzyme behavior are the accuracy
of the force field and the completeness of conformational sampling
of the enzyme. The sources of errors in MD simulations have been
reviewed exhaustively [48]. Despite the many potential problems, it
is often possible to get good enough computational predictions to
obtain improved variants (Table 24.1).

24.6.1 Force Field Inaccuracies


Inaccuracies in the force field are a relatively well-known concern
for MD simulations. More recent force fields are better at repro-
ducing experimental thermodynamic data and 3D structures [76–
78] than older force fields. The formulation of the force fields did
not change; only the parameters were optimized (i.e., the force
constants for all possible atom combinations and the partial charges
on the atoms). Older force fields were so inaccurate that they
caused proteins to slowly unfold during longer MD simulations.
In contrast, with newer force fields it is sometimes possible to
correctly fold proteins while starting with an unfolded structure
[79]. However, also with the more recent force fields a long (>100
ns) MD simulation can result in partial unfolding of the protein
[77].
A known source of error is that in most force fields the partial
charges on atoms, which determine to which extent they repel or
attract other atoms through electrostatics, are constant [48]. In
reality, these charges change if the molecule that they belong to
is polarized by the vicinity of partial charges on nearby atoms. An

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Improving Correlation between MD and Experiment 823

MD simulation is therefore prone to give inaccurate results with


regard to 3D structure and energetics if polarization is important in
the interactions between substrate and active site. Force fields that
explicitly model the polarization are in development but as yet lack
the thorough parameter fine-tuning reached in other force fields and
are therefore not yet as accurate [48, 80].
While the parameters for standard protein residues have been
thoroughly optimized [76–78], for enzyme substrates such well-
tested parameterizations are rarely available. The assignment of
parameters, both for force constants and for partial charges on
each atom, can be done automatically, via QM calculations [81, 82],
or manually. In the latter case, the parameters are modified till
the simulation results are in agreement with experimental data. A
common control experiment for the quality of the parameters is
to verify that the experimental density of the substrate (if known)
is reproduced when doing an MD simulation with a cell filled
completely with substrate molecules [83].

24.6.2 Sampling Concerns


Often the main concern is the lack of completeness of conforma-
tional sampling by the MD simulation. It was found that even at
the end of an extremely long (1 ms) MD simulation a protein
continued to sample new conformations, without showing signs of
convergence [84]. The reason for the inefficient sampling during
an MD simulation is that the conformational landscape of proteins
resembles that of a mountain range. At the start of an MD simulation,
the protein will land in one of many local energy minima in
the landscape. Once it has ended up in a particular local energy
minimum it happens rarely that the MD trajectory goes far enough
uphill in energy to enter another local minimum [42, 85–88].
An efficient method to improve conformational sampling is to
do multiple MD simulations that start with different initial atom
velocities or slightly different starting structures. The differences
in starting conditions make that the simulations sample multiple
local minima [42, 85–88]. A recent study on dehalogenases [42]
found that more conformational space was sampled through 20 very
short MD simulations (of only 10 ps) than through one long MD
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

824 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

simulation (of 22 ns). Even such short MD simulations sampled


different binding modes of the substrate. A NAC analysis on these
trajectories showed that the averaged results converged at 20
trajectories and could then accurately reproduce the experimentally
determined enantioselectivities of four dehalogenases for 45 differ-
ent substrates. The short MD simulations allow for high-throughput
screening of enzyme variants, as only 2 CPU hours per enzyme–
substrate combination were required. The same protocol was later
successfully used to in silico select highly enantioselective epoxide
hydrolases (Table 24.1) [19].
There are many other protocols to increase conformational
sampling [48]. For example, in metadynamics the substrate is
actively pushed away from conformations that have already been
sampled sufficiently [89]. It is as yet unclear how the various
enhanced sampling protocols compare to each other with regard
to the risks of generating unrealistic conformations and how fast
conformational sampling converges.

24.6.3 Other Concerns


A further concern can be whether the starting structure of the
MD simulations is in agreement with the enzyme in solution. The
protonation states of enzymes and substrates should reflect those
under experimental conditions [90]. Also other features in the X-
ray structure may have to be manually corrected as described
in detail by Lonsdale et al. [91]. Furthermore, structures that
have been obtained through homology modeling instead of by
experiment could have a structure that is too different from the true
structure to perform meaningful MD simulations. If started with an
unrealistic structure, the MD simulations are expected to explore
conformations that are not representative for the protein in solution
until the correct fold is found.
It is essential that the right catalytic step in the enzyme
mechanism is targeted during in silico screening. It can occur that
the wrong catalytic step is monitored. For example, an approach in
which one attempts to find faster enzyme variants by screening for

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

Outlook and Further Possibilities 825

more NACs would be inapt if product release is rate limiting for the
enzyme.

24.7 Outlook and Further Possibilities

Most examples of MD simulation to obtain improved enzyme


variants still require a subsequent experimental screening of a panel
of remaining variants. This illustrates the limitations in accuracy
that can as yet be obtained. Improvements in prediction accuracy
could come from better force fields, which continue to be developed,
and from techniques to better sample conformational space.
Computational screening is also becoming increasingly attractive
by the exponential increases in commonly available computational
resources. A practical advantage of in silico screening is that it
can eliminate most of the possible variants prior to experimental
characterization. This markedly decreases the amount of labor and
materials required to obtain improved enzyme variants.
A methodological advantage of in silico screening is that it can be
used to experimentally evaluate hypotheses on enzyme function. For
example, explanations for the low amidase activity of CaLB, the low
cocaine esterase activity of human butyrylcholin esterase, and the
low catalytic activity of a Kemp eliminase could be experimentally
confirmed by selecting appropriate variants, which indeed had
improved catalytic activity (Table 24.1). This contrasts with the
more common method of using computational methods to provide
explanations for the high catalytic activity of existing enzymes,
which often results in conflicting theories [36, 39, 92, 93] that cannot
be evaluated easily. In contrast, using the theoretical explanation to
subsequently predict improved variants provides an opportunity to
reject or validate the hypothesis experimentally.

Acknowledgments

This work was supported by the European Union U 7th framework


project Kyrobio (KBBE-2011-5,289646) and by the Dutch Ministry
of Economic Affairs through the BE-Basic organization.
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

826 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

References

1. Bar-Even, A., Noor, E., Savir, Y., Liebermeister, W., Davidi, D., Tawfik, D. S.,
and Milo, R. (2011). The moderately efficient enzyme: evolutionary and
physicochemical trends shaping enzyme parameters, Biochemistry, 50,
pp. 4402–4410.
2. Bornscheuer, U. T., Huisman, G. W., Kazlauskas, R. J., Lutz, S., Moore, J.
C., and Robins, K. (2012). Engineering the third wave of biocatalysis,
Nature, 485, pp. 185–194.
3. Jochens, H., Hesseler, M., Stiba, K., Padhi, S. K., Kazlauskas, R. J., and
Bornscheuer, U. T. (2011). Protein engineering of alpha/beta-hydrolase
fold enzymes, ChemBioChem, 12, pp. 1508–1517.
4. Goldsmith, M., and Tawfik, D. S. (2012). Directed enzyme evolution:
beyond the low-hanging fruit, Curr. Opin. Struct. Biol., 22, pp. 406–412.
5. Chaparro-Riggers, J. F., Polizzi, K. M., and Bommarius, A. S. (2007). Better
library design: data-driven protein engineering, Biotechnol. J., 2, pp.
180–191.
6. Fersht, A. (1997). Structure and Mechanism in Protein Science: A Guide to
Enzyme Catalysis and Protein Folding (W.H. Freeman and company, New
York).
7. Kong, X. D., Yuan, S., Li, L., Chen, S., Xu, J. H., and Zhou, J. (2014).
Engineering of an epoxide hydrolase for efficient bioresolution of bulky
pharmaco substrates, Proc. Natl. Acad. Sci. U S A, 111, pp. 15717–15722.
8. Wijma, H. J., and Janssen, D. B. (2013). Computational design gains
momentum in enzyme catalysis engineering, FEBS J., 280, pp. 2948–
2960.
9. Kiss, G., Celebi-Oelcuem, N., Moretti, R., Baker, D., and Houk, K. N. (2013).
Computational enzyme design, Angew. Chem., Intl. Ed., 52, pp. 5700–
5725.
10. Richter, F., Leaver-Fay, A., Khare, S. D., Bjelic, S., and Baker, D. (2011). De
novo enzyme design using Rosetta3, PLOS ONE, 6, p. e19230.
11. Jimenez-Oses, G., Osuna, S., Gao, X., Sawaya, M. R., Gilson, L., Collier, S.
J., Huisman, G. W., Yeates, T. O., Tang, Y., and Houk, K. N. (2014). The
role of distant mutations and allosteric regulation on LovD active site
dynamics, Nat. Chem. Biol., 10, pp. 431–436.
12. Zheng, F., Xue, L., Hou, S., Liu, J., Zhan, M., Yang, W., and Zhan, C.
G. (2014). A highly efficient cocaine-detoxifying enzyme obtained by
computational design, Nat. Commun., 5, p. 3457.

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

References 827

13. Zheng, F., Yang, W., Ko, M. C., Liu, J., Cho, H., Gao, D., Tong, M., Tai, H. H.,
Woods, J. H., and Zhan, C. G. (2008). Most efficient cocaine hydrolase
designed by virtual screening of transition states, J. Am. Chem. Soc., 130,
pp. 12148–12155.
14. Xue, L., Ko, M. C., Tong, M., Yang, W., Hou, S., Fang, L., Liu, J.,
Zheng, F., Woods, J. H., Tai, H. H., and Zhan, C. G. (2011). Design,
preparation, and characterization of high-activity mutants of human
butyrylcholinesterase specific for detoxification of cocaine, Mol. Phar-
macol., 79, pp. 290–297.
15. Pan, Y., Gao, D., Yang, W., Cho, H., Yang, G., Tai, H. H., and Zhan, C. G. (2005).
Computational redesign of human butyrylcholinesterase for anticocaine
medication, Proc. Natl. Acad. Sci. U S A, 102, pp. 16656–16661.
16. Gao, D., Cho, H., Yang, W., Pan, Y., Yang, G., Tai, H., and Zhan, C. (2006).
Computational design of a human butyrylcholinesterase mutant for
accelerating cocaine hydrolysis based on the transition-state simulation,
Angew. Chem., Int. Ed., 45, pp. 653–657.
17. Sun, H., Pang, Y., Lockridge, O., and Brimijoin, S. (2002). Re-engineering
butyrylcholinesterase as a cocaine hydrolase, Mol. Pharmacol., 62, pp.
220–224.
18. Seifert, A., Antonovici, M., Hauer, B., and Pleiss, J. (2011). An efficient
route to selective bio-oxidation catalysts: an iterative approach com-
prising modeling, diversification, and screening, based on CYP102A1,
ChemBioChem, 12, pp. 1346–1351.
19. Wijma, H. J., Floor, R. J., Bjelic, S., Marrink, S. J., Baker, D., and Janssen,
D. B. (2015). Enantioselective enzymes by computational design and in
silico screening, Angew. Chem., Int. Ed., 127, pp. 3797–3801.
20. Syren, P. O., Hendil-Forssell, P., Aumailley, L., Besenmatter, W., Gou-
nine, F., Svendsen, A., Martinelle, M., and Hult, K. (2012). Esterases
with an introduced amidase-like hydrogen bond in the transition
state have increased amidase specificity, ChemBioChem, 13, pp. 645–
648.
21. Privett, H. K., Kiss, G., Lee, T. M., Blomberg, R., Chica, R. A., Thomas, L.
M., Hilvert, D., Houk, K. N., and Mayo, S. L. (2012). Iterative approach to
computational enzyme design, Proc. Natl. Acad. Sci. U S A, 109, pp. 3790–
3795.
22. Cui, D., Zhang, L., Yao, Z., Liu, X., Lin, J., Yuan, Y. A., and Wei, D.
(2013). Computational design of short-chain dehydrogenase Gox2181
for altered coenzyme specificity, J. Biotechnol., 167, pp. 386–392.
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

828 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

23. Huang, X., Gao, D., and Zhan, C. G. (2011). Computational design of
a thermostable mutant of cocaine esterase via molecular dynamics
simulations, Org. Biomol. Chem., 9, pp. 4138–4143.
24. Wijma, H. J., Floor, R. J., Jekel, P. A., Baker, D., Marrink, S. J., and Janssen,
D. B. (2014). Computationally designed libraries for rapid enzyme
stabilization, Protein Eng. Des. Sel., 27, pp. 49–58.
25. Floor, R. J., Wijma, H. J., Colpa, D. I., Ramos-Silva, A., Jekel, P. A.,
Szymanski, W., Feringa, B. L., Marrink, S. J., and Janssen, D. B. (2014).
Computational library design for increasing haloalkane dehalogenase
stability, ChemBioChem, 15, pp. 1659–1671.
26. Huey, R., Morris, G. M., Olson, A. J., and Goodsell, D. S. (2007). A
semiempirical free energy force field with charge-based desolvation, J.
Comput. Chem., 28, pp. 1145–1152.
27. Kiss, G., Rothlisberger, D., Baker, D., Houk, K. N. (2010). Evaluation and
ranking of enzyme designs, Protein Sci., 19, pp. 1760–1773.
28. Mladenovic, M., Ansorg, K., Fink, R. F., Thiel, W., Schirmeister, T., and
Engels, B. (2008). Atomistic insights into the inhibition of cysteine
proteases: first QM/MM calculations clarifying the stereoselectivity of
epoxide-based inhibitors, J. Phys. Chem. B, 112, pp. 11798–11808.
29. Roiban, G. D., Agudo, R., Ilie, A., Lonsdale, R., and Reetz, M. T. (2014).
CH-activating oxidative hydroxylation of 1-tetralones and related
compounds with high regio- and stereoselectivity, Chem. Commun.
(Camb), 50, pp. 14310–14313.
30. Juhl, P. B., Doderer, K., Hollmann, F., Thum, O., and Pleiss, J. (2010).
Engineering of Candida antarctica lipase B for hydrolysis of bulky
carboxylic acid esters, J. Biotechnol., 150, pp. 474–480.
31. Eksterowicz, J. E., and Houk, K. N. (1993). Transition-state modeling
with empirical force-fields, Chem. Rev., 93, pp. 2439–2461.
32. Pancook, J., Pecht, G., Ader, M., Mosko, M., Lockridge, O., and Watkins,
J. (2003). Application of directed evolution technology to optimize the
cocaine hydrolase activity of human butyrylcholinesterase, FASEB J., 17,
pp. A565–A565.
33. Hamza, A., Cho, H., Tai, H. H., and Zhan, C. G. (2005). Molecular dynamics
simulation of cocaine binding with human butyrylcholinesterase and its
mutants, J. Phys. Chem. B., 109, pp. 4776–4782.
34. Braiuca, P., Lorena, K., Ferrario, V., Ebert, C., and Gardossi, L. (2009).
A three-dimensional quanititative structure-activity relationship (3D-
QSAR) model for predicting the enantioselectivity of Candida antarctica
lipase B, Adv. Synth. Catalys., 351, pp. 1293–1302.

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

References 829

35. Bruice, T. C. (2006). Computational approaches: reaction trajectories,


structures, and atomic motions. Enzyme reactions and proficiency,
Chem. Rev., 106, pp. 3119–3139.
36. Bruice T. C. (2002). A view at the millennium: the efficiency of enzymatic
catalysis, Acc. Chem. Res., 35, pp. 139–148.
37. Bruice, T. C., and Benkovic, S. J. (2000). Chemical basis for enzyme
catalysis, Biochemistry, 39, pp. 6267–6274.
38. Hur, S., Kahn, K., and Bruice T. C. (2003). Comparison of formation of
reactive conformers for the SN2 displacements by CH3CO2- in water
and by Asp124-CO2- in a haloalkane dehalogenase, Proc. Natl. Acad. Sci.
U S A, 100, pp. 2215–2219.
39. Warshel, A., Sharma, P. K., Kato, M., Xiang, Y., Liu, H., and Olsson, M. H.
M. (2006). Electrostatic basis for enzyme catalysis, Chem. Rev., 106, pp.
3210–3235.
40. Zheng, H., and Reetz, M. T. (2010). Manipulating the stereoselectivity
of limonene epoxide hydrolase by directed evolution based on iterative
saturation mutagenesis, J. Am. Chem. Soc., 132, pp. 15744–15751.
41. Westerbeek, A., Szymanski, W., Wijma, H. J., Marrink, S. J., Feringa, B.
L., and Janssen, D. B. (2011). Kinetic resolution of alpha-bromoamides:
experimental and theoretical investigation of highly enantioselective
reactions catalyzed by haloalkane dehalogenases, Adv. Synth. Catalys.,
353, pp. 931–944.
42. Wijma, H. J., Marrink, S. J., and Janssen, D. B. (2014). Computationally
efficient and accurate enantioselectivity modeling by clusters of
molecular dynamics simulations, J. Chem. Inf. Model., 54, pp. 2079–
2092.
43. Keizers, P. H., de Graaf, C., de Kanter, F. J., Oostenbrink, C., Feenstra, K.
A., Commandeur, J. N., and Vermeulen, N. P. (2005). Metabolic regio- and
stereoselectivity of cytochrome P450 2D6 towards 3,4-methylenedioxy-
N-alkylamphetamines: in silico predictions and experimental validation,
J. Med. Chem., 48, pp. 6117–6127.
44. Stjernschantz, E., van Vugt-Lussenburg, B. M., Bonifacio, A., de Beer,
S. B., van der Zwan, G., Gooijer, C., Commandeur, J. N., Vermeulen, N.
P., and Oostenbrink, C. (2008). Structural rationalization of novel drug
metabolizing mutants of cytochrome P450 BM3, Proteins, 71, pp. 336–
352.
45. Bjelic, S., Nivon, L. G., Celebi-Oelcuem, N., Kiss, G., Rosewall, C. F., Lovick,
H. M., Ingalls, E. L., Gallaher, J. L., Seetharaman, J., Lew, S., Montelione,
G. T., Hunt, J. F., Michael, F. E., Houk, K. N., and Baker, D. (2013).
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

830 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

Computational design of enone-binding proteins with catalytic activity


for the Morita-Baylis-Hillman reaction, ACS Chem. Biol., 8, pp. 749–757.
46. Syren, P. O., and Hult, K. (2011). Amidases have a hydrogen bond that
facilitates nitrogen inversion, but esterases have not, ChemCatChem, 3,
pp. 853–860.
47. Hollfelder, F., Kirby, A. J., and Tawfik, D. S. (1996). Off-the-shelf proteins
that rival tailor-made antibodies as catalysts, Nature, 383, pp. 60–62.
48. Van Gunsteren, W. F., Bakowies, D., Baron, R., Chandrasekhar, I., Christen,
M., Daura, X., Gee, P., Geerke, D. P., Glattli, A., Hunenberger, P. H.,
Kastenholz, M. A., Oostenbrink, C., Schenk, M., Trzesniak, D., van der
Vegt, N. F., and Yu, H. B. (2006). Biomolecular modeling: goals, problems,
perspectives, Angew. Chem., Int. Ed., 45, pp. 4064–4092.
49. Genheden, S., and Ryde, U. (2012). Improving the efficiency of protein-
ligand binding free-energy calculations by system truncation, J. Chem.
Theory Comput., 8, pp. 1449–1458.
50. Mikulskis, P., Genheden, S., Rydberg, P., Sandberg, L., Olsen, L., and Ryde,
U. (2012). Binding affinities in the SAMPL3 trypsin and host-guest blind
tests estimated with the MM/PBSA and LIE methods, J. Comput. Aided
Mol. Des., 26, pp. 527–541.
51. Singh, N., and Warshel, A. (2010). Absolute binding free energy
calculations: on the accuracy of computational scoring of protein-ligand
interactions, Proteins, 78, pp. 1705–1723.
52. Feldmeier, K., and Hocker, B. (2013). Computational protein design of
ligand binding and catalysis, Curr. Opin. Chem. Biol., 17, pp. 929–933.
53. Nervall, M., Hanspers, P., Carlsson, J., Boukharta, L., and Aqvist, J. (2008).
Predicting binding modes from free energy calculations, J. Med. Chem.,
51, pp. 2657–2667.
54. Kuhn, B., and Kollman, P. A. (2000). Binding of a diverse set of
ligands to avidin and streptavidin: an accurate quantitative prediction
of their relative affinities by a combination of molecular mechanics and
continuum solvent models, J. Med. Chem., 43, pp. 3786–3791.
55. Trott, O., and Olson, A. J. (2010). AutoDock Vina: improving the
speed and accuracy of docking with a new scoring function, efficient
optimization, and multithreading, J. Comput. Chem., 31, pp. 455–461.
56. Bissantz, C., Kuhn, B., and Stahl, M. (2010). A medicinal chemist’s guide
to molecular interactions, J. Med. Chem., 53, pp. 5061–5084.
57. Ferdjani, S., Ionita, M., Roy, B., Dion, M., Djeghaba, Z., Rabiller, C., and
Tellier, C. (2011). Correlation between thermostability and stability of
glycosidases in ionic liquid, Biotechnol. Lett., 33, pp. 1215–1219.

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

References 831

58. Tokuriki, N., and Tawfik, D. S. (2009). Stability effects of mutations and
protein evolvability, Curr. Opin. Struct. Biol., 19, pp. 596–604.
59. Shoichet, B. K., Baase, W. A., Kuroki, R., and Matthews, B. W. (1995). A
relationship between protein stability and protein function, Proc. Natl.
Acad. Sci. U S A, 92, pp. 452–456.
60. Eijsink, V., Bjork, A., Gaseidnes, S., Sirevag, R., Synstad, B., Van den Burg,
B., and Vriend, G. (2004). Rational engineering of enzyme stability, J.
Biotechnol., 113, pp. 105–120.
61. Bommarius, A. S., and Paye, M. F. (2013). Stabilizing biocatalysts, Chem.
Soc. Rev., 42, pp. 6534–6565.
62. Polizzi, K. M., Bommarius, A. S., Broering, J. M., and Chaparro-Riggers, J.
F. (2007). Stability of biocatalysts, Curr. Opin. Chem. Biol., 11, pp. 220–
225.
63. Trevino, S. R., Gokulan, K., Newsom, S., Thurlkill, R. L., Shaw, K. L.,
Mitkevich, V. A., Makarov, A. A., Sacchettini, J. C., Scholtz, J. M., and Pace, C.
N. (2005). Asp79 makes a large, unfavorable contribution to the stability
of RNase Sa, J. Mol. Biol., 354, pp. 967–978.
64. Augustyniak, W., Brzezinska, A. A., Pijning, T., Wienk, H., Boelens, R.,
Dijkstra, B. W., and Reetz, M. T. (2012). Biophysical characterization of
mutants ofBacillus subtilis lipase evolved for thermostability: factors
contributing to increased activity retention, Protein Sci., 21, pp. 487–
497.
65. Malakauskas, S., and Mayo, S. (1998). Design, structure and stability
of a hyperthermophilic protein variant, Nat. Struct. Biol., 5, pp. 470–
475.
66. Huang, P. S., Oberdorfer, G., Xu, C., Pei, X. Y., Nannenga, B. L., Rogers, J. M.,
DiMaio, F., Gonen, T., Luisi, B., and Baker, D. (2014). High thermodynamic
stability of parametrically designed helical bundles, Science, 346, pp.
481–485.
67. Wijma, H. J., Floor, R. J., and Janssen, D. B. (2013). Structure-
and sequence-analysis inspired engineering of proteins for enhanced
thermostability, Curr. Opin. Struct. Biol., 23, pp. 588–594.
68. Eijsink, V., Gaseidnes, S., Borchert, T., and Van den Burg, B. (2005).
Directed evolution of enzyme stability, Biomol. Eng., 22, pp. 21–30.
69. Vihinen M. (1987). Relationship of protein flexibility to thermostability,
Protein Eng., 1, pp. 477–480.
70. Kamerzell, T. J., and Middaugh, C. R. (2008). The complex inter-
relationships between protein flexibility and stability, J. Pharm. Sci., 97,
pp. 3494–3517.
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

832 In Silico Screening of Enzyme Variants by Molecular Dynamics Simulation

71. Zhang, S., Wang, Y., Song, X., Hong, J., Zhang, Y., and Yao, L. (2014).
Improving Trichoderma reesei Cel7B thermostability by targeting the
weak spots, J. Chem. Inf. Model., 54, pp. 2826–2833.
72. Park, H. J., Park, K., Kim, Y. H., and Yoo, Y. J. (2014). Computational
approach for designing thermostable Candida antarctica lipase B by
molecular dynamics simulation, J. Biotechnol., 192, pp. 66–70.
73. Rathi, P. C., Radestock, S., and Gohlke, H. (2012). Thermostabilizing
mutations preferentially occur at structural weak spots with a high
mutation ratio, J. Biotechnol., 159, pp. 135–144.
74. Radestock, S., and Gohlke, H. (2011). Protein rigidity and thermophilic
adaptation, Protein Struct. Funct. Bioinform., 79, pp. 1089–1108.
75. Pfleger, C., Rathi, P. C., Klein, D. L., Radestock, S., and Gohlke, H. (2013).
Constraint Network Analysis (CNA): a Python software package for
efficiently linking biomacromolecular structure, flexibility, (thermo-
)stability, and function, J. Chem. Inf. Model., 53, pp. 1007–1015.
76. Beauchamp, K. A., Lin, Y. S., Das, R., and Pande, V. S. (2012). Are protein
force fields getting better? A systematic benchmark on 524 diverse NMR
measurements, J. Chem. Theory Comput., 8, pp. 1409–1414.
77. Lange, O. F., van der Spoel, D., and de Groot, B. L. (2010). Scrutinizing
molecular mechanics force fields on the submicrosecond timescale with
NMR data, Biophys. J., 99, pp. 647–655.
78. Lindorff-Larsen, K., Maragakis, P., Piana, S., Eastwood, M. P., Dror, R. O.,
and Shaw, D. E. (2012). Systematic validation of protein force fields
against experimental data, PLOS ONE, 7, p. e32131.
79. Lindorff-Larsen, K., Piana, S., Palmo, K., Maragakis, P., Klepeis, J. L., Dror,
R. O., and Shaw, D. E. (2010). Improved side-chain torsion potentials for
the Amber ff99SB protein force field, Proteins, 78, pp. 1950–1958.
80. Ponder, J. W., Wu, C., Ren, P., Pande, V. S., Chodera, J. D., Schnieders, M. J.,
Haque, I., Mobley, D. L., Lambrecht, D. S., DiStasio, R. A., Jr, Head-Gordon,
M., Clark, G. N., Johnson, M. E., and Head-Gordon, T. (2010). Current
status of the AMOEBA polarizable force field, J. Phys. Chem. B, 114, pp.
2549–2564.
81. Jakalian, A., Jack, D. B., and Bayly, C. I. (2002). Fast, efficient generation of
high-quality atomic charges. AM1-BCC model: II. Parameterization and
validation, J. Comput. Chem., 23, pp. 1623–1641.
82. Wang, J., Wolf, R. M., Caldwell, J. W., Kollman, P. A., and Case, D. A. (2004).
Development and testing of a general amber force field, J. Comput. Chem.,
25, pp. 1157–1174.

www.ebook3000.com
February 2, 2016 16:53 PSP Book - 9in x 6in 24-Allan-Svendsen-c24

References 833

83. Lousa, D., Baptista, A. M., and Soares, C. M. (2013). A molecular


perspective on nonaqueous biocatalysis: contributions from simulation
studies, Phys. Chem. Chem. Phys., 15, pp. 13723–13736.
84. Genheden, S., and Ryde, U. (2012). Will molecular dynamics simulations
of proteins ever reach equilibrium?, Phys. Chem. Chem. Phys., 14, pp.
8662–8677.
85. Genheden, S., and Ryde, U. (2011). A comparison of different initializa-
tion protocols to obtain statistically independent molecular dynamics
simulations, J. Comput. Chem., 32, pp. 187–195.
86. Genheden, S., and Ryde, U. (2010). How to obtain statistically converged
MM/GBSA results, J. Comput. Chem., 31, pp. 837–846.
87. Caves, L. S. D., Evanseck, J. D., and Karplus, M. (1998). Locally accessible
conformations of proteins: multiple molecular dynamics simulations of
crambin, Prot. Sci., 7, pp. 649–666.
88. Genheden, S., Diehl, C., Akke, M., and Ryde, U. (2010). Starting-condition
dependence of order parameters derived from Molecular Dynamics
simulations, J. Chem. Theory Comput., 6, pp. 2176–2190.
89. Sirin, S., Kumar, R., Martinez, C., Karmilowicz, M. J., Ghosh, P., Abramov,
Y. A., Martin, V., and Sherman, W. (2014). A Computational approach
to enzyme design: predicting omega-aminotransferase catalytic activity
using docking and MM-GBSA scoring, J. Chem Inf. Model., 54, pp. 2334–
2346.
90. Krieger, E., Nielsen, J. E., Spronk, C. A., and Vriend, G. (2006). Fast
empirical pKa prediction by Ewald summation, J. Mol. Graph. Model., 25,
pp. 481–486.
91. Lonsdale, R., Harvey, J. N., and Mulholland, A. J. (2012). A practical guide
to modelling enzyme-catalysed reactions, Chem. Soc. Rev., 41, pp. 3025–
3038.
92. Schwartz, S. D., and Schramm, V. L. (2009). Enzymatic transition states
and dynamic motion in barrier crossing, Nat. Chem. Biol., 5, pp. 551–558.
93. Glowacki, D. R., Harvey, J. N., and Mulholland, A. J. (2012). Taking
Ockham’s razor to enzyme dynamics and catalysis, Nat. Chem., 4, pp.
169–176.
This page intentionally left blank

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Chapter 25

Kinetic Stability of Variant Enzymes

Jose M. Sanchez-Ruiz
Facultad de Ciencias, Departamento de Quimica Fisica, Universidad de Granada,
Granada 18071, Spain
sanchezr@ugr.es

25.1 Kinetics vs. Thermodynamics in Protein Stability

Stability is a crucial feature to consider when developing techno-


logical and biomedical applications of proteins [1–3]. Moreover,
high stability is a likely contributor to evolvability [4] and highly
stable proteins provide, therefore, a convenient starting point for
design and engineering of new functions [5]. Understandably, much
effort has been devoted over many years to dissect the factors
that determine protein stability and to develop method to predict
stability-enhancing protein alterations (typically, although not
exclusively, mutations). What it is not fully appreciated sometimes,
however (or, at least, it goes often without explicit saying), is that
protein stability is not a simple property that can be quantified by
giving just one single number. Rather, the concept of protein stability
embraces a variety of metrics that describe the persistence of the
functional protein state under different environment conditions

Understanding Enzymes: Function, Design, Engineering, and Analysis


Edited by Allan Svendsen
Copyright c 2016 Pan Stanford Publishing Pte. Ltd.
ISBN 978-981-4669-32-0 (Hardcover), 978-981-4669-33-7 (eBook)
www.panstanford.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

836 Kinetic Stability of Variant Enzymes

and timescales. For instance, in the academic literature the term


“protein stability” has been sometimes equated to the unfolding free
energy at a given temperature (typically the standard temperature
of 25◦ C or the physiological temperature of 37◦ C). Certainly, a
detailed description of the thermodynamic stability of a protein is
afforded by the profile of unfolding free energy versus temperature
[6–8]. This so-called protein stability curve reveals the existence
of both heat- and cold-denaturation processes (the stability curve
crosses zero at two temperatures) and a free-energy optimum
at a temperature (the entropy inversion temperature) between
the heat- and cold-denaturation temperatures. Highly useful and
informative though it is, the thermodynamic description of protein
stability belongs to an (usually two-state) equilibrium unfolding
process (typically achieved for small model proteins under well-
defined simple laboratory conditions), and, as elaborated below, its
applicability to real-life biotechnological settings is not warranted.
Consider a technological application that requires an enzyme
to be functional at a comparatively high temperature. One might
perhaps naively expect that the operational temperature range
for the application is determined, among other factors, by the
thermodynamic stability curve of the protein. That is, one might
expect that the operational range reaches up to about the equi-
librium heat-denaturation temperature (the high temperature at
which the unfolding equilibrium constant becomes unity). A little
reflection, however, will convince us that this is not necessarily
the case. Even if the native structure is itself safe, the unfolded
and partially unfolded states of many proteins (in particular, the
complex protein systems of biotechnological interest) are prone
to undergo irreversible alteration processes [8–10], such as aggre-
gation, proteolysis (if there are proteases in the milieu), cofactor
loss, undesirable interactions with other molecular components in
the system, etc. Consider now a temperature clearly below the
denaturation equilibrium temperature of the protein (specified by
the stability curve). At that temperature the unfolding equilibrium
constant is much smaller than unity and only a small amount of
unfolded state is present in equilibrium with the native functional
state. Let’s assume for the sake of illustration that the unfolding
equilibrium constant is 0.01 at the temperature under consideration

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Kinetics vs. Thermodynamics in Protein Stability 837

or, what is the same, that at equilibrium, we have a 1/100 unfolded-


to-native-state ratio. In principle, therefore, most of the protein
exists as native functional state at that temperature. This situation,
however, will not last forever if the unfolded state is prone, for
instance, to aggregation. In that case, aggregation of the small
amount of unfolded state initially present will perturb the unfolded–
native ratio away from its equilibrium value and will trigger a
shift in the native–unfolded equilibrium so that the equilibrium
1/100 ratio is recovered. However, the newly formed amount of
unfolded state will also aggregate, leading to a new shift in the
unfolding equilibrium. It is easy to see [8, 11–13] that, as this
process continues over time, the native, functional state will be
depleted through the aggregation of the unfolded state and that
eventually all the protein will end up as nonfunctional aggregates.
Clearly, the protein stability curve (the plot of unfolding free
energy versus temperature) does not necessarily provide a reliable
indication of the operational temperature range for an enzyme-
based technological application that involves a definite timescale.
Actually, the protein biotechnologist in charge of developing the
application will not likely use the thermodynamic stability curve
(which, in any case, may not be straightforward to determine for the
complex proteins typically used in biotechnological applications)
as a guide. Rather, he or she could, for instance, perform thermal
inactivation experiments (leading to profiles of activity versus time)
at several temperatures to determine the feasible temperature range
for the application and the more suitable protein variants.
Another particularly clear example is provided by the devel-
opment of stable formulations for therapeutic proteins [14]. Here,
the issue is not primarily one of thermodynamic stability, of
course. There can be little doubt that thermodynamic stability (a
positive value of the unfolding free energy) is satisfied by the
formulations proposed to achieve an adequate shelf life for a protein
under storage conditions (typically low temperature). However,
despite thermodynamic stability, several degradation processes
(aggregation, chemical alterations such as deamidation, polypeptide
cleavage, oxidation, surface denaturation, etc.) may occur upon
years of storage at low temperature. Formulations are designed
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

838 Kinetic Stability of Variant Enzymes

and developed to prevent or limit these irreversible degradation


processes as much as possible [14].
The above discussions illustrate a rather general principle,
namely that, if irreversible alterations may occur, thermodynamic
stability (as specified by a positive value for the unfolding free
energy) does not guarantee that the protein will remain in the native,
functional state in a given timescale [8, 12, 13, 15]. Most often,
these alterations will occur from the unfolded or partially unfolded
states (the so-called Lumry–Eyring scenario [9, 10, 13]), although it
cannot of course be ruled out that the folded native state may also
undergo irreversible alteration processes in particular over very
long timescales [14]. It also emerges from the above discussions
that many different metrics of protein stability naturally arise,
including the unfolding free-energy value at a given temperature, the
equilibrium denaturation temperature (the temperature at which
the unfolding equilibrium constant becomes unity), the rate (or
half-life) of irreversible denaturation at a given temperature, the
temperature at which irreversible denaturation occurs significantly
in the timescale of the suggested application, the serum half-life
for a therapeutic protein, the amount of denatured protein after
a certain time (years?) under storage conditions, the amount of
denatured protein in accelerated stability studies commonly used
to explore suitable storage conditions, etc. Some of these metrics
refer to a native–unfolded equilibrium and describe thermodynamic
stability. Others are related to the rate of irreversible denaturation
or the timescale upon which irreversible denaturation takes place to
a significant extent and reflect kinetic stability [15].

25.2 Mutation Effects on Kinetic Stability: A Description


Based on the Transition State for Irreversible
Denaturation

The general principles behind the rational enhancement of protein


thermodynamic stability are well understood (the devil is, of course,
in the details). One just needs to find mutations that increase the
free-energy difference between the native and the unfolded states,
and since these states are quite different from a structural point

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Mutation Effects on Kinetic Stability 839

of view, several obvious possibilities come immediately to mind:


Introduction of disulfide bridges should enhance thermodynamic
stability [16, 17] through their effect on the conformational entropy
of the unfolded state (provided, of course, that the engineered
disulfide bridges do not significantly strain or distort the native
structure); charge deletion, charge introduction, and charge reversal
mutations can be chosen in such a way that the charge–charge
interactions on the native proteins’ surface are overall stabilizing
[18–20]; and hydrophobic packing in the native structure can be
optimized for stability through suitable mutations [21]. These and
other rational approaches to thermodynamic stability enhancement
have been explored with a level of success that appears to be
reasonable, at least to the extent that success in protein engineering
can be ascertained from the published literature (since negative
results are not commonly published). Also, several algorithms to
predict mutation effects on protein stability have been developed
and shown to reproduce at least the general experimental trends
[22].
Our acceptable capability to rationally design protein thermo-
dynamic stability stems from the availability of suitable structural
descriptions of the two main states involved. The native structure
is usually known from X-ray crystallography or nuclear magnetic
resonance (NMR) studies, and from the point of view of stability
prediction and design, the unfolded state may often be viewed
as an ensemble of more or less disordered chains with a high
degree of exposure to the solvent [23], although, certainly, residual
structure in the unfolded state may have a significant impact
on thermodynamic stability, as several experimental studies have
shown [24–27]. Likewise, the rational design of protein kinetic
stability should likely be based upon appropriate descriptions of
the protein states relevant for the rate of irreversible denaturation.
Many conformational processes involving proteins, in particular
those related with fast folding, are best described in terms of
the movement of a point representing conformation on the so-
called energy landscape, that is, the hypersurface of internal free
energy versus conformational degrees of freedom [28–30]. However,
irreversible denaturation processes are typically comparatively slow
and the usual physical chemist description of kinetics in terms of
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

840 Kinetic Stability of Variant Enzymes

transition-state theory should plausibly apply. Therefore, the simple


mental construction we will use as a working hypothesis involves
a plot of free energy versus a single reaction coordinate and an
expression of the rate constant for irreversible denaturation in
terms of a free-energy barrier:
 
−G=
k = k0 exp (25.1)
RT
where k0 is the front factor, R is the gas constant, T is the absolute
temperature, and G‡ is the activation free energy, that is, the free
energy of the kinetically relevant transition state (the hypothetical
protein state at the top of the rate-determining barrier) with
reference to that of the native state.
Clearly, in the admittedly very simple approach proposed above,
the two states relevant for kinetic stabilization are the native state
and the transition state. The latter, however, belongs to the top of
a free-energy barrier and its structural description is elusive, since
its population will always remain extremely low. Note that, for a
transition state to be significantly populated, the free-energy barrier
would have to be on the order of the thermal energy (RT, about
a few kilojoules per mole), a situation that may occur in cases of
ultrafast protein folding [30–33] but that it is highly unlikely to apply
to the kinetics of protein irreversible denaturation. An experimental
approach to the structural modeling of the transition state relevant
for kinetic stabilization must then be based upon on the effects
of mutations on the rate of irreversible denaturation, in a manner
akin to the traditional φ-value approach to the structure of the
transition state for protein folding [34, 35]. For instance, if, for the
sake of illustration, we assume that the transition-state structure
can be simply described as partially unfolded, then mutations on
the protein region that becomes unstructured in the transition
state will affect the denaturation rate, while mutations on the
protein region that remains native-like in the transition state will
not. We see, therefore, that there is synergistic interplay between
the structural description of the transition state and the effect of
mutations on kinetic stability. Thus, the experimental study of a
certain number of such mutation effects may lead to a proposed
model for the transition state, which, in turn, could allow us to focus

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Kinetic Stability Linked to the Breakup of Interactions in the Transition State 841

additional mutations to the enhancement of kinetic stability and to


the refinement of the structural model for the transition state.
This chapter is sharply focused to the review of several published
studies into the kinetic stability of protein variants that provide a
first glimpse of the structural features of the kinetically relevant
transition states for irreversible protein denaturation and that may
eventually contribute to the development of rational approaches to
kinetic stability enhancement. For a more general exploration of
protein kinetic stability and its implications, see the review paper
recently published by the author [15].

25.3 Kinetic Stability Linked to the Breakup of


Interactions in the Transition State: Pro-dependent
Proteases

In the preceding sections we have discussed protein kinetic stability


mainly in terms of Lumry–Eyring scenarios in which the native
protein is assumed to denature in a given timescale, despite being
thermodynamically stable with respect to the unfolded and a par-
tially states, because these states can undergo irreversible alteration
processes. There is, however, a well-studied class of proteins in
which stability of the functional form is based solely on kinetics. Pro-
dependent proteases, exemplified by the paradigmatically example
of the α-lytic protease, are synthesized as proenzymes containing a
pro-region that is required for folding and thermodynamic stability.
However, cleavage of the pro-region leads to an active α-lytic
protease that has been shown to be thermodynamically unstable
with respect to the unfolded state [36]. The stability of the cleaved,
active form relies exclusively on a very high free-energy barrier
separating both states in such a way that the half-life of the native
active form may even be on the order of thousand years.
One obvious question is what type of transition state (the
structure at the top of the barrier) is related to such a huge free-
energy barrier. The native state of pro-dependent proteases displays
a double β-barrel structure with two defined domains connected by
a β-hairpin (the domain bridge). Remarkably, there is an excellent
correlation between unfolding rates (or barrier heights) for several
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

842 Kinetic Stability of Variant Enzymes

pro-dependent proteases and the amino acid length of the domain


bridge (or the surface area buried by the domain bridge in the
native structure) [37]. This correlation and other pieces of evidence
[38] suggest a transition state in which the domain bridge is
partially unfolded, with the concomitant breakup of interactions
between the N-terminal and C-terminal domains contributing to
the high free-energy barrier. It is relevant (see Section 25.7)
that high-temperature molecular dynamics simulations support the
main features of this transition-state model, including the loss of
interactions along the domain bridge and the break of contacts
between the two domains [37, 39]. Certainly, the suggested cartoon-
like model of the transition state cannot include all the important
atomic-level details, and in this regard, it is of interest that the role of
the distortion of the side chain of Phe228 (observed in an ultrahigh-
resolution structure of the native α-lytic protease) has been recently
explored [40]. Nevertheless, even such a simple transition-state
model may provide adequate guidelines for the design of enzyme
variants with an enhanced kinetic stability. For instance, variants of
the α-lytic protease with decreased rate of denaturation have been
obtained by replacing the domain bridge of the α-lytic protease with
that of a thermophilic homolog [37]. Likewise, an increase in the
acid resistance of the α-lytic protease has been achieved through
replacement of an interdomain salt bridge with that corresponding
to an acid-resistant homolog [38].

25.4 Kinetic Stability Linked to Substantially Unfolded


Transition States: Thioredoxin and Phytase Enzymes

A most convenient scenario from the point of view of achieving and


enhancing kinetic stability would involve a substantially unfolded
transition state for the rate-determining step in such a way that
all interactions that contribute to thermodynamic stability (i.e., to
the positive value of the unfolding free energy) will also enhance
the free-energy barrier for denaturation. Recent studies on enzyme
variants support that thioredoxins and phytases may conform to this
picture.

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Kinetic Stability Linked to Substantially Unfolded Transition States 843

Figure 25.1 Effects of 27 conservative mutations on the thermodynamic


stability of Escherichia coli thioredoxin. (Left) Color code for the different
mutations and locations on the thioredoxin structure of the mutated
residues. (Right) Mutation effects on the unfolding free-energy change for
thioredoxin (as determined from differential scanning calorimetry data).
Reprinted from Ref. [43], Copyright (2006), with permission from Elsevier.

Evidence for kinetic stabilization of thioredoxin enzymes was


originally derived from an evolutionary analysis on the effect of
27 conservative mutations on stability [41–43]. The 27 mutations
selected involved the exchange of very similar amino acid residues
(I⇔V and E⇔D) and had comparatively small effects on thermody-
namic stability (Fig. 25.1).
Accordingly, these mutations could naively expected to be neutral
or quasi-neutral from the point of view of protein evolution.
Surprisingly, however, their effect on thioredoxin thermodynamic
stability (admittedly small in most instances) showed a good
correlation with a metric of the amino acids frequencies in
sequence alignments (Fig. 25.2), indicating that during evolution the
mutations became accepted with probabilities related to their effect
on stability. The simplest model that can explain such results invokes
a minimum threshold for stability in such a way that mutations that
bring stability below the threshold are rejected during evolution. It
follows from this model that destabilizing mutations will be rarely
accepted (only when previous mutations have sufficiently increased
the stability so that threshold violation does not occur when
the destabilization mutation occurs). This simple model can thus
explain the observed correlation between stability and sequence
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

844 Kinetic Stability of Variant Enzymes

Figure 25.2 Correlation between the effects of conservative mutations (Fig.


25.1) on thioredoxin stability and metrics of the frequency of occurrence
in sequence alignments of the amino acids involved. Mutational effects on
both the unfolding free energy (G values) and the equilibrium unfolding
temperature (Tm ) are given. The numbers alongside the data points refer
to the position mutated, and the numeric intervals shown in the different
plots refer to the range of sequence identity used in the calculations
of residue frequency. (Left) Mutations involving aspartic and glutamic
residues. Reprinted from Ref. [41], Copyright (2004), with permission from
Elsevier. (Right) Mutations involving valine and isoleucine residues (for
comparison, mutations involving carboxylic acid residues are also included
as nonnumbered open circles). Reprinted from Ref. [42], Copyright (2005),
with permission from Elsevier.

statistics but, as a simple calculation shows [41–43], only if the


threshold is assumed to be only a few kilojoules per mole below the
thermodynamic stability of the wild-type (WT) thioredoxin.
This calculated stability threshold would seem to make little
sense in terms of thermodynamic stability, however, because it
would translate in a decrease of a few degrees in equilibrium
denaturation temperature, which is already close to 90 degrees for
WT thioredoxin (i.e., much higher than physiological temperature).
On the other hand, a very close threshold does make sense in terms
of kinetic stability because (i) in a harsh environment (likely the

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Kinetic Stability Linked to Substantially Unfolded Transition States 845

environment surrounding the protein in vivo) in which irreversible


alterations of the unfolded state may readily take place, the rate
of irreversible denaturation may be determined by the free-energy
barrier for unfolding [12], a scenario that can be experimentally
verified in vitro by using proteases to simulate a harsh environment
[44], and (ii) the half-life time associated to the crossing of such a
barrier (1/k in Eq. 25.1) changes in an exponential manner with
the barrier height and even a decrease of a few kilojoules per
mole in height may cause a large decrease in half-life that can
affect organism fitness and be subject to purifying natural selection
(Fig. 25.3). This interpretation of the stability–/sequence–statistics
correlation in terms of kinetic stability was actually supported by
exhaustive unfolding kinetics studies on all the 27 variants that
demonstrated a good correlation between the mutation effects on
the free-energy barrier and the frequency of occurrence of residues
in sequence alignments [43] (Fig. 25.3).
The overall picture that emerges from these studies on thiore-
doxin variants certainly not only involves natural selection for
kinetic stability but also implies a substantially unfolded transition
state. This picture does provide a molecular-level explanation for
kinetic stability in this system, as many stabilizing interactions in the
native state are broken up in the transition state, thus contributing to
a high free-energy barrier. Also, a substantially unfolded transition
state is consistent with the correlation between mutation effects
on thermodynamic and kinetic stabilities found in this system
(Fig. 25.3), since many mutations are expected to occur in regions
of the structure that become unfolded in the transition state.
Finally, a substantially unfolded transition state greatly facilitates
the rational design of kinetic stability given that modeled effects on
thermodynamic stability (unfolding free-energy difference between
the native and unfolded states) will be transferred to the free-energy
barrier that determines the rate of denaturation. As an example,
charge reversal mutations were introduced in consensus-stabilized
thioredoxin scaffolds to yield variants with both thermodynamic and
kinetic stabilities highly tunable by salt [45].
Recent work has addressed the stabilization of phytase from
Citrobacter braakii (a biotechnologically important enzyme) on
the basis of disulfide bridge engineering [46]. High-temperature
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

846 Kinetic Stability of Variant Enzymes

Figure 25.3 Comparing mutational effects on thioredoxin thermodynamic


and kinetic stabilities. (Left) Dependence of the equilibrium unfolding tem-
perature and the unfolding rate constant with changes in the corresponding
free-energy changes (unfolding: G; activation barrier: GN→‡ ). The
right axis on the lower plot indicates the rate constant levels corresponding
to representative half-lives. Note that a free-energy change (kJ/mol) alters
the equilibrium unfolding temperature by just a few degrees, while it can
have a substantial effect on the half-life time for irreversible denaturation.
(Right) Correlation between mutational effects on the unfolding free-energy
and the activation free-energy barrier (from the analysis of equilibrium
and kinetic data for urea-induced unfolding) for the 27 mutations of Fig.
25.1. The excellent correlation with slope close to unity supports that the
mutation sites are in regions that are substantially unfolded in the transition
state. Reprinted from Ref. [43], Copyright (2006), with permission from
Elsevier.

molecular dynamics simulations were used to determine regions


showing large structural displacements at 500 K when compared
with the starting structure. Design of the disulfide bridges was
focused on these regions on the basis of the following assump-
tions: (i) the targeted regions could plausibly be unfolded or
disrupted in the transition state for irreversible denaturation,
and therefore, disulfide bridges introduced in those regions may
enhance the corresponding free-energy barrier, and (ii) the targeted
regions could plausibly be flexible in the native structure, and
therefore, introduction of disulfide bridges in those regions is

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Kinetic Stability Linked to Substantially Unfolded Transition States 847

Figure 25.4 Differential scanning calorimetry of the thermal denaturation


of variants of the phytase from Citrobacter braakii. (a) Calorimetric
profiles for the WT enzyme and variants with one (S1, S2, S3), two
(D1, D2), and three (T) engineered disulfide bridges. These profiles are
partially reversible and show some scan rate dependence; therefore,
they only provide a first, rough assessment of the mutational effects
on thermodynamic stability. (b) Plot of denaturation temperature (as
determined from the calorimetric profiles) versus the number of engineered
disulfide bridges. (c) Reversibility of the thermal denaturation is higher
for the phytase variants with two and three engineered disulfide bridges.
Reprinted from Ref. [46] (published under a Creative Commons Attribution
license).

not likely to distort or strain the native structure. Variants with


one, two, and three engineered disulfide bridges were prepared
and characterized exhaustively on the basis of differential scan-
ning calorimetry (Fig. 25.4) and inactivation kinetics at several
temperatures (Fig. 25.5). The variants showed both enhanced
thermodynamic stability and enhanced kinetic stability, supporting
that the engineered disulfide bridges were located in regions of the
structure that become unfolded in the kinetically relevant transition
state.
Possibly, the most striking result of this study was the strongly
non-Arrhenius temperature dependence observed for the inactiva-
tion rate of all variants studied (Fig. 25.5). In fact, for all the variants
the timescale of the irreversible denaturation process reached a
minimum at an intermediate temperature within the denaturation
transition range [46]. This striking behavior was shown to be a
signature of the key kinetic role in irreversible aggregation played by
a highly unfolded intermediate state/ensemble. Such intermediate
state would of course be a good model for the kinetically relevant
transition state of the irreversible denaturation process.
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

848 Kinetic Stability of Variant Enzymes

Figure 25.5 Kinetics of thermal inactivation of a variant of phytase from


Citrobacter braakii with two engineered disulfide bridges (a) compared
to the WT protein (b). A/A 0 is the fraction of the initial enzyme activity.
The variant shows a considerable kinetic stabilization with respect to
the WT enzyme. Note that in both cases, thermal inactivation is slowest
at an intermediate temperature (a strongly non-Arrhenius temperature
dependence), a result that suggests a key kinetic role of a substantially
unfolded intermediate state (likely a good model for the kinetically relevant
transition state). Reprinted from Ref. [46] (published under a Creative
Commons Attribution license).

25.5 Role of Solvation Barriers in Kinetic Stability: Lipases


and Triose Phosphate Isomerases

Previous sections would seem to convey a simple structural picture


of the transition states for irreversible denaturation processes
as partially unfolded or partially disrupted native structures.
In fact, a transition state belongs to the top of a free-energy
barrier and may display rather unusual features when compared
with the states (native, unfolded) at free-energy basins. Indeed,
experimental and computational analyses support that transition
states for protein folding/unfolding processes may display a clearly
nonnative energetic nonbalance associated to solvation/desolvation
of amino acid moieties [47–51]. For instance, protein unfolding
implies the interaction with water (i.e., solvation) of the amino acid
residues buried in the native structure. However, the molecule of
water has a finite size and buried interacting amino acids must
separate a certain distance before water molecules can slip in. Such
water-unsatisfied separation involves a large increase in energy
associated to disrupted internal interactions (van der Waals packing
interactions between buried hydrophobic residues, for instance).
This situation is often described in the literature as a solvation
barrier, and it would be described as a desolvation barrier if the

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Role of Solvation Barriers in Kinetic Stability 849

Figure 25.6 Cartoon depiction of solvation/desolvation barrier effects.


Blue color is used for the protein surface that is accessible to the solvent.
Red color represents broken or disrupted internal interactions that are not
satisfied by water molecules. A folding–unfolding scenario is shown in this
figure, but the type of transition state depicted may also play a role in kinetic
stability. Reprinted from Ref. [51], Copyright (2006), with permission from
Elsevier.

process is seen in the folding direction (i.e., from the unfolded


state to the transition state). What is meant by these terms is
essentially that there may be a large contribution to the free-
energy barrier for folding–unfolding due to the asynchrony between
water penetration/release and disruption/setting up of internal
interactions (Fig. 25.6). While solvation/desolvation barriers were
originally proposed to play a role in folding/unfolding processes,
their contribution to protein kinetic stability has been established
mainly through the studies on lipases and triose phosphate
isomerases summarized below.
Stabilization of lipase from Thermomyces lanuginosus (an im-
portant industrial enzyme) has been addressed on the basis of
directed evolution focused to flexible regions detected in high-
temperature molecular dynamics simulations [52]. The variants
thus obtained showed substantially enhanced kinetic stability (Fig.
25.7), a result consistent with the general proposal that heated
molecular dynamics simulations may plausibly locate regions of
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

850 Kinetic Stability of Variant Enzymes

Figure 25.7 (Left) Mutational effects on the free-energy barrier for the ir-
reversible denaturation of the lipase from Thermomyces lanuginosus (G‡
values). Free-energy changes have enthalpic and entropic components, and
here the G‡ values are plotted versus their enthalpic components, that
is, versus the mutational changes in activation enthalpy (H ‡ values).
(Right) Activation urea m values (m‡ ) for the several lipase variants studied
plotted versus the mutational changes in (a) activation enthalpy and (b) the
denaturation temperature. The m‡ values shown are substantially smaller
than the m values for complete unfolding of lipase (about 17 kJ/[mol·M]),
indicating low exposure to the solvent in the transition state. Furthermore,
the m‡ values do not correlate with the corresponding mutational changes
in activation enthalpy supporting a transition state akin to that depicted in
Fig. 25.6. Reprinted from Ref. [52], Copyright (2006), with permission from
Elsevier.

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

Role of Solvation Barriers in Kinetic Stability 851

the structure that become disrupted in the kinetically relevant


transition state. However, detailed analyses of differential scanning
calorimetry thermograms for the irreversible denaturation of the
lipase variants led to a more intricate picture. Activation enthalpy
values can be derived from the fittings of a two-state irreversible
denaturation model to the calorimetric profiles. Furthermore,
calorimetric experiments at different urea concentrations allow
the estimation of the denaturant-concentration dependence of the
activation free energy, that is, the so-called activation urea m
value, which is generally taken as a measure of the exposure to
the solvent of the transition state as compared with the native
state. Surprisingly, for all variants the two values showed a clear
mismatch, with the activation enthalpy being very large (on the
order hundreds of kilojoules per mole) and the activation m values
being comparatively small. A high activation enthalpy suggests that
a substantial number of interactions are broken or disrupted in the
transition state, while a low urea m value indicates the transition
state has a low exposure to solvent. It follows that disrupted (or
partially disrupted) interactions in the transition state are mainly
internal and still not satisfied by water molecules; we find, therefore,
the structural pattern expected to be associated to a solvation
barrier contribution.
The activation enthalpy versus the urea m value disparity
observed with lipase also extends to the corresponding muta-
tion effects on these quantities (Fig. 25.7), revealing that the
solvation barrier contribution is highly sensitive to mutation [52,
53]. Furthermore, studies on the kinetic stability triosephosphate
isomerases [53, 54] reproduced the activation-enthalpy/urea m
pattern associated to a solvation barrier contribution and also
showed such contributions to significantly differ among triose
phosphate isomerases from different organisms. It emerges, overall,
that solvation/desolvation effects may provide an efficient mech-
anism for the modulation of free-energy barriers. The rational
exploitation of this mechanism for kinetic stability enhancement
is not straightforward at this stage, however, since, among other
problems, solvation barrier contributions seem to be characterized
by pervasive enthalpy/entropy compensation phenomena [53, 54]
that often are difficult to rationalize and predict.
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

852 Kinetic Stability of Variant Enzymes

25.6 Concluding Remarks

We have purposely used in this chapter an approach to protein


kinetic stability based exclusively on phenomenological descriptions
of the transition states for the kinetically relevant conformational
processes. Admittedly, this approach has obvious limitations. For
instance, kinetic stability in a highly proteolytic environment may re-
quire not only a high free-energy barrier for unfolding/denaturation
[44] but also the suppression of the local structure fluctuations
that render the native state susceptible to proteolysis [36]. It does
not appear that the molecular mechanisms responsible for such
suppression can be described in terms of a single transition state
associated to extensive conformational changes. Also, the long-
term kinetic stability at low temperature that is relevant for the
shelf life of protein pharmaceuticals may be determined in part
by chemical alterations of residues in the native structure (such
as deamidation or cyclic imide formation) [14], processes, which,
again, may not be linked to large conformational alterations. From
a more general viewpoint, it must be recognized that mechanisms of
irreversible protein denaturation may in some cases be considerably
complex [46, 55–57]. For instance, protein aggregation may involve
steps of nucleation, growth, fragmentation, coalescence, etc. Clearly,
assuming a single rate-determining step and a single kinetically
relevant free-energy barrier may not be realistic in some cases.
The above caveats notwithstanding, it stems from the examples
discussed in this chapter that a description on the basis of a single
transition state may be adequate as a first approximation in many
cases, and furthermore, it may be useful in the following sense:
(i) it may provide clues to the molecular mechanisms behind the
natural selection of the kinetic stabilization found in many protein
systems; (ii) it may offer suitable guidelines for protein engineering
as, for instance, directed evolution procedures aimed at enhancing
protein kinetic stability are more likely to be successful if focused
to the regions of the protein structure that become unfolded or
disrupted in the kinetically relevant transition state; and (iii) it may
open up the possibility of using known computational procedures
for transition-state prediction in the rational design of kinetic
stability, and in this context, it is noteworthy that several studies

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

References 853

(discussed in this chapter) support that high-temperature molecular


dynamics simulations can be employed to assess the regions of
the structure that are likely unfolded/disrupted in the kinetically
relevant transition state.

Acknowledgments

Work in the author’s lab is supported by grants BIO2012-34937


and CSD2000-00088 (Spanish Ministry of Economy and Competi-
tiveness) and Feder Funds.

References

1. Carpenter, J. F., and Manning, M. C. (2002). Rational Design of Stable


Protein Formulations: Theory and Practice (Kluwer Academic/Plenum,
New York).
2. Kirk, O., Borchert, T. V., and Fuglsang, C. C. (2002). Industrial enzyme
applications, Curr. Opin. Biotechnol., 13, pp. 345–351.
3. Lazar, G.A., Marshall, S. A., Plecs, J. J., Mayo, S. L., and Dejarlais, J.
R. (2003). Designing proteins for therapeutic applications, Curr. Opin.
Struct. Biol., 13, pp. 513–518.
4. Bloom, J. D., Labthavikul, S. T., Otey, C. R., and Arnold, F. H. (2006). Protein
stability promotes evolvability, Proc. Natl. Acad. Sci. U S A, 103, pp. 5869–
5874.
5. Khersonsky, O., Kiss, G., Röthlisberger, D., Dym, O., Albeck, S., Houk,
K. N., Baker, D., and Tawfik, D. S. (2012). Bridging the gaps in
design methodologies by evolutionary optimization of the stability and
proficiency of designed Kemp eliminase KE59, Proc. Natl. Acad. Sci. U S A,
109, pp. 10358–10363.
6. Becktel. W. J., and Schellman, J. A. (1987). Protein stability curves,
Biopolymers, 26, pp. 1859–1877.
7. Privalov, P. L. (1990). Cold denaturation of proteins, Crit. Rev. Biochem.
Mol. Biol., 25, pp. 281–305.
8. Sanchez-Ruiz, J. M. (1995). Chapter 6, Differential scanning calorimetry
of proteins. In Subcellular Biochemistry, Biswas, B. B., and Roy, S., eds.
(Plenum Press, New York), 24, pp. 133–176.
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

854 Kinetic Stability of Variant Enzymes

9. Klibanov, A. M., and Ahern, T. J. (1987). Thermal stability of proteins. In


Protein Engineering, Oxender, D. L., and Fox, C. F., eds. (Alan R. Liss, New
York), pp. 213–218.
10. Lumry, R., and Eyring, E. (1954). Conformational changes of proteins, J.
Phys. Chem., 58, pp. 110–120.
11. Freire, E., van Osdol, W. W., Mayorga, O. L., and Sanchez-Ruiz, J. M.
(1990). Calorimetrically-determined dynamics of complex unfolding
transitions in proteins, Annu. Rev. Biophys. Biophys. Chem., 19, pp. 159–
188.
12. Plaza del Pino, I. M., Ibarra-Molero, B., and Sanchez-Ruiz, J. M. (2000).
Lower kinetic limit to protein kinetic stability: a proposal regarding
protein stability in vivo and its relation with misfolding diseases,
Proteins, 40, pp. 58–70.
13. Sanchez-Ruiz, J. M. (1992). Theoretical analysis of Lumry-Eyring models
in differential scanning calorimetry, Biophys. J., 61, pp. 921–935.
14. Chang, B. S., and Hershenson, S. (2002). Practical approaches to
protein formulation development. In Rational Design of Stable Protein
Formulations: Theory and Practice, Carpenter, J. F., and Manning, M. C.,
eds. (Kluwer Academic/Plenum, New York), pp. 1–25.
15. Sanchez-Ruiz, J. M. (2010). Protein kinetic stability, Biophys. Chem., 148,
pp. 1–15.
16. Pace, C. N., Grimsley, G. R., Thomson, J. A., and Barnett, B. J. (1988).
Conformational stability and activity of ribonucelase T1 with zero,
one, and two intact disulfide bonds, J. Biol. Chem., 263, pp. 11820–
11825.
17. Zhang, T., Bertelsen, E., and Alber, T. (1994). Entropic effects of disulfide
bonds on protein stability, Nat. Struct. Biol., 1, pp. 434–438.
18. Gribenko, A. V., Patel, M. M., Liu, J., McCallum. S. A., Wang, C.,
and Makhatadze, G. I. (2009). Rational stabilization of enzymes by
computational redesign of surface charge interactions, Proc. Natl. Acad.
Sci. U S A, 106, pp. 2601–2606.
19. Ibarra-Molero, B., and Sanchez-Ruiz, J. M. (2002). Genetic algorithm to
design stabilizing surface-charge distributions in proteins, J. Phys. Chem.
B, 106, pp. 6609–6613.
20. Sanchez-Ruiz, J. M., and Makhatadze, G. I. (2001). To charge or not to
charge?, Trends Biotechnol., 19, pp. 132–135.
21. Malakauskas, S. M., and Mayo, S. L. (1998). Design, structure and
stability of a hyperthermophilic protein variant, Nat. Struct. Biol., 5, pp.
470–475.

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

References 855

22. Potapov, V., Cohen, M., and Schreiber, G. (2009). Assessing computa-
tional methods for predicting protein stability upon mutation: good on
average but not in the details, Protein Eng., 22, pp. 553–560.
23. Creamer, T. P., Srinivasan, R., and Rose, G. D. (1995). Modeling unfolded
states of peptides and proteins, Biochemsitry, 34, pp. 16245–16250.
24. Fu, H., Grimsley, G., Scholtz, J. M., and Pace, C. N. (2010). Increasing
protein stability: importance of DeltaC(p) and the denatured state,
Protein Sci., 19, pp. 1044–1052.
25. Guzman-Casado, M., Parody-Morreale, A., Robic, S., Marqusee, S., and
Sanchez-Ruiz, J. M. (2003). Energetic evidence for the formation of a
pH-dependent hydrophobic cluster in the denatured state of Thermus
thermophilus ribonuclease H, J. Mol. Biol., 329, pp. 731–743.
26. Robic, S., Guzman-Casado, M., Sanchez-Ruiz, J. M., and Marqusee, S.
(2003). Role of residual structure in the unfolded state of a thermophilic
protein, Proc. Natl. Acad. Sci. U S A, 100, pp. 11345–11349.
27. Xiao, S., Patsalo, V., Shan, B., Bi, Y., Green, D. F., and Raleigh D. P. (2013).
Rational modification of protein stability by targeting surface sites leads
to complicated results, Proc. Natl. Acad. Sci. U S A, 110, pp. 11337–
11342.
28. Bryngelson, J. D., Onuchic, J. N., Socci, N. D., and Wolynes, P. G. (1995).
Funnels, pathways, and the energy landscape of protein folding: a
synthesis, Proteins, 21, pp. 167–195.
29. Dill, K. A., and Chan, H. S. (1996). From Levinthal to pathways to funnels,
Nat. Struct. Biol., 4, pp. 10–19.
30. Sanchez-Ruiz, J. M. (2011). Probing free-energy surfaces with differen-
tial scanning calorimetry, Annu. Rev. Phys. Chem., 62, pp. 231–255.
31. Godoy-Ruiz, R., Henry. E. R., Kubelka, J., Hofrichter. J., Muñoz, V., Sanchez-
Ruiz, J. M., and Eaton, W. A. (2008). Estimating free-energy barrier
heights for an ultrafast folding protein from calorimetric and kinetic
data, J. Phys. Chem. B, 112, 5398–5349.
32. Naganathan, A. N., Sanchez-Ruiz, J. M., and Muñoz, V. (2005). Direct
measurement of barrier heights in protein folding, J. Am. Chem. Soc.,
127, pp. 17970–17971.
33. Naganathan, A. N., Li, P., Perez-Jimenez, R., Sanchez-Ruiz, J. M., and
Muñoz, V. (2010). Navigating the downhill protein folding regime via
structural homologues, J. Am. Chem. Soc., 132, pp. 11183–11190.
34. Fersht, A. R., Matouscheck, A., and Serrano, L., (1992). The folding of
an enzyme: I. Theory of protein engineering analysis of stability and
pathway of protein folding, J. Mol. Biol., 224, pp. 771–782.
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

856 Kinetic Stability of Variant Enzymes

35. Naganathan, A. N., and Muñoz, V. (2010). Insights into protein folding
mechanisms from large scale analysis of mutational effects, Proc. Natl.
Acad. Sci. U S A, 107, pp. 8611–8616.
36. Jaswal, S. S., Sohl, J. L., and Agard, D. A. (2002). Energetic landscape of the
α-lytic protease optimizes longevity through kinetic stability, Nature,
415, pp. 343–346.
37. Kelch, B. A., and Agard, D. A. (2007). Mesophile versus thermophile:
insights into the structural mechanism of kinetic stability, J. Mol. Biol.,
370, pp. 784–795.
38. Kelch, B. A., Eagen, K. P., Erciyas, F. P., Hunphris, E. L., Thomason, A.
R., Mitsuiki, S., and Agard, D. A. (2007). Structural and mechanistic
exploration of acid resistance: kinetic stability facilitates evolution of
extremophilic behavior, J. Mol. Biol., 368, pp. 870–883.
39. Salimi, N. L., Ho, B., and Agard, D. A. (2010). Unfolding simulations reveal
the mechanism of extreme unfolding cooperativity in the kinetically
stable α-lytic protease, PLOS Comput. Biol., 6, p. e1000689.
40. Kelch, B. A., Salimi, N. L., and Agard, D. A. (2012). Functional modulation
of a protein folding landscape via side-chain distortion, Proc. Natl. Acad.
Sci. U S A, 109, pp. 9414–9419.
41. Godoy-Ruiz, R., Perez-Jimenez, R., Ibarra-Molero, B., and Sanchez-Ruiz,
J. M. (2004). Relation between protein stability, evolution and structure,
as probed by carboxylic acid mutations, J. Mol. Biol., 336, pp. 313–
318.
42. Godoy-Ruiz, R., Perez-Jimenez, R., Ibarra-Molero, B., and Sanchez-Ruiz,
J. M. (2005). A stability pattern of protein hydrophobic mutations that
reflects evolutionary structural information, Biophys. J., 89, pp. 3320–
3331.
43. Godoy-Ruiz, R., Ariza, F., Rodriguez-Larrea, D., Perez-Jimenez, R., Ibarra-
Molero, B., and Sanchez-Ruiz, J. M. (2006). Natural selection for kinetic
stability is a likely origin of correlations between mutational effects
on protein energetics and frequencies o amino acid occurrences in
sequence alignments, J. Mol. Biol., 362, pp. 966–978.
44. Tur-Arlandis, G., Rodriguez-Larrea, D., Ibarra-Molero, B., and Sanchez-
Ruiz, J. M. (2010). Proteolytic scanning calorimetry: a novel method-
ology that probes the fundamental features of protein kinetic stability,
Biophys. J., 98, pp. L12–L14.
45. Pey, A. L., Rodriguez-Larrea, D., Bomke, S., Dammers, S., Godoy-Ruiz, R.,
Garcia-Mira, M. M., and Sanchez-Ruiz, J. M. (2008). Engineering proteins

www.ebook3000.com
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

References 857

with tunable thermodynamic and kinetic stabilities, Proteins, 71, pp.


165–174.
46. Sanchez-Romero, I., Ariza, A., Wilson, K. S., Skjøt, M., Vind, J., De Maria, L.,
Skov, L. K., and Sanchez-Ruiz, J. M. (2013). Mechanism of protein kinetic
stabilization by engineered disulfide crosslinks, PLOS ONE, 8, p. e70013.
47. Chan, H. S., Zhang, Z., Wallin, S., and Liu, Z. (2011). Cooperativity, local-
nonlocal coupling, and nonnative interactions: principles of protein
folding from coarse-grained models, Annu. Rev. Phys. Chem., 62, pp. 301–
326.
48. Cheung, S. M., Garcia, A. E., and Onuchic, J. N. (2002). Protein
folding mediated by solvation: water expulsion and formation of the
hydrophobic core occur after structural collapse, Proc. Natl. Acad. Sci.
U S A, 99, pp. 685–690.
49. Ferguson, A., Liu, Z., and Chan, H. S. (2009). Desolvation barrier effects
are a likely contributor to the remarkable diversity in the folding rates
of small proteins, J. Mol. Biol., 389, pp. 619–636.
50. Rank, J. A., and Baker, D. (1997). A desolvation barrier to hydrophobic
cluster formation may contribute to the rate-limiting step in protein
folding, Protein Sci., 6, pp. 347–354.
51. Rodriguez-Larrea, D., Ibarra-Molero, B., and Sanchez-Ruiz, J. M. (2006).
Energetic and structural consequences of desolvation/solvation barri-
ers to protein folding/unfolding assessed from experimental unfolding
rates, Biophys. J., 91, pp. L48–L50.
52. Rodriguez-Larrea, D., Minning, S., Borchert, T. V., and Sanchez-Ruiz, J. M.
(2006). Role of solvation barriers in protein kinetic stability, J. Mol. Biol.,
360, pp. 715–724.
53. Costas, M., Rodriguez-Larrea, D., De Maria, L., Borchert, T. V., Gomez-
Puyou, A., and Sanchez-Ruiz, J. M. (2009). Between-species variation
in the kinetic stability of TIM proteins linked to solvation-barrier free
energies, J. Mol. Biol., 385, pp. 924–927.
54. Aguirre, Y., Cabrera, N., Aguirre, B., Perez-Monfort, R., Hernandez-
Santoyo, A., Reyes-Vivas, H., Enriquez-Flores, S., Tuena de Gómez-Puyou,
M., Gómez-Puyou, A., Sanchez-Ruiz, J. M., and Costas, M. (2014). Different
contributions of conserved amino acids to the global properties of
triosephosphate isomerases, Proteins, 82, pp. 323–335.
55. Cohen, S. I., Vendruscolo, M., Dobson, C. M., and Knowles, T. P. (2012).
From macroscopic measurements to microscopic to microscopic mech-
anisms of protein aggregation, J. Mol. Biol., 421, pp. 160–171.
March 23, 2016 13:17 PSP Book - 9in x 6in 25-Allan-Svendsen-c25

858 Kinetic Stability of Variant Enzymes

56. Roberts, C. J., Das, T. K., and Sahin, E. (2011). Predicting solution
aggregation rates for therapeutic proteins: approaches and challenges,
Int. J. Pharm., 418, pp. 318–333.
57. Weiss, W. F., Young, T. M., and Roberts, C. J. (2009). Principles,
approaches and challenges for predicting protein aggregation rates and
shelf life, J. Pharm. Sci., 98, pp. 1246–1277.

www.ebook3000.com

You might also like