Understanding The Catalytic Mechanisms of Selected Glycoside Hydr

University of Wollongong
Research Online
University of Wollongong Thesis Collection University of Wollongong Thesis Collections

2017+
2018
Understanding the Catalytic Mechanisms of Selected Glycoside Hydrolases

Kela Xiao
Follow this and additional works at: https://ro.uow.edu.au/theses1
Copyright Warning
You may print or download ONE copy of this document for the purpose of your own research or study. The University
does not authorise you to copy, communicate or otherwise make available electronically to any other person any
copyright material contained on this site.
You are reminded of the following: This work is copyright. Apart from any use permitted under the Copyright Act
1968, no part of this work may be reproduced by any process, nor may any other exclusive right be exercised,
without the permission of the author. Copyright owners are entitled to take legal action against persons who infringe
their copyright. A reproduction of material that is protected by copyright may be a copyright infringement. A court
may impose penalties and award damages in relation to offences and infringements relating to copyright material.
Higher penalties may apply, and higher damages may be awarded, for offences and infringements involving the
conversion of material into digital or electronic form.
Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily
represent the views of the University of Wollongong.
Recommended Citation
Xiao, Kela, Understanding the Catalytic Mechanisms of Selected Glycoside Hydrolases, Doctor of
Philosophy thesis, School of Chemistry, University of Wollongong, 2018. https://ro.uow.edu.au/theses1/
258
Research Online is the open access institutional repository for the University of Wollongong. For further information
contact the UOW Library: research-pubs@uow.edu.au
Understanding the Catalytic Mechanisms of Selected
Glycoside Hydrolases
Kela Xiao
This thesis is presented as part of the requirements for the conferral of the degree:
Doctor of Philosophy
Supervisor:
Dr Haibo Yu
Co-supervisor:
A/Prof Danielle Skropeta
The University of Wollongong

School of Chemistry
March 2018
This work
c copyright by Kela Xiao, 2018. All Rights Reserved.
No part of this work may be reproduced, stored in a retrieval system, transmitted, in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of the
author or the University of Wollongong.
This research has been conducted with the support of the Australian Government Research Training
Program Scholarship.
Declaration
I, Kela Xiao, declare that this thesis is submitted in partial fulfilment of the requirements
for the conferral of the degree Doctor of Philosophy, from the University of Wollongong,
is wholly my own work unless otherwise referenced or acknowledged. This document
has not been submitted for qualifications at any other academic institution.
Kela Xiao
June 14, 2018

Abstract
Carbohydrate-active enzymes (CAZymes) are responsible for the synthesis, modification

and degradation of glycosidic bonds in a large number of carbohydrate based structures.
Among CAZymes, glycoside hydrolases (GH) and glycoside transferases (GT) catalyze
the reversible breaking and formation of the glycosidic bonds, respectively. GHs have
been shown to play vital roles in many biological processes. Consequently, they have
drawn great attention as potential drug targets. This thesis will focus on a few critical
issues in GHs. The Bacillus circulans xylanase (BcX) and Streptococcus pneumoniae (S.
pneumoniae) sialidase are two of the important GH families, for which a great deal of
information has been reported. At the same time there are critical mechanistic questions
not well understood. BcX, a retaining GH11, contains a nucleophile Glu78 residue and a
general acid/base Glu172 residue, which facilitates the glycosylation and deglycosylation
step, respectively. A so-called “pKa cycling”, which is common in retaining GHs, was es-
tablished for Glu172 of BcX. Various methods including quantum mechanics/molecular
mechanics (QM/MM) free energy calculation, multiconformation continuum electrostat-
ics (MCCE) and the PROPKA method were applied to predict the pKa s or pKa shifts of
Glu172 and other key residues, aiming to rationalise the effects of structural and electro-
static perturbations caused by mutations. S. pneumoniae sialidase, a GH33, is a common
resident of the human nasopharynx that can cause a number of pathological conditions, in-
cluding otitis media, meningitis and septicaemia. It encodes up to three distinct sialidases:
NanA, a classical hydrolytic sialidase; NanB, an intramolecular tran-sialidase; and NanC,
a sugar dehydratase. These three S. pneumoniae sialidases share a common first-step re-
action pathway, while the second step shows an intriguing difference, producing Neu5Ac,
2,7-anhydro-Neu5Ac and Neu5Ac2en, respectively. The mechanisms, structural and en-
ergetic properties and the potential reaction pathways for NanA, NanB and NanC have
been systematically studied using reaction barrier and energy calculations with combined
QM/MM simulations. Complementary to previous studies, this thesis provided additional
mechanistic insights into the determinants for their distinct reaction pathways.
iv
Publications and Conference
Presentations
Publications
Montgomery AP*, Xiao K*, Wang X, Skropeta D, Yu H. “Computational glycobiology:
mechanistic studies of carbohydrate active enzymes and implication for inhibitor”, Ad-
vances in Protein Chemistry and Structural Biology, 2017, 109, 25-76. *Co-first author.
(Chapter 1)
Xiao K, Yu H. “Rationalising pKa shifts in Bacillus circulans xylanase with computa-

tional studies”. Physical Chemistry Chemical Physics, 2016,18, 30305-30312. (Chapter
2)
Xiao K, Yu H. “Exploring the catalytic mechanisms of the S. pneumoniae sialidases

NanA”. to be submitted, 2018. (Chapter 3)
Xiao K, Yu H. “Comparative studies of catalytic pathways for S. pneumoniae NanB and

NanC“. to be submitted, 2018. (Chapter 4)
Conference Presentations
Xiao K, Yu H. (Poster) “pKa shift calculations of the general acid/base in glycoside hy-
drolase family 11”, MM2015 - the annual conference of the Association of Molecular
Modellers of Australasia, Sydney, 2015.
Xiao K, Yu H. (Poster) “pKa shift calculations of the general acid/base in glycoside hy-
drolase family 11”, ASB2015 - the annual conference of the Australian Society for Bio-
physics, Armidale, 2015.
Xiao K, Yu H. (Talk) “Rationalising pKa shifts in Bacillus circulans xylanase with com-
putational studies”, New Zealand Institute of Chemistry Conference, Queenstown NZ,
2016.
v
vi
Xiao K, Montgomery AP, Wang X, Skropeta D, and Yu H. (Invited Talk) “Mechanistic

studies of carbohydrate-active enzymes and implication for inhibitor design”, MM2017 -
the annual conference of the Association of Molecular Modellers of Australasia, Margaret
River, 2017.
Acknowledgments
I would like to acknowledge my supervisors Dr Haibo Yu and A/Prof Danielle Skropeta

for their help, time, guidance, encouragement and full support in supervising my PhD
project. I would also like to thank the Yu Research group for their kind help and sug-
gestions that have been valuable throughout the research. I am grateful to Andrew Mont-
gomery, Nehad Elsalamouny and Chris Dobie for proofreading the thesis. I wish to thank
the University of Wollongong and Australian International Postgraduate Award for their
financial assistance for my PhD. I also wish to thank National Computational Infras-
tructure and UOW High Performance Computing Cluster for their technical support. I
am truly grateful to everyone who has directly or indirectly supported and helped me
throughout my PhD study in the School of Chemistry at the University of Wollongong.
vii
Abbreviations
Organisms
C. perfingens Clostridium perfringens

C. septicum Clostridium septicum
M. decora Macrobdella decora
S. pneumoniae Streptococcus pneumoniae
S. typhimurium Salmonella typhimurium
T. cruzi Trypanosoma cruzi
T. rangeli Trypanosoma rangeli
V. cholerae Vibrio cholerae
Sialidases
HN Haemaglutinin-neuraminidase
IT-sialidase Intromolecular trans-sialidase
NA Influenza virus sialidase
NanA S. pneumoniae sialidase A
NanB S. pneumoniae sialidase B
NanC S. pneumoniae sialidase C
NEU1 Human lysosomal sialidase
NEU2 Human cytosomal sialidase
NEU3 Human plasma membrane-associated sialidase
NEU4 Human mitochondrial sialidase
TcTS T. cruzi trans-sialidase
TrSA T. rangeli sialidase
viii
ix
Technical & Miscellaneous
AIDS Acquired immune deficiency syndrome

BcX Bacillus circulans xylanase
CAZy Carbohydrate-Active Enzymes
CBM Carbohydrate binding module
CDD Conserved Domain Database
CE Continuum electrostatics
DNA Deoxyribonucleic acid
DIV Divided frontier charge model
DFT Density functional theory
DFTB Density-functional tight-binding
DFTB2 Self-consistent charge density-functional tight-binding
DFTB3 Third order of extension of self-consistent charge density functional tight-binding
GH Glycoside hydrolase
GT Glycosyltransferases
GSBP General solvent boundary potential
HA Haemaglutinin
1 H NMR Proton nuclear magnetic resonance
IC Intermediate complex
KS Kohn-Sham
LCAO Linear combination of atomic orbitals
MD Molecular dynamics
MM Molecular mechanics
MCCE Multiconformation continuum electrostatics
MEP Minimum energy pathway
MC Michaelis complex
NMR Nuclear magnetic resonance
PB Poisson-Boltzmann
PMF Potential of mean force
PES Potential energy surface
PC Product complex
PDB Protein data bank
RMSD Root mean square deviation
RC Reactant complex
TI Thermodynamic integration
TS Transition state
WHAM Weighted histogram analysis method
WT Wild-type
x
Chemicals
2,7-anhydro-Neu5Ac 2,7-anhydro-α-N-acetylneuraminic acid

DFX 1,2-deoxy-2-fluoro-xylopyranose
Gal Galactose
GalNAc N-acetylgalactosamine
Neu Neuraminic acid
Neu5Ac N-acetylneuraminic acid, NANA
Neu5Ac2en 2-deoxy-2,3-didehydro-5-N-acetylneuraminic acid, DANA
3SL α-2,3-sialyllactose
6SL α-2,6-sialyllactose
XYP β -d-xylopyranose
Keywords
molecular dynamics (MD), combined quantumn mechanics and molecular mechanics

(QM/MM), enzymatic reactions, glycoside hydrolases, transition state (TS), pKa , mini-
mum energy pathway (MEP), potential of mean force (PMF), SCC-DFTB, DFTB3
xi
Contents
Abstract iv
Publications and Conference Presentations v
Acknowledgements vii
Abbreviations viii
Keywords xi
List of Figures xv
List of Tables xxiv
1 Introduction 1
1.1 The carbohydrate-active enzymes . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Puckering of carbohydrate structures . . . . . . . . . . . . . . . . 3
1.1.2 Cremer-Pople puckering coordinates . . . . . . . . . . . . . . . . 4
1.2 Glycoside hydrolases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 The conformational catalytic itinerary in GHs . . . . . . . . . . . 6
1.2.2 pK a shifts in GHs . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Bacillus circulans xylanase . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 The sialidase superfamily . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.1 Three S. pneumoniae sialidases NanA, NanB and NanC . . . . . 14
1.5 Computational methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.1 Molecular dynamics simulations . . . . . . . . . . . . . . . . . . 18
1.5.2 Combined quantum mechanics/molecular mechanics methods . . 19
1.5.3 pK a and pK a shift calculations in enzymes . . . . . . . . . . . . . 22
1.5.4 Reaction barriers and energies calculations based on QM/MM
simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.6 Overview of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
xii
CONTENTS xiii
2 pKa shifts in BcX 28

2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Computational details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.1 Simulation setup . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4 Results and discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.1 Equilibrated structures from QM/MM simulations . . . . . . . . 35
2.4.2 pKa calculations with QM/MM TI, PROPKA and MCCE . . . . . 35
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 The catalytic mechanism of NanA 48

3.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.1 Simulation setup . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.2 MEP calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.3 PMF simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4 Results and discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.1 Equilibrated structures from QM/MM simulations . . . . . . . . 53
3.4.2 Potential energy profiles for NanA along three different reaction
pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.3 Description of the mechanisms for NanA from the reactant state
to the product state . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4.4 Validation of SCC-DFTB/MM for reaction barriers and reaction
energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.5 Free energy profiles for NanA along three different paths . . . . . 67
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4 The catalytic pathways of NanB and NanC 78

4.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.1 Simulation set up . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.2 MEP calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.4 Results and Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.4.1 Equilibrated structures from QM/MM simulations of RC . . . . . 83
4.4.2 Potential energy profiles for NanB and NanC along three different
reaction pathways . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.3 Description of the mechanisms for NanB and NanC from the re-
actant state to the product state . . . . . . . . . . . . . . . . . . . 88
CONTENTS xiv
4.4.4 Single point calculations for the preferred routs of NanB and NanC 100
4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5 Summary and future work 105

5.1 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.2 Future work and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Bibliography 108
Appendix A 139
A The catalytic mechanism of NanA 139

A.1 DFTB3/MM simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 139
A.2 Additional MEP calculations for the second step of NanA along the reac-
tion path of PathC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
A.3 The extension MEP calculations for PathB and PathC . . . . . . . . . . . 140
Appendix B 144
B The catalytic pathways of NanB and NanC 144

B.1 Additional MEP calculations for the second steps of NanB and NanC
along the reaction path of PathC . . . . . . . . . . . . . . . . . . . . . . 144
B.2 The extension MEP calculations for NanB and NanC . . . . . . . . . . . 144
B.3 Comparison of the mechanisms for NanA, NanB and NanC from the in-
termediate to the product state . . . . . . . . . . . . . . . . . . . . . . . 145
List of Figures
1.1 Diversity of CAZymes and auxiliary enzymes, according to the CAZy

database. For the detailed catalytic mechanisms, please refer to www.
cazypedia.org.10 Reproduced from André et al. Current Opinion in
Chemical Biology, 17, 17-24 with permissionfrom Elsevier.11 . . . . . . . 2
1.2 α (1) and β (2) conformations of D-glucopyranose. . . . . . . . . . . . . 3
1.3 Stoddart diagrams of pyranose ring in the north and south poles. Figure
adapted from Montgomery et al.18 . . . . . . . . . . . . . . . . . . . . . 4
1.4 Cremer-Pople puckering coordinates of a pyranose ring. Figure adapted
from Montgomery et al.18 . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Mechanisms of inverting (A) and retaining (B) glycoside hydrolases. Fig-
ure adapted from Montgomery et al.18 . . . . . . . . . . . . . . . . . . . 7
1.6 The double-displacement mechanism in BcX. Figure adapted from Xiao
and Yu.76 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7 Mechanisms showing the transfer (a) and hydrolysis (b) reactions. TcTS
exhibits trans-sialidase activity, transferring sialic acid to the R-containing
mucins leading to the Michaelis Complex. TrSA preferentially shows the
hydrolytic activity, hydrolysing the Covalent Intermediate to the Michaelis
Complex by a water molecule. . . . . . . . . . . . . . . . . . . . . . . . 12
1.8 The topology of S. pneumoniae NanA (a), NanB (b) and NanC (c) active
sites. The catalytic cavity in NanA is flat and open while the ones in
NanB and NanC are relatively narrow due to the two Tryptophans (Trp674
and Trp716 in NanB and NanC, respectively.). Reproduced from http:
//hdl.handle.net/10023/778 by permission of Xu.170 . . . . . . . . 16
1.9 The proposed mechanisms for NanA, NanB and NanC sialidase enzymes. 17
1.10 A division of the system into QM and MM parts. The relatively small
purple is treated with a QM method and the rest of the system (includ-
ing the blue region and water molecules (labeled in red)) is treated with
classical mechanics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1 The double-displacement mechanism in BcX and the measured pKa s for
Glu172 and Glu78. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
xv
LIST OF FIGURES xvi
2.2 Changing the global charge of BcX by succinylation in which the succinic
anhydride react with amines of protein does not significantly perturb its
pH-dependent activity. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 DFTB3/MM with the General Solvent Boundary Potential (GSBP) model
for WT-2FXb. A spherical region of 20 Å radius centered on the C atom of
Glu172 was described as the inner region (shown in red) and the remain-
ing part was outer region represented with the continuum electrostatics;
Reaction region: a spherical model of 18 Å centered on the Cα atom
of Glu172, where Newtonian’s equations of motion were solved (labeled
in green); Buffer region: the region between reaction and inner region,
where Langevin dynamics was solved (labeled in blue); QM region: the
link atom (drawn in pink) was placed between Cβ and Cγ for Glu172 to
divide the bond between QM and MM region. . . . . . . . . . . . . . . . 34
2.4 The final snapshots from the explicit solvent DFTB3/MM simulations.
The physiologically important protonation states for the key residues (residue
35, 78 and 172) are shown. The key residues and substrate are shown ex-
plicitly. The water molecules within approximately 4 Å of OE2 in Glu172
are shown in red. (a) WT; (b) WT/2FXb; (c) N35H; (d) N35H/2FXb; (e)
N35D; (f) N35D/2FXb. . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5 Thermodynamic integrations of the free energy derivatives from λ =0.0 to
λ =1.0 of Glu172 in BcX (a WT, b WT-2FXb, c N35H, d N35H-2FXb, e
N35D and f N35D-2FXb) from the DFTB3/MM simulations. The error
bars for each are presented. . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6 Coordination number for water molecules around the carboxylic oxygen
of Glu172 in BcX (a WT, b WT-2FXb, c N35H, d N35H-2FXb, e N35D
and f N35D-2FXb) from the DFTB3/MM simulations. λ =0.0 refers to
the protonated state and λ =1.0 refers to the deprotonated state. . . . . . . 46
LIST OF FIGURES xvii
3.1 QM/MM simulation setup with the GSBP model for NanA. A spherical
region of 20 Å radius centered on Glu647 Oε2 was describedas the inner
region (shown in red) and the remaining part was outer region represented
with the continuum electrostatics; Reaction region: a sphericalmodel of
18 Å centered on Glu647 Oε2, where Newton’s equations of motion were
solved (labeled in green); Buffer region: the region betweenreaction and
inner region, where Langevin dynamics was solved (labeled in blue); QM
region: the link atom (drawn in light orange) was placed between Cβ
and Cγ , Cβ and Cγ , and Cβ and Cα for Glu647, Tyr752 and Asp372
respectively to divide the bond between QM and MM region. The QM
atoms are include Cγ , Hγ1, Hγ2, Cδ , Oδ 1, Oδ 2 of Glu647, Cγ , Cδ 1,
Hδ 1, Cδ 2, Hδ 2, Cε 1, Hε1, Cε 2, Hε2, Cζ , OHη, Hη of Tyr752, Cβ ,
Hβ 1, Hβ 2, Cγ , Oγ1, Oγ2, Hγ2 of Asp372 and the whole substrate. . . . . 50
3.2 The proposed conformation of the substrate in the RC. . . . . . . . . . . 52
3.3 The three predefined pathways for NanA. For the first step, d1 refers to
the distance between C2 position of the sugar ring and the glycosidic O
of methyl group, d2 is the distance between the glycosidic O of methyl
group and Asp372 Hδ 2, d3 describes the distance between Tyr752 Oη
and C2 position of the sugar ring, and d4 is the one between Tyr752 Hη
and Glu647 Oε2. For the second steps, d5 is the distance between Tyr752
Oη and C2 position of the substrate, d6 corresponds to the one between
Tyr752 Oη and Hη, d7 is the one between C2 position of the sugar ring
and the catalytic water oxygen, d8 is the one between Asp372 Oε2 and
catalytic water hydrogen, d9 defines the one between C2 position and O7
of the sugar ring, d10 refers to the one between C3 and H32 in substrate,
and d11 accounts for that between H32 of the sugar ring and catalytic
water oxygen. The reaction coordinates rxn1,rxn6 are also illustrated in
the figure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 The final snapshots from the explicit solvent SCC-DFTB/MM simulations
at 10ns. The key residues and substrate are shown explicitly. The water
molecules within approximately 5 Å of Oε2 of Glu647 are shown in red. . 55
3.5 Potential energy contour plot for the first reaction step of NanA. Energies
are in kcal/mol. The white dashed line illustrates the minimum potential
energy path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.6 Potential energy contour plot for the 2nd step along PathA. Energies are in
kcal/mol. The white dashed line illustrates the minimum potential energy
path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
LIST OF FIGURES xviii
3.7 Potential energy contour plot for the 2nd step along PathB. Energies are in
path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.8 Potential energy contour plot for the 2nd step along PathC. Energies are in
path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.9 Potential energy profiles for PathA (black), PathB (purple) and PathC
(blue). This potential energy profile was based on SCC-DFTB/MM sim-
ulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.10 The More O’Ferral-Jencks plot for the IC formation. The axes are the
Pauling bond orders from the anomeric C to the nucleophile (n(C2sial -
Onuc ) ) and the leaving group (n(C2sial -Olg )) . The RC, TS1, and IC con-
figuration are shown in the bottom left, top left, and top right, respectively.
See Fig. 3.4 for more information. The water molecules are shown in red. 63
3.11 The More O’Ferral-Jencks plot for the PC formation in PathA. The axes
are the Pauling bond orders from the anomeric C to the water molecule
(n(C2sial -Owat ) ) and residue Tyr752 (n(C2sial -OHtyr752 )). The IC, TS2,
and PC configuration are shown in the bottom left, top left, and top right,
respectively. See Fig. 3.4 for more information. The catalytic water
molecule is shown in Licorice and the nearby water molecules are pre-
sented in red with CPK. . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.12 The More O’Ferral-Jencks plot for the PC formation in PathB. The axes
are the Pauling bond orders from the anomeric C to the O-7 atom of the
substrate (n(C2sial -O7sial ) ) and residue Tyr752 (n(C2sial -OHtyr752 )). The
IC, TS2, and PC configuration are shown in the bottom left, top left, and
top right, respectively. See Fig. 3.4 for more information. The catalytic
water molecule is shown in Licorice and the nearby water molecules are
presented in red with CPK. . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.13 The More O’Ferral-Jencks plot for the PC formation in PathC. The axes
are the Pauling bond orders from the anomeric C to the H-3 atom of the
substrate (n(C2sial -H32sial )) and residue Tyr752 (n(C2sial -OHtyr752 )). The
3.14 Potential energy contour plot for PathA. This potential energy was ob-
tained from single point calculation based on M06-2X/6-31+G(d,p). En-
ergies are in kcal/mol. The white dashed line illustrates the minimum
potential energy path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
LIST OF FIGURES xix
3.15 Potential energy contour plot for PathB. This potential energy was ob-
3.16 Potential energy contour plot for PathC. This potential energy was ob-
(blue). This potential energy was obtained from single point calculation
based on M06-2X/6-31+G(d,p) with the SCC-DFTB/MM optimized ge-
ometries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.18 Free energy contour plot for PathA. Energies are in kcal/mol. The white
dashed line illustrates the minimum free energy path. . . . . . . . . . . . 72
3.19 Free energy contour plot for PathB. Energies are in kcal/mol. The white
3.20 Free energy contour plot for PathC. Energies are in kcal/mol. The white
3.21 Free energy profiles for PathA(black), PathB(purple) and PathC(blue).
This free potential energy profile was based on SCC-DFTB/MM simu-
lations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.22 Water distribution around C2, HO7 and H32 in substrate from the SCC-
DFTB/MM simulations for NanA, NanB and NanC (a) Radial distribution
function of water oxygens around C2 in substrate in intermediate state of
NanA, NanB and NanC; (b) Radial distribution function of water oxy-
gens around HO7 in substrate in intermediate state of NanA, NanB and
NanC; (c) Radial distribution function of water oxygens around H32 in
substrate in intermediate state of NanA, NanB and NanC; (d) the coor-
dination number for water around around C2 in substrate in intermediate
state of NanA, NanB and NanC; (e) the coordination number for water
around around HO7 in substrate in intermediate state of NanA, NanB and
NanC; (f) the coordination number for water around around H32 in sub-
strate in intermediate state of NanA, NanB and NanC. . . . . . . . . . . . 76
LIST OF FIGURES xx
4.1 The three predefined pathways for NanB and NanC. For the first step,
d1 refers to the distance between C2 position of the sugar ring and the
glycosidic O of methyl group, d2 is the distance between the glycosidic
O of methyl group and the aspartic acid Hδ 2, d3 describes the distance
between the tyrosine Oη and C2 position of the sugar ring, and d4 is the
one between the tyrosine Hη and the glutamic acid Oε2. For the second
steps, d5 is the distance between the tyrosine Oη and C2 position of the
substrate, d6 corresponds to the one between the tyrosine Oη and Hη, d7
is the one between C2 position of the sugar ring and the catalytic water
oxygen, d8 is the one between Asp372 Oε2 and catalytic water hydrogen,
d9 defines the one between C2 position and O7 of the sugar ring, d10
refers to the one between C3 and H32 in substrate, and d11 accounts
for that between H32 of the sugar ring and Asp372 Oε2. The reaction
coordinates rxn1, rxn6 are also illustrated in the figure. Tyr, Glu, Asp:
Tyr653, Glu541, Asp270 in NanB; Tyr695, Glu584, Asp315 in NanC. . . 79
4.2 The final snapshots at 10ns from the explicit solvent SCC-DFTB/MM
simulations. The key residues and substrate are shown explicitly. Thealiphatic
hydrogen atoms are omitted for clarity. The water molecules within ap-
proximately 5 Å of Oε2 of Glu541 and Glu695 are shown in red, respec-
tively. (a) NanB; (b) NanC. . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3 Potential energy contour plot for the first reaction step of NanB. Energies
energy path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 Potential energy contour plot for the first reaction step of NanC. Energies
energy path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.5 Potential energy contour plot for the 2nd step along PathA of NanB. En-
4.6 Potential energy contour plot for the 2nd step along PathB of NanB. En-
4.7 Potential energy contour plot for the 2nd step along PathC of NanB. En-
4.8 Potential energy contour plot for the 2nd step along PathA of NanC. En-
LIST OF FIGURES xxi
4.9 Potential energy contour plot for the 2nd step along PathB of NanC. En-
4.10 Potential energy contour plot for the 2nd step along PathC of NanC. En-
(blue) of NanB. This potential energy profile was based on SCC-DFTB/MM
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
(blue) of NanC. This potential energy profile was based on SCC-DFTB/MM
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.13 The More O’Ferral-Jencks plot for the IC formation of NanB. The axes
are the Pauling bond orders from the anomeric C to the nucleophile (n(C2sial -
See Fig. 4.2 for more information. The water molecules are shown in red
with CPK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.14 The More O’Ferral-Jencks plot for the IC formation of NanC. The axes
are the Pauling bond orders from the anomeric C to the nucleophile (n(C2sial -
See Fig. 4.2 for more information. The water molecules are shown in red
with CPK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.15 The More O’Ferral-Jencks plot for the PC formation in PathA of NanB.
The axes are the Pauling bond orders from the anomeric C to the water
molecule (n(C2sial -Owat ) ) and residue Tyr653(n(C2sial -OHtyr653 )). The
4.16 The More O’Ferral-Jencks plot for the PC formation in PathB of NanB.
The axes are the Pauling bond orders from the anomeric C to the O-7 atom
of the substrate (n(C2sial -O7sial ) ) and residue Tyr653 (n(C2sial -OHtyr653 )).
The IC, TS2, and PC configuration are shown in the bottom left, top left,
and top right, respectively. See Fig. 4.2 for more information. The pro-
posed catalytic water molecule is shown in Licorice and the nearby water
molecules are presented in red with CPK. However, Asp270 is protonated
by HO7 of the substrate directly. See details in Section 2.4.3 of this Chapter. 97
LIST OF FIGURES xxii
4.17 The More O’Ferral-Jencks plot for the PC formation in PathC of NanB.
The axes are the Pauling bond orders from the anomeric C to the H-
3 atom of the substrate (n(C2sial -H32sial )) and residue Tyr653 (n(C2sial -
OHtyr653 )). The IC, TS2, and PC configuration are shown in the bottom
left, top left, and top right, respectively. See Fig. 4.2 for more informa-
tion. The catalytic water molecule is shown in Licorice and the nearby
water molecules are presented in red with CPK. . . . . . . . . . . . . . . 98
4.18 The More O’Ferral-Jencks plot for the PC formation in PathA of NanC.
The axes are the Pauling bond orders from the anomeric C to the water
molecule (n(C2sial -Owat ) ) and residue Tyr695 (n(C2sial -OHtyr695 )). The
4.19 The More O’Ferral-Jencks plot for the PC formation in PathB of NanC.
The axes are the Pauling bond orders from the anomeric C to the O-7 atom
of the substrate (n(C2sial -O7sial ) ) and residue Tyr695 (n(C2sial -OHtyr695 )).
The IC, TS2, and PC configuration are shown in the bottom left, top left,
and top right, respectively. See Fig. 4.2 for more information. The pro-
posed catalytic water molecule is shown in Licorice and the nearby water
molecules are presented in red with CPK. However, Asp315 is protonated
by HO7 of the substrate (label in the figure) directly. See details in Section
2.4.3 of this Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.20 The More O’Ferral-Jencks plot for the PC formation in PathC of NanC.
The axes are the Pauling bond orders from the anomeric C to the H-
3 atom of the substrate (n(C2sial -H32sial )) and residue Tyr695 (n(C2sial -
OHtyr695 )). The IC, TS2, and PC configuration are shown in the bottom
left, top left, and top right, respectively. See Fig. 4.2 for more informa-
tion. The catalytic water molecule is shown in Licorice and the nearby
water molecules are presented in red with CPK. . . . . . . . . . . . . . . 99
4.21 Potential energy contour plot for PathB of NanB. This potential energy
was obtained from single point calculation with M06-2X/6-31+G(d,p)
based on the SCC-DFTB/MM MEP. Energies are in kcal/mol. The white
dashed line illustrates the minimum potential energy path. . . . . . . . . 102
4.22 Potential energy contour plot for PathC of NanC. This potential energy
was obtained from single point calculation with M06-2X/6-31+G(d,p)
based on the SCC-DFTB/MM MEP. Energies are in kcal/mol. The white
dashed line illustrates the minimum potential energy path. . . . . . . . . 103
LIST OF FIGURES xxiii
A.1 The alternative mechanism for the second step of NanA along PathC. . . . 141
A.2 Potential energy contour plot for the second step of NanA based on the
mechanism proposed for NanC. Energies are in kcal/mol. . . . . . . . . . 142
A.3 1D potential energy plot of the subsequent reaction for PathB in NanA.
The last point (labeled in red) represents a free minimization without re-
straints on the reaction coordinates. . . . . . . . . . . . . . . . . . . . . . 142
A.4 1D potential energy plot of the subsequent reaction for PathC in NanA.
B.1 The alternative mechanism for the second step of NanB along PathC . . . 147
B.2 Potential energy contour plot for the second step of NanB based on the
mechanism which has a catalytic water involved. Energies are in kcal/mol. 148
B.3 Potential energy contour plot for the second step of NanC based on the
mechanism which has a catalytic water involved. Energies are in kcal/mol. 148
B.4 1D potential energy plot of the subsequent reaction for PathA in NanB.
B.5 1D potential energy plot of the subsequent reaction for PathB in NanB.
B.6 1D potential energy plot of the subsequent reaction for PathC in NanB.
B.7 1D potential energy plot of the subsequent reaction for PathA in NanC.
B.8 1D potential energy plot of the subsequent reaction for PathB in NanC.
B.9 1D potential energy plot of the subsequent reaction for PathC in NanC.
B.10 Potential energy profiles for PathA of NanA (black), PathB of NanB (pur-
ple) and PathC of NanC (blue). This potential energy was obtained from
single point calculation based on M06-2X/6-31+G(d,p). . . . . . . . . . . 155
List of Tables
1.1 Six types of enzyme catalysts.58,59 . . . . . . . . . . . . . . . . . . . . . 9
2.1 Summary of the starting structures, the equilibrium molecular dynamics

simulations (Eq.) and TI simulations. . . . . . . . . . . . . . . . . . . . . 33
2.2 ∆G(E-Glu(H/D)) of Glu172 in BcX based DFTB3/MM TI explicit solvent
simulations (in kcal/mol). . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 The pKa values for Glu172 based on DFTB3/MM TI simulations, MCCE
and PROPKA methods. For MCCE, only results from DEFAULT are
presented. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4 The pKa values for other residues based on MCCE and PROPKA meth-
ods. For MCCE, only results from DEFAULT are presented. . . . . . . . 39
2.5 pKa values of Glu172 from DFTB3/MM, MCCE (QUICK and DEFAULT)
and PROPKA. Numbers in parentheses represent the number of values
used to calculate RMSD, 5 referring to all the cases in the table and 3
referring to the cases of WT, WT-2FXb and N35H-2FXb. . . . . . . . . . 40
2.6 pKa values of other residues from MCCE (QUICK and DEFAULT) and
PROPKA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.7 pKa shifts based on perturbative analyses of the deprotonation Glu172
with DFTB3/MM simulations a . . . . . . . . . . . . . . . . . . . . . . . 42
2.8 Perturbative analyses of the deprotonation free energy of Glu172 with
DFTB3/MM simulations (in kcal/mol)a . . . . . . . . . . . . . . . . . . . 42
2.9 Perturbative analyses of the deprotonation free energy of Glu172 with
DFTB3/MM simulations (in kcal/mol)a . . . . . . . . . . . . . . . . . . . 43
2.10 Perturbation analyses for pKa s of Glu172 and Glu78 based on PROPKA
method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.11 The Coulombic interaction energy between Glu172 and surrounding residues
estimated by PROPKA (in kcal/mol) . . . . . . . . . . . . . . . . . . . . 44
2.12 The minimal distance between heavy atom pairs in Glu172 or Glu78 and
surrounding residues (in Å) . . . . . . . . . . . . . . . . . . . . . . . . . 44
xxiv
LIST OF TABLES xxv
3.1 Summary of the starting structures for the Reactant complex (RC) and
Intermediate complex (IC), the equilibrium molecular dynamics simula-
tions (Eq.). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 Reaction coordinate distances (Å) for RC, TS1 and IC in the first step of
NanA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 Reaction coordinate distances (Å) for RC, TS1 and IC for the second step
along PathA, PathB and PathC, respectively. . . . . . . . . . . . . . . . . 65
3.4 Potential energies based on perturbation analyses of relevant residues with
SCC-DFTB/MM simulations. Energies are in kcal/mol. . . . . . . . . . . 77
4.1 Summary of the starting structures for the Reactant complex (RC) and
Intermediate complex (IC) of NanB and NanC, the equilibrium molecular
dynamics simulations (Eq.). . . . . . . . . . . . . . . . . . . . . . . . . 82
NanB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
NanC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.4 Reaction coordinate distances (Å) for IC, TS2 and PC for the second step
of NanB along PathA, PathB and PathC, respectively. . . . . . . . . . . . 101
4.5 Reaction coordinate distances (Å) for IC, TS2 and PC for the second step
of NanC along PathA, PathB and PathC, respectively. . . . . . . . . . . . 101
B.1 Reaction coordinate distances (Å) at RC, TS1, IC, TS2 and PC in PathA
of NanA, PathB of NanB and PathC of NanC. . . . . . . . . . . . . . . . 146
Chapter 1
Introduction
1.1 The carbohydrate-active enzymes

Carbohydrate structures can vary greatly, mainly as a result of the different possible com-
binations and orientations of the individual monosaccharides that make up the overall
structure. The possible structures vary to an even greater extent when taking into con-
sideration the possible stereochemical and non-carbohydrate substituent differences that
occur between different monosaccharide subunits.1 These complex carbohydrates are ex-
pressed in the living world, where they regulate functional properties and mediate bio-
logical processes and cellular interactions within or between organisms.1–5 The synthesis
and degradation of complex carbohydrates is controlled by a group of enzymes, named
Carbohydrate-Active Enzymes (CAZymes).1,5 CAZymes are a large group of structurally
related enzymes that catalyse the formation, modification, and degradation of glycosidic
bonds in carbohydrates.1 CAZymes play a key role in a variety of biological processes
closely related to human health and disease (e.g. cancer, diabetes, Alzheimer’s diseases
and acquired immune deficiency syndrome (AIDS)), and have been utilised as important
drug targets in combating pathogenesis.6,7 Additionally, CAZymes hold the potential as
efficient biocatalysts for biofuel production.8 Hence, this enzyme group is of great interest
to medical biotechnology and chemical engineering.9 Since 1991, the available informa-
tion including the enzyme’s structure, sequence and function are described in the CAZy
database (http://www.cazy.org/).1 As of 2017, the sequence-based CAZy database
has classified more than 300 families, which encompasses almost 400,000 enzymes.9
(Fig. 1.1)
1
CHAPTER 1. INTRODUCTION
Figure 1.1: Diversity of CAZymes and auxiliary enzymes, according to the CAZy database. For the detailed catalytic mechanisms, please refer to
www.cazypedia.org.10 Reproduced from André et al. Current Opinion in Chemical Biology, 17, 17-24 with permissionfrom Elsevier.11
2
CHAPTER 1. INTRODUCTION 3
OH OH
HO O O
HO OH
HO HO
OH OH
OH
(1) α-D-Glucopyranose (2) β-D-Glucopyranose

Figure 1.2: α (1) and β (2) conformations of D-glucopyranose.
The annotation of CAZymes is never a trivial task. Several state-of-the-art annotation

methods have been employed, such as Protein families database of alignments (PFAM)
hidden Markov model (HMM) construction12 and Basic Local Alignment Search Tool
(BLAST).13 In particular, the dbCAN (http://csbl.bmb.uga.edu/dbCAN/annotate.
php) method based on HMM provides much richer information and more comprehen-
sive CAZyme annotation than the Conserved Domains Database (CDD) model (https:
//www.ncbi.nlm.nih.gov/cdd) and PFAM protein domain database (http://pfam.
janelia.org/).14 However, the CAZy database is still insufficient for description of en-
zymatic activity due to its sequence-based classification.1 CAZypedia (www.cazypedia.
org),10 together with the structural information in Protein Data Bank (PDB) (http:
//www.rcsb.org), provides a more comprehensive overview of the biocatalysis prop-
erties of CAZymes.
1.1.1 Puckering of carbohydrate structures

The complexity of carbohydrate structures is evident from their diverse ring sizes, derivati-
sation and ring substituents and their stereochemistry. The changes in the orientations of
the ring substituents can strongly influence the properties of carbohydrate structures and
thus make them highly complex. As an example, Fig. 1.2 shows the structures of D-
glucopyranose that has two different stereochemistries, α (1) and β (2). Additionally,
variations in the conformational of the sugar ring are important in stabilising the struc-
tures of carbohydrates, with the orientation of exocyclic groups closely related to the
conformers of the sugar ring. As there is a myriad of carbohydrate-based structures in
nature, here we focus on discussing the conformations of pyranose (six-membered ring)
sugars.
The pyranose ring nomenclature described by Stoddart15 first and adopted by IUPAC16
is summarised in Fig. 1.3 which includes two diagrams, introducing all the accessible
conformers, together with their connectivity. The 38 canonical conformations shown in
Fig. 1.3 represent distinct ways in which the pyranose ring can be classified into chair,
half-chair, envelope, boat and skew-boat (represented by the letters C, H, E, B and S,
respectively).17
3,OB 3,OB
OS 3S OS 3S
2 1 2 1
OE 3E
B2,5 OH OH B1,4 B2,5 3H 3H B1,4

5 1 2 4
E5 E1 E2 E4
1S 4H 4C 2H 5S 1S 1H 1C 5H 5S
5 5 1 1 1 5 2 4 4 1
4E 2E 1E 5E
1,4B 4H 2H 2,5B 1,4B 1H 5H 2,5B

3 3 O O
E3 EO
1S 2S 1S 2S
3 O 3 O
B3,O B3,O
Figure 1.3: Stoddart diagrams of pyranose ring in the north and south poles. Figure
adapted from Montgomery et al.18
1.1.2 Cremer-Pople puckering coordinates

Cremer and Pople mapped the conformations of pyranose onto a spherical representation
using three polar coordinates: a radius Q, two phase angles φ and θ (Fig. 1.4).19 These
three coordinates are defined by the following equations:
r
6
1 2π
Q sin θ cos φ =
3 ∑ z j cos[ 6 2 ( j − 1)]
j=1
r
6
1 2π
Q sin θ cos θ =
3 ∑ z j cos[ 6 2 ( j − 1)]
j=1
r
6
1
Q sin θ cos θ =
3 ∑ (−1) j−1 z j (1.1)
j=1
All the possible pyranose ring conformations would fall within the ellipsoid of puck-
ering radius Q (Fig. 1.4). Stoddart’s representation15 corresponds to the projection of
Cremer-Pople’s sphere-like representation and can be computed according to their Carte-
sian coordinates.20 Cremer and Pople’s puckering coordinates offer a convenient approach
to study the conformational free energy of the pyranose ring either in vacuo,21 in aqueous
solutions22 or along reaction pathways of CAZymes.23 On the other hand, the complexity
of carbohydrate puckering also presents a challenge for molecular modelling. Deficien-
cies have been identified in various adopted models including classical force fields and
semi-empirical model22,24,25 and improvements have been introduced for better descrip-
tions of puckering.26,27
O
4C
1
O θ
5S
B1,4 1 2,5B
3S 2S
1 O
3,OB
ϕ
B3,O
OS 1S
2 3
1,4B
B2,5 1S
5
O
1C
4
O
Figure 1.4: Cremer-Pople puckering coordinates of a pyranose ring. Figure adapted

from Montgomery et al.18
1.2 Glycoside hydrolases

Two of the most notable families of CAZymes are glycoside hydrolases (GHs) and gly-
cosyltransferases (GTs) (Fig. 1.1), which are responsible for the cleavage and synthesis
of glycosidic bonds, respectively.28,29 Structure and sequence-based classifications cate-
gorise GHs into more than 135 families.1 Alternatively, based on their reaction mecha-
nisms, GHs can be separated into two distinct classes proposed by Koshland in 1953:29,30
inverting and retaining GHs that catalyse hydrolysis with inversion or retention of the
anomeric configuration, respectively. Inverting GHs (Fig. 1.5A) operate via a direct re-
placement of the leaving group by a water molecule. This mechanism requires a general
acid and a general base residue. In retaining GHs (Fig. 1.5B), one catalytic residue func-
tions both as a general acid by protonating the aglycone during the glycosylation step and
as a general base by deprotonating the attacking water during the deglycosylation step.
The other catalytic residue acts as a nucleophile during the glycosylation step to form
a glycosyl-enzyme intermediate and acts as a leaving group during the deglycosylation
step. Each reaction step for these two classes has been shown to involve an oxocarbe-
nium ion-like transition-state (TS) (Fig. 1.5).31–35 The anomeric carbon (C1) in the TS is
sp2 -hybridised and positively charged due to the cleavage of C1-O1 bond, along with the
shrinkage in C1-O5 distance and the formation of a double bond.36 The glycopyranosyl
ring exhibits a distorted conformation away from a relaxed chair conformation,20,37 which
is attributed to the planarity of C5, O5, C1 and C2 atoms imposed by the sp2-hybridised
TS. This distorted conformation is also observed in the Michaelis complex structures of
GHs.23,29,38–43 It is worth noting that though most GHs follow the classical Koshland
mechanisms, a number of completely distinct mechanisms for enzymatic cleavage of gly-
cosides have been discovered.44
1.2.1 The conformational catalytic itinerary in GHs

Conformational changes along the reaction coordinates are referred to as the catalytic
itinerary.38,45 Knowing the conformational catalytic itinerary for a specific GH can be
very helpful to design potent inhibitors, such as TS analogues, whose shapes and electro-
static properties mimic the TS of a substrate.46 TS structures can be interpolated in ac-
cordance with the Cremer-Pople puckering coordinates if the distorted conformations for
both the Michaelis complex and the covalent intermediate of the retaining GH (or product
state of inverting GH) are known. X-ray crystallography is the most used tool to infer
the conformational catalytic itinerary harnessed by GHs. For example, an X-ray study
trapped the three-dimensional structures of the Michaelis complex and covalent interme-
diate of a retaining GH26 β -mannanase, showing a distorted β -1 S5 and α-O S2 conforma-
tion, respectively.47 This data suggested that the TS may adopt a B2,5 boat conformation,
which is flanked by Michaelis 1 S5 and intermediate O S2 conformations on the Cremer-
General acid
(A)
δ-
O O O O H O O
H O
δ+ H δ+
C4 C5
O5
O -
R
O O R Oδ O
HO O1 HO R HO
C3 C2 C1
δ+ δ-
O O OH
H H δ- H
H
O O O O δ+ O OH
General base Oxocarbenium ion-like TS
General acid
(B)
δ-
O O O O
H
δ+ H δ+
glycosylation
O O - H
HO O R HO Oδ
δ + R O General base
R
δ- O O
O O O O
H
Nucleophile O O H
Oxocarbenium ion-like TS HO
O O
δ-
O O O O
H
δ+ H δ+
O deglycosylation O -
HO O R HO Oδ
H
δ+
δ-
O O O O
Oxocarbenium ion-like TS
Figure 1.5: Mechanisms of inverting (A) and retaining (B) glycoside hydrolases. Figure
adapted from Montgomery et al.18
Pople itinerary, predicting that a 1 S5 → B2,5 → O S2 itinerary occurs for β -mannanase.

This prediction was supported in later work on a large number of putative mannosidase
inhibitors bound to the β -mannosidase BtMan2A (GH2)48 with kinetic analyses.42 Offen
and co-workers reported a distorted 1 S5 conformation for the Michaelis complex for the
β -mannosidase Man2A (GH2) arguing strongly in support of the same itinerary of 1 S5 →
B2,5 → O S2 for GH2.49
Although X-ray crystallography is a very powerful tool to investigate GH itineraries,
it has its own limitations. The mutated enzymes and the non-natural substrates used in
experiments may affect the nearby active sites and result in distortion of the sugar ring
in subsite -1. In addition, the TS has a short life and is unstable, as a result it cannot be
observed by this type of direct measurements. However, computational simulations can
overcome these limitations, providing equilibrium data for TS conformation and com-
plete conformational itinerary gaps between experimentally captured stable states.50 Re-
cently, a joint experimental and computational study was carried out to probe the tran-
sition of a GH76 α-1,6-mannanase, a key enzyme in bacterial and fungal mannoprotein
metabolism.51 Experimentally, a Michaelis complex state in O S2 conformation and an
inhibitor complex in B2,5 conformation suggested a boat conformation for the TS. The
complete interpretation was achieved through Car-Parrinello MD metadynamics simula-
tions of the Michaelis complex and the inhibitor complex. The free energy landscape
of the mannose natural donor and inhibitor isofagomine (IFG) in complex with GH76
clearly demonstrated that GH76 dramatically reshapes the energetic accessible conforma-
tional space. The 4 C1 conformer is no longer the global minima, being 6 kcal/mol higher
compared to the N S5 /B2,5 conformer of IFG and > 15 kcal/mol higher compared to the
O S /B
2 2,5 conformer of the mannoside. Taken together, computational studies mapped out
the full itinerary for α-1,6-mannanase and suggested that it operates through a canonical
1S → B O
5 2,5 → S2 itinerary.
Consistent with X-ray crystallographic studies, all the computational studies have noted
that the substrates in the Michaelis complex of different GHs adopt different distorted
structures, concurring with the concept that different GHs follow unique catalytic con-
formational itineraries.36,52–54 All these examples illustrate that calculation of the free
energy landscape could be a powerful tool to capture the conformations of both the sugar
donor and TS.55
In general, one of the common intrinsic features coming out of these computational
studies of GHs is that both the nucleophile and the leaving group are conformationally
restrained so that the anomeric substitution performed by the anomeric migration towards
the nucleophile needs to break the glycosidic bond in subsite -1 through the conforma-
tional changes. It has been proposed that ring distortion provides electronic assistance
and structural changes to pre-activate the enzymatic reaction and thus facilitate the TS
formation.36,50
1.2.2 pK a shifts in GHs

As illustrated in Fig. 1.5, one common feature that arises is the dynamic change of the
protonation states of the catalytic residues, which is related with their apparent pKa val-
ues. For instance, the His Nε in serine proteases has a pKa upshift of 3.5 and Lys116
in acetoacetate decarboxylase has a pKa downshift of 4.56 Studies of pKa shifts involved
in CAZymes catalysed reactions will provide a complete interpretation of their catalytic
mechanisms, and of how their catalytic roles are controlled.
In biological macromolecules, only a finite number of chemical groups (including pro-
teins and RNAs) are able to function as catalysts in numerous biological reactions (Table
1.1). The multitudinous enzyme-catalysed reactions are due to these functional groups’
capability to act either as general acid/base catalysts, nucleophiles, or electrophiles.57 The
key to their versatile roles or functions lies in their protonation states and their apparent
pKa s, which depends on the intrinsic pKa value of the given group and their specific en-
vironment it resides within. The weak electrostatic interactions between ionisable groups
on the biomolecular surface only cause small pKa perturbations and have been frequently
reported in experimental and theoretical studies. However, the pKa s of catalytic groups
that exist in the active sites of enzymes can be perturbed significantly.56 To fully exploit
such data, it is necessary to have a good understanding of how the enzymes control the
pKa values of their key catalytic residues.
Table 1.1: Six types of enzyme catalysts.58,59
Types of enzyme catalysts Examples

Oxidoreductases Alcohol dehydrogenase; Monoamine oxidase;
Amino acid oxidoreductases
Transferases Alanine aminotransferase; Glycosyltransferases;
Acyltransferases
Hydrolases Phosphatase; Glycoside hydrolase;
Proteases
Lyases Decarboxylases; Aldehyde lyases;
Ferrochelatase
Isomerases Intramolecular oxidoreductases;
Intramolecular transferases; Intramolecular lyases
Ligases Aminoacyl-transfer RNA; Acetate-CoA;
Tyrosine-tRNA
1.3 Bacillus circulans xylanase

One of the systems where the pKa s for the key residues have been systematically char-
acterised is Bacillus circulans (BcX), which belongs to CAZy family GH-11. Xylans,
generally referred to heteropolysaccharides that are based on a backbone structure of β -

1,4-linked xylopyranosyl residues, are the important polymeric component of plant cell
walls.60 Xylanase has longstanding importance as a class of enzymes that hydrolyze xy-
lans. The results for the enzymatic processing of xylan degradation and depolymeriza-
tion have been employed in food preparation, pulp, paper and biofuels industries.61,62
BcX is a retaining GH (Fig. 1.5 B) whose mechanism involves two active site carboxyl
groups, Glu78 and Glu172, serving as a nucleophile and a general acid/base, respec-
tively (Fig. 1.6).63,64 The covalent intermediate state formed by the nucleophilic attack
of Glu78 and hydrolyzed via the oxocarbenium ion transition state can be trapped by
the inactivator 2,4-dinitrophenyl-2-deoxy-2-fluoro-β -xylobioside (DNP-2FXb).61,65 The
2FXb-BcX intermediate was proven to be a stable enzyme-intermediate complex.65 The
three-dimensional structure of BcX has been determined.63,64,66–70 It was also shown that
BcX is one of the classical model systems for understanding the structure and mechanism
of retaining GHs, and has been analysed extensively by experimental techniques, includ-
ing Nuclear Magnetic Resonance (NMR) spectroscopy and X-ray crystallography.71–75
Nevertheless, few theoretical investigations have been performed on BcX. As the main
determinants for protein pKa (i.e. desolvation effects, helix dipole interactions, charge-
dipole interactions and charge-charge interactions) of BcX are not sufficiently clear, key
mechanistic questions remains. What molecular features of active site are needed to es-
tablish the pKa values of Glu172, Glu78 and nearby residues? Generally, a significant
challenge for computational enzymology is to predict pKa s for catalytical residues in the
heterogeneous electrostatic systems based on empirical or continuum electrostatics mod-
els. Fortunately, microscopic simulations with explicit solvent models can overcome this
difficulty so that we can provide insights into the protonation states for ionisable groups.
We calculated the pKa shifts for BcX with a combination of different modeling techniques
to complement previous computational and experimental studies and provided explana-
tions for the effects of the structural perturbations caused by mutants of BcX (See details
in Chapter 2).
1.4 The sialidase superfamily

In the early 1940s, the enzyme whose substrate is the receptor for influenza virus was
first suggested to exist on the surface of the red blood cells.77 Alfred Gottschalk then first
discovered that sialidases, also known as neuraminidases (EC.3.2.1.18), from influenza
virus and Vibrio cholerae (V. cholerae) receptor-destroying enzymes could remove the
virus receptor.78 It was further established that sialidases catalyze the cleavage of sialic
acid at the termini of complex carbohydrates on glycolipids or glycoproteins and exhibit
crucial roles in many biological processes such as immune surveillance, apoptosis and
signalling.79–81 These enzymes were later classified in the GH33 family based on the se-
Glu172 Glu172 Glu172
O -O O O
OH HO OH OH HO
R R R
O O H O
HO O O O O O
HO HO
R O
HO HO H HO H
-O
-O H 2O HOR O O
O O
C C C
Glu78 Glu78 Glu78
Glycosylation Deglycosylation
Figure 1.6: The double-displacement mechanism in BcX. Figure adapted from Xiao and
Yu.76
quence similarities1 and have been identified on the animal tissue cell surfaces.82 They
are expressed in a number of microorganisms like protozoa, fungi, bacteriophage, bac-
teria and many deuterostomes.83 Among the many sialidases, the neuraminidase (NA),
as the major antigen of the influenza A virus envelope, has drawn much attention in the
past decades. The NAs remove sialic acids from the virus envelope to prevent the forma-
tion of aggregates of virus particles, thereby promoting the budding process and release
of progeny virions from the cell surface.84 They may also play a role early in infection
of epithelial cells.85 Another key virulence biochemical factor of influenza A virus is
haemagglutinin (HA), which binds to the sialic acid residues through its receptor-binding
site on target cells during the initial infection.86 The paramyxovirus sialidases, defined
as haemagglutinin-neuraminidases (HNs),87 which both binds to the receptor and cleaves
sialic acids.79 The viral sialidases have been confirmed as the key enzymes that have
drawn great therapeutics interest.88–91 Apart from viruses, bacteria also produce siali-
dases, such as Clostridium perfringens (C. perfringens), Clostridium septicum (C. sep-
ticum), Salmonella typhimurium (S. typhimurium), V. cholerae and Streptococcus pneu-
moniae (S. pneumoniae).92 These types of pathogenic bacteria have been suggested to
act as potential virulence biochemical factors involved in the carbohydrate recognition of
sialic acid residues exposed on the host cell glycoconjugates.93,94 Generally, sialidases
can remove sialic acids from host cell glycoconjugates by revealing the putative receptors
for bacteria colonization and invasion process.95 Also, the sialic acid residues released by
sialidases can serve as a energy source by utilizing the carbon skeleton.95
Based on the specific intracellular localizations, mammalian sialidases, another group
of sialidases, have been cloned and classified into four subtypes, NEU1, NEU2, NEU3
and NEU4 that are encoded by four different genes in vertebrates.96 Apart from the dif-
ferent intracellular localizations, these sialidases also have different tissue expressions
and substrate preferences.96,97 They are shown to be localized in lysosomes, plasma
Trans-sialidase Activity
(a) Tyr
Tyr
Glu
Glu
O O-
O O H O O
OH H
HO O +R HO OH
O O O
AcHN O- -R AcHN O-
HO HO HO OH
H RO
O
R H R = Gal, GalNAc etc. H
HO O-
O O
Asp
Asp
Sialidase Activity
(b) Tyr
Tyr
Glu
Glu
O O-
O
O O H O H
HO OH
HO OH O +H2O O
O
O AcHN
AcHN O- -H2O O-
HO OH
HO HO HO
H
O
H H H
HO O-
O O
Asp
Asp
Covalent Intermediate Michaelis Complex
Figure 1.7: Mechanisms showing the transfer (a) and hydrolysis (b) reactions. TcTS ex-
hibits trans-sialidase activity, transferring sialic acid to the R-containing mucins leading
to the Michaelis Complex. TrSA preferentially shows the hydrolytic activity, hydrolysing
the Covalent Intermediate to the Michaelis Complex by a water molecule.
membranes-associated sialidases and intracellular-associated sialidases, respectively.96,98,99

In addition, it has been reported that NEU3 is also observed on cell surface and endoso-
mal compartments,100,101 where it is able to interact with specific proteins involved in
intracellular trafficking, cell stress response and protein folding.102 These mammalian
sialidases play crucial roles in numerous physiological and pathological events including
apoptosis, differentiation and various human cancers.99,103,104 The lysosomal sialidase
NEU1 is a crucial vehicle for the degradations of complex N-glycans that are invloved
in various diseases including sialidosis,105 Alzheimer’s disease (AD)-like amyloidogenic
process in mice106 and cellular signalling107–113 by regulating the sialylation level of gly-
cans. The cytosolic NEU2 contributes to the myoblast differentiation during myotube
growth.97,114–118 It has been demonstrated that the plasma membrane-associated sialidase
NEU3 is key in cell adhesion, transmembrane signalling, neuronal differentiation, onco-
genic transformation, endocytosis and apoptosis.101,103,119–127 The mitochondrial siali-
dase NEU4 was supposed to be involved in apoptogenic events, degrading the ganglioside
GD3 in neuroblastoma cells.128,129 Also, NEU4 may function as apoptosis effectors by
regulating Ca2+ flux through ganglioside hydrolysis.103,130
Distinguishing from aforementioned hydrolytic sialidases, some paticular sialidases
that posses unique properties were observed, including Tyrpanosoma cruzi (T. cruzi) and
Trypanosoma rangeli (T. rangeli). T. cruzi, a protozoan parasite, is the etiologic agent
of American trypanosomiasis, also know as Chagas’ disease which was first described in
early 1900s.131 The sialidase of T.cruzi (TcTS) plays a key role in the transfer of sialic acid
residues from the host cell to the parasite’s surface glycoproteins.132–135 Thus, the proto-
zoan parasite of the genus Trypanosoma is protected by TcTS from the host’s immune sys-
tem and then mediates the initial stage of the invasion of the host cells.136,137 In general,
TcTS exhibits low hydrolytic activity and high trans-sialidase activity (Fig. 1.7) and thus
is able to produce sialylated human milk oligosaccharides and galacto-oligosaccarides.138
The sialidase of T. rangeli (TrSA) lacks trans-sialidase activity (Fig. 1.7) although it
has high sequence identity (∼ 70%) with TcTS.139,140 Interestingly, the mutants of TrSA
have been discovered to reduce hydrolytic activity or improve trans-sialidase activity,
providing an appealing alternative for enzymatic glycan sialylation.141–143 In addition,
an unusual sialidase from the North American leech, Macrobdella decora, sialidase L,
which belongs to the intramolecular (IT) trans-sialidase subfamily, was shown to hy-
drolyze only the α-2,3-linkages of N-acetylneuraminic acid (NeuAc) to galactose (Gal) in
sialoglycoconjugates, releasing 2,7-anhydro-NeuAc rather than NeuAc.144–146 The exo-
α-sialidase (EC 3.2.1.18) attacks the terminal, non-reducing sialic acid linkages from
glycoconjugates, and usually due to the steric hinderance, the internal sialic acid residues
are resistant to the activity of exo-α-sialidase, however, Arthobacter ureafaciens en-
zyme was the only sialidase shown to desialylate glycoconjugates via attacking the in-
ternal sialyl residues.146–148 Distinct from the usual exo-α-sialidase, another type of sial-
idase, endo-α-sialidase (E.C 3.2.1.129) hrdrolyzes sialic acid linkages internal to macro-
molecules,146 such as bacteriophage E endosialidase which cleaves the α2,8-linkage in
polysialic acid.149,150
Generally, we described the sialidases above according to their functional properties.
They can be also catagorized in many other ways, one of which is based on the reaction
mechanism that separates them into two distinct classes: inverting and retaining GHs
(See details in Chapter 1.2.2). Furthermore, based on the sequence homology, retaining
GHs can be divided into different families: GH33, trans-sialidase, IT trans-sialidase,
hydrolytic sialidase, bacterial, viral and eukaryotic; GH34, influenza virus NAs; GH58,
bacteriophage endosialidase; and GH83, paramyxovirus.151,152
1.4.1 Three S. pneumoniae sialidases NanA, NanB and NanC

S. pneumoniae (pneumococcus) is a Gram-positive, α-haemolytic, facultative anaero-
bic organism that resides in the nasopharynx. S. pneumoniae is largely responsible for
upper/lower-respiratory infections, meningitis, and septicaemia, and operates a number of
fairly major diseases with a considerably high mortality and incidence.153 It is the most
common cause of acute otitis media in children,154 especially following upper respira-
tory tract infections. Most S. pneumoniae are encapsulated, and these capsular polysac-
charides are shown to be pathogenic to humans. Their antigenicity and biochemical
structures were investigated, and currently more than 94 distinct S. pneumoniae cap-
sules serotypes have been described.155–158 Commonly, the nasopharyngeal carriage of
S. pneumoniae is asymptomatic but serves as the major reservoir for transmission of res-
piratory pathogens.159,160 The reported rate of asymptomatic carriage is very high, around
80% in children under 5 years old.161 Many factors that contribute to its own clearance
are likely to affect the duration and frequency of transmission of S. pneumoniae, such
as environmental factors (e.g., smoking, crowding and ethnicity), inflammatory factors
(e.g. tumour necrosis factor and interleukin-1factor), bacteria factors (e.g. proinflam-
matory bacterial and CD4 T-cell-mediated immunity factors) and age factors.160,162–166
Infections preceded by the bacterial colonization can result in various disease states, in-
cluding invasive diseases such as osteomyelitis, endocarditis, bacteraemia, pneumonia
and septicemia and non-invasive diseases such as otitis media and community-acquired
pneumonia.167 The occurrence rate of invasive pneumococcal disease is highest in adults
aged over 65 and in children 2 years of age and younger. Particularly, it has been re-
ported that individuals with certain medical condition such as hematologic maligancy and
human immunodeficiency virus (HIV) infection have higher risks.160 Around 400,000
invasive pneumococcal disease-associated hospitalizations occur in the United States of
America each year.168 Fortunately, with the introduction of polysaccharide pneumococ-
cal vaccine (e.g. PPSV23, PCV7 and PCV13), the incidence of pneumococcal disease
has been reduced, and to some extent, the pneumococcal carriage rates have also been
reduced, protecting individuals against pneumococcal infection.169 However, further re-
search on potential new vaccine candidates is still necessary due to the increase in the
rates of antibiotic-resistant pneumococci.
The comprehensive analysis of the S. pneumoniae genome sequence suggested that
numerous geomes hybridizing with deoxyribonucleic acid (DNA) can contribute to the
colonization of host tissues.171 Sialidases are one of the key virulence biochemical fac-
tors which may facilitate such pneumococcal colonization and invasion process, because
they are able to specifically remove sialic acid residues from host cell and proteins by ex-
posing the putative receptors for pneumococci.172 S. pneumoniae encodes up to three dis-
tinct sialidases, that are NanA, NanB and NanC.173 The crystal structures for the three S.
pneumoniae sialidases have been solved (PDB code: 2VVZ,174 2VW1175 and 4YW3,176
respectively.). The active sites for NanA shows a more flat and open catalytic cleft, while
both NanB and NanC present a relatively narrower and more hydrophobic binding pocket
(Fig. 1.8).173 Clinical research of pneumococcal isolates of S. pneumoniae point out that
these three sialidases are present in 100%, 96% and 51% of strains respectively.177 NanA
is the largest one in size (115kDa), comparing to the other two sialidases NanB (78kDa)
and NanC (82kDa), due to the presence of a C-terminal ”LPTEG” anchoring motif. NanA
has a sequence share of up to 24%, while NanC shares over 50% sequence identity with
NanB. Previous studies using MF1 mouse models have demonstrated that both NanA and
NanB play a crucial role in respiratory tract infection, sepsis and blood infection, and are
of immense potential in the area of drug design.153 What is more, considerable evidence
indicated that NanA has a significant overall impact on middle ear invasion, nasopharynx
and the incidence of otitis media effusion,178 and also contribute to S. pneumoniae-human
brain microvascular endothelial cells and blood-brain barrier.179 Recent 1 H NMR stud-
ies174–176,180 proposed the catalytic mechanisms for NanA, NanB and NanC. As depicted
in Fig. 1.9, the conserved catalytic residues are shared by these three sialidases. The
findings173,176 reported that these three sialidases might work together up to the ultimate
step where NanA and NanB produce Neu5Ac and 2,7-anhydro-Neu5Ac following the
functions of sialidase and intramolecular (IT) trans-sialidase, whilst NanC could carry on
a ping-pong mechanism that produce or remove Neu5Ac2en. It is intriguing that three
different sialidases have very similar active sites but operate via three distinct reaction
pathways. Considering the strengths of computational enzymology in revealing the mech-
anistic details of chemical reactions, We propose to characterise the reaction energetics of
these three systems and compare their reaction pathways and probe the determinants for
their respective catalysis (See details in Chapter 3 and 4).
Figure 1.8: The topology of S. pneumoniae NanA (a), NanB (b) and NanC (c) active
sites. The catalytic cavity in NanA is flat and open while the ones in NanB and NanC are
relatively narrow due to the two Tryptophans (Trp674 and Trp716 in NanB and NanC,
respectively.). Reproduced from http://hdl.handle.net/10023/778 by permission
of Xu.170
Tyr
Glu
O O-
O
H +
HO OH H2N
O Arg
O H2N
AcHN
O-
+
HO OH
R = Gal, GalNAc etc.
OR
O O
Tyr752
Asp
Glu647
O O H O
+
HO OH O H2N
Arg Tyr653
O H2N
AcHN O- +
HO HO Glu541
H
O
H
H O O H O
O O- +
HO OH O H2N
Arg Tyr695
O H2N
Asp372 AcHN O- +
O HO Glu584
H
H O
NanA H O O H O
H +
O O- HO OH O H2N
Arg
OH COOH O H2N
HO AcHN O- +
Asp270
O OH HO HO
AcHN H
HO HO
H NanB
O O-
NHAc
O Asp315
COOH
HO O
HO NanC
HO OH
HO
O COOH
AcHN
OH HO
Figure 1.9: The proposed mechanisms for NanA, NanB and NanC sialidase enzymes.
1.5 Computational methods
1.5.1 Molecular dynamics simulations

The molecular dynamics (MD) simulation method has been widely used to study the
behavior of molecular systems, including proteins, lipids, DNA, carbohydrates, zeolites
and crystals.181,182 It can provide mechanistic details and dynamic properties of complex
systems, and information about enzyme-catalysed reactions (e.g. structures of transition
states and intermediate states) that experiments cannot.183 MD simulations consists of
step by step, numerical solutions of Newton’s classical equations of motion. For a simple
atomic system, it can be written as
mi~r¨i = ~fi
~fi = − ∂U (1.2)
∂~ri
where ~fi is the forces acting on the atom that are derived from the potential energy
U(~rN ), where~rN = (~r1 ,~r2 ,...~rN ) define the complete set of 3N atomic coordinates.
As far as the choice of classical molecular modeling method is concerned, a variety of
biomolecular force fields are available, among which AMBER,184–187 CHARMM188–191
GROMOS192–195 and OPLS196,197 are the most popular ones in use. In our work we
primarily use the CHARMM force field whose energy expression takes the following
form:
U(~rN ) = ∑ kd (d − d0 )2 + ∑ kUB (S − S0 )2
bonds UB
2
+ ∑ kθ (θ − θ0 ) + ∑ kφ [1 + cos(nφ + δ )]
angles dihedrals
σAB 12 σAB 6 1 qA qB
+ ∑ {εAB [( ) − 2( ) ]+ } (1.3)
nonbonded pairs AB rAB rAB 4πε0 rAB
in which the the bonded terms, the Lennard-Jones-type van der Waals terms and Coulomb
interaction terms are described. The symbols d, S, θ , φ designate bond lengths, Urey-
Bradley 1,3-distance, bond angles, and torsions, respectively; d0 , S0 and θ0 are the cor-
responding equilibrium values; n and δ are the torsional multiplicity and phase angle,
respectively. The bonded force constants are kd , kUB , kθ , and kφ ; rAB is the nonbonded
distance between atoms A and B, εAB and σAB are the Lennard-Jones parameters; qA and
qB are atomic partial charges; and ε0 is the vacuum permittivity (dielectric constant).198
1.5.2 Combined quantum mechanics/molecular mechanics methods

Molecular modeling and simulation methods are playing a growing role in understanding
the mechanisms of biocatalytic reactions that are the foundation of biochemical processes.
These methods can also make contributions to discovering new drugs and designing new
catalysts.199,200 Different types of methods for modeling chemical reactions are applied
in daily practice.199,201–203 As highly efficient force-field-based methods for simulations
of biomolecules, molecular mechanics (MM) methods are capable of modeling the sys-
tems of up to millions of atoms. In MM energy functions, bond stretching, angle bending,
torsion angles, van der Waals and electrostatic interactions are described simply which
allows this method to simulate larger systems for longer timescale. On the downside,
MM methods are not able to model chemical reactions simultaneously and have difficulty
in modeling reactive intermediates and transition states. Fortunately, QM methods can
accurately describe bond making, bond breaking, charge transfer and electronic excita-
tion of small molecules. Thus a combined QM/MM approach, which was first introduced
by Warshel and Levitt in 1976, has become a promising way to model reactions not only
in biomolecular systems, but also in inorganic/organometallic and solid-state.204 With an
increasing number of publications based on QM/MM methods demonstrating excellent
results,200,201,205–209 we have reasons to believe that QM/MM techniques are highly rec-
ommended to treat complex chemical systems. An outline of the division of the QM/MM
system is shown in Fig. 1.10. The simulated system is partitioned into a quantum mechan-
ical region and molecular mechanical region, respectively. Usually the QM region is of
most interest in the system as it involves the active sites, and classical mechanics is used
to treat the rest of the enzyme, and the environment (including solvent and counterions).
To properly model the structure and dynamics of biomolecular systems, electrostatic
interactions have to be treated in a reliable and efficient way. This can be achieved via
the Ewald summation method together with periodic boundary conditions. Although this
naturally avoids the artefacts from truncating the long-range electrostatics interactions,
it results in a high computational cost which severely limits the size of the systems and
the affordable sampling time, in addition to concerns regarding the artificial periodicity
imposed.210,211 Alternatively coupled with spherical boundary conditions, two different
robust and efficient schemes have been introduced to treat long-range electrostatics inter-
actions, namely the Generalized Solvent Boundary Potential (GSBP) method212,213 and
the Solvent Macromolecule Boundary Potential (SMBP) method.214,215 The GSBP-based
QM/MM protocol enables relatively long conformational sampling with semi-empirical
methods, while the QM/MM protocol with ab initio or DFT as the QM model provides
an affordable approach for multiple reaction pathway calculations.
QM/MM Interaction
MM Region QM Region
ĤQM/MM
ĤMM ĤQM
ĤTOTAL=ĤQM+ĤMM+ĤQM/MM
Figure 1.10: A division of the system into QM and MM parts. The relatively small
purple is treated with a QM method and the rest of the system (including the blue region
and water molecules (labeled in red)) is treated with classical mechanics.
1.5.2.1 The QM model
Various combinations of QM and MM methods can be adopted for the QM/MM. The
reasonable choice of QM method is required for successful combined QM/MM simu-
lations. Generally, QM methods aim to solve the Schrödinger equation for molecular
systems, and can be divided into three main catagories based on different approxima-
tions, ab initio, density functional theory (DFT) and semi-empirical methods. Both ab
initio and DFT methods are limited by the size of the system they can deal with and
the number of conformations one can afford to simulate. In practice, ab initio and DFT
based QM/MM simulations are limited to energy minimizations of a small number of
structures. Semi-empirical methods become more popular for QM/MM MD simulations,
owing to their capability to deal with much lager systems than ab initio and DFT with
less computational requirements. The semi-empirical methods can be deduced from the
Hartree-Fock method while they are determined by multiple empirical parameters that
are more than DFT methods but significantly less than MM methods. Compared to the
empirical parameters used in the classical force field, the ones in semi-empirical meth-
ods are more sufficiently transferable and have been parametrised for nearly all quantum
effects in few simulations, however, reasonable efforts are still required to derive these
parameters.216 The similar limitations have been founded in empirical tight-binding (TB)
scheme, in which the eigenstates of the system in accordance with the atomistic basis set
representation can also be tuned for a more efficient and sophisticated treatment based
on the DFT within the Kohn-Sham (KS) approximation.217 With this idea in mind, the
density-functional tight-binding (DFTB) method has been developed over the years and
applied successfully to condensed-matter systems and biological molecules.218–222 Up to
now, according to the different orders of the expansions for the KS total energy in the lin-
ear combination of atomic orbitals (LCAO) framework, DFTB has three different models,
the first-order expansion DFTB1 (also simply called DFTB),223,224 the second-order ex-
pansion DFTB2 (self-consistent charge density functional tight-binding, SCC-DFTB)225
and the third-order term DFTB3.226–228 DFTB1 is used especially for describing the small
molecule in which the electron density can be well approximated a superposition of atom
charge density.229,230 However, this method is greatly dependent on the transferability of
parameters and the initial densities. To improve geometries, energies, forces and transfer-
ability, SCC-DFTB method has been developed by introducing the density fluctuations at
the level of the Mulliken population analysis.225 Previous studies based on SCC-DFTB
method have provided reliable geometries and reaction energies.231–234 The SCC-DFTB
total energy is given by
E SCC−DFT B = E rep + E HO + E γ
1 rep 0 1
= ∑ Vab + ∑ ∑ ∑ ni cµi cνi Hµν + ∑ ∆qa ∆qb γab (1.4)
2 ab iab µ∈a ν∈b 2 ab
where cµi , the molecular orbital coefficients, issuing in KS equations

∑ cνi Hµν − εi Sµν = 0
ν
with ν ∈ b and ∀a, µ ∈ a, (1.5)
The first term in the first line of Eq. 1.4 is the repulsive potential. The second one
is the energy contribution which only depends on the reference density that leads to this
equation can be solved self-consistently and thereby contribute to huge computational
savings. The third term of Eq. 1.4 is the second-order expansion energy components.
ni of Eq. 1.4 and and εi of Eq. 1.5 are the occupation number and eigenvalue for the
ith molecular orbital, respectively. Hµν0 are the Hamilton matrix elements in the atomic
orbital representation. ∆qa is equal to qa -q0 , representing the charge fluctuations at the
atom a. qa refers to Mulliken population stemming from the sum over the orbitals, and q0
is the valence electron number of atom a. The function γab describes the atomic chemical
hardness properly. Hµν and Sµν of Eq. 1.5 are Hamiltonian and overlap matrix elements,
respectively.
Despite the advantages SCC-DFTB approach has shown, it still has some shortcomings,
such as, overestimate heats of formation,235,236 and failure to give vibrational frequencies
in some cases.231,236–240 In recent years, several researchers have adapted to the third-
order terms of SCC-DFTB, that is DFTB3 model.226–228,241,242 The DFTB3 total energy
is given by
1 rep 0 1
E DFT B3 = ∑ Vab + ∑ ∑ ∑ ni cµi cνi Hµν h
+ ∑ ∆qa ∆qb γab
2 ab iab µ∈a ν∈b 2 ab
1
+ ∑ (∆qa )2 ∆qb Γab (1.6)
3 ab
where the function γabh is the modification of γ , improving the hydrogen bonding
ab
significantly. The fourth term of Eq. 1.6 is the third-order expansion of DFT total energy
around a given reference density and is given by
1
EΓ = (∆qa )2 Γab (1.7)
3∑ab

0 1 1
Hµν = Hµν + Sµν ∑ ∆qc (γac + γbc ) + (∆qa Γac + ∆qb Γbc )
c 2 3

∆qc
+ (Γca + Γcb ) (1.8)
6
DFTB3 has been shown to substantially improve the description of charged systems,
especially regarding hydrogen-binding energies and proton affinities.227 These improve-
ments make DFTB3 particularly applicable to biomolecular systems. The GSBP DFTB/
MM scheme has been shown to be suited for the description of chemical reactions in
macromolecular systems.243–245 The application of DFTB/MM (DFTB3/MM and SCC-
DFTB/MM) will be covered in more details in Chapter 2 and 3.
1.5.3 pK a and pK a shift calculations in enzymes

According to the Lewis theory, an acid-base reaction is any reaction in which a Lewis acid
accepts a pair of electrons and a Lewis base donates a pair of electrons. The more general
Brønsted-Lowry theory describes the acid-base reaction in terms of proton (H + ) trans-
fer between acidic species and basic species. Using the Brønsted-Lowry definition, the
balanced equation of acid dissociation in solution (soln) is associated with the following
equation:
∆G
AH(soln)
A− +
(soln) + H(soln) (1.9)
where the acid AH dissociates completely to release its conjugates base A− and a proton
H + in soln. ∆G is the dissociation free energies. The definition of acidity constant Ka can
be written symbolically as
[A− ][H + ]
Ka =
[AH]

∆G
= exp − (1.10)
kB T
Then pKa is defined in the form of a negative logarithm of Ka
pKa = − log10 Ka
∆G
=
kB T ln 10
1
≈ ∆G (1.11)
2.303kB T
where kB is the Boltzmann constant and T is the temperature.

There are different models proposed in the literatures to calculate pKa or pKa shifts
in biological systems. To gain a comprehensive picture, a combination of different tech-
niques have been utilised in this thesis. The three different methods are briefly described
below.
1.5.3.1 Thermodynamic integration (TI) based on explicit solvent QM/MM simu-

lations
Based on combined QM/MM simulations,246,247 pKa shifts for a specific residue of inter-
est in a protein environment can be calculate via a thermodynamic cycle according to the
following Eq.
∆pKa = pKa (E-A) − pKa (model-A)

1
= ∆∆G (1.12)
2.303kB T
with
∆∆G = ∆G(E-A(H/D)) − ∆G(model-A(H/D)) (1.13)
where ∆G(E-A(H/D) stands for the free energy gap between protonated and deprotonated
state for a specific residue in protein, whilst ∆G(model-A(H/D) stands for the free energy
gap between protonated and deprotonated state for a specific model compound in aqueous
solution. Both terms can be obtained through a combined QM/MM TI simulations with a
so-called dual topology-single-coordinate (DTSC) scheme.246 After running simulations
for each λ frame (λ serving as a reaction coordinate enforcing the deprotonation process
going from AH to AD), the free energy difference can be obtained by integrating the free
energy derivatives from the protonated state (λ =0) to the deprotonated state (λ =1). pKa
can be calculated by summing up ∆pKa and the expermental value for pKa (model-A).
The accuracy of such QM/MM based pKa calculation depends on the accuracy of QM/
MM model and sufficient conformational sampling. The former can gauged through com-
parison of energetic property to high level methods. The latter requires careful monitoring
of the simulation convergence. Often enhanced sampling methods are applied. Due to the
high computational cost, no systematic benchmarking on pKa shifts with DFTB3 calcula-
tions have been reported in the literature.
1.5.3.2 The multiconformation continuum electrostatics (MCCE)
MCCE (multi-conformation continuum electrostatics)248 is a biophysics simulation pro-

gram combining continuum electrostatics (CE) and molecular mechanics (MM). Different
dielectric constants of water and protein are defined in Poisson-Boltzmann (PB) equations
of CE. The Boltzmann distribution of side chain ionisation states and conformers is de-
fined as a function of pH. With MCCE, an explicit heterogeneous dielectric response
which provides coupled ionisation and position changes accurately, is introduced using
the Monte Carlo sampling of coupling conformations and ionisation. This is a significant
improvement over the traditional pK a calculations based on a single static structure. A
high dielectric constant of ∼80 (the experimental data for water), a low protein dielectric
constant (ε prot, a suggested value of 4 for globular proteins) and the explicit side chain
rearrangements are used to describe the dielectric response of the system. The energy for
a given protein microstate (∆Gx ) is given by
M
∆Gx =

∑ δx (i) 2.3mi kB T pH − pKsol,i
i=1
h io
+ ∆∆Gdesolv,i + ∆GCE
bkbn,i + ∆G LJ
bkbn,i + ∆Gtorsion,i + ∆∆G SAS,i
M
+ ∑ δx ( j)[∆GCE LJ
i j + ∆Gi j ] (1.14)
j=i+1
where M represents the total number of conformers. δx (i)=1 represents that conformer
i exists in the given microstate whereas i does not exist in microstate if δx (i)=0. The
energy of ionised residue in protein at a given pH is described by the first term in Eq.
1.14 with mi being the charge and pKsol,i the reference pKa value of a specific residue i
in solution. The second term in Eq. 1.14 describes the self-energies of conformers which
includes the stabilization of the neutral form by the desolvation penalty (∆∆Gdesolv,i ),
the electrostatic (CE) and non-electrostatic interaction (LJ) with the backbone dipoles
(∆GCE LJ
bkbn,i and ∆Gbkbn,i ), the torsional energy (∆Gtorsion,i ) and the implicit LJ interaction
with the solvent (∆∆GSAS,i ), and last term is used to provide non-bonded interactions
between different residues.
It has been shown that this physics-based method to pKa predictions of ionizable groups
has an accuracy of 75% with an overall RMSD of 0.90 of the error being <1 pH unit.248 A
more extensive study using a large pKa database of 100 proteins shows that the predictive
capabilities of MCCE have an accuracy of 53% within 1 pKa unit for surface residues.249
As a popular empirical pKa prediction method, however, MCCE has its own limitations
due to the implicit solvent model it used that simplified the large effect on pKa from sol-
vation.250 Moreover, MCCE only takes into account the effects of side chain rotations,
resulting in a incomplete sampling of conformations accessible to the true dynamic char-
acteristics of protein structures.249,250
1.5.3.3 PROPKA method
PROPKA251 is an empirical method used to compute the pKa s for titratable groups in
proteins and substrates. The desolvation effects and intra-protein interactions, as well
as covalently and non-covalent intra-ligand interactions are taken into consideration in
PROPKA. pKa values are modelled as follows:
pKa = pKawater + ∆pKawater→protein (1.15)
where pKawater is the pKa for the titratable group in water, and ∆pKawater→protein represents
the contribution of the protein environment to the pKa value which is defined as:
∆pKawater→protein = ∆pKadesolv + ∆pKaRE + ∆pKaHB + ∆pKaQQ (1.16)
where ∆pKadesolv is the contribution due to the effects of desolvation, ∆pKaRE describes
the contribution from unfavourable electrostatic reorganisation energies, ∆pKaHB defines
the contribution due to the hydrogen bonding interactions, and ∆pKaQQ is the contribution
from the Coulombic interaction. Usually generic parameters are used to calculate the two
terms ∆pKadesolv and ∆pKaRE and specific parameters are used to treat the remaining two
terms.
This empirical based method to pKa shifts calculations of titratable groups provides a
reasonable match to experiment results, showing an accuracy of 85% to a level of <1
pKa unit.249 However, akin to MCCE, PROPKA neglects the true dynamic nature of the
protein due to the implicit solvent model it used and the assumption of the static protein
structure.250 Considering the strengths and limitations of the three above pKa prediction
methods, we examined the performance of them in BcX (See details in Chapter 2).
1.5.4 Reaction barriers and energies calculations based on QM/MM

simulations
To fully understand an enzymatic reaction, insights on the energetic and structural infor-
mation along the entire reaction pathways are required. Among the different simulation
techniques, two of the most popular methods to model the reaction pathway of an en-
zymatic reaction and to obtain the properties of TSs are the minimum energy pathway
(MEP) and potential of mean force (PMF) calculations.
MEP on the potential energy surface (PES) can be obtained through adiabatic mapping
by minimizing the energy of the system at intervals along predefined reaction coordi-
nates that connect the reactant state and the product state.252 Additional calculations are
required to verify the predicted TS structures are indeed saddle point structures. A TS
structure is represented by the structure highest in energy on this MEP. It is crucial that
MEPs are evaluated with multiple enzyme-substrate conformations as reaction barriers,
reaction energies and TS structures can vary significantly based on the starting struc-
ture.253–255 The calculated reaction barrier can be directly compared to an (apparent) ac-
tivation energy (∆G0act ) obtained from the experimental apparent first-order reaction rate
(k(T)) constant using TS theory (See Eq. 1.17; kBhT is a frequency factor (≈6 ps−1 at
300K), C0 is the standard state concentration, n is the order of the reaction, R is the gas
constant and T is the absolute temperature).256
kB T 0 1−n −∆G0act (T )
k(T ) = (C ) exp[ ] (1.17)
h RT
As another method used to calculate the free energy differences, PMF can be very
useful for probing chemical reactions. In contrast to MEP, PMF relies on MD at a finite
temperature, properly taking into account thermal fluctuations and flexibilities. It can be
determined as

∗ hρ (ξ )i
W (ξ ) = W (ξ ) − kB T ln (1.18)
hρ (ξ ∗ )i
where ξ is the reaction coordinate, ξ ∗ and are arbitrary constants. hρ(ξ )irepresents the
average distribution along the reaction coordinate ξ , is defined as follows
h i −U (~R)
d ~Rδ ξ 0 ~R − ξ e kB T
R
hρ (ξ )i = −U (~R)
(1.19)
d ~Re
R
kB T
where U(~R) is the function of coordinates ~R for the total system energy,257 and ξ 0 [~R] lies
on the degrees of freedom. In practice, it is difficult to obtain an accurate PMF along a
reaction coordinate from unbiased MD simulations due to the presence of low probability
region (i.e., high energy region is usually sampled less frequently than the low energy
region). Methods based on MD simulations such as umbrella sampling258 are particularly
useful to overcome this problem. We would like to point out that PMF can be also ob-
tained with several other methods, such as local elevation/metadynamics259–262 in which
a historical biasing potential is constructed as a sum of Gaussian hills centered along
the trajectory on a set of collective variables or reaction coordinates, string method263,264
which constrained simulated the equilibrium distribution of the Langevin equation based
system in the hyperplanes parametrized by a discretized string, and transition path sam-
pling265–267 which samples the path ensemble without requirements of predefinition of
reaction coordinates. In this thesis, we focus on the umbrella sampling technique, with
which PMF can be determined by performing various window simulations with a biased
potential centered at a given reaction coordinate to increase the sampling. The weighted
histogram analysis method (WHAM)268 can then be applied to obtain the unbiased PMF.
1.6 Overview of the thesis

The remaining Chapters in this thesis are organised as follows:
In Chapter 2, we present a comprehensive study of pKa shifts in BcX. We discuss
the strengths and limitations of QM/MM free energy calculations, MCCE and PROPKA
methods for calculating pKa shifts of the general acid/base Glu172 and other key residues
in BcX. Additionally, we rationalise the molecular basis for pKa shifts in BcX.
In Chapter 3 and 4, we describe the energy profiles for NanA, NanB and NanC along
three reaction pathways based on comparative MEP QM/MM simulations. According to
the MEP results, we then analyse the catalytic mechanisms for NanA, NanB and NanC
from the reactant state to the product state along the reaction pathways. Additional PMF
simulations were carried out NanA as well and the free energy profiles are very similar to
MEP.
In Chapter 5, we review the contributions of this thesis, and list possible directions of
future research.
Chapter 2
Rationalizing pK a shifts in Bacillus

circulans xylanase
2.1 Summary
Bacillus circulans xylanase (BcX), a family 11 glycoside hydrolase, catalyses the hy-
drolysis of xylose polymers with a net retention of stereochemistry. Glu172 in BcX is
believed to act as a general acid by protonating the aglycone during glycosylation, and
then as a general base to facilitate the deglycosylation step. The key to the dual role
of this general acid/base lies in its protonation states, which depend on its intrinsic pK a
value and the specific environment which it resides within. To fully understand the de-
tailed molecular features in BcX and to establish the dual role of Glu172, we present a
combined study based on both atomistic simulations and empirical models to calculate
pK a shifts for the general acid/base Glu172 in BcX at different functional states. Its pK a
values and those of nearby residues, obtained based on QM/MM free energy calcula-
tions, MCCE and PROPKA, show a good agreement with available experimental data.
Additionally, our study provides additional insights into the effects of structural and elec-
trostatic perturbations caused by mutations and chemical modifications, suggesting that
the local solvation environment and mutagenesis of the residues adjacent to Glu172 es-
tablish its dual role during hydrolysis. The strengths and limitations of various methods
for calculating pK a s and pK a shifts have also been discussed.
2.2 Introduction
BcX is a 20 kDa family GH11 retaining endo-β -(1,4)-xylanase1 that has been exten-
sively studied by using various experimental methods, including mutagenesis studies,
NMR spectroscopy and X-ray crystallography.64,71,74,75,269 Particularly, the pKa values
of the key residues in the active site at different functional states have been systematically
28
CHAPTER 2. PKA SHIFTS IN BCX 29
studied with NMR titration experiments.64,71 When wild-type BcX is optimally active at
pH=5.7, Glu78 is deprotonated to function as a nucleophile while Glu172 is protonated to
act as a general-acid catalyst during glycosylation, and then as a general-base to facilitate
the deglycosylation step (Fig. 2.1). The pKa values for Glu78 and Glu172 in the Michaelis
complex have been shown to be 4.6 and 6.7, respectively, via13 C-NMR pH titrations.64
Contrastingly in a trapped covalent intermediate, the pKa value of Glu172 drops to 4.2
from 6.7 with Glu78 covalently bound to an unnatural substrate 2’,4’-dinitrophenyl 2-
deoxy-2-fluoro-β -xylobioside65 (PDB code: 2FXb), which subsequently enables Glu172
to function as a general base during the deglycosylation reaction. It has been implicated
that such ”pKa cycling” of the carboxylic acids that can act as a general acid/base is prob-
ably common among retaining GHs.64,270
pKa = 6.7 Glu172 pKa = 4.2 Glu172 pKa = 6.7 Glu172
O -O O O
OH HO OH OH HO
R R R
O O H O
HO O O O O O
HO HO
R O
HO HO H HO H
-O H2O HOR -O
O O O O
C C C
Glu78 Glu78 Glu78

pKa = 4.6
Glycosylation Deglycosylation
Figure 2.1: The double-displacement mechanism in BcX and the measured pKa s for
Glu172 and Glu78.
Recent studies examined a variety of strategies that include altering the global charge
through random succinylation (Fig. 2.2) and introducing changeable charged or polar
residues at the active site to perturb pKa values for the key residues in BcX.71 pKa values
for the specific residues include Glu172 of WT, WT-2FXb, E172H, E172H-2FXb, N35H,
N35H-2FXb and N35E of BcX were measured. It has been found that mutations at the
close proximity of the active site have a much larger effect on the pKa values of the key
catalytic residues. However, it is intriguing that altering the global charge by up to -14e
through succinylation of surface Tyrosine and Lysine residues has a very minor effect on
their pKa s.71
On the computational side, a hybrid QM/MM method was successfully used to analyse
the transition states for the WT BcX and its Y69F mutant along the reaction pathways,
with aims to understand the catalytic role of Tyr69.271 It was found that Tyr69 plays a key
role in stablishing the boat conformation in the Michaelis complex. A similar approach
has been used to analyse substrate conformation and mechanistic behaviours of WT BcX
protein NH3 O
O
O
protein NH
O
Figure 2.2: Changing the global charge of BcX by succinylation in which the succinic
anhydride react with amines of protein does not significantly perturb its pH-dependent
activity.
and its Y69F mutant.272 The pH dependence of activity and stability of BcX have been
systematically investigated with the empirical PROPKA approach and it has been pro-
posed that by introducing ionizable residues in surface with pKa values significantly lower
than standard values will improve its thermostability.273 A novel computational approach
has been developed to directly estimate the reaction barriers in BcX and a total of 342 sin-
gle mutants and 111 double mutants with promising catalytic activity have been in silico
designed.274
As acidic xylanases are important in food and animal feed industries, efforts have been
devoted to generate BcX mutants with a pHopt shifted towards the acidic side.73 For in-
stance, the presence of an aspartic acid residue at position 35 (Asp35) has been proved to
lower the pH optima of BcX from 5.7 (WT) to 4.6 (N35D) based on NMR titration study
due to a strong hydrogen bond between Asp35 and Glu172.275 Kim et al. performed non-
conservative mutations of BcX and found that some of the non-catalytic residues play a
significant role in perturbing pKa s of the catalytic residues and shifting the pH optima
of BcX.276 A BcX mutant V37R has shown to have a pH optima downshift by -1.5 pKa
units due to the increased distance between the substrate oxygen and Glu172, which is
attributed to the formation of a strong salt bridge between Arg37 and Glu172.277 A signif-
icant challenge for computational enzymology is to reproduce pKa s for catalytic residues
in the highly heterogeneous electrostatic systems. Considering the available experimental
data of pKa s for the key residues at different functional states and well studied chemistry,
BcX is a very attractive system for benchmarking computational pKa shift calculations. In
this work, three different computational methods have been applied to calculate the pKa
shifts for BcX, including WT, WT-2FXb, N35H, N35H-2FXb, N35D and N35D-2FXb.
The goals of this study are two-fold. First, we would like to provide a complementary un-
derstanding of the molecular basis for pKa shifts of the key residues in BcX, particularly
examining and comparing the effects from the changes in the global charges and those
in the local electrostatic environment by site-directed mutagenesis, which has not been
examined with computational methods in the previous studies. Second, we will bench-
mark various models for calculating pKa s or pKa shifts in GHs. It has been shown that
the highly efficient empirical method PROPKA251 has some difficulties in reproducing
the pKa upshifts upon binding of a negatively charged substrate in the active site of GH33
and GH34.270 Complementary to the previous study, it would be interesting to examine
its performance in a case with the non-charged substrate in BcX (GH11).
2.3 Computational details
2.3.1 Simulation setup

The X-ray structures of WT-2FXb (PDB code: 1BVV,278 resolution 1.80 Å), N35H (PDB
code 3VZL,71 resolution 2.00 Å), N35H-2FXb (PDB code: 3VZO,71 resolution 1.73 Å),
N35D (PDB code: 1C5H,275 resolution: 1.55 Å), and N35D-2FXb (PDB code: 1C5I,275
resolution: 1.80 Å) were used as starting structures for the simulations. All QM/MM
simulations were carried out with CHARMM (version c38a2).279 CHARMM force field
c36188 was used to describe the protein and sugars. All these six proteins and an isolate
model Glu172 were solvated with TIP3P water models.280 The General Solvent Boundary
Potential (GSBP)213 setup was carried out in conjunction with explicit water molecules in
the inner sphere for the QM/MM simulation that includes: (i) a spherical-model of 20 Å
radius centred on the group of interest was treated as the inner region and the remaining
protein is treated as the outer region; (ii) a spherical region of radius 18 Å was defined
as reaction region where Newtonian dynamics are solved; (iii) the region between 18 Å
and 20 Å was treated as buffer region where Langevin dynamics are solved. A non-polar
cavity potential was adopted to confine the inner region water molecules to prevent them
leaving this region. The dielectric constants in the protein and solvent were assumed to be
1.0 and 80.0, respectively. The simulated temperature was set up to 298.15K. The bonds
involving hydrogen were constrained via the SHAKE algorithm.281 To treat the bonded
interactions at the QM/MM boundary, the bonds between the QM and MM regions were
cut by a link atom with the divided frontier charge model (DIV).282 The boundary was
placed between Cβ and Cγ in residue Glu172. Such a treatment has been shown to pro-
vide a balanced description of proton affinities for molecules of interest.270 Full QM/MM
minimizations were performed prior to explicit solvent MD simulations. The explicit
solvent DFTB3/MM simulations are summarized in Table 2.1. The snapshots from the
DFTB3/MM equilibrium simulations (Eq.) were taken as the initial structures for explicit
solvent TI simulations. The DFTB3/MM simulations with the GSBP model are shown in
Fig. 2.3, in which we take WT-BcX (PDB code: 1BVV) as an example to illustrate the
details of simulation setup.

CHAPTER 2. PKA SHIFTS IN BCX
Table 2.1: Summary of the starting structures, the equilibrium molecular dynamics simulations (Eq.) and TI simulations.
System PDB Resolution Substrate Number of Eq.(in ns) TI per λ (in ns)
ID (Å) QM atomsd QM/MM QM/MM
WT a 1BVV1 1.80 - 8 3 4
WT-2FXb 1BVV1 1.80 XYP , DFXc
b 8 3 4
N35H 3VZL2 2.00 - 8 3 4
N35H-2FXb 3VZO2 1.73 XYP , DFXc
b 8 3 4
N35HD 1C5H3 1.55 - 8 3 4
N35D-2FXb 1C5I3 1.80 XYPb , DFXc 8 3 >4
aThe substrate was deleted in the apo state of WT-BcX simulations; b XYP: β -d-xylopyranose; c DFX: 1,2-deoxy-2-fluoro-xylopyranose; d The DFTB3/MM partition
boundary (the link atom) for all the simulations was placed between Cβ and Cγ in residue Glu172. The DIV scheme was adopted as it has been shown to provide the best
representation of the system..270 The F atom at position 2 of the substrate 2FXb has been changed to OH.
33
QM region
Glu172
Link atom
Asn35
Substrate
Glu78
Reaction region
Buffer region
Inner region
Continuum electrostatic
Figure 2.3: DFTB3/MM with the General Solvent Boundary Potential (GSBP) model
for WT-2FXb. A spherical region of 20 Å radius centered on the C atom of Glu172 was
described as the inner region (shown in red) and the remaining part was outer region
represented with the continuum electrostatics; Reaction region: a spherical model of 18
Å centered on the Cα atom of Glu172, where Newtonian’s equations of motion were
solved (labeled in green); Buffer region: the region between reaction and inner region,
where Langevin dynamics was solved (labeled in blue); QM region: the link atom (drawn
in pink) was placed between Cβ and Cγ for Glu172 to divide the bond between QM and
MM region.
2.3.1.1 Comparison of three different methods
We utilised the three different methods to calculate pKa s and pKa shifts, namely TI based
on explicit solvent QM/MM simulations, MCCE248 and PROPKA..251 The three differ-
ent methods have their own strengths and limitations. MCCE goes beyond the traditional
continuum electrostatics (CE) based on a single static structure by taking into account
rotamer states.248 However, it has been debated in the literature what is the proper dielec-
tric constant for proteins.283,284 PROPKA is highly efficient and well parameterised for
both surface and buried residues.251 However, it has shown some limitations in dealing
with the interactions between the ionisable groups and charged substrates.270 QM/MM TI
based methods, though providing arguably the most faithful representation of the solva-
tion environment and thermal-fluctuations. However, these methods depend critically on
the accuracy of the QM method285 and sufficient conformation sampling.286,287 Thus, a
comparative study with the three different methods was carried out to calculate pKa shifts
of key catalytic residues in BcX.
2.4 Results and discussions
2.4.1 Equilibrated structures from QM/MM simulations

The equilibrated structures from DFTB3/MM simulations are shown in Fig. 2.4. For BcX
WT (Fig. 2.4 a & b), Glu172, close to Asn35, is shown to be protonated and deprotonated
in apo and intermediate state, respectively, and Glu78 is deprotonated in the apo state
but covalently linked with 2FXb in the intermediate state. For N35H-BcX and N35D-
BcX, His35 and Asp35 are positively and negatively charged, respectively, with Glu172
being deprotonated (Fig. 2.4 c) or protonated (Fig. 2.4 d). Glu78 is protonated in the
apo state for both cases (Fig. 2.4 c & e). For the key residues and substrates, the heavy
atom positional root-mean-square-deviations (RMSD) from the X-ray crystallographic
structures were approximately 1.0 Å over the simulation time.
2.4.2 pKa calculations with QM/MM TI, PROPKA and MCCE

pKa calculations with explicit solvent simulations of WT-BcX (WT, WT-2FXb) and mu-
tants (N35H, N35H-2FXb, N35D and N35D-2FXb) were carried out using the DFTB3/MM
method. The snapshots from equilibrium simulations were used as the starting struc-
tures for explicit solvent TI simulations. According to the thermodynamic cycle (Eq.
1.12), the pKa shifts are defined as ∆pKa =1/2.303kT ∆∆G, where∆∆G=∆G(E-A(H/D))-
∆G(model-A(H/D). ∆G(E-A(H/D)) and ∆G(model-A(H/D) represent the free energy dif-
ference between the protonated and the deprotonated states of an isolated Glu172 in pro-
tein and aqueous solution, respectively. These two terms were obtained by integrating the
free energy derivatives from λ =0.0 to λ =1.0 (11 evenly spaced windows are included).
∆G(model-A(H/D) was estimated to be 123.4 kcal/mol based on DFTB3/MM simula-
tions. The calculated numbers are listed in Table 2.2. The free energy derivatives for
Glu172 at different time intervals for these 6 cases are shown in Fig. 2.5 to monitor the
convergences. Then the pKa values for Glu172 were obtained based on the Eq. 1.12.
Here, QM/MM simulations are only performed on Glu172 due to the much larger com-
putational cost required for QM/MM TI compared to the other methods and that we are
particularly interested in its dual catalytic role. The continuum electrostatics protocol
MCCE (at both QUICK and DEFAULT levels, whose major difference is the number of
rotamers being 1 or 6 per rotatable bonds)248 and the empirical method PROPKA251 were
performed to calculate pKa s for Glu172 and other key catalytic residues in all six cases.
The results combined with the experimental data71,275,288 are shown in Table 2.3 and 2.4.
(a) (b)
Glu172
Glu172
Asn35
Asn35
Glu78 Glu78
Substrate
(c) (d)
Glu172 Glu172
His35
His35
Glu78
Glu78
Substrate
(e) (f)
Glu172
Glu172
Asp35 Asp35
Glu78 Glu78
Substrate
Figure 2.4: The final snapshots from the explicit solvent DFTB3/MM simulations. The
physiologically important protonation states for the key residues (residue 35, 78 and 172)
are shown. The key residues and substrate are shown explicitly. The water molecules
within approximately 4 Å of OE2 in Glu172 are shown in red. (a) WT; (b) WT/2FXb;
(c) N35H; (d) N35H/2FXb; (e) N35D; (f) N35D/2FXb.
Table 2.2: ∆G(E-Glu(H/D)) of Glu172 in BcX based DFTB3/MM TI explicit solvent

simulations (in kcal/mol).
Ref.a WT WT-2FXb N35H N35H-2FXb N35D N35D-2FXb

∆G(E-Glu(H/D)) 123.4 128.1 123.8 118.6 121.6 133.7 131.2
aRef. refers to the free energy difference between the protonated and the deprotonated state of an isolate
Glu172 in aqueous solution.
300
(a) 0-500ps 3000-3500ps (b)
500-1000ps 3500-4000ps
1000-1500ps
200 1500-2000ps
2000-2500ps
2500-3000ps
100
300
0
(c) (d)
200
∂Η/∂λ
100
0
300
(e) (f)
200
100
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
λ λ
Figure 2.5: Thermodynamic integrations of the free energy derivatives from λ =0.0 to
λ =1.0 of Glu172 in BcX (a WT, b WT-2FXb, c N35H, d N35H-2FXb, e N35D and f
N35D-2FXb) from the DFTB3/MM simulations. The error bars for each are presented.
2.4.2.1 Accuracy of three different methods
The RMSD with respect to the experimental data from DFTB3/MM TI, MCCE (QUICK
and DEFAULT) and PROPKA were obtained to compare the three different methods (Ta-
ble 2.5 and Table 2.6). The error analyses were performed separately for Glu172 and
other key residues. For Glu172, the RMSD calculated for DFTB3/MM TI is 1.8, which is
larger than the values of 1.1 with QUICK and DEFAULT MCCE, and 1.7 with PROPKA.
This is due to the larger errors for the apo state in N35H and N35D-BcX (-2.7 and 2.9,
respectively, Table 2.5). The underestimated and overestimated values could be rational-
ized by the overestimated QM/MM interactions due to the lack of explicit polarisation
in the MM potential. The favourable interactions between the positively charged His35
and deprotonated Glu172 in N35H was overestimated and thus the downshift from WT to
N35H was overestimated. Similarly, the unfavourable interactions between the negatively
charged Asp35 and deprotonated Glu172 was overestimated thus the upshift from WT to
N35D was overestimated. Similar observations have been noted in previous studies.289,290
The results for N35H and N35D are consistent with the previous studies in which the de-
creased pKa value of Glu172 was observed due to the attractive electrostatic interaction
between Arg37 and Glu172 in V37R-BcX277 and the elevated pKa value of Glu172 was
Table 2.3: The pKa values for Glu172 based on DFTB3/MM TI simulations, MCCE and
PROPKA methods. For MCCE, only results from DEFAULT are presented.
Protein Exp. DFTB3/MM MCCE PROPKA

WT 6.7a 7.3 4.8 7.5
(pHopt : 5.6)
WT-2FXb 4.2a 4.3 4.9 6.8
N35H 3.4a 0.7 2.6 4.2
(pHopt : 4.4)
N35H-2FXb 2.8a 2.8 3.0 4.4
N35D 8.4b 11.3 7.1 10.3
(pHopt : 4.6)
N35D-2FXb >9b,c 9.6 8.7 5.7
a Data from Ludwiczek, M. L.,;71 b Data from Joshi, M. D.,;275 c The exact experimental pKa is not
available due to Asp35 and Glu172 being strongly coupled in N35D-2FXb.71
found attributed to the substitution of Asn35 for Asp35 in N35D-BcX.71,275 For Glu172
in other cases, DFTB3/MM provided satisfactory predictions with a RMSD of 0.4 (Table
2.5). Overall, MCCE DEFAULT provided the best prediction for Glu172 with a RMSD
of 1.1. However, MCCE DEFAULT failed to capture the pKa downshifts for Glu172 upon
glycosylation in WT (WT->WT-2FXb) and N35H (N35H->N35H-2FXb). Encourag-
ingly, all three methods correctly assigned proper protonation states for Glu172 at the apo
state and the intermediate state at pHopt .
For other key residues (Table 2.4), MCCE and PROPKA provided a quantitatively con-
sistent picture except for Asp35 in N35D-2FXb. The RMSD for the MCCE DEFAULT
value of 0.6 is lower than that of 0.9 for QUICK MCCE and 1.2 for PROPKA, respec-
tively (Table 2.6). One of the noticable deviations for PROPKA is the overestimated pKa
for Asp35 in N35D-2FXb (together with the underestimated pKa value for Glu172 in
N35D-2FXb (Table 2.3)). This observation might indicate the challenging issue in this
case is to accurately describe the tight coupling between Asp35 and Glu172, as seen in
DFTB3/MM TI simulations.
The results obtained by MCCE DEFAULT are better than those by MCCE QUICK due
to a better representation of the flexibility of the system by taking into account the ro-
tamer state. This observation demonstrates the importance of conformational sampling
in pK a calculations.248,291 In the following discussion, we will focus on results obtained
by MCCE DEFAULT. Considering the relative cost of computation and their performance
for three different methods, the current study suggests that future work can adopt DFTB3/
MM TI explicit solvent simulations and MCCE DEFAULT for residues involving substan-
tial environmental changes (e.g. the catalytic residues upon substrate binding or glyco-
sylation). Particularly, MCCE DEFAULT is more applicable for the catalytic residues
which are in direct contact with charged residues. PROPKA can be used to deal with
Table 2.4: The pKa values for other residues based on MCCE and PROPKA methods.
For MCCE, only results from DEFAULT are presented.
Protein Exp. MCCE PROPKA

WT Asp4 3.0a 3.5 2.7
(pHopt : 5.6) Asp11 2.5a 2.4 3.0
Asp83 <2.0a 3.2 4
Asp101 <2.0a 2.9 2.9
Asp106 2.7a 2.9 3.9
Asp119 3.2a 3.3 3.3
Asp121 3.6a 3.5 4.0
Glu78 4.6a 4.5 6.0
His149 <2.3a 3.8 4.1
His156 ∼ 6.5a 7.3 6.5
N35H Glu78 5.5b 4.0 6.4
(pHopt : 4.4) His35 7.5b 8.4 7.3
N35H-2FXb His35 5.6b 6.0 7.2
N35D Glu78 5.7c 4.7 6.1
(pHopt : 4.6) Asp35 3.7c 4.0 6.5
N35D-2FXb Asp35 1.9-3.4c,d 5.3 11.4
a Data from Joshi, M. D.,;288 b Data from Ludwiczek, M. L.,;71 c Data from Joshi, M. D.,;275 d The exact
experimental pKa is not available due to Asp35 and Glu172 being strongly coupled in N35D-2FXb.71
other residues for its balance between efficiencies and accuracies.
2.4.2.2 Effect of glycosylation
As seen in Table 2.3, QM/MM TI and PROPKA consistently reproduced the pKa down-
shifts for Glu172 upon glycosylation in WT (WT->WT-2FXb) while MCCE suggested
no significant shift (from 4.8 to 4.9). The downshifts in the WT upon glycosylation could
be ascribed to the masking of the negative charges on Glu78 upon glycosylation thus sta-
blishing the deprotonated state. In contrast, all three methods indicated that Glu172 stays
deprotonated upon glycosylation in N35H (N35H->N35H-2FXb) and Glu172 maintain
protonated upon glycosylation in N35D (N35D->N35D-2FXb). Additionally, the exper-
imentally observed downshifts for N35H from the apo state to the glycosylated inter-
mediate state are much smaller compared to WT.71 These can be attributed to the much
stronger electrostatic interactions introduced by the closer charged residues being H35
(positively charged) than N35 (neutral) in the wide-type. However, experimentally a pKa
upshift for Glu172 was observed in N35D upon glycosylation.71 One would expected a
downshift upon glycosylation as this process masked the negative charge on Glu78. Ex-
perimentally, it has been argued that due to the tight coupling between Glu172 and Asp35,
it is difficult to assign the measured pKa values to specific residues and thus one has to be
cautious about the exact values.71 Both QM/MM TI and PROPKA predicted a pKa down-
Table 2.5: pKa values of Glu172 from DFTB3/MM, MCCE (QUICK and DEFAULT)
and PROPKA. Numbers in parentheses represent the number of values used to calculate
RMSD, 5 referring to all the cases in the table and 3 referring to the cases of WT, WT-
2FXb and N35H-2FXb.
Protein Exp. DFTB3/MM MCCE MCCE PROPKA

TI QUICK DEFAULT
calc err calc err calc err calc err
WT 6.7a 7.3 0.6 5.7 -1.0 4.8 -1.9 7.5 0.8
WT-2FXb 4.2a 4.3 0.1 4.8 0.6 4.9 0.7 6.8 2.6
N35H 3.4a 0.7 -2.7 2.7 -0.7 2.6 -0.8 4.2 0.8
N35H-2FXb 2.8a 2.8 0.0 4.9 2.1 3.0 0.2 4.4 1.6
N35D 8.4b 11.3 2.9 8.1 -0.3 7.1 -1.3 10.3 1.9
RMSD(5) 1.8 1. 1.1 1.7
RMSD(3) 0.4 0.8 1.2 1.8
a Data from Ludwiczek, M. L., et al.;71 b Data from Joshi, M.D., et al.275
shift from N35D to N35D-2FXb (-1.7 and -4.6, respectively) while MCCE predicted a
upshift (+1.6).
It is worth noting that Glu172 has a much lower pKa in N35H compared to WT and a
much higher pKa in N35D, while both mutants are shown to be active experimentally.71
The tight electrostatic coupling between residue 35 and Glu172 is proposed to maintain
its activity. The rational for the low pKa of Glu172 in N35H is the favourable electrostatic
interactions between Glu172 and the adjacent His35. What is more interesting is the pKa
of Glu172 is lower than that of Glu78 in N35H. A previous experimental study71 has
shown that Glu172 still serves as the general acid and Glu78 functions as a nucleophile
in N35H. This so-called reverse protonation mechanism292 led to a low population of
catalytically competent enzyme at the pHopt presented in N35H. The high pKa value of
Glu172 in N35D-2FXb results in it being unable to function as a general base. However,
the closely hydrogen-bonded pair between Asp35 and Glu172 is proposed to compensate
for the low population of catalytic competent Glu172, which enables that Glu172 and the
adjacent Asp35 function together as the general acid/base during the double-displacement
pathway.71 This is consistent with the observations that a stable hydrogen bond between
Asp35 and Glu172 was maintained throughout the simulations in both N35D and N35D-
2FXb.
2.4.2.3 Perturbative analyses of Glu172 pKa
As mentioned in the previous studies,270 substantial pKa shifts were observed for the key
catalytic residues in GHs and this was complemented by the analysis of the static X-ray
crystal structures. The electrostatic interactions between Glu78 and Glu172 are proposed
to affect the pKa values for Glu172.64 However, there are still unaddressed questions in-
Table 2.6: pKa values of other residues from MCCE (QUICK and DEFAULT) and
PROPKA.
Protein Residue Exp. MCCE QUICK MCCE DEFAULT PROPKA

calc err calc err calc err
WT Asp4 3.0a 3.9 0.9 3.5 0.5 2.7 -0.3
Asp11 2.5a 3.0 2.5 2.4 -0.1 3.0 0.5
Asp106 2.7a 4.0 1.3 2.9 0.2 3.9 1.2
Asp119 3.2a 3.1 -0.1 3.3 0.1 3.3 0.1
Asp121 3.6a 4.2 0.6 3.5 -0.1 4.0 0.4
Glu78 4.6b 3.8 -0.8 4.5 -0.1 6.0 1.4
N35H Glu78 5.5b 4.3 -1.2 4.0 -1.5 6.8 1.3
His35 7.5b 8.5 1.0 8.4 0.9 6.4 -0.9
N35H-2FXb His35 5.6b 4.7 -0.9 6.0 0.4 7.2 1.6
N35D Glu78 5.7c 5.2b -0.5 4.7 -1.0 6.1 0.4
Asp35 3.7c 5.0 1.3 4.0 0.3 6.5 2.8
RMSD 0.9 0.6 1.2
bData from Joshi, M.D., et al.;288 b Data from Ludwiczek, M. L., et al.;71 c Data from Joshi, M.D., et al.275
calc: calculated; err: error.
cluding the individual effects from 1) the role of glycosylation (Fig. 2.1); 2) the role
of solvation environment; and 3) the role of other nearby residues. In previous work on
GH33 and 34,270 the ligand, water and key protein residues have been proved to serve
coordinately to establish the pKa values for its dual role. In this study, we carried out ”in
silico” mutagenesis to probe the individual effects from interacting residues. The free en-
ergy difference between the protonated state and the deprotonated state was re-evaluated
from the structures of QM/MM trajectories to set up perturbative analyses. During the
perturbative calculations, the partial charges on groups of interests were set to zero and
the new energy gap was obtained by re-calculating the free energy derivatives based on the
original trajectories for the different λ (from 0.0 to 1.0) windows. Additionally, previous
experimental studies71 pointed out that changing the global charge of BcX by succinyla-
tion of Surface Tyrosine and Lysine does not significantly affect its pKa . To characterise
the effects of the global charge of BcX, we re-calculated the deprotonation free energy of
Glu172 after setting the charges of Tyrosine and Lysine on the surface in our models to
zero. It is worth noting that such perturbative analyses can only be treated qualitatively as
the relaxation of the modified environment is neglected during the perturbation. Despite
of these approximations, such a perturbative analysis is capable of providing meaningful
trends and information about how the surrounding environment affects the pKa value of
the general acid/base.
Based on the results from the free energy perturbative analyses (Table 2.8), we obtained
the pKa shifts due to the contributions from residue 35, the global charge, Surface Tyro-
sine and Surface Lysine (Table 2.7). We can clearly see that the pKa shifts result from
Table 2.7: pKa shifts based on perturbative analyses of the deprotonation Glu172 with
DFTB3/MM simulations a .
System ∆35b ∆Tyr&Lysc ∆Tyrd ∆Lyse

WT 7.3 3.5 -1.8 5.3
WT-2FXb 7.8 0.1 -2.2 5.5
N35H 38.7 4.6 -1.3 6.0
N35H-2FXb 20.3 2.9 -2.3 5.2
N35D -22.5 4.8 -1.0 5.7
N35D-2FXb -26.0 3.4 -1.9 5.3
aThe deprotonation free energy of Glu172 were computed by integrating the free energy derivatives that
recalculated from the DFTB3/MM trajectories for different λ windows, pKa shifts were then obtained
based on the recomputed free energydifference; b All the partial charges on the residue 35 were set to zero;
c All the partial charges on Surface Tryosine and Lysine were set to zero; d All the partial charges on
Surface Tryosine were set to zero.; e All the partial charges on Surface Lysine were set to zero.
Table 2.8: Perturbative analyses of the deprotonation free energy of Glu172 with
DFTB3/MM simulations (in kcal/mol)a .
System Ref.b ∆residue 35c ∆Tyr&Lysd ∆Tyre ∆Lys f

WT 129.6 140.0 134.6 127.0 137.1
WT-2FXb 124.4 135.5 124.6 121.2 132.2
N35H 118.6 173.9 125.1 116.8 127.2
N35H-2FXb 121.9 150.9 126.1 118.6 129.3
WT 136.0 103.8 142.8 134.6 144.2
WT 132.1 94.9 137.0 129.4 139.7
a The deprotonation free energy of Glu172 were computed by integrating the free energy derivatives that
recalculated from the DFTB3/MM trajectories for different λ windows; b Ref. refers to the perturbative
analyses on the originally simulated systems while keeping all the parameters untouched; c The partial
charges on residue 35 were set to zero; d All the partial charges on Surface Tyrosine and Lysine of our
models were set to zero; e All the partial charges on Surface Tyrosine of our models were set to zero; f All
the partial charges on Surface Lysine of our models were set to zero.
Table 2.9: Perturbative analyses of the deprotonation free energy of Glu172 with
DFTB3/MM simulations (in kcal/mol)a .
System λ Ref.b ∆residue 35c ∆Tyr&Lysd ∆Tyre ∆Lys f

WT 0.0 191.6 215.6 194.6 187.6 198.7
1.0 43.6 64.0 54.3 45.7 52.2
WT-2FXb 0.0 183.1 193.0 183.3 170.0 181.5
1.0 49.9 64.0 50.0 52.1 62.0
N35H 0.0 189.0 242.7 188.3 180.0 197.5
1.0 36.5 95.4 40.9 32.9 44.5
N35H-2FXb 0.0 191.7 241.0 195.7 188.2 195.1
1.0 48.3 82.1 63.9 56.6 55.6
N35D 0.0 187.6 138.0 216.2 208.3 216.6
1.0 27.9 -23.2 57.3 49.3 56.1
N35D-2FXb 0.0 206.8 82.8 229.0 222.4 227.6
1.0 24.4 -19.9 43.2 35.7 45.7
a
The deprotonation free energy of Glu172 were computed by integrating the free energy derivatives that
recalculated from the DFTB3/MM trajectories for different λ windows. b Ref. refers to the perturbative
analyses on the originally simulated systems while keeping all the parameters untouched; c The partial
charges on residue 35 were set to zero; d All the partial charges on Surface Tyrosine and Lysine of our
models were set to zero; e All the partial charges on Surface Tyrosine of our models were set to zero; f All
the partial charges on Surface Lysine of our models were set to zero.
Table 2.10: Perturbation analyses for pKa s of Glu172 and Glu78 based on PROPKA
method.
System Ref.a ∆residue 35b ∆Tyr&Lysc ∆Tyrd ∆Lyse ∆Tyr80 f

WT Glu172 7.5 8.0 7.4 7.5 7.5 7.7
Glu78 6.0 6.0 5.8 5.8 6.0 5.7
WT-2FXb Glu172 6.8 7.3 6.8 6.8 6.8 7.0
N35H Glu172 4.2 7.5 4.6 4.7 4.2 4.4
Glu78 6.4 5.8 6.0 6.0 6.4 6.1
N35H-2FXb Glu172 4.4 7.3 4.5 4.5 4.4 4.7
N35D Glu172 10.3 7.9 9.7 9.7 10.3 10.3
Glu78 6.1 6.1 6.1 6.1 6.1 5.7
N35D-2FXb Glu172 5.7 6.8 6.6 6.7 5.7 7.0
a Ref. refers to the pKa values for WT-BcX using PROPKA method based on the X-ray structure of BcX;
b The residue 35 were mutated to Alanine; c The Surface Tyrosine and Lysine in WT-BcX were mutated to
Alanine; d The Surface Tyrosine in WT-BcX were mutated to Alanine; e The Surface Lysine in WT-BcX
were mutated to Alanine; f The Tyrosine80 in WT-BcX was mutated to Alanine.
Table 2.11: The Coulombic interaction energy between Glu172 and surrounding
residues estimated by PROPKA (in kcal/mol)
System Glu78 Arg112 Tyr65 Tyr69 Tyr80 Tyr166 residue 35

WT 1.0 0.2 0.8 0.9 2.0 0.1 0.0
WT-2FXb 1.0 0.2 0.9 0.9 2.0 0.1 0.0
N35H 0.9 0.1 0.6 1.0 2.0 0.1 1.2
N35H-2FXb 1.1 0.2 1.0 1.0 2.0 0.9 1.8
N35D 1.0 0.2 0.7 1.0 2.0 0.1 1.8
N35D-2FXb 1.0 0.3 0.8 1.0 2.0 0.1 2.0
Table 2.12: The minimal distance between heavy atom pairs in Glu172 or Glu78 and
surrounding residues (in Å)
Glu172 Glu78
Tyr5 11.2 10.4
Tyr65 1.8 7.2
Tyr69 3.8 2.7
Tyr80 1.9 2.6
Tyr88 7.4 10.7
Tyr94 17.5 13.7
Tyr108 12.4 7.7
Tyr113 11.5 9.4
Tyr128 9.7 2.2
Tyr166 8.3 5.8
Tyr174 4.2 10.1
Lys40 9.6 13.7
Lys95 13.7 11.2
Lys99 20.9 12.8
Lys135 10.9 18.2
Lys154 20.4 12.9
Asn35 2.3 6.5
residue 35 in all systems are substantially large, about 7.3, 7.8, 38.7, 20.3, -22.5 and -26.0
pKa units, respectively. The corresponding electrostatic contributions are listed in Table
2.8. On the contrary, the net contributions from Surface Tyrosine and Lysine in BcX are
relatively smaller compared to those from residue 35 (Table 2.7 and Table 2.8). What can
be inferred are that residue35 in these six cases of BcX make a greater contribution to the
protonation process of Glu172, and the Surface Tyrosine and Lysine do not make a signif-
icant contribution. This means changing the global charge of BcX by succinylation does
not significantly affect pKa s of Glu172. By examining the energetic components, it can
be rationalized to that much of the electrostatics interactions between these residues and
Glu172 are largely screened due to the solvated environment (Table 2.12). Additionally,
the individual effects from Surface Tyrosine and Lysine on pKa values of Glu172 were
investigated by re-calculating the free energy derivatives with their partial charges scaled
to zero separately (Table 2.7). The magnitude from the Surface Tyrosine is smaller than
that from the Surface Lysine. This is expected as the electrostatics effects by nulling the
partial charges on Lys (net charge +1) are larger than those by nulling the partial charges
on Tyr (net charge zero). The reorganisation energy of the solvation environment and
nearby selected residues is associated with the variation of their contributions to the free
energy derivatives over the coupling parameters λ from 0.0 to 1.0. Here, the reorganisa-
tion energies from residue 35, which is in the close proximity of Glu172, together with
the Tyrosine and Lysine on the surface of our model upon “in silico” mutagenesis have
been calculated at the protonated and the deprotonated state. Experimentally it has been
shown five Lysine residues are solvent exposed and eleven out of fifteen Tyrosine residues
are accessible. These residues are the targets in the succinylation experiments to probe
the effects of the change of global charge. In all the six cases, the reorganisation energy
from residue 35is substantially larger than that from the global charge. Additionally, it has
been noticed that the contributions from Surface Lysine is larger than those from Tyrosine
(Table 2.8 and Table 2.9). These results demonstrated a qualitatively consistent trend for
the individual residues.
For comparison, PROPKA calculations were carried on by mutating residue 35 and
Surface Tyrosine and Lysine to Alanine to mimic the experimental studies and the pertur-
bative DFTB3/MM simulations. The pKa values of Glu172 and Glu78 based on PROPKA
perturbation calculations were obtained (Table 2.10). Coulombic interactions analyses
between Glu172 and the nearby residues were provided in Table 2.11. A marked change
for pKa values of Glu172 is found when residue 35 (Asn35 in WT, His35 in N35H and
Asp35 in N35D) was mutated to Alanine, but a relatively minor change is observed for
pKa values of Glu172 while the Surface Tyrosine & Lysine were mutated to Alanine.
From Table 2.10, a relatively strong interaction between Glu172 and residue 35 were ob-
served in the four mutated cases (N35H, N35H-2FXb, N35D and N35D-2FXb BcX). It
has been proved again that the interactions between residue 35 and Glu172 makes a sig-
nificant contribution to the pKa s for Glu172 in BcX. Additionally, the shift of the Glu172
pKa s with Surface Tyrosine mutated to Alanine is larger than that with Surface Lysine
mutated to Alanine, which is contrast to the findings in the free energy perturbation with
explicit solvent DFTB3/MM simulations, but in accordance with the analyses of Coulom-
bic interactions based on PROPKA which show some of Surface Tyrosine have significant
interactions with Glu172 and minor interactions with any Lysine residues were observed
(< 0.1 pKa unit). Tyr80 in BcX was mutated to Alanine to investigate its effects on the
pKa s of Glu172 as a larger Coulombic interaction between Tyr80 and Glu172 was ob-
served in the WT. However, its effect on the pKa of Glu172 was minor. We measured the
minimum heavy atom distances between these residues of interest and Glu172 or Glu78
(Table 2.12). We can see that all the Lysine residues are far away from Glu172, consistent
with the minor change on the pKa s of Glu172 while mutating Surface Lysine to Alanine.
residue 35, Tyr65, Tyr69 and Tyr80 have significant Coulombic interactions with Glu172.
This is consistent with the previous study that Tyr69 and Tyr80 play an important role in
catalysis.271,276,293 All these data show that pKa s of Glu172 are affected more by residue
35 than by the change in the global charge, which is consistent with the conclusion from
free energy perturbative analyses in the DFTB3/MM simulations (Table 2.7).
10
(a) (b)
8
λ=0.0
6 λ=1.0
4
2
100
Coordination Number
(c) (d)
8
6
4
2
100
(e) (f)
8
6
4
2
0
0 2 4 6 8 0 2 4 6 8 10
r(Å) r(Å)
Figure 2.6: Coordination number for water molecules around the carboxylic oxygen of
Glu172 in BcX (a WT, b WT-2FXb, c N35H, d N35H-2FXb, e N35D and f N35D-2FXb)
from the DFTB3/MM simulations. λ =0.0 refers to the protonated state and λ =1.0 refers
to the deprotonated state.
We rationalized the water contributions from the solvation structure of Glu172 by com-
puting the water radial distribution functions around the carboxylic oxygen in Glu172
(Fig. 2.6). The overall values for the apo state of BcX (WT and N35H) are relatively
higher than those for the intermediate state of BcX (WT-2FXb and N35H-2FXb), which
suggest that the substrate binding and the followed glycosylation process make Glu172
less exposed to water and contribute to their pKa shifts. No significant difference was ob-
served for N35D upon glycosylation. This is due to the tight interactions between Glu172
and Asp35, which make water less accessible, consistent with the sampled snapshots. Be-
sides, in all these six cases, the values for the deprotonation state are significantly larger
than the ones for the protonation state of Glu172, which reflects that deprotonated Glu172
(negatively charged) are largely exposed to solvents.
In summary, combined with the perturbative analysis, we can conclude that the electro-
static interactions between residue 35 and Glu172 in BcX, together with the glycosylation
step are the major determinants for the pKa shifts of Glu172 in BcX.
2.5 Conclusions
In this study, comparative computational studies were carried out to predict and ratio-
nalize pKa shifts for the catalytic Glu172 and selected key residues in Bacillus circulans
xylanase (BcX). Particularly, we studied the molecular basis for the so-called “pKa cy-
cling” in BcX (a retaining GH). Several routes for perturbing the pKa s of the key catalytic
residues thus alternating the pH-dependent activities of BcX were analysed and compared
including mutations of nearby residues, the solvation environment and the global charges.
This study consistently demonstrated that the local electrostatic environment including the
nearby polar/charged residues and the solvent interactions play more significant roles than
the global charge in BcX. This complemented the experimental studies with an energetic
analysis. The strengths and limitations of three different computational methods, namely
DFTB3/MM TI, MCCE and PROPKA, were evaluated by comparing to the experimental
data. For most of the cases, all the three methods provide satisfactory pKa predictions for
BcX. This differs from previous work in GH33/34 with a negatively charged substrate,
where signification deficiencies were noted for PROPKA.270 The challenging aspects in
dealing with electrostatics for pKa calculations have been highlighted. We anticipate our
work to serve as a benchmark for future computational studies of other GHs.
Chapter 3
Exploring the catalytic mechanism of

the S. pneumoniae sialidases NanA
3.1 Abstract
S. pneumoniae is a leading human pathogen, which is associated with many diseases
including otitis media, meningitis and septicaemia. NanA, as one of three S. pneumoniae
sialidases, has been shown to function as a promiscuous sialidase producing Neu5Ac
from α2,3- or α2,6-sialyllactose. The three forms of S. pneumoniae, NanA, NanB and
NanC, were suggested to share a common glycosylation step but to produce three different
products. By means of MEP and PMF calculations based on combined SCC-DFTB/MM
simulations, the hydrolysis mechanism and the most preferred route of NanA has been
determined.
3.2 Introduction
Sialic acids (N-acetylneuraminic acid, a nine-carbon and acidic α-keto sugar) have been
identified on the animals’ tissues cell surfaces.82 Of some pathological changes in which
the sialic acids involve, the adjustment of cell recognition is of outstanding significance
and importance. Also, the removal of terminal sialic acid residues from various glyco-
conjugates is catalyzed by sialidases and this catalytic reaction is a key biochemical event
associated with immune surveillance, signaling and apoptosis.79
S. pneumoniae has been proposed to be important in the development of severe otitis
media, acute meningitis and septicaemia.294,295 The occurrence rate of life-threatening
pneumococcal disease in human, especially in children under 5 years of age is higher
than HIV, malaria, and measles combined.296 S. pneumoniae encodes up to three distinct
NAs: NanA, NanB and NanC.173,177,297 NanA was detected in all S. pneumoniae strains,
NanB was the majority prevalent gene (96%), and NanC was the least one (51%).177,298
48
CHAPTER 3. THE CATALYTIC MECHANISM OF NANA 49
NanA supports pneumococcal colonization of the upper and lower airway299,300 and sep-
sis,153,301 and promotes biofilm formation.297,302 The predominant role of NanA in patho-
genesis of pneumococcal strains renders it a promising therapeutic target.79
The reaction mehchanisms for the three sialidases NanA, NanB and NanC in S. pneu-
moniae have been proposed by previous experimental study based on X-ray crystallogra-
phy, NMR, kinetic analysis and site-directed mutagensis (Fig. 1.9).173–176,180,303,304 As
depicted in Fig. 1.9, for the conserved catalytic residues shared by these three sialidases,
the carboxylate group of sialic acids interact with the tri-arginine clusters, and tyrosine
residue close to the glutamic acid acts as a nucleophile, together with that aspartic residue
acts as an acid/base. Their first step shows a common reaction pathway, while the second
step presents intriguing differences. The oxocarbonium ion C2 position is attacked by the
water molecule, the sialyl cation O7 atom and the carbohydrate accepter for NanA, NanB
and NanC, respectively. Subsequent products include Neu5Ac, 2,7-anhydro-Neu5Ac and
Neu5Ac2en are then produced. The findings173,176 reported that these three sialidases
might work together up to the ultimate step where NanA and NanB produce Neu5Ac and
2,7-anhydro-Neu5Ac following the functions of sialidase and intramolecular (IT) trans-
sialidase, whilst NanC could carry on a ping-pong mechanism that producing or removing
Neu5Ac2en. It is intriguing that three different sialidases have very similar active sites
but operate via three distinct reaction pathways. This chapter describes a combined QM/
MM modeling study for the S. pneumoniae sialidases NanA. We explored the hydrolysis
mechanism of NanA and computed the potential energy and free energy profiles of the re-
lated chemical reactions. Based on the MEP and PMF calculations for NanA along three
different reaction pathways, their respective reaction barriers in the second step has been
compared and thus the most preferred route for NanA has been determined. Addition-
ally, according to the combined QM/MM simulations of the intermediate state, a detailed
analysis on the active site structures and dynamics including water structures have been
carried out. These computational results, combined with the previous experimental in-
vestigations, are able to provide conclusive information on the catalytic mechanism of
NanA.
3.3 Methods
3.3.1 Simulation setup

The crystal structure of NanA (PDB code: 2VVZ,174 resolution 2.50 Å) was used as
the starting structure for all the simulations. The protonation states of all the titratable
residues, particularly the catalytic residues Glu647, Tyr752 and Asp372, were predicted
based on PROPKA305 which has been proved to be capable of providing satisfactory pKa
predictions and assigning proper protonation states in our previous work.76 The Glu647
QM region with the tri-arginine clusters
Arg347
Glu647 Tyr752
Arg663
Arg721
Asp372
Reaction region
Buffer region
Inner region
Continuum electrostatic
Figure 3.1: QM/MM simulation setup with the GSBP model for NanA. A spherical re-
gion of 20 Å radius centered on Glu647 Oε2 was describedas the inner region (shown
in red) and the remaining part was outer region represented with the continuum elec-
trostatics; Reaction region: a sphericalmodel of 18 Å centered on Glu647 Oε2, where
Newton’s equations of motion were solved (labeled in green); Buffer region: the re-
gion betweenreaction and inner region, where Langevin dynamics was solved (labeled
in blue); QM region: the link atom (drawn in light orange) was placed between Cβ and
Cγ , Cβ and Cγ , and Cβ and Cα for Glu647, Tyr752 and Asp372 respectively to divide
the bond between QM and MM region. The QM atoms are include Cγ , Hγ1, Hγ2, Cδ ,
Oδ 1, Oδ 2 of Glu647, Cγ , Cδ 1, Hδ 1, Cδ 2, Hδ 2, Cε 1, Hε1, Cε 2, Hε2, Cζ , OHη, Hη of
Tyr752, Cβ , Hβ 1, Hβ 2, Cγ , Oγ1, Oγ2, Hγ2 of Asp372 and the whole substrate.
and Asp372 in the Michaelis complex (reactant complex RC) of NanA was shown to be
deprotonated and protonated, respectively. This is consistent with their catalytical roles
as a general base and general acid in the first step from RC to the Intermediate complex
IC, respectively. All QM/MM simulations were carried out with CHARMM (version
c38a2).279 CHARMM force field c36306–308 was used to describe the protein and sugars.
The self-consistant charge density functional tight binding (SCC-DFTB)225 was imple-
mented to describe the quantum region. The system was solvated with the TIP3P water
model.309 The General Solvent Boundary Potential (GSBP)310 setup was carried out in
conjunction with explicit water molecules in the inner sphere for the QM/MM simulation
that includes: (i) a spherical region of 20 Å radius centred on Glu647 Oε2 is treated as the
inner region and the remaining protein is treated as the outer region; (ii) a spherical region
of radius 18 Å was defined as reaction region where Newtonian dynamics is solved; (iii)
the region between 18 Å and 20 Å was treated as buffer region where Langevin dynam-
ics is solved. A non-polar cavity potential was adopted to confine the inner region water
molecules to prevent them leaving this region. The dielectric constants in the protein
and solvent were assumed to be 1.0 and 80.0, respectively. The simulated temperature
was set up to 298.15K. The bonds involving hydrogen were constrained via the SHAKE
algorithm.311 To treat the bonded interactions at the QM/MM boundary, the bonds be-
tween the QM and MM regions were cut by a link atom with the divided frontier charge
model (DIV).282 Such a treatment has been shown to provide a balanced description of
proton affinities for molecules of interest.270 The link atom was placed between Cβ and
Cγ , Cβ and Cγ , and Cβ and Cα for Glu647, Tyr752 and Asp372, respectively. The whole
substrate was also included in the QM region. A full QM/MM minimization was per-
formed prior to MD simulations. Notably, the R group in the substrate was modified to
a methyl group to minimize the computational cost. Furthermore, the puckering of the
sugar was converted to a boat conformation (2 B5 ) based on the Cremer-Pople puckering
coordinates19 during the QM/MM minimization for consistency with the conformation
in the proposed catalytic mechanism. The puckering coordinates are given by Q, φ and
θ defined by Cremer and Pople (See details in 1.1.1.2).19 This was achieved by applying
harmonic restraints with a force constant 1.0, 50.0 and 50.0 kcal/mol, respectively, around
the predefined reference values. See Fig. 3.1, Fig. 3.2 and Table 3.1 for more information.
3.3.2 MEP calculations

We assumed that NanA could produce three promiscuous products Neu5Ac, 2,7-anhydro-
Neu5Ac and Neu5Ac2en. A comparative MEP simulation was performed along the three
different hypothetical reaction pathways presented in Fig. 3.3. The reaction ccordinates
used to drive the reactions are shown in Fig. 3.3. From RC to IC, three reaction pathways
share the same reaction coordinates. d1 is the distance between C2 position of the sugar
O
OH HO OH
HO
O O- Pukering O
O
AcHN AcHN
O CH3 O-
HO H HO OH
HO
O CH3
3H (!=330°, ɸ=130°) 2B ɸ=98°)

2 5 (!=300°,
Figure 3.2: The proposed conformation of the substrate in the RC.
Table 3.1: Summary of the starting structures for the Reactant complex (RC) and Inter-
mediate complex (IC), the equilibrium molecular dynamics simulations (Eq.).
System PDB ID Substrate Number of Eq.(in ns) Eq.(in ns)

QM atomsb MM QM/MM
NanA RC 2VVZ174 DANa 70 100 10
IC 2VVZ174 DANa 67 4
a DAN: 2-deoxy-2,3-dehydro-N-acetyleneuraminic acid; b The SCC-DFTB/MM partition boundary (the

link atom) for all the simulations was placed between Cβ and Cγ , Cβ and Cγ , and Cβ and Cα in residue
Glu647, Tyr752 and Asp372, respectively. The DIV scheme was adopted as it has been shown to provide
the best representation of the system.270
ring and the glycosidic O of methyl group, d2 describes the one between the glycosidic O
of methyl group and Asp372 Hδ 2, d3 refers to the distance between Tyr752 Oη and C2
position of the sugar ring, and d4 refers to the distance between Tyr752 Hη and Glu647
Oε2. During the second steps (from IC to the product complex PC), d5 refers to the dis-
tance between Tyr752 Oη and C2 position of the sugar ring, d6 corresponds to the one
between Tyr752 Oη and Hη, d7 is the one between C2 position of the sugar ring and the
catalytic water oxygen, d8 is the one between Asp372 Oε2 and catalytic water hydrogen,
d9 defines the one between C2 position and O7 of the sugar ring, d10 refers to the one
between C3 and H32 in substrate, and d11 accounts for that between H32 of the sugar
ring and catalytic water oxygen. The 2D energy profiles were then evaluated, with four
reaction coordinates. rxn1=d1-d2 and rxn2=d3+d4 were used driving the first step from
RC to IC. rxn3=d5-d6 and rxn4=d7+d8 were used to define the reaction pathway from IC
to PC Neu5Ac (PathA), rxn3=d5-d6 and rxn5=d9 were used to describe the pathway from
IC to PC 2,7-anhydro-Neu5Ac (PathB), and rxn3=d5-d6 and rxn6=d10-d11 were used to
describe the one from IC to PC Neu5Ac2en (PathC). rxn1 describes the departure of leav-
ing group, rxn2 represents the proton transfer from Tyr752 to Glu647 and the nucleophilic
attack on the C2 position of the sugar ring by the deprotonated Tyr752, rxn3 defines the
cleavage of the Tyr752-sialic acid bond and the protonation of Tyr752 by a proton of
Glu647, rxn4 refers to the subsequent nucleophilic attack at C2 position of the sugar ring
with the deprotonated catalytic water molecule by Asp372, rxn5 and rxn6 describe the
attack to the anomeric carbon of the oxocarbonium ion by O7 of the sialyl cation and the
carbohydrate acceptor, respectively. Unlike the catalytic mechanism for NanC (Fig. 1.9)
proposed by experimental studies, in which the C3 proton of the ligand is attacked by the
catalytic base Asp315 directly due to a unusually short of distance between the anomeric
carbon C3 and Asp315 and the surrounding hydrophobic environment around the ligand
(less water molecules around),173,176 the reaction in our case is proceeded with an acti-
vated water molecule nearby because the more accessible the catalytic cavity in NanA
allows more water molecules to reside (Fig. 3.3). This mechanism is more accordant with
the actual environment in NanA. The water distributions around C2, HO7 and H32 of the
substrate were rationalized during the simulations (Fig. 3.22). By way of contrast, we
also performed MEP calculation on the reaction path for the second step of NanA based
on the mechanism of NanC shown in Fig. A.2 (See details in Appendix).
3.3.3 PMF simulations

The PMF simulations were carried out with a harmonic force constant 50.0 kcal/mol/
Å2 for each reaction coordinate except 350 kcal/molÅ2 for rxn5. The total number of
windows was around 266, 980 and 931 for PathA, PathB and PathC, respectively. For each
window, 200 ps SCC-DFTB/MM simulations following the same GSBP protocol were
performed. The 2D WHAM268 method was used to unbias the systems and to evaluate
the free energy profiles along the predefined reaction coordinates.
3.4 Results and discussions
3.4.1 Equilibrated structures from QM/MM simulations

The equilibrated structures of RC from SCC-DFTB/MM simulations are shown in Fig.
3.4. Asp372 is shown to be protonated, and Glu647 is deprotonated in RC. The sugar
ring is a boat conformation (2 B5 ). For the key residues and substrates, the heavy atom
positional root-mean-square-deviations (RMSD) from the crystal structures were approx-
imately 1.0 Å over the simulation time. Also, classical MM simulations were carried out
to probe the key interactions in RC. Encouragingly, both MM and SCC-DFTB/MM simu-
lations provide a consistent structure and suitable positions for the catalytic reaction. The
final snapshots from these equilibrium simulations (Table 3.1) were taken as the initial
structures for MEP calculations.
Tyr752
Glu647
O O-
O
d4 +
H
HO OH d3 H 2N
O Arg
Reaction coordinates: O H 2N
rxn1 = d1-d2 AcHN +
rxn2 = d3+d4 O-
HO OH d1
RO
d2
H
O O
Tyr752
Asp372
Glu647
d6
O O H O
+
HO OH d5 O H 2N
Arg
O H 2N Tyr752
AcHN O- +
HO HO d7
H Glu647
O
rxn3 = d5-d6 H
rxn4 = d7+d8 H d6
HO O- d8 O O H O
+
HO OH d5 O H 2N Tyr752
Arg
Asp372 O H 2N
AcHN O- +
Glu647
O HO
H
PathA H O d9
d6
H O O H O
H +
COOH rxn3 = d5-d6 O- HO OH d5 O H 2N
OH O
HO rxn5 = d9 Arg
O H 2N
O OH AcHN O- +
AcHN Asp372
HO HO d10
HO HO H d11
H O
PathB rxn3 = d5-d6 H
rxn6 = d10-d11 H
O O-
NHAc
O Asp372
COOH
HO O
HO PathC
HO OH
HO
O COOH
AcHN
OH HO
Figure 3.3: The three predefined pathways for NanA. For the first step, d1 refers to the
distance between C2 position of the sugar ring and the glycosidic O of methyl group, d2
is the distance between the glycosidic O of methyl group and Asp372 Hδ 2, d3 describes
the distance between Tyr752 Oη and C2 position of the sugar ring, and d4 is the one
between Tyr752 Hη and Glu647 Oε2. For the second steps, d5 is the distance between
Tyr752 Oη and C2 position of the substrate, d6 corresponds to the one between Tyr752
Oη and Hη, d7 is the one between C2 position of the sugar ring and the catalytic water
oxygen, d8 is the one between Asp372 Oε2 and catalytic water hydrogen, d9 defines the
one between C2 position and O7 of the sugar ring, d10 refers to the one between C3 and
H32 in substrate, and d11 accounts for that between H32 of the sugar ring and catalytic
water oxygen. The reaction coordinates rxn1,rxn6 are also illustrated in the figure.
Arg663
Glu647
Tyr752
Arg721
WT
Arg347
Asp372
Figure 3.4: The final snapshots from the explicit solvent SCC-DFTB/MM simulations
at 10ns. The key residues and substrate are shown explicitly. The water molecules within
approximately 5 Å of Oε2 of Glu647 are shown in red.
3.4.2 Potential energy profiles for NanA along three different reac-
tion pathways
3.4.2.1 The first step in NanA
The MEP calculations were carried out along those predefined reaction coordinates, in
which several cycles scanning from the reactant state and product states back and forth
were performed to monitor the convergences. For the first step, rxn1 was harmonically
restrained from -0.12 to 2.44 Å while rxn2 was scanned from 4.86 to 2.30 Å using a force
constant of 2000.0 kcal/mol/Å2 . The scans were constructed in steps of 0.08 Å. A total of
257 energy points have been obtained in the 2D MEP calculations. A contour plot of the
energy profiles was shown in Fig. 3.5, describing the reaction pathway for the first step
catalyzed by NanA. The white dashed line highlighted the minimum energy path which
estimates the most likely pathway. Regarding the energetics, the reaction barrier for the
first step is 13.4 kcal/mol, and the path is downhill to the IC once the transition complex
of the first reaction step (TS1) is reached, which indicates a nucleophilic attack to the
anomeric carbon of the substrate and a proton transfer from Tyr752 to Glu647.
3.4.2.2 The second step in NanA
Subsequently 4 ns equilibrium SCC-DFTB/MM simulations were performed on the IC

structure obtained from the first step. Similarly, several cycles MEP scanning were per-
formed based on the equilibrated IC structure. rxn3 was scanned from -0.44 to 2.44
Å while rxn4 was scanned from 5.29 to 2.11 Å in the second step of PathA, rxn3 was
scanned from -0.28 to 2.12 Å while rxn5 was scanned from 3.91 to 1.03 Å in PathB, and
rxn3 was scanned from -0.3 to 2.10 Å while rxn6 was scanned from -1.73 to 2.11 Å in
PathC. Also, the scans were done with a 0.08 Å step with a harmonic force constant 2000
kcal/mol/Å2 . The total number of energy point for PathA, PathB and PathC is 317, 211
and 319, respectively. The energy profiles for these three distinct pathways are shown
in Fig. 3.6, Fig. 3.7 and Fig. 3.8. The minimum energy paths from IC to the transi-
tion complexes involved in the second steps (TS2) and then to PCs are highlighted in the
white dashed lines. The related reaction barrier for each path is around 27.1, 36.3 and
27.8 kcal/mol, respectively. The complete energy profiles for PathA, PathB and PathC are
presented in Fig. 3.9, from which we can clearly see that the reaction barrier for PathB
is the largest one among the three paths while the one for PathA is smallest that implies
PathA is the most possible rout for NanA. Interestingly, the reaction barriers difference
between PathA and PathC is quite small. The detailed analyses on this will be discussed
in Section 3.4.5.
Figure 3.5: Potential energy contour plot for the first reaction step of NanA. Energies
are in kcal/mol. The white dashed line illustrates the minimum potential energy path.
Figure 3.6: Potential energy contour plot for the 2nd step along PathA. Energies are in
kcal/mol. The white dashed line illustrates the minimum potential energy path.
Figure 3.7: Potential energy contour plot for the 2nd step along PathB. Energies are in
0
2.0
3
6
9
1.5
12
15
18
1.0 21
27
24
27
0.5 30
33
rxn6(Å )
36
0.0 39
27 42
45
-0.5
-1.0
-1.5
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 3.8: Potential energy contour plot for the 2nd step along PathC. Energies are in
40
36.7
35
30
27.8
Relative Energy(kcal/mol)
26.6
25 27.1
TS2 24.4
20
18.5
PC
15
13.4
TS1
10
5 PathA
PathB
0.0 -0.2
0 PathC
RC IC
-5
Figure 3.9: Potential energy profiles for PathA (black), PathB (purple) and PathC (blue).
This potential energy profile was based on SCC-DFTB/MM simulations.
3.4.3 Description of the mechanisms for NanA from the reactant state
to the product state
The MEPs for the three different reaction pathways , PathA, PathB and PathC proceed
diagonally indicating an AN DN type mechanism involved. This represents a concerted
bimolecular ”SN 2” displacement,312 where the reaction coordinates rxn1 and rxn2, rxn3
and rxn4, rxn3 and rxn5, rxn3 and rxn6 are highly correlated. The AN DN reaction can be
classified into an associative or a dissociative mechanism according to the relationship be-
tween attacking and departing groups, for instance, the nucleophilic attack by Tyr752 and
the departure of the methyl group in the first step. To better understand the mechanisms,
a 2D More O’Ferral-Jencks style diagram313 was created. The Pauling bond order314 was
used to represent the sialic acid bond making and breaking, it was determined as follows:

(R0 − Rx )
nx = n0 exp (3.1)
0.6
where R0 is the reference bond (R0 equals to 1.4 Å and 1.09 Å for C-O and C-H, respec-
tively), whose bond order is defined as n0 (=1), and Rx denotes the distance between the
nucleophile (Tyr752 in the first step; water, O7 of sialic acid and carbohydrate acceptor
in PathA, PathB and PathC respectively) and the anomeric carbon or the leaving group.
The More O’Ferral-Jencks plots for the critical points, together with the corresponding
conformations are depicted in Fig. 3.10, Fig. 3.11, Fig. 3.12 and Fig. 3.13. The x and
y axes are the Pauling bond orders of the structures along the potential energy path from
the anomeric C to the nucleophile or the leaving group.
For the first step, from the RC to IC, the leaving methyl group departs substantially
when the TS1 is reached, indicating that the AN DN mechanism for the first step is a
dissociative one (Fig. 3.10).
In Fig. 3.11, the axes represent the Pauling bond orders for O (water)-sialyl bond
making and Tyr752 Oη-sialyl bond breaking. The PathA shows that the Tyr752 Oη-sialyl
bond is already cleaved while the O (water)-sialyl bond formation is about to happen,
indicating that PathA adopts a dissociate AN DN mechanism. Similarly, Fig. 3.12 shows
that in PathB the Tyr752 Oη-sialyl bond has been cleaved while the O7 (sialyl)-C2 (sialyl)
bond formation is beginning. In the meantime, the proton has been transferred from the
catalytic water to the Asp372 while the water molecule is protonated by HO7 of sialyl
cation. Fig. 3.13 reveals that in PathC the Tyr752 Oη-sialyl bond is broken before the the
TS2 is reached. At this point, the proton transfers between the catalytic water molecule
and the Asp372, and between the Gu647 and the Tyr752 are about to happen.
Considering all of these results, it can be concluded that the AN DN mechanisms in-
volved in the whole pathways are dissociative, in other words, which indicates the loss
of bonding in the TS has progressed further than bond making. This dissociative mecha-
nism has been observed in the free energy study of the catalytic mechanisms of TcTS and
TrSA.315,316 In addition, it is worth noting that the proton transfer between the Gu647 and
the Tyr752 has not completed at the PC states for both PathB and PathC (Fig. 3.12 and
Fig. 3.13). The subsequent reactions for PathB and PathC are explained in Appendix. We
will focus on the energy profiles shown above in the following context as we are more
interested in the formation of the different conformations of the substrate during the three
reaction pathways.
The values for the critical distances are summarized in Table 3.2 and Table 3.3. From
Table. 3.2 we can clearly see that the proton of Tyr752 is almost transferred to Glu647
(d4≈1.18 Å) and the Tyr752 hydroxyl oxygen is around 0.8 Å closer to the anomeric
carbon of sialic acid at the TS1. Obviously, Tyr752 is more negative charged in this con-
figuration, which enhances Tyr752 being a nucleophile. Unlike the traditional carboxylate
nucleophile, it has been argued that Tyr752 here has an easier electrostatic approach to
the anomeric carbon of sialic acid due to the more favorable electrostatic interaction with
the negatively charged substrate.270,315,317 In addition, the Tyr752 Oη-sialyl is already
cleaved (d1≈2.07 Å) and the Asp372 proton is about to be transferred to the methyl group
(d2≈1.22 Å). Significant conformational changes of the sugar ring have been observed
during the first reaction step. The sialic acid adopts a 2 B5 conformation in the RC and
changes to 4 H5 half-chair conformation during the oxocarbenium ion-like TS1 and finally
Figure 3.10: The More O’Ferral-Jencks plot for the IC formation. The axes are the
Pauling bond orders from the anomeric C to the nucleophile (n(C2sial -Onuc ) ) and the
leaving group (n(C2sial -Olg )) . The RC, TS1, and IC configuration are shown in the
bottom left, top left, and top right, respectively. See Fig. 3.4 for more information. The
water molecules are shown in red.
Figure 3.11: The More O’Ferral-Jencks plot for the PC formation in PathA. The axes
are the Pauling bond orders from the anomeric C to the water molecule (n(C2sial -Owat ) )
and residue Tyr752 (n(C2sial -OHtyr752 )). The IC, TS2, and PC configuration are shown
in the bottom left, top left, and top right, respectively. See Fig. 3.4 for more information.
The catalytic water molecule is shown in Licorice and the nearby water molecules are
presented in red with CPK.
Figure 3.12: The More O’Ferral-Jencks plot for the PC formation in PathB. The axes are
the Pauling bond orders from the anomeric C to the O-7 atom of the substrate (n(C2sial -
O7sial ) ) and residue Tyr752 (n(C2sial -OHtyr752 )). The IC, TS2, and PC configuration are
shown in the bottom left, top left, and top right, respectively. See Fig. 3.4 for more
information. The catalytic water molecule is shown in Licorice and the nearby water
molecules are presented in red with CPK.
Figure 3.13: The More O’Ferral-Jencks plot for the PC formation in PathC. The axes are
the Pauling bond orders from the anomeric C to the H-3 atom of the substrate (n(C2sial -
H32sial )) and residue Tyr752 (n(C2sial -OHtyr752 )). The IC, TS2, and PC configuration
are shown in the bottom left, top left, and top right, respectively. See Fig. 3.4 for more
reaches a 2C5 chair form at the IC. This is consistent with the catalytic pathway reported
in the previous studies for the retaining (trans)sialidases.23,173,318
In the conformation of IC, the distance between the anomeric carbon and the oxygen
atom of the catalytic water molecule is less than 4 Å (d7≈3.26 Å) and a hydrogen bond is
established between the catalytic hydrogen and Asp372 Oε2, providing an ideal position
for a nucleophilic attack by a water molecule and the proton transfer. At the TS2 in PathA,
the Tyr752-sialic acid bond has been cleaved (d5≈2.63 Å) while the proton transfer be-
tween Glu647 and Tyr752 is almost achieved (d6≈1.15 Å). This proton transfer is also
observed at the TS2 in PathC while it is likely to happen concurrently with the formation
of the O7 (sialyl)-C2 (sialyl) bond in PathB. The PathA TS2 conformation is shown to
be similar to the TS1 4 H5 half-chair conformation, while the TS2 in PathB adopts a 4 T3
conformation and the one in PathC presents a E3 conformation.
Table 3.2: Reaction coordinate distances (Å) for RC, TS1 and IC in the first step of
NanA.
RC TS1 IC
d1 1.50 2.07 3.47
d2 1.62 1.22 1.03
d3 3.91 2.40 1.54
d4 1.68 1.18 1.00
Table 3.3: Reaction coordinate distances (Å) for RC, TS1 and IC for the second step
along PathA, PathB and PathC, respectively.
IC TS2 PC
PathA d5 1.52 2.63 3.05
d6 1.95 1.15 1.09
d7 3.26 2.11 1.50
d8 1.79 1.66 1.01
PathB d5 1.53 2.37 2.80
d6 1.81 1.69 1.16
d9 3.83 1.99 1.51
PathC d5 1.52 2.24 2.54
d6 1.82 1.20 1.16
d10 1.12 1.37 2.13
d11 1.89 1.17 0.98
3.4.4 Validation of SCC-DFTB/MM for reaction barriers and reac-

tion energies
QM/MM simulations with the SCC-DFTB model have been proved to be one of the
most popular tools to solve chemical questions associated with enzymatic reactions in
condensed phases and large biomolecules.198,244,319 However, it is not a straightforward
method to evaluate the accuracy for barrier heights. Several benchmarking studies of
DFTB3 for barrier heights have been reported, suggesting DFTB3 energy calculations on
some specific enzymes can provide reliable data.228,320 Recently, Pereira et al. evaluated
the performance of SCC-DFTB and several DFT methods using a 22-atom retaining GH
model.321 They pointed out that semiempirical SCC-DFTB presents a significant error
(≈12.6 kcal/mol) in the determination of reaction barrier height when compared with the
ab initio electron correlation method CCSD(T) with CBS approximation, while M06-2X/
6-31+G(d,p) shows much more accuracy (≈0.6 kcal/mol).321 A very recent work showed
that the both SCC-DFTB and DFTB3 structures are in agreement with the ones from DFT
calculations.322 In addition, these two methods can yield satisfactory results although pro-
duce larger error in some certain cases.322 It is worth mentioning that all these benchmark-
ing studies adopted a cluster model in the gas phase.
In this work, we chose as our model the second step along three different pathways
of NanA. Single point calculations at a higher level of theory on the structures from the
optimized MEP results were performed. The QM regions were treated with M06-2X
method with basis set 6-31+G(d,p). The results are shown in Fig. 3.14, Fig. 3.15 and Fig.
3.16. For PathA which adopts the general hydrolysis mechanism of retaining glycosi-
dases, the potential energy difference between SCC-DFTB and M06-2X is quite small
(<3.0 kcal/mol). Interestingly, the previous benchmarking of SCC-DFTB and M06-2X
shows that the reaction barrier height of a 22-atom model which mimic the the first step
of the retaining glycosidases mechanism based on SCC-DFTB is around 34.7 kcal/mol,
which is much higher than the M06-2X/6-31+G(d,p) result (≈22.7 kcal/mol).321 Another
barrier height benchmark set for several other systems found that DFTB3 results have a
marked change with the increase of the system size.320 With these in mind, along with
the accuracy of M06-2X,321 one can infer that SCC-DFTB provides a good performance
on energy calculations for our NanA model. Nevertheless, for PathB and PathC which
conformations of substrates change substantially, M06-2X produce significant overesti-
mations of reaction barriers (around 20.0 kcal/mol in PathB and around 10.0 kcal/mol in
PathC), compared to the SCC-DFTB results. Significant changes in the structure, such as
proton transfer and hydrogen bonding rearrangements have been proved to be the sources
of large error for SCC-DFTB calculations.320 Taking all these into account, one can ar-
gue that SCC-DFTB does not seem to provide encouraging results of reaction barrier for
PathB and PathC where the structures were optimized at the SCC-DFTB theory, and it
is thought to be associated with large error in describing significant structures changes

during the reactions. Notably, for PathC, M06-2X predicted a different structure for both
TS2 and PC. Similar phenomena have been observed in the previous works for potential
energy surfaces generated by M06-2X.323,324 Based on the IC, TS2 and PC conformers
obtained from the SCC-DFTB/MM simulations, the potential energy profiles for the three
paths based on the single point calculations are shown in Fig. 3.17. We can conclude
that consistent with our SCC-DFTB simulations, M06-2X/MM predicted pathA is the
preferred reaction pathway for NanA.
Figure 3.14: Potential energy contour plot for PathA. This potential energy was obtained
from single point calculation based on M06-2X/6-31+G(d,p). Energies are in kcal/mol.
The white dashed line illustrates the minimum potential energy path.
3.4.5 Free energy profiles for NanA along three different paths
To better understand the chemical reaction for the second step of NanA, the PMF sim-
ulations on the second reaction step have been carried out using the same reaction co-
ordinates based on the structures obtained from the MEP calculations above. Fig. 3.18,
Fig. 3.19 and Fig. 3.20 show the free energy contour plots for the three different reaction
pathways from the IC to the PC and Fig. 3.21 summarize the reaction energetics from
Figure 3.15: Potential energy contour plot for PathB. This potential energy was obtained
Figure 3.16: Potential energy contour plot for PathC. This potential energy was obtained
60 58.4
50
40 39.9
35.8
30
26.0
24.0
TS2
20 18.0
PC
10
PathA
PathB
0.0
0 PathC
IC
Figure 3.17: Potential energy profiles for PathA (black), PathB (purple) and PathC
(blue). This potential energy was obtained from single point calculation based on M06-
2X/6-31+G(d,p) with the SCC-DFTB/MM optimized geometries.
IC to PCs based on PMF calculations. The minimum free energy paths highlighted with
white dashed lines show similar trend with MEP, indicating that the MEP calculations can
describe the catalytic mechanisms of NanA properly. Besides, the MEP and PMF reac-
tion barriers are close each other, the largest difference is less than 3.5 kcal/mol. Taking
these into consideration, we assume the error estimated between M06-2X/6-31+G(d,p)
and SCC-DFTB MEP energy calculations can be transferred to the corresponding PMF.
The reaction barriers from the corrected PMF profiles of the second step for PathA, PathB
and PathC are 26.0, 35.4 and 55.0 kcal/mol, respectively. We note that the barrier of 26.0
kcal/mol is a little bit higher for a typical enzymatic system, however similar barriers have
been reported for other retain GHs including S. marcescens ChiB,50 TcTS,316 1,3-1,4-β -
glucanase55 and Golgi α-mannosidase II (GMII).325
Figure 3.18: Free energy contour plot for PathA. Energies are in kcal/mol. The white
dashed line illustrates the minimum free energy path.
Previous work on TcTS and TrSA has pointed out that the enzyme active site serve
to establish the TS during both the formation and hydrolysis of the IC.315,316 Here, to
probe the effects of individual active site residues on the reaction barrier for the second
step of PathA, we carried out perturbative analyses on the IC, TS2 and PC, where the
partial charges on the specific residues were set to zero and the energies were recalculated
based on the optimized geometry of the wild type structures. Such a perturbative analysis
can provide qualitative information on the contributions of relative residues, albeit the
relaxation of the modified environment is neglected during the perturbation. The results
from the perturbative analyses are presented in Table 3.4. In all the cases, the resultant
values for the energy barriers are relatively similar to the one of PathA, indicating that the
stabilization effect of these key residues are quite small, which seems to contradict the
previous studies.315,316 It is worth noting that these residues are relatively far away from
the reactive center.
We then rationalized the water contributions from the SCC-DFTB/MM simulations of
IC for NanA, NanB and NanC by computing the water radial distribution functions around
the C2, HO7 and H32 position of sialic acid of NanA, NanB and NanC, respectively (Fig.
3.22, See Appendix for relevant details of NanB and NanC). For NanA, the overall values
9
3.5
12
15
18
21
3.0 24
27
30
33
rxn5(Å )
36
2.5 39
42
45
33 33
2.0
1.5
33
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 3.19: Free energy contour plot for PathB. Energies are in kcal/mol. The white
Figure 3.20: Free energy contour plot for PathC. Energies are in kcal/mol. The white
40
35 33.5
30
27.6
27.3 22.4
25
TS2 22.1
20
21.7
PC
15
10
5 PathA
PathB
0.0
0 PathC
IC
-5
Figure 3.21: Free energy profiles for PathA(black), PathB(purple) and PathC(blue). This
free potential energy profile was based on SCC-DFTB/MM simulations.
for the coordination number for the water molecules around C2 position of the substrate
are higher than those for NanB and NanC, which means the reaction catalyzed by NanA
make the C2 of the sugar more exposed to water, and this is consistent with the hypothesis
that the water molecule can easily attack C2 position in NanA.173 In other words, the
influence of activated water molecule was supposed to have significant impacts on the
reaction mechanism of NanA, and Neu5Ac is the preferred product of NanA and PathA
is the ideal reaction pathway for NanA.
3.5 Conclusions
S. pneumoniae, as an important human pathogen, is large responsible for a range of dis-
ease states. Sialidases are one of the key virulence factors. The three S. pneumoniae sial-
idases, NanA, NanB and NanC adopt three different reaction mechanisms and produce
three distinct products. In this work, energy profiles based comparative MEP and PMF
SCC-DFTB/MM simulations for NanA along three reaction pathways (PathA, PathB and
PathC) were obtained. MEP and PMF calculations characterised the different interme-
diate states and product states for three catalytic mechanisms of NanA. Combined with
the results from DFT calculations, we show a lowest reaction barrier for the preferred
PathA in NanA, where the anomeric carbon C2 is supposed to be attacked by an activated
(a) (b) (c)

0.008
0.006
g(r)
0.004
0.002
20
0
(d) (e) (f)
15
NanA
Coor. No.
NanB
10 NanC
0
0 2 4 6 0 2 4 6 0 2 4 6
r(Å) r(Å) r(Å)
Figure 3.22: Water distribution around C2, HO7 and H32 in substrate from the SCC-
DFTB/MM simulations for NanA, NanB and NanC (a) Radial distribution function of
water oxygens around C2 in substrate in intermediate state of NanA, NanB and NanC;
(b) Radial distribution function of water oxygens around HO7 in substrate in intermediate
state of NanA, NanB and NanC; (c) Radial distribution function of water oxygens around
H32 in substrate in intermediate state of NanA, NanB and NanC; (d) the coordination
number for water around around C2 in substrate in intermediate state of NanA, NanB
and NanC; (e) the coordination number for water around around HO7 in substrate in
intermediate state of NanA, NanB and NanC; (f) the coordination number for water
around around H32 in substrate in intermediate state of NanA, NanB and NanC.
Table 3.4: Potential energies based on perturbation analyses of relevant residues with
SCC-DFTB/MM simulations. Energies are in kcal/mol.
IC TS2
different residues WT 0.0 27.0
Gln602 0.0 27.7
Tyr695 0.0 27.7
Asn645 0.0 27.8
Tyr777 0.0 28.2
Phe443 0.0 26.7
conserved catalytic residues Arg347 0.0 22.3
Arg663 0.0 34.1
Arg721 0.0 18.7
other identical residues Arg366 0.0 29.3
Asp417 0.0 29.8
Asp434 0.0 27.2
Tyr590 0.0 27.6
Ile416 0.0 27.0
Thr591 0.0 27.3
water molecule following the classical Koshland retaining meshanism. The information
of the characterized TS complex is expected to provide insights and guidances into in-
hibitor design. We hope this work can be of help for future computational studies of other
sialidases. For the other two S. pneumoniae sialidases, NanB and NanC, further compar-
ative MEP simulations were performed to probe their catalytic mechanisms. This will be
discussed in Chapter 4.
Chapter 4
Comparative studies of catalytic

pathways for S. pneumoniae NanB and
NanC
4.1 Abstract
S. pneumoniae takes large responsibility for severe otitis media, acute meningitis and sep-
ticaemia. It encodes up to three sialidases: NanA, NanB and NanC, that are promising
drug targets. NanB has been shown to act as an intramolecular trans-sialidase producing
2,7-anhydro-Neu5Ac, and NanC has been confirmed to first produce Neu5Ac2en which
can be slowly hydrolysed to Neu5Ac. The reaction mechanisms for these three pneumo-
coccal sialidases have been proposed by previous experimental studies. Their first steps
show a common reaction pathway, while the second steps present intriguing difference,
producing three different products. In Chapter 3, I have systematically studied the struc-
tures and energetic properties along all the potential pathways for NanA. In this Chap-
ter, by means of MEP calculations based on combined SCC-DFTB/MM, the hydrolysis
mechanisms and the most preferred routes of NanB and NanC have been determined.
4.2 Introduction
S. pneumoniae sialidase has been reviewed in previous chapters (Chapter 3). Here I briefly
review the key aspects relevant for this chapter.
S. pneumoniae is a major human pathogen for respiratory tract infections, meningi-
tis and septicaemia, and operate numerous cases of disease with a considerably high
mortality and incidence.153 S. pneumoniae expresses three distinct NAs: NanA, NanB
and NanC.173,177,297 A gene screening study showed that NanA, NanB and NanC genes
present in 100%, 96% and 51% of S. pneumoniae isolates respectively.177 NanA and
78
CHAPTER 4. THE CATALYTIC PATHWAYS OF NANB AND NANC 79
Tyr
Glu
O O-
O
d4 +
H
HO OH d3 H 2N
O Arg
Reaction coordinates: O H 2N
rxn1 = d1-d2 AcHN +
rxn2 = d3+d4 O-
HO OH d1
RO
d2
H
O O
Tyr
Asp
Glu
d6
O O H O
+
HO OH d5 O H 2N
Arg
O H 2N Tyr
AcHN O- +
HO HO d7
H Glu
O
rxn3 = d5-d6 H
rxn4 = d7+d8 H d6
HO O- d8 O O H O
+
HO OH d5 O H 2N Tyr
Arg
Asp O H 2N
AcHN O- +
Glu
O HO
H
PathA H O d9
d6
H O O H O
H +
COOH rxn3 = d5-d6 O- HO OH d5 O H 2N
OH O
HO rxn5 = d9 Arg
O H 2N
O OH AcHN O- +
AcHN Asp
HO HO d10
HO HO H
H
PathB rxn3 = d5-d6 d11
rxn6 = d10-d11
O O-
NHAc
O Asp
COOH
HO O
HO PathC
HO OH
HO
O COOH
AcHN
OH HO
Figure 4.1: The three predefined pathways for NanB and NanC. For the first step, d1
refers to the distance between C2 position of the sugar ring and the glycosidic O of
methyl group, d2 is the distance between the glycosidic O of methyl group and the as-
partic acid Hδ 2, d3 describes the distance between the tyrosine Oη and C2 position of
the sugar ring, and d4 is the one between the tyrosine Hη and the glutamic acid Oε2.
For the second steps, d5 is the distance between the tyrosine Oη and C2 position of the
substrate, d6 corresponds to the one between the tyrosine Oη and Hη, d7 is the one be-
tween C2 position of the sugar ring and the catalytic water oxygen, d8 is the one between
Asp372 Oε2 and catalytic water hydrogen, d9 defines the one between C2 position and
O7 of the sugar ring, d10 refers to the one between C3 and H32 in substrate, and d11 ac-
counts for that between H32 of the sugar ring and Asp372 Oε2. The reaction coordinates
rxn1, rxn6 are also illustrated in the figure. Tyr, Glu, Asp: Tyr653, Glu541, Asp270 in
NanB; Tyr695, Glu584, Asp315 in NanC.
NanB have been shown to play a crucial role in respiratory tract infection and sepsis.153
Moreover, considerable evidence indicated that NanA is essential to middle ear invasion,
nasopharynx and the incidence of otitis media effusion.178 Generally, S. pneumoniae sial-
idases have been shown to have great potential as drug targets or vaccine candidates.172
Recent studies have shown that the human upper respiratory epitheliums express α2,6-
sialylated glycans while many human bronchial epithelial cells possess α2,3- linked sialic
acid substrates.80,326 NanA would mainly responsible for the sialic acid removal,173 NanB
would be essential for the survival of the colonized bacteria,173,326 and NanC was sug-
gested to be a regulator of NanA.173,176 All these findings show that the three S. pneumo-
niae sialidases might play distinct roles in pneumococcal virulence.
Sequence identity between NanB and NanC is the greatest (over 50%), while it is very
limited between NanA and NanB (up to 24%). The crystal structures of NanA, NanB and
NanC have been solved and demonstrate conservation of their catalytic domain.174–176,180
These three sialidases have different substrate specificity. Previous studies175 have shown
NanA to be a typical hydrolytic sialidase that cleaves α2,3-, α2,6- and α2,8- linked sialic
acids, producing Neu5Ac. NanB acts as an intramolecular (IT)trans-sialidase that prefer-
entially cleaves α2,3- linked sialic acid substrates to release 2,7-anhydro-Neu5Ac.175,180
NanC is shown to be specific for α2,3- linked sialic acids producing the primary prod-
uct Neu5Ac2en that can be hydrolysed to Neu5Ac by attacking to C2 of the oxocarbo-
nium ion.176 A direct spectrophotometric study has shown that these three NA proteins
have slightly different pH optima (5.5-6.5 for NanA, 5.0-5.5 for NanB and 5.0-6.0 for
NanC).327
As depicted in Fig. 1.9, the residues involved in the key interactions with the substrate
(e.g., the nucleophilic tyrosine, the acid/base aspartic and the glutamic acid close to the
tyrosine, and the carboxylate group of sialic acids interacts with the tri-arginine cluster)
are conserved in NanA, NanB and NanC. The first steps to cleave sialosides show a com-
mon reaction pathway, while the final product formation steps are different to release
Neu5Ac, 2,7-anhydro-Neu5Ac and Neu5Ac2en, respectively, depending on attack at the
oxcarbonium ion C2 position by the catalytic water molecule (NanA), the substrate agly-
con O7 atom (NanB) or the carbohydrate accepter (NanC). In this chapter, we will explore
the hydrolysis mechanism of NanB and NanC with MEP simulations based on combined
SCC-DFTB/MM simulations. This work together with our previous study on NanA can
provide a more complete understanding of the mechanisms of actions of pneumococcal
NAs.
4.3 Methods
4.3.1 Simulation set up

The crystal structures of NanB (PDB code: 2VW1,175 resolution 2.39 Å) and NanC (PDB
code: 4YW3,176 resolution 2.05 Å) were used as the starting structures for all the sim-
ulations. The protonation states of all the titratable residues, particularly the catalytic
residues of NanB (Glu541, Tyr653 and Asp270) and NanC (Glu584, Tyr695 and Asp315),
were predicted based on the empirical method PROPKA305 which has been proved to be
capable of providing satisfactory pKa predictions and assigning proper protonation states
in our previous work.76 The Glu541 and Asp270 in the Michaelis complex (reactant com-
plex RC) of NanB were shown to be deprotonated and protonated, respectively. Similarly,
the Glu584 and Asp315 in RC of NanC were shown to be deprotonated and protonated,
respectively. This is consistent with their catalytic roles as a general base and general
acid in the first step from RC to the Intermediate complex (IC), respectively. All QM/
MM simulations were carried out with CHARMM (version c38a2).279 CHARMM force
field c36306–308 was used to describe the protein and sugars. The self-consistant charge
density functional tight binding (SCC-DFTB)225 was adopted to describe the quantum
region. The system was solvated with the TIP3P water model.309 The General Solvent
Boundary Potential (GSBP)310 setup was carried out in conjunction with explicit water
molecules in the inner sphere for the QM/MM simulation that includes: (i) a spherical
region of 20 Å radius centred on the glutamic acid (the general base in the first step,
Glu541 in NanB and Glu584 in NanC) Oε2 is treated as the inner region and the remain-
ing protein is treated as the outer region; (ii) a spherical region of radius 18 Å was defined
as reaction region where Newtonian dynamics are solved; (iii) the region between 18 Å
and 20 Å was treated as buffer region where Langevin dynamics are solved. A non-polar
cavity potential was adopted to confine the inner region water molecules to prevent them
leaving this region. The dielectric constants in the protein and solvent were assumed to be
1.0 and 80.0, respectively. The simulated temperature was set up to 298.15K. The bonds
involving hydrogen were constrained via the SHAKE algorithm.311 To treat the bonded
interactions at the QM/MM boundary, the bonds between the QM and MM regions were
cut by a link atom with the divided frontier charge model (DIV).282 Such a treatment
has been shown to provide a balanced description of proton affinities for molecules of
interest.270 The link atom was placed between Cβ and Cγ , Cβ and Cγ , and Cβ and Cα
for the glutamic acid, tyrosine and aspartic acid residue (Glu541, Tyr653 and Asp270 in
NanB; Glu584, Tyr695 and Asp315 in NanC), respectively. The whole substrate was also
included in the QM region. A full QM/MM minimization was performed prior to MD
simulations. Notably, the R group in the substrate was modified to a methyl group to min-
imize the computational cost. Furthermore, the puckering of the sugar was converted to a
boat conformation (2 B5 ) based on the Cremer-Pople puckering coordinates19 (See details
in 1.1.1.2) during the QM/MM minimization for consistency with the conformation in the
proposed catalytic mechanism. This was achieved by applying a harmonic restraint on
the reaction coordinates with the predefined reference values. Refer to Fig. 3.1 and Fig.
3.2 for more information. Table 4.1 summaries the relevant information for the starting
structures and equilibrium molecular dynamics simulations.
Table 4.1: Summary of the starting structures for the Reactant complex (RC) and Inter-
mediate complex (IC) of NanB and NanC, the equilibrium molecular dynamics simula-
tions (Eq.).
System PDB ID Substrate Number of Eq.(in ns) Eq.(in ns)

QM atomsb MM QM/MM
NanB RC 2VW1175 DANa 70 100 10
IC 2VW1175 DANa 67 10
NanC RC 4YW3176 DANa 70 100 10
IC 4YW3176 DANa 67 10
a DAN: 2-deoxy-2,3-dehydro-N-acetyleneuraminic acid; In the simulations, the inhibitors were converted

to their natural substrates; b The SCC-DFTB/MM partition boundary (the link atom) for all the simulations
was placed between Cβ and Cγ , Cβ and Cγ , and Cβ and Cα in residue Glu541, Tyr653 and Asp270 of
NanB, and in residue Glu584, Tyr695 and Asp315 of NanC, respectively. The DIV scheme was adopted as
it has been shown to provide the best representation of the system.270
4.3.2 MEP calculations

We assumed that both NanB and NanC could produce three promiscuous products Neu5Ac,
2,7-anhydro-Neu5Ac and Neu5Ac2en by following three distinct reaction pathways. A
comparative MEP simulation was performed along the different hypothetical reaction
pathways presented in Fig. 4.1. The reaction coordinates used to drive the reactions are
also shown in Fig. 4.1. From RC to IC, three reaction pathways share the same reaction
coordinates. d1 is the distance between the C2 position of the sugar ring and the glyco-
sidic O of the methyl group, d2 describes the one between the glycosidic O of the methyl
group and the aspartic acid residue (NanB, Asp270; NanC, Asp315) Hδ 2, d3 refers to the
distance between the tyrosine (NanB, Tyr653; NanC, Tyr695) Oη and C2 position of the
sugar ring, and d4 refers to the distance between the tyrosine Hη and the glutamic acid
Oε2 ((NanB, Tyr653 and Glu541; NanC, Tyr695 and Glu584). During the second steps
(from IC to the Product complex PC), d5 refers to the distance between the tyrosine Oη
and C2 position of the sugar ring, d6 corresponds to the one between the tyrosine Oη and
Hη (NanB, Tyr653; NanC, Tyr695), d7 is the one between C2 position of the sugar ring
and the catalytic water oxygen, d8 is the one between the aspartic acid residue (NanB,
Asp270; NanC, Asp315) Oε2 and catalytic water hydrogen, d9 defines the one between
C2 position and O7 of the sugar ring, d10 refers to the one between C3 and H32 in sub-
strate, and d11 accounts for that between H32 of the sugar ring and Oε2 of the aspartic
acid residue (NanB, Asp270; NanC, Asp315). The 2D energy profiles were then evalu-
ated, with four reaction coordinates. rxn1=d1-d2 and rxn2=d3+d4 were used driving the
first step from RC to IC. rxn3=d5-d6 and rxn4=d7+d8 were used to define the reaction
pathway from IC to PC Neu5Ac (PathA), rxn3=d5-d6 and rxn5=d9 were used to describe
the pathway from IC to PC 2,7-anhydro-Neu5Ac (PathB), and rxn3=d5-d6 and rxn6=d10-
d11 were used to describe the one from IC to PC Neu5Ac2en (PathC). rxn1 describes the
departure of the leaving group, rxn2 represents the proton transfer from the tyrosine to
the glutamic acid and the nucleophilic attack on the C2 position of the sugar ring by the
deprotonated tyrosine, rxn3 defines the cleavage of the tyrosine-sialic acid bond and the
protonation of tyrosine by a proton of the glutamic acid, rxn4 refers to the subsequent
nucleophilic attack at C2 position of the sugar ring with the deprotonated catalytic water
molecule by the aspartic acid residue, rxn5 and rxn6 describe the attack to the anomeric
carbon of the oxocarbonium ion by O7 of the sialyl cation and the carbohydrate accep-
tor, respectively (Glu541, Tyr653 and Asp270 in NanB; Glu584, Tyr695 and Asp315 in
NanC). Interestingly, similar to the catalytic mechanism for NanC (Fig. 1.9) proposed
by experimental studies, in which the C3 proton of the ligand is attacked by the catalytic
base aspartic acid residue directly due to a unusually short distance between the anomeric
carbon C3 and Asp315 and the surrounding hydrophobic pocket around the ligand (less
water molecules around),173,176 the reaction involved in the second steps of PathC for
NanB were undertaken without activated water molecules nearby (Fig. 4.1). We moni-
tored the water distributions around C2, HO7 and H32 of the substrate for NanB in the
IC (Fig. 3.22). We also performed MEP calculations on the reaction paths for the second
steps of NanB and NanC based on the mechanisms that have a catalytic water molecule
involved (Fig. B.2 and Fig. B.3, See details in Appendix), aiming to make a comparision
and verify the the mechanism of NanC proposed by the experimental results.176
4.4 Results and Discussions
4.4.1 Equilibrated structures from QM/MM simulations of RC

The equilibrated structures of RC from SCC-DFTB/MM simulations are shown in Fig.
4.2. The aspartic acid residues (NanB, Asp270; NanC, Asp315) are shown to be proto-
nated, and the glutamic acids (NanB, Glu541; NanC, Glu584) are deprotonated in RC.
The sugar rings are in a boat conformation (2 B5 ). For the key residues and substrates,
the heavy atom positional root-mean-square-deviations (RMSD) from the crystal struc-
tures were approximately 1.2 Å over the simulation time. Also, classical MM simulations
were carried out for RC to probe the key interactions. Encouragingly, both MM and
SCC-DFTB/MM simulations provide a consistent structure and suitable positions for the
(a)
Glu541
Arg557
Tyr653
Arg619
Arg245
WT Asp270
(b)
Arg600
Glu584
Tyr695
Arg662
Arg290
Asp315 WT
Figure 4.2: The final snapshots at 10ns from the explicit solvent SCC-DFTB/MM sim-
ulations. The key residues and substrate are shown explicitly. Thealiphatic hydrogen
atoms are omitted for clarity. The water molecules within approximately 5 Å of Oε2 of
Glu541 and Glu695 are shown in red, respectively. (a) NanB; (b) NanC.
catalytic reaction. The final snapshots from these equilibrium simulations (Table 4.1)
were taken as the initial structures for MEP calculations.
4.4.2 Potential energy profiles for NanB and NanC along three dif-
ferent reaction pathways
4.4.2.1 The first steps in NanB and NanC
The MEP calculations were carried out along those predefined reaction coordinates, in
which several cycles scanning from the reactant state and product states forward and back-
ward were performed to monitor the convergences. For the first step of NanB and NanC,
rxn1 was harmonically restrained from -0.13 to 2.51 Å and from -0.85 to 2.5 Å respec-
tively and rxn2 was scanned from 4.34 to 2.26 Å and from 5.29 to 2.33 Å, respectively.
These two reaction coordinates drive the nucleophilic attacks to the anomeric carbons of
the substrates and the proton transfers from tyrosine to glutamic acids (NanB, Tyr653 and
Glu541; NanC, Try695 and Glu584). The scans were constructed using a force constant
of 2000.0 kcal/mol/Å2 in steps of 0.08 Å. A total number of 209 and 347 energy points
have been obtained in the 2D MEP calculations for the first step in NanB and NanC, re-
spectively. The contour plots of the energy profiles was shown in Fig. 4.3 and Fig. 4.4,
describing the reaction pathways from RC to IC in NanB and NanC. The white dashed
lines highlighted the minimum energy paths that estimate the most likely pathways. Re-
garding the energetics, the reaction barriers for the first steps are 12.3 and 12.5 kcal/mol
for NanB and NanC, respectively. The similarity in the MEPs of NanA (See details in
Chapter 3), NanB and NanC also indirectly supports the common first step reaction path-
way.
4.4.2.2 The second steps in NanB and NanC
Subsequently 10 ns equilibrium SCC-DFTB/MM simulations were performed on the IC

structure obtained from the MEPs of the first steps of in NanB and NanC (Table 4.1).
Similarly, several cycles MEP scanning were performed based on the equilibrated IC
structures until the convergence occured. For the second steps along PathA of NanB and
NanC, rxn3 was scanned from -0.14 to 2.50 Å and from -0.03 to 2.45 Å while rxn4 was
restrained from 6.66 to 2.10 Å and from 6.87 to 2.07 Å, respectively. For the second steps
along PathB of the two cases, rxn3 was scanned from -0.12 to 2.36 Å and from -0.19 to
2.45 Å while rxn5 was sampled from 3.96 to 1.12 Å and from 4.12 to 1.10 Å separately.
rxn3 was harmonically restrained from -0.25 to 2.55 Å and from -0.19 to 2.45 Å while
rxn6 was scanned from -1.95 to 2.53 Å and from -1.85 to 2.47 Å for the second step along
PathC of NanB and NanC, respectively. The scans were done with a 0.08 Å step with a
harmonic force constant of 2000 kcal/mol/Å2 . The total number of energy points for the
6
4.0 9
12
15
18
12 21
24
3.5 27
30
33
rxn2(Å )
36
39
42
12
3.0 45
2.5
0.0 0.5 1.0 1.5 2.0 2.5
rxn1(Å )
Figure 4.3: Potential energy contour plot for the first reaction step of NanB. Energies
12 3
5.0 6
9
12
15
4.5 18
21
24
27
4.0 30
12
33
rxn2(Å )
36
39
3.5 42
45
3.0
2.5
-0.5 0.0 0.5 1.0 1.5 2.0 2.5
rxn1(Å )
Figure 4.4: Potential energy contour plot for the first reaction step of NanC. Energies
second step along PathA, PathB and PathC of NanB is 421, 251 and 442, respectively,
and for NanC, total 402, 302 and 385 energy points were obtained for the second steps
along the three distinct pathways.
The energy profiles for the second steps along PathA, PathB and PathC of NanB and
NanC are shown in Fig. 4.5, Fig. 4.6, Fig. 4.7, Fig. 4.8, Fig. 4.9 and Fig. 4.10,
respectively. The white dashed lines in the figures show the minimum energy paths from
IC to the transition complexes involved in the second steps (TS2) and then to PC. The
related reaction barriers for each path of NanB and NanC are separately summarized in
Fig. 4.11 and Fig. 4.12, from which we can clearly see that PathB is the smallest. This
implies PathB is the most possible route for NanB, and PathC is the preferred pathway
for NanC. These preferences are consistent with previous proposals.
0
9
6
12
15
18
21
24
5
27
30
33
rxn4(Å )
24 36
39
4 42
45
24
0.0 0.5 1.0 1.5 2.0 2.5
rxn3(Å )
Figure 4.5: Potential energy contour plot for the 2nd step along PathA of NanB. Energies
4.4.3 Description of the mechanisms for NanB and NanC from the
reactant state to the product state
Both NanB and NanC MEPs proceed involve a so-called AN DN type mechanism, which
can be divided into the associative or dissociative classification. It indicates that the reac-
9
3.5
12
15
18
21
3.0 24
27
30
33
rxn5(Å )
2.5 36
15
39
42
45
2.0 15
1.5
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 4.6: Potential energy contour plot for the 2nd step along PathB of NanB. Energies
2.5 0
3
6
2.0
9
12
15
1.5
18
21
1.0 24
27
27 30
0.5 33
rxn6(Å )
36
39
0.0
27 42
45
-0.5
-1.0
-1.5
0.0 0.5 1.0 1.5 2.0 2.5
rxn3(Å )
Figure 4.7: Potential energy contour plot for the 2nd step along PathC of NanB. Energies
6 12
15
18
21
24
5 27
30
33
rxn4(Å )
36
39
18
4 42
45
18
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 4.8: Potential energy contour plot for the 2nd step along PathA of NanC. Energies
0
4.0
3
12
3.5
15
18
24 21
24
3.0
27
30
33
rxn5(Å )
36
2.5
39
42
45
2.0
24
1.5
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 4.9: Potential energy contour plot for the 2nd step along PathB of NanC. Energies
0
3
6
2.0
9
12
1.5 15
12
18
21
1.0 24
27
30
0.5 33
rxn6(Å )
36
39
0.0
12 42
45
-0.5
-1.0
-1.5
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 4.10: Potential energy contour plot for the 2nd step along PathC of NanC. En-
ergies are in kcal/mol. The white dashed line illustrates the minimum potential energy
path.
30
28.3
25.4
25
22.3
20
16.6
15.2
15
TS2
12.3
TS1
10
5.3
5
PC
0.0 0.0 PathA

0
RC IC PathB
PathC
-5
Figure 4.11: Potential energy profiles for PathA (black), PathB (purple) and PathC (blue)
of NanB. This potential energy profile was based on SCC-DFTB/MM simulations.
30
25 24.7
20 19.6
16.6
15
13.4
12.5 12.6
11.3
TS1 TS2
10 PC
PathA
0.0 0.4
0 PathB
RC IC
PathC
-5
Figure 4.12: Potential energy profiles for PathA (black), PathB (purple) and PathC (blue)
of NanC. This potential energy profile was based on SCC-DFTB/MM simulations.
tion coordinates rxn1 and rxn2, rxn3 and rxn4, rxn3 and rxn5, rxn3 and rxn6 are highly
correlated. To describe the mechanisms, the 2D More O’Ferral-Jencks style diagrams313
based on the Pauling bond order314 were created (See Eq. 3.1 in Chapter 3 for more
information).
Fig. 4.13 and Fig. 4.14 present the More O’Ferral-Jencks plots for the critical points for
the formations of IC during the sialidase activity of NanB and NanC. The axes represent
the Pauling bond orders for the tyrosine (Tyr653 in NanB and Tyr695 in NanC) Oη-sialyl
bond making and O (leaving group)-sialyl bond breaking. One can clearly see that the
leaving methyl group departs substantially when the TS1 is reached, indicating that the
first steps of both systems follow a dissociative AN DN mechanism.
The More O’Ferral-Jencks plots for the PC formations of the two cases are shown
from Fig. 4.15 to Fig. 4.20, where axes represent the Pauling bond orders from C2 at
the axial position to the catalytic water or the tyrosine residues (Tyr653 in NanB and
Tyr695 in NanC). As depicted in the Figures for both NanB and NanC along PathA (Fig.
4.15 and Fig. 4.18 , the tyrosine Oη-sialyl bond is already cleaved while the O (water)-
sialyl bond formation is beginning, indicating that the hydrolysis reaction of the IC along
PathA involve a dissociate AN DN mechanism. Similarly, Fig. 4.16 and Fig. 4.19 show
that in PathB of the two systems the tyrosine Oη-sialyl bond has been cleaved before the
formation of the O7 (sialyl)-C2 (sialyl) bond is reached. However, unlike the reaction
of PathB of NanA where a proton has been transferred from the catalytic water to the
Asp372 while the water molecule is protonated by HO7 of the sialyl cation (Fig. 3.12),
no catalytic water molecules were observed to be involved in the reactions of NanB and
NanC, but interestingly, the HO7 of sialyl cation is transferring or has been transferred to
the aspartic acid residues (Asp270 in NanB and Asp315 in NanC). Fig. 4.17 and Fig. 4.20
reveal that in PathC the tyrosine Oη-sialyl bond is broken before the the TS2 is reached.
In the mean time, the proton transfers between the aspartic acid residues and the catalytic
water molecule has happened.
Considering all of these results, it can be concluded that all the pathways of both cases
are shown to follow a dissociative AN DN mechanism, which indicates the loss of bonding
in the TS has progressed further than bond making. This dissociative mechanism has been
observed in the free energy study of the catalytic mechanisms of TcTS and TrSA.315,316
In addition, it is worth noting that the proton transfers between the glutamic acids and
the tyrosines have not completed at the PC states along all the pathways (Fig. 4.15 to
Fig. 4.20, See details about the subsequent reactions in Appendix). We will focus on
the energy profiles shown above in the following context as we are more interested in
the formation of the different conformations of the substrate during the three reaction
pathways.
The values for the distances at the critical points are presented in Table 4.2, Table
4.3, Table 4.4 and Table 4.5. At the TS1, for both systems, the proton transfer between
Figure 4.13: The More O’Ferral-Jencks plot for the IC formation of NanB. The axes
are the Pauling bond orders from the anomeric C to the nucleophile (n(C2sial -Onuc ) ) and
the leaving group (n(C2sial -Olg )) . The RC, TS1, and IC configuration are shown in the
water molecules are shown in red with CPK.
Figure 4.14: The More O’Ferral-Jencks plot for the IC formation of NanC. The axes
are the Pauling bond orders from the anomeric C to the nucleophile (n(C2sial -Onuc ) ) and
the leaving group (n(C2sial -Olg )) . The RC, TS1, and IC configuration are shown in the
water molecules are shown in red with CPK.
Figure 4.15: The More O’Ferral-Jencks plot for the PC formation in PathA of NanB. The
axes are the Pauling bond orders from the anomeric C to the water molecule (n(C2sial -
Owat ) ) and residue Tyr653(n(C2sial -OHtyr653 )). The IC, TS2, and PC configuration are
HO7
Figure 4.16: The More O’Ferral-Jencks plot for the PC formation in PathB of NanB.
The axes are the Pauling bond orders from the anomeric C to the O-7 atom of the sub-
strate (n(C2sial -O7sial ) ) and residue Tyr653 (n(C2sial -OHtyr653 )). The IC, TS2, and PC
configuration are shown in the bottom left, top left, and top right, respectively. See Fig.
4.2 for more information. The proposed catalytic water molecule is shown in Licorice
and the nearby water molecules are presented in red with CPK. However, Asp270 is
protonated by HO7 of the substrate directly. See details in Section 2.4.3 of this Chapter.
Figure 4.17: The More O’Ferral-Jencks plot for the PC formation in PathC of NanB.
The axes are the Pauling bond orders from the anomeric C to the H-3 atom of the sub-
strate (n(C2sial -H32sial )) and residue Tyr653 (n(C2sial -OHtyr653 )). The IC, TS2, and PC
4.2 for more information. The catalytic water molecule is shown in Licorice and the
nearby water molecules are presented in red with CPK.
Figure 4.18: The More O’Ferral-Jencks plot for the PC formation in PathA of NanC. The
axes are the Pauling bond orders from the anomeric C to the water molecule (n(C2sial -
Owat ) ) and residue Tyr695 (n(C2sial -OHtyr695 )). The IC, TS2, and PC configuration are
HO7
HO7
Figure 4.19: The More O’Ferral-Jencks plot for the PC formation in PathB of NanC.
The axes are the Pauling bond orders from the anomeric C to the O-7 atom of the sub-
strate (n(C2sial -O7sial ) ) and residue Tyr695 (n(C2sial -OHtyr695 )). The IC, TS2, and PC
4.2 for more information. The proposed catalytic water molecule is shown in Licorice
and the nearby water molecules are presented in red with CPK. However, Asp315 is pro-
tonated by HO7 of the substrate (label in the figure) directly. See details in Section 2.4.3
of this Chapter.
Figure 4.20: The More O’Ferral-Jencks plot for the PC formation in PathC of NanC.
The axes are the Pauling bond orders from the anomeric C to the H-3 atom of the sub-
strate (n(C2sial -H32sial )) and residue Tyr695 (n(C2sial -OHtyr695 )). The IC, TS2, and PC
4.2 for more information. The catalytic water molecule is shown in Licorice and the
nearby water molecules are presented in red with CPK.
the tyrosine and the glutamic acid residue is almost completed (d4≈1.14 Å and 1.21 Å,
respectively), and the tyrosine hydroxyl oxygen is moving closer to the anomeric car-
bon of sialic acid. Similar to NanA, both Tyr653 and Tyr695 in the two cases are more
negatively charged, which enhances their nucleophilic potential (See Chapter 3 for more
information). The distance from the hydroxyl oxygen of the nucleophilic tyrosine to the
C2 position of the substrate in NanB is shorter than that in NanA and NanC by 0.16 and
0.40 Å, respectively, making Tyr653 of NanB a better nucleophile (See details in Ap-
pendix). Additionally, the tyrosine Oη-sialyl is already cleaved (d1≈2.24 Å and 2.10 Å,
respectively) and the aspartic acid residue proton has been transferred to the methyl group
(d2≈1.05 Å and 1.02 Å, respectively). The conformational changes of the sugar ring in
NanB and NanC show similar results with NanA, that the 2 B5 in RC change to the 4 H5 in
TS1 and finally reaches a 2C5 chair conformation at the IC (See Chapter 3 for more infor-
mation). This is consistent with the catalytic pathway reported in the previous studies for
the retaining (trans)sialidases.23,173,318
At the IC for NanB and NanC, the distance between C2 position of the sugar ring and
the catalytic water oxygen is 4.58 Å and 5.06 Å, respectively. These are greater than 4 Å
which has been shown to be an ideal position for nucleophilic attack by a water molecule
and the proton transfer in NanA (See Chapter 3 for more information). This is due to
the more hydrophobic pockets surrounding the catalytic clefts of NanB and NanC. At
the TS2 in PathA, the tyrosine-sialic acid bonds have been cleaved (d5≈2.83 Å and 2.47
Å, respectively) while the proton transfers between the glutamic acids and the tyrosines
are about to happen (d6≈1.21 Å and 1.22 Å, respectively). This proton transfer is also
observed at the TS2 in PathC while the O7 (sialyl)-C2 (sialyl) bond formation is beginning
in PathB.
NanB.
RC TS1 IC
d1 1.52 2.24 3.53
d2 1.65 1.05 1.02
d3 2.99 2.24 1.57
d4 1.27 1.14 1.01
4.4.4 Single point calculations for the preferred routs of NanB and
NanC
It has been shown that M06-2X/6-31+G(d,p) provides a good performance in the deter-
mination of energy barrier for several similar systems.323,324,328,329 This has also been
confirmed in our previous work on NanA (Chapter 3), predicting a correct reaction path-
NanC.
RC TS1 IC
d1 1.49 2.10 3.49
d2 2.34 1.02 0.98
d3 3.49 2.64 1.57
d4 1.24 1.21 1.00
Table 4.4: Reaction coordinate distances (Å) for IC, TS2 and PC for the second step of
NanB along PathA, PathB and PathC, respectively.
IC TS2 PC
PathA d5 1.52 2.83 3.06
d6 1.66 1.21 1.20
d7 4.58 1.92 1.48
d8 2.08 1.39 0.98
PathB d5 1.52 2.19 2.91
d6 1.64 1.59 1.19
d9 3.96 2.01 1.52
PathC d5 1.52 2.34 2.71
d6 1.77 1.31 1.19
d10 1.09 1.56 2.40
d11 3.04 1.10 0.99
Table 4.5: Reaction coordinate distances (Å) for IC, TS2 and PC for the second step of
NanC along PathA, PathB and PathC, respectively.
IC TS2 PC
PathA d5 1.58 2.47 2.97
d6 1.61 1.22 1.16
d7 5.06 2.00 1.52
d8 1.81 1.51 1.10
PathB d5 1.62 2.51 2.90
d6 1.81 1.34 1.20
d9 4.12 1.87 1.55
PathC d5 1.62 2.48 2.67
d6 1.81 1.23 1.18
d10 1.09 1.59 2.49
d11 2.92 1.05 0.98
way for NanA and providing more satisfactory energies than SCC-DFTB (See Chapter
3 for more information). Here, we performed single point calculations on the structures
from the optimized MEP results based on the SCC-DFTB/MM simulations. The QM re-
gions were treated with M06-2X with basis set 6-31+G(d,p). The potential energy profiles
for the second step for PathB of NanB and for PathC for NanC based on M06-2X/MM
calculations are shown in Fig. 4.21 and Fig. 4.22. The related reaction barrier is 25.9
and 17.8 kcal/mol, respectively. The potential energy difference between SCC-DFTB
and M06-2X/6-31+G(d,p) is around 10.7 for PathB of NanB and 5.2 for PathC of NanC.
The larger error for PathB of NanB is probably due to the larger changes observed in the
structures (e.g. additional ring formation) during the reactions. This is consistent with
the results for NanA, which shows larger errors in the case of PathB. This indicates that
SCC-DFTB has non-negligible errors in estimating reaction energetics and future work is
required to gauge the potential errors.
0
9
3.5
12
15
18
21
3.0 24
24 27
30
33
rxn5(Å )
2.5 36
39
42
45
2.0
24
1.5
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 4.21: Potential energy contour plot for PathB of NanB. This potential energy was
obtained from single point calculation with M06-2X/6-31+G(d,p) based on the SCC-
DFTB/MM MEP. Energies are in kcal/mol. The white dashed line illustrates the mini-
mum potential energy path.
0
3
6
2.0
9
12
1.5 15
18
21
1.0 24
27
30
0.5 15 33
rxn6(Å )
36
39
0.0
42
45
15
-0.5
-1.0
-1.5
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure 4.22: Potential energy contour plot for PathC of NanC. This potential energy was
obtained from single point calculation with M06-2X/6-31+G(d,p) based on the SCC-
DFTB/MM MEP. Energies are in kcal/mol. The white dashed line illustrates the mini-
mum potential energy path.
4.5 Conclusions
S. pneumoniae is a major human pathogen, which takes large responsibility for a range of
disease states. It encodes up to three S. pneumoniae sialidases, NanA, NanB and NanC,
that adopt three different reaction mechanisms and produce three distinct products. In
this work, energy profiles based on comparative MEP SCC-DFTB/MM simulations for
NanB and NanC along three reaction pathways (PathA, PathB and PathC) were obtained.
Based on the MEP calculations, the mechanisms for NanB and NanC from the reactant
state to the product state along the reaction pathways were characterized. NanC operates
via a consistent mechanism proposed by experiments. It is worth pointing out that the
mechanism of NanB is different from the previously proposed one which had a catalytic
water in the second step. We proposed that no catalytic water molecules were involved
in the second step. This is supported by the hydrophobic environment at the active sites.
Combined with the results from DFT calculations, we show a lowest reaction barrier for
the prefered PathB in NanB and PathC in NanC, respectively. The detailed characteriza-
tion of the TS complex will provide insights into inhibitor design. We hope this work will
be helpful for understanding the catalytic mechanisms of other sialidases.
Chapter 5
Summary and future work
5.1 Concluding remarks

In this thesis, I focused on two important aspects of the reactivities of GHs: pKa shifts
of key residues and the similarity and difference among a class of important sialidases.
In Chapter 2, a combined computational study based on both atomistic simulations and
empirical models to predict pKa shifts for the general acid/base Glu172 and other key
residues in Bacillus circulans xylanase (BcX) were presented. The pKa values obtained
from quantum mechanics/molecular mechanics (QM/MM) thermodynamic integration
(TI), the multiconformation continuum electrostatics (MCCE) and PROPKA methods
show a good agreement with available experimental data. Particularly, QM/MM TI and
MCCE DEFAULT provide satisfactory predictions with relatively lower RMSD values,
suggesting that the computationally more expensive QM/MM TI and MCCE DEFAULT
methods can be used to deal with pKa shifts for a small number of catalytic residues
along the catalytic pathways, while the highly efficient PROPKA method can be adopted
for other residues for its balance between efficiency and accuracy. The mutations of the
residues nearby the catalytic Glu172, the solvation environment and the effects of the
change in its global charge were compared through perturbative analyses, showing that
pKa s of Glu172 are affected more by the local electrostatic environment including the
nearby residue 35 and the solvent interaction, than by the global charge.
In Chapters 3 and 4, a comparative study was carried out for three S. pneumoniae sial-
idases, NanA, NanB and NanC. Previous studies have indicated that they are promising
drug targets. At the same time, extensive biochemical and structural studies have shown
that they have distinct reaction pathways despite having very similar active sites. The
reaction barriers and energies for the three sialidases along the three reaction pathways
(PathA, PathB and PathC) based on the QM/MM simulations were obtained. We con-
firmed the lowest reaction barriers for their preferred pathways (PathA for NanA, PathB
for NanB and PathC for NanC). Also the mechanisms for NanA, NanB and NanC from
105
CHAPTER 5. SUMMARY AND FUTURE WORK 106
the reactant state to the product state were characterised. NanA is found to act as a clas-
sic hydrolytic sialidase following the Koshland retaining mechanism, where the anomeric
carbon C2 is attacked by an activated water molecule. NanB functions as an IT trans-
sialidase, in which the oxocarbonium ion C2 is attacked by the sialyl cation O7 atom.
The C2 position of the substrate of NanC is shown to be attacked by the carbohydrate
acceptor, producing Neu5Ac2en. These results are consistent with the mechanisms pro-
posed by experiments. Interestingly, we found that there is no catalytic water molecules
involved in the second step of NanB, which is different from the previously proposed
mechanism. This is supported by the analysis of the active site structures.
5.2 Future work and outlook

Current studies utilize the efficient approximate implementation of DFTB/MM within the
GSBP framework. The quality of such simulations critically depends on at least three
factors. Firstly, the accuracy of the adopted QM method. Our validation studies based on
M06-2X indicates that there are non-negligible errors in several cases. Possible solutions
will include development of special reaction parameters (SRP) as it has been done for
SCC-DFTB to study phosphate chemistry.264 Another alternative approach would be to
carry out free energy perturbation from the PMF based on DFTB/MM to high level of
theory.247 Secondly, the convergence in conformational sampling. As mostly localized
reactions were studied in the current work, it might be expected that convergence in sam-
pling is not a critical issue. For reaction energetics, current work is mainly based on either
MEP and PMF. In general cases, more advanced sampling techniques can be applied to
speed up the sampling. As mentioned in Chapter 1, there are several other powerful meth-
ods to obtain PMF, such as local elevation/metadynamics,259,260 the string method,263,264
and transition path sampling.265–267 In future work, it would be very interesting to exam-
ine their performance in our case to probe the potential bias introduced by the predefined
reaction coordinates. Finally, a balanced description between QM/MM interactions. In
the current work, non-polarizable force fields have been adopted. In order to accurately
capture the electrostatic interactions in enzymatic systems, polarizable force fields might
be required.330,331 These issues will be explored in future work.
CAZymes are fascinating biomolecular machines to study from both a fundamental
science and a medicinal application point of view. Current studies can be extended to
other classes of GHs, partically those with interesting catalysis or unsettled mechanistic
debates in the literature. For instance, usually GHs display substrate specificity, however,
promiscuity has been identified. GH4 6-phospho-α-glucosidase from Basillus subtilis,
as an extreme example, contains both α- and β -glycosidases.332 There are some other
GHs that act by unknown mechanisms; for instance, two GH61 members, that stimulate
plant cell wall degradation, were found to have no typical catalytic centers or potential
CHAPTER 5. SUMMARY AND FUTURE WORK 107
catalytic residues.333 Additionally, the catalytic mechanisms for many GH families have
not been reported, suggesting that some large-scale systematic studies of GHs would be of
great interest. Furthermore, the characterization of the intermediate and transition states
also provides important insights for drug discovery. Many GHs have been validated or
proposed as potential drug targets,6,334,335 for which the detailed study of their structure
and electronic properties could guide more effective inhibitor design.
Bibliography
(1) V. Lombard, H. Golaconda Ramulu, E. Drula, P. M. Coutinho and B. Henrissat,

“The carbohydrate-active enzymes database (CAZy) in 2013”, Nucleic Acids Re-
search, 2013, 42, D490–D495.
(2) L. Brown, J. M. Wolf, R. Prados-Rosales and A. Casadevall, “Through the wall:
Extracellular vesicles in Gram-positive bacteria, mycobacteria and fungi”, Nature
Reviews Microbiology, 2015, 13, 620–630.
(3) S. S. Pinho and C. A. Reis, “Glycosylation in cancer: Mechanisms and clinical
implications”, Nature Reviews Cancer, 2015, 15, 540–555.
(4) S. A. Springer and P. Gagneux, “Glycomics: Revealing the dynamic ecology and
evolution of sugar molecules”, Journal of Proteomics, 2016, 135, 90–100.
(5) K. S. Egorova and P. V. Toukach, “CSDB GT: a new curated database on glyco-
syltransferases”, Glycobiology, 2017, 27, 285–290.
(6) T. M. Gloster and D. J. Vocadlo, “Developing inhibitors of glycan processing
enzymes as tools for enabling glycobiology”, Nature Chemical Biology, 2012, 8,
683–694.
(7) S. J. Williams and G. J. Davies, “Protein–carbohydrate interactions: Learning
lessons from nature”, TRENDS in Biotechnology, 2001, 19, 356–362.
(8) D. B. Wilson, “Cellulases and biofuels”, Current Opinion in Biotechnology, 2009,
20, 295–299.
(9) P. K. Busk, B. Pilgaard, M. J. Lezyk, A. S. Meyer and L. Lange, “Homology to
peptide pattern for annotation of carbohydrate-active enzymes and prediction of
function”, BMC Bioinformatics, 2017, 18, 214.
(10) C. Consortium, “Ten years of CAZypedia: A living encyclopedia of carbohydrate-
active enzymes”, Glycobiology, 2017, 28, 3–8.
(11) I. André, G. Potocki-Véronèse, S. Barbe, C. Moulis and M. Remaud-Siméon,
“CAZyme discovery and design for sweet dreams”, Current Opinion in Chemical
Biology, Apr. 2014, 19, 17–24.
108
BIBLIOGRAPHY 109
(12) R. D. Finn, P. Coggill, R. Y. Eberhardt, S. R. Eddy, J. Mistry, A. L. Mitchell,

S. C. Potter, M. Punta, M. Qureshi, A. Sangrador-Vegas et al., “The Pfam protein
families database: Towards a more sustainable future”, Nucleic Acids Research,
2016, 44, D279–D285.
(13) S. F. Altschul, W. Gish, W. Miller, E. W. Myers and D. J. Lipman, “Basic local
alignment search tool”, Journal of Molecular Biology, 1990, 215, 403–410.
(14) Y. Yin, X. Mao, J. Yang, X. Chen, F. Mao and Y. Xu, “dbCAN: A web resource
for automated carbohydrate-active enzyme annotation”, Nucleic Acids Research,
2012, 40, W445–W451.
(15) J. F. Stoddart, Stereochemistry of carbohydrates, Wiley-Interscience, 1971.
(16) S.-M. Rings, “Conformational nomenclature for five and six-membered ring forms
of monosaccharides and their derivatives”, Eur J Biochem, 1980, 100, 295–298.
(17) L. Raich, A. Nin-Hill, A. Ardèvol and C. Rovira, “Chapter seven-enzymatic cleav-
age of glycosidic bonds: Strategies on how to set up and control a QM/MM meta-
dynamics simulation”, Methods in Enzymology, 2016, 577, 159–183.
(18) A. P. Montgomery, K. Xiao, X. Wang, D. Skropeta and H. Yu, in Advances in
Protein Chemistry and Structural Biology, Elsevier, 2017, vol. 109, pp. 25–76.
(19) D. Cremer and J. Pople, “General definition of ring puckering coordinates”, Jour-
nal of the American Chemical Society, 1975, 97, 1354–1358.
(20) X. Biarnés, A. biarnes, A. Planas, C. Rovira, A. Laio and M. Parrinello, “The con-
formational free energy landscape of β -D-glucopyranose. Implications for sub-
strate preactivation in β -glucoside hydrolases”, Journal of the American Chemi-
cal Society, 2007, 129, 10686–10693.
(21) H. B. Mayes, L. J. Broadbelt and G. T. Beckham, “How sugars pucker: Electronic
structure calculations map the kinetic landscape of five biologically paramount
monosaccharides and their implications for enzymatic catalysis”, Journal of the
American Chemical Society, 2014, 136, 1008–1022.
(22) C. B. Barnett and K. J. Naidoo, “Ring puckering: a metric for evaluating the accu-
racy of AM1, PM3, PM3CARB-1, and SCC-DFTB carbohydrate QM/MM simu-
lations”, The Journal of Physical Chemistry B, 2010, 114, 17142–17154.
(23) G. J. Davies, A. Planas and C. Rovira, “Conformational analyses of the reaction
coordinate of glycosidases”, Accounts of Chemical Research, 2011, 45, 308–316.
(24) B. M. Sattelle and A. Almond, “Less is more when simulating unsulfated gly-
cosaminoglycan 3D-structure: Comparison of GLYCAM06/TIP3P, PM3-CARB1/TIP3P,
and SCC-DFTB-D/TIP3P predictions with experiment”, Journal of Computa-
tional Chemistry, 2010, 31, 2932–2947.
BIBLIOGRAPHY 110
(25) N. S. Gandhi and R. L. Mancera, “Can current force fields reproduce ring puck-
ering in 2-O-sulfo-α-l-iduronic acid? A molecular dynamics simulation study”,
Carbohydrate Research, 2010, 345, 689–695.
(26) K. Govender, J. Gao and K. J. Naidoo, “AM1/d-CB1: A semiempirical model for
QM/MM simulations of chemical glycobiology systems”, Journal of Chemical
theory and Computation, 2014, 10, 4694–4707.
(27) L. Pol-Fachin, V. H. Rusu, H. Verli and R. D. Lins, “GROMOS 53A6GLYC, an
improved GROMOS force field for hexopyranose-based carbohydrates”, Journal
of Chemical Theory and Computation, 2012, 8, 4681–4690.
(28) L. Lairson, B. Henrissat, G. Davies and S. Withers, “Glycosyltransferases: Struc-
tures, functions, and mechanisms”, Annual Review of Biochemistry, 2008, 77,
521–555.
(29) C. S. Rye and S. G. Withers, “Glycosidase mechanisms”, Current Opinion in
Chemical Biology, 2000, 4, 573–580.
(30) D. E. Koshland, “Stereochemistry and the mechanism of enzymatic reactions”,
Biological Reviews, 1953, 28, 416–436.
(31) P. J. Berti and K. S. Tanaka, “Transition state analysis using multiple kinetic iso-
tope effects: mechanisms of enzymatic and non-enzymatic glycoside hydrolysis
and transfer”, Advances in Physical Organic Chemistry, 2002, 37, 239–314.
(32) J. K. Lee, A. D. Bain and P. J. Berti, “Probing the transition states of four gluco-
side hydrolyses with 13C kinetic isotope effects measured at natural abundance
by NMR spectroscopy”, Journal of the American Chemical Society, 2004, 126,
3769–3776.
(33) J. M. Stubbs and D. Marx, “Glycosidic bond formation in aqueous solution:
On the oxocarbenium intermediate”, Journal of the American Chemical Society,
2003, 125, 10960–10962.
(34) D. J. Vocadlo, G. J. Davies, R. Laine and S. G. Withers, “Catalysis by hen egg-
white lysozyme proceeds via a covalent intermediate”, Nature, 2001, 412, 835–
838.
(35) D. L. Zechel and S. G. Withers, “Glycosidase mechanisms: anatomy of a finely
tuned catalyst”, Accounts of Chemical Research, 2000, 33, 11–18.
(36) X. Biarnés, J. Nieto, A. Planas and C. Rovira, “Substrate distortion in the Michaelis
complex of Bacillus 1, 3–1, 4-β -glucanase insight from first principles molecular
dynamics simulations”, Journal of Biological Chemistry, 2006, 281, 1432–1441.
BIBLIOGRAPHY 111
(37) V. Spiwok, B. Králová and I. Tvaroška, “Modelling of β -D-glucopyranose ring

distortion in different force fields: a metadynamics study”, Carbohydrate Re-
search, 2010, 345, 530–537.
(38) G. Davies, V.-A. Ducros, A. Varrot and D. Zechel, “Mapping the conformational
itinerary of β -glycosidases by X-ray crystallography”, Biochemical Society Trans-
actions, 2003, 31, 523–527.
(39) S. Fushinobu, B. Mertz, A. D. Hill, M. Hidaka, M. Kitaoka and P. J. Reilly, “Com-
putational analyses of the conformational itinerary along the reaction pathway
of GH94 cellobiose phosphorylase”, Carbohydrate Research, 2008, 343, 1023–
1033.
(40) D. M. Guérin, M.-B. Lascombe, M. Costabel, H. Souchon, V. Lamzin, P. Béguin
and P. M. Alzari, “Atomic (0.94 Å) resolution structure of an inverting glycosidase
in complex with substrate”, Journal of Molecular Biology, 2002, 316, 1061–1069.
(41) E. Sabini, G. Sulzenbacher, M. Dauter, Z. Dauter, P. L. Jørgensen, M. Schülein,
C. Dupont, G. J. Davies and K. S. Wilson, “Catalysis and specificity in enzymatic
glycoside hydrolysis: A 2, 5B conformation for the glycosyl-enzyme intermedi-
ate revealed by the structure of the Bacillus agaradhaerens family 11 xylanase”,
Chemistry & Biology, 1999, 6, 483–492.
(42) L. E. Tailford, W. A. Offen, N. L. Smith, C. Dumon, C. Morland, J. Gratien,
M.-P. Heck, R. V. Stick, Y. Blériot, A. Vasella et al., “Structural and biochemical
evidence for a boat-like transition state in β -mannosidases”, Nature Chemical
Biology, 2008, 4, 306–312.
(43) A. Vasella, G. J. Davies and M. Böhm, “Glycosidase mechanisms”, Current Opin-
ion in Chemical Biology, 2002, 6, 619–629.
(44) S. A. Jongkees and S. G. Withers, “Unusual enzymatic glycoside cleavage mech-
anisms”, Accounts of Chemical Research, 2013, 47, 226–235.
(45) D. J. Vocadlo and G. J. Davies, “Mechanistic insights into glycosidase chemistry”,
Current Opinion in Chemical Biology, 2008, 12, 539–555.
(46) M. E. Caines, S. M. Hancock, C. A. Tarling, T. M. Wrodnigg, R. V. Stick, A. E.
Stütz, A. Vasella, S. G. Withers and N. C. Strynadka, “The structural basis of
Glycosidase Inhibition by five-membered iminocyclitols: The clan a Glycoside
Hydrolase endoglycoceramidase as a model system”, Angewandte Chemie Inter-
national Edition, 2007, 46, 4474–4476.
BIBLIOGRAPHY 112
(47) V. M.-A. Ducros, D. L. Zechel, G. N. Murshudov, H. J. Gilbert, L. Szabó, D.

Stoll, S. G. Withers and G. J. Davies, “Substrate distortion by a β -Mannanase:
Snapshots of the Michaelis and covalent-intermediate complexes suggest a B2, 5
conformation for the transition state”, Angewandte Chemie International Edition,
2002, 41, 2824–2827.
(48) L. E. Tailford, V. A. Money, N. L. Smith, C. Dumon, G. J. Davies and H. J.
Gilbert, “Mannose foraging by Bacteroides thetaiotaomicron structure and speci-
ficity of the β -mannosidase, BtMan2A”, Journal of Biological Chemistry, 2007,
282, 11291–11299.
(49) W. A. Offen, D. L. Zechel, S. G. Withers, H. J. Gilbert and G. J. Davies, “Structure
of the Michaelis complex of β -mannosidase, Man2A, provides insight into the
conformational itinerary of mannoside hydrolysis”, Chemical Communications,
2009, 2484–2486.
(50) A. Ardèvol and C. Rovira, “Reaction mechanisms in carbohydrate-active enzymes:
Glycoside hydrolases and glycosyltransferases. Insights from ab initio quantum
mechanics/molecular mechanics dynamic simulations”, Journal of the American
Chemical Society, 2015, 137, 7528–7547.
(51) A. J. Thompson, G. Speciale, J. Iglesias-Fernández, Z. Hakki, T. Belz, A. Cart-
mell, R. J. Spears, E. Chandler, M. J. Temple, J. Stepper et al., “Evidence for a
boat bonformation at the transition state of GH76 α-1, 6-mannanases —key en-
zymes in bacterial and fungal mannoprotein metabolism”, Angewandte Chemie
International Edition, 2015, 54, 5378–5382.
(52) J. Iglesias-Fernández, L. Raich, A. Ardèvol and C. Rovira, “The complete confor-
mational free energy landscape of β -xylose reveals a two-fold catalytic itinerary
for β -xylanases”, Chemical Science, 2015, 6, 1167–1177.
(53) B. C. Knott, M. Haddad Momeni, M. F. Crowley, L. F. Mackenzie, A. W. Götz,
M. Sandgren, S. G. Withers, J. Ståhlberg and G. T. Beckham, “The mechanism
of cellulose hydrolysis by a two-step, retaining cellobiohydrolase elucidated by
structural and transition path sampling studies”, Journal of the American Chemi-
cal Society, 2013, 136, 321–329.
(54) A. J. Thompson, J. Dabin, J. Iglesias-Fernández, A. Ardévol, Z. Dinev, S. J.
Williams, O. Bande, A. Siriwardena, C. Moreland, T.-C. Hu et al., “The reaction
coordinate of a bacterial GH47 α-mannosidase: A combined quantum mechan-
ical and structural approach”, Angewandte Chemie International Edition, 2012,
51, 10997–11001.
BIBLIOGRAPHY 113
(55) X. Biarnés, A. Ardèvol, J. Iglesias-Fernández, A. Planas and C. Rovira, “Catalytic

itinerary in 1, 3-1, 4-β -glucanase unraveled by QM/MM metadynamics. Charge
is not yet fully developed at the oxocarbenium ion-like transition state”, Journal
of the American Chemical Society, 2011, 133, 20301–20309.
(56) T. K. Harris and G. J. Turner, “Structural basis of perturbed pKa values of catalytic
groups in enzyme active sites”, IUBMB Life, 2002, 53, 85–98.
(57) W. P. Jencks, Catalysis in chemistry and enzymology, Courier Corporation, 1987.
(58) E. C. Webb et al., Enzyme nomenclature 1992. Recommendations of the Nomen-
clature Committee of the International Union of Biochemistry and Molecular Bi-
ology on the Nomenclature and Classification of Enzymes. Academic Press, 1992.
(59) J. M. Berg, J. L. Tymoczko and L. Stryer, Biochemistry, ; W. H, 2002.
(60) M. Coughlan and G. P. Hazlewood, “beta-1, 4-D-xylan-degrading enzyme sys-
tems: Biochemistry, molecular biology and applications”, Biotechnology and Ap-
plied Biochemistry, 1993, 17, 259–289.
(61) G. P. Connelly, S. G. Withers and L. P. McINTOSH, “Analysis of the dynamic
properties of Bacillus circulans xylanase upon formation of a covalent glycosyl-
enzyme intermediate”, Protein Science, 2000, 9, 512–524.
(62) M. P. Kötzler, L. P. McIntosh and S. G. Withers, “Refolding the unfoldable: A
systematic approach for renaturation of Bacillus circulans xylanase”, Protein Sci-
ence, 2017, 26, 1555–1563.
(63) L. A. Plesniak, L. P. Mcintosh and W. W. Wakarchuk, “Secondary structure and
NMR assignments of Bacillus circulans xylanase”, Protein Science, 1996, 5, 1118–
1135.
(64) L. P. McIntosh, G. Hand, P. E. Johnson, M. D. Joshi, M. Körner, L. A. Plesniak,
L. Ziser, W. W. Wakarchuk and S. G. Withers, “The pKa of the general acid/base
carboxyl group of a glycosidase cycles during catalysis: A 13C-NMR study of
Bacillus circulans xylanase”, Biochemistry, 1996, 35, 9958–9966.
(65) S. Miao, L. Ziser, R. Aebersold and S. G. Withers, “Identification of glutamic acid
78 as the active site nucleophile in Bacillus subtilis xylanase using electrospray
tandem mass spectrometry”, Biochemistry, 1994, 33, 7027–7032.
(66) R. Campbell, D. Rose, W. Wakarchuk, W. Suog and M. Yaguchi, in Proceedings
of the second TRICEL symposium on Trichoderma reesei cellulases and other
hydrolyases, 1993, pp. 63–77.
(67) A. Törrönen, A. Harkki and J. Rouvinen, “Three-dimensional structure of endo-1,
4-β -xylanase II from Trichoderma reesei: Two conformational states in the active
site.”, The EMBO journal, 1994, 13, 2493–2501.
BIBLIOGRAPHY 114
(68) A. Torronen and J. Rouvinen, “Structural comparison of two major endo-1, 4-

xylanases from Trichoderma reesei”, Biochemistry, 1995, 34, 847–856.
(69) U. Krengel and B. W. Dijkstra, “Three-dimensional structure of endo-1, 4-β -
xylanase I from Aspergillus niger: Molecular basis for its low pH optimum”,
Journal of Molecular Biology, 1996, 263, 70–78.
(70) G. P. Connelly and L. P. McIntosh, “Characterization of a buried neutral histidine
in Bacillus circulans xylanase: Internal dynamics and interaction with a bound
water molecule”, Biochemistry, 1998, 37, 1810–1818.
(71) M. L. Ludwiczek, I. D’Angelo, G. N. Yalloway, J. A. Brockerman, M. Okon,
J. E. Nielsen, N. C. Strynadka, S. G. Withers and L. P. McIntosh, “Strategies
for modulating the pH-dependent activity of a family 11 glycoside hydrolase”,
Biochemistry, 2013, 52, 3138–3156.
(72) A. Sapag, J. Wouters, C. Lambert, P. de Ioannes, J. Eyzaguirre and E. Depiereux,
“The endoxylanases from family 11: computer analysis of protein sequences re-
veals important structural and phylogenetic relationships”, Journal of Biotechnol-
ogy, 2002, 95, 109–131.
(73) T. Collins, C. Gerday and G. Feller, “Xylanases, xylanase families and extremophilic
xylanases”, FEMS Microbiology Reviews, 2005, 29, 3–23.
(74) A. Pollet, J. A. Delcour and C. M. Courtin, “Structural determinants of the sub-
strate specificities of xylanases from different glycoside hydrolase families”, Crit-
ical Reviews in Biotechnology, 2010, 30, 176–191.
(75) G. Paës, J.-G. Berrin and J. Beaugrand, “GH11 xylanases: Structure/function/properties
relationships and applications”, Biotechnology Advances, 2012, 30, 564–592.
(76) K. Xiao and H. Yu, “Rationalising pKa shifts in Bacillus circulans xylanase with
computational studies”, Physical Chemistry Chemical Physics, 2016, 18, 30305–
30312.
(77) G. K. Hirst, “The agglutination of red cells by allantoic fluid of chick embryos
infected with influenza virus”, Science, 1941, 94, 22–23.
(78) A. Gottschalk and P. E. Lind, “Product of interaction between influenza virus
enzyme and ovomucin”, Nature, 1949, 164, 232–233.
(79) G. Taylor, “Sialidases: Structures, biological significance and therapeutic poten-
tial”, Current Opinion in Structural Biology, 1996, 6, 830–837.
(80) K. Shinya, M. Ebina, S. Yamada, M. Ono, N. Kasai and Y. Kawaoka, “Avian flu:
Influenza virus receptors in the human airway”, Nature, 2006, 440, 435–436.
(81) M. J. Memoli, D. M. Morens and J. K. Taubenberger, “Pandemic and seasonal
influenza: Therapeutic challenges”, Drug Discovery Today, 2008, 13, 590–595.
BIBLIOGRAPHY 115
(82) M. J. Kiefel and M. von Itzstein, “Recent advances in the synthesis of sialic acid
derivatives and sialylmimetics as biological probes”, Chemical Reviews, 2002,
102, 471–490.
(83) A. P. Corfield and R. Schauer, in Sialic Acids, Springer, 1982, pp. 5–50.
(84) P. Palese, K. Tobita, M. Ueda and R. W. Compans, “Characterization of tempera-
ture sensitive influenza virus mutants defective in neuraminidase”, Virology, 1974,
61, 397–410.
(85) M. N. Matrosovich, T. Y. Matrosovich, T. Gray, N. A. Roberts and H.-D. Klenk,
“Neuraminidase is important for the initiation of influenza virus infection in hu-
man airway epithelium”, Journal of Virology, 2004, 78, 12665–12667.
(86) A. Gaymard, N. Le Briand, E. Frobert, B. Lina and V. Escuret, “Functional bal-
ance between neuraminidase and haemagglutinin in influenza viruses”, Clinical
Microbiology and Infection, 2016, 22, 975–983.
(87) T. Sergel, L. McGinnes and T. Morrison, “Role of a conserved sequence in the
maturation and function of the NDV HN glycoprotein”, Virus Research, 1993, 30,
281–294.
(88) I. V. Alymova, A. Portner, T. Takimoto, K. L. Boyd, Y. S. Babu and J. A. Mc-
Cullers, “The novel parainfluenza virus hemagglutinin-neuraminidase inhibitor
BCX 2798 prevents lethal synergism between a paramyxovirus and Streptococ-
cus pneumoniae”, Antimicrobial Agents and Chemotherapy, 2005, 49, 398–405.
(89) M. J. Kiefel and M. Von Itzstein, “1 Influenza Virus Sialidase: A Target for Drug
Discovery”, Progress in Medicinal Chemistry, 1999, 36, 1–28.
(90) M. von Itzstein, “Avian influenza virus, a very sticky situation”, Current Opinion
in Chemical Biology, 2008, 12, 102–108.
(91) M. Von Itzstein and R. Thomson, in Antiviral Strategies, Springer, 2009, pp. 111–
154.
(92) P. Roggentin, R. Schauer, L. L. Hoyer and E. R. Vimr, “The sialidase superfamily
and its spread by horizontal gene transfer”, Molecular Microbiology, 1993, 9,
915–921.
(93) E. R. Vimr, K. A. Kalivoda, E. L. Deszo and S. M. Steenbergen, “Diversity of
microbial sialic acid metabolism”, Microbiology and Molecular Biology Reviews,
2004, 68, 132–153.
(94) S. Kim, D.-B. Oh, H. A. Kang and O. Kwon, “Features and applications of bacte-
rial sialidases”, Applied Microbiology and Biotechnology, 2011, 91, 1–15.
(95) T. Corfield, “Bacterial sialidases —roles in pathogenicity and nutrition”, Glyco-
biology, 1992, 2, 509–521.
BIBLIOGRAPHY 116
(96) S. Magesh, T. Suzuki, T. Miyagi, H. Ishida and M. Kiso, “Homology modeling of

human sialidase enzymes NEU1, NEU3 and NEU4 based on the crystal structure
of NEU2: hints for the design of selective NEU3 inhibitors”, Journal of Molecular
Graphics and Modelling, 2006, 25, 196–207.
(97) V. Smutova, A. Albohy, X. Pan, E. Korchagina, T. Miyagi, N. Bovin, C. W. Cairo
and A. V. Pshezhetsky, “Structural basis for substrate specificity of mammalian
neuraminidases”, PLOS One, 2014, 9, e106320.
(98) E. Monti, E. Bonten, A. D’Azzo, R. Bresciani, B. Venerando, G. Borsani, R.
Schauer and G. Tettamanti, “Sialidases in vertebrates: A family of enzymes tai-
lored for several cell functions”, Advances in Carbohydrate Chemistry and Bio-
chemistry, 2010, 64, 403–479.
(99) T. Miyagi and K. Yamaguchi, “Mammalian sialidases: Physiological and patho-
logical roles in cellular functions”, Glycobiology, 2012, 22, 880–896.
(100) G. Zanchetti, P. Colombi, M. Manzoni, L. Anastasia, L. Caimi, G. Borsani, B.
Venerando, G. Tettamanti, A. Preti, E. Monti et al., “Sialidase NEU3 is a periph-
eral membrane protein localized on the cell surface and in endosomal structures”,
Biochemical Journal, 2007, 408, 211–219.
(101) M. Rodriguez-Walker, A. A. Vilcaes, E. Garbarino-Pico and J. L. Daniotti, “Role
of plasma-membrane-bound sialidase NEU3 in clathrin-mediated endocytosis”,
Biochemical Journal, 2015, 470, 131–144.
(102) F. Cirillo, A. Ghiroldi, C. Fania, M. Piccoli, E. Torretta, G. Tettamanti, C. Gelfi
and L. Anastasia, “NEU3 sialidase protein interactors in the plasma membrane
and in the endosomes”, Journal of Biological Chemistry, 2016, 291, 10615–10624.
(103) T. Miyagi, K. Takahashi, S. Moriya, K. Hata, K. Yamamoto, T. Wada, K. Yam-
aguchi and K. Shiozaki, in Biochemical Roles of Eukaryotic Cell Surface Macro-
molecules, Springer, 2012, pp. 257–267.
(104) A. Minami, T. Otsubo, D. Ieno, K. Ikeda, H. Kanazawa, K. Shimizu, K. Ohata, T.
Yokochi, Y. Horii, H. Fukumoto et al., “Visualization of sialidase activity in Mam-
malian tissues and cancer detection with a novel fluorescent sialidase substrate”,
PLOS One, 2014, 9, e81941.
(105) V. Seyrantepe, H. Poupetova, R. Froissart, M.-T. Zabot, I. Maire and A. V. Pshezhet-
sky, “Molecular pathology of NEU1 gene in sialidosis”, Human Mutation, 2003,
22, 343–352.
(106) I. Annunziata, A. Patterson, D. Helton, H. Hu, S. Moshiach, E. Gomero, R. Nixon
and A. d?Azzo, “Lysosomal NEU1 deficiency affects amyloid precursor protein
levels and amyloid-β secretion via deregulated lysosomal exocytosis”, Nature
Communications, 2013, 4, 2734–2745.
BIBLIOGRAPHY 117
(107) F. Liang, V. Seyrantepe, K. Landry, R. Ahmad, A. Ahmad, N. M. Stamatos and

A. V. Pshezhetsky, “Monocyte differentiation up-regulates the expression of the
lysosomal sialidase, Neu1, and triggers its targeting to the plasma membrane via
major histocompatibility complex class II-positive compartments”, Journal of Bi-
ological Chemistry, 2006, 281, 27526–27538.
(108) T. Uemura, K. Shiozaki, K. Yamaguchi, S. Miyazaki, S. Satomi, K. Kato, H.
Sakuraba and T. Miyagi, “Contribution of sialidase NEU1 to suppression of metas-
tasis of human colon cancer cells through desialylation of integrin β 4”, Oncogene,
2009, 28, 1218–1229.
(109) S. R. Amith, P. Jayanth, S. Franchuk, S. Siddiqui, V. Seyrantepe, K. Gee, S. Basta,
R. Beyaert, A. V. Pshezhetsky and M. R. Szewczuk, “Dependence of pathogen
molecule-induced toll-like receptor activation and cell function on Neu1 siali-
dase”, Glycoconjugate Journal, 2009, 26, 1197–1212.
(110) F. Alghamdi, M. Guo, S. Abdulkhalek, N. Crawford, S. R. Amith and M. R.
Szewczuk, “A novel insulin receptor-signaling platform and its link to insulin
resistance and type 2 diabetes”, Cellular Signalling, 2014, 26, 1355–1368.
(111) S. Katoh, S. Maeda, H. Fukuoka, T. Wada, S. Moriya, A. Mori, K. Yamaguchi,
S. Senda and T. Miyagi, “A crucial role of sialidase Neu1 in hyaluronan receptor
function of CD44 in T helper type 2-mediated airway inflammation of murine
acute asthmatic model”, Clinical & Experimental Immunology, 2010, 161, 233–
241.
(112) M. Sumida, M. Hane, U. Yabe, Y. Shimoda, O. M. Pearce, M. Kiso, T. Miyagi, M.
Sawada, A. Varki, K. Kitajima et al., “Rapid trimming of cell surface polysialic
acid (PolySia) by exovesicular sialidase triggers release of preexisting surface
neurotrophin”, Journal of Biological Chemistry, 2015, 290, 13202–13214.
(113) S. Ryuzono, R. Takase, K. Oishi, A. Ikeda, P. K. Chigwechokha, A. Funahashi,
M. Komatsu, T. Miyagi and K. Shiozaki, “Lysosomal localization of Japanese
medaka (Oryzias latipes) Neu1 sialidase and its highly conserved enzymatic pro-
files with human”, Gene, 2016, 575, 513–523.
(114) A. Fanzani, R. Giuliani, F. Colombo, S. Rossi, E. Stoppani, W. Martinet, A. Preti
and S. Marchesini, “The enzymatic activity of sialidase Neu2 is inversely regu-
lated during in vitro myoblast hypertrophy and atrophy”, Biochemical and Bio-
physical Research Communications, 2008, 370, 376–381.
(115) M. S. Sandbhor, N. Soya, A. Albohy, R. B. Zheng, J. Cartmell, D. R. Bundle,
J. S. Klassen and C. W. Cairo, “Substrate recognition of the membrane-associated
sialidase NEU3 requires a hydrophobic aglycone”, Biochemistry, 2011, 50, 6753–
6762.
BIBLIOGRAPHY 118
(116) A. Albohy, M. D. Li, R. B. Zheng, C. Zou and C. W. Cairo, “Insight into substrate
recognition and catalysis by the human neuraminidase 3 (NEU3) through molecu-
lar modeling and site-directed mutagenesis”, Glycobiology, 2010, 20, 1127–1138.
(117) A. Albohy, S. Mohan, R. B. Zheng, B. M. Pinto and C. W. Cairo, “Inhibitor se-
lectivity of a new class of oseltamivir analogs against viral neuraminidase over
human neuraminidase enzymes”, Bioorganic & Medicinal Chemistry, 2011, 19,
2817–2822.
(118) Y. Zhang, A. Albohy, Y. Zou, V. Smutova, A. V. Pshezhetsky and C. W. Cairo,
“Identification of selective inhibitors for human neuraminidase isoenzymes us-
ing C4, C7-modified 2-deoxy-2, 3-didehydro-N-acetylneuraminic acid (DANA)
analogues”, Journal of Medicinal Chemistry, 2013, 56, 2948–2958.
(119) Y. Wang, K. Yamaguchi, T. Wada, K. Hata, X. Zhao, T. Fujimoto and T. Miyagi,
“A close association of the ganglioside-specific sialidase Neu3 with caveolin in
membrane microdomains”, Journal of Biological Chemistry, 2002, 277, 26252–
26259.
(120) A. Sasaki, K. Hata, S. Suzuki, M. Sawada, T. Wada, K. Yamaguchi, M. Obi-
nata, H. Tateno, H. Suzuki and T. Miyagi, “Overexpression of plasma membrane-
associated sialidase attenuates insulin signaling in transgenic mice”, Journal of
Biological Chemistry, 2003, 278, 27896–27902.
(121) K. Kato, K. Shiga, K. Yamaguchi, K. Hata, T. Kobayashi, K. Miyazaki, S. Saijo
and T. Miyagi, “Plasma-membrane-associated sialidase (NEU3) differentially reg-
ulates integrin-mediated cell proliferation through laminin-and fibronectin-derived
signalling”, Biochemical Journal, 2006, 394, 647–656.
(122) L. Anastasia, N. Papini, F. Colazzo, G. Palazzolo, C. Tringali, L. Dileo, M. Pic-
coli, E. Conforti, C. Sitzia, E. Monti et al., “NEU3 sialidase strictly modulates
GM3 levels in skeletal myoblasts C2C12 thus favoring their differentiation and
protecting them from apoptosis”, Journal of Biological Chemistry, 2008, 283,
36265–36271.
(123) T. Wada, K. Hata, K. Yamaguchi, K. Shiozaki, K. Koseki, S. Moriya and T.
Miyagi, “A crucial role of plasma membrane-associated sialidase in the survival
of human cancer cells”, Oncogene, 2007, 26, 2483–2490.
(124) T. Miyagi, T. Wada and K. Yamaguchi, “Roles of plasma membrane-associated
sialidase NEU3 in human cancers”, Biochimica et Biophysica Acta (BBA)-General
Subjects, 2008, 1780, 532–537.
BIBLIOGRAPHY 119
(125) N. Papini, L. Anastasia, C. Tringali, L. Dileo, I. Carubelli, M. Sampaolesi, E.

Monti, G. Tettamanti and B. Venerando, “MmNEU3 sialidase over-expression in
C2C12 myoblasts delays differentiation and induces hypertrophic myotube for-
mation”, Journal of Cellular Biochemistry, 2012, 113, 2967–2978.
(126) K. Yamamoto, K. Takahashi, K. Shiozaki, K. Yamaguchi, S. Moriya, M. Hosono,
H. Shima and T. Miyagi, “Potentiation of epidermal growth factor-mediated onco-
genic transformation by sialidase NEU3 leading to Src activation”, PLOS One,
2015, 10, e0120578.
(127) F. Jia, M. A. Howlader and C. W. Cairo, “Integrin-mediated cell migration is
blocked by inhibitors of human neuraminidase”, Biochimica et Biophysica Acta
(BBA)-Molecular and Cell Biology of Lipids, 2016, 1861, 1170–1179.
(128) T. Hasegawa, N. Sugeno, A. Takeda, M. Matsuzaki-Kobayashi, A. Kikuchi, K.
Furukawa, T. Miyagi and Y. Itoyama, “Role of Neu4L sialidase and its substrate
ganglioside GD3 in neuronal apoptosis induced by catechol metabolites”, FEBS
Letters, 2007, 581, 406–412.
(129) M. Sorice, P. Matarrese, A. Tinari, A. M. Giammarioli, T. Garofalo, V. Man-
ganelli, L. Ciarlo, L. Gambardella, G. Maccari, M. Botta et al., “Raft component
GD3 associates with tubulin following CD95/Fas ligation”, The FASEB Journal,
2009, 23, 3298–3308.
(130) R. Sano, I. Annunziata, A. Patterson, S. Moshiach, E. Gomero, J. Opferman,
M. Forte and A. d’Azzo, “GM1-ganglioside accumulation at the mitochondria-
associated ER membranes links ER stress to Ca 2+-dependent mitochondrial apop-
tosis”, Molecular Cell, 2009, 36, 500–511.
(131) A. Rassi Jr, A. Rassi and J. A. Marin-Neto, “Chagas disease”, The Lancet, 2010,
375, 1388–1402.
(132) P. Scudder, J. Doom, M. Chuenkova, I. Manger and M. Pereira, “Enzymatic char-
acterization of β -D-galactoside alpha 2, 3-trans-sialidase from Trypanosoma cruzi.”,
Journal of Biological Chemistry, 1993, 268, 9886–9891.
(133) M. A. Ferrero-garcÍa, S. E. Trombetta, D. O. SÁnchez, A. Reglero, A. C. Frasch
and A. J. Parodi, “The action of Trypanosoma cruzi trans-sialidase on glycolipids
and glycoproteins”, The FEBS Journal, 1993, 213, 765–771.
(134) C. A. Buscaglia, V. A. Campo, A. C. Frasch and J. M. Di Noia, “Trypanosoma
cruzi surface mucins: Host-dependent coat diversity.”, Nature Reviews Microbi-
ology, 2006, 4, 229–236.
(135) M. Kashif, A. Moreno-Herrera, E. E. Lara-Ramirez, E. Ramırez-Moreno, V. Bocanegra-
Garcıa, M. Ashfaq and G. Rivera, “Recent developments in trans-sialidase in-
hibitors of Trypanosoma cruzi”, Journal of Drug Targeting, 2017, 25, 485–498.
BIBLIOGRAPHY 120
(136) S. Schenkman, D. Eichinger, M. Pereira and V. Nussenzweig, “Structural and

functional properties of Trypanosoma trans-sialidase”, Annual Reviews in Micro-
biology, 1994, 48, 499–523.
(137) T. Haselhorst, J. C. Wilson, A. Liakatos, M. J. Kiefel, J. C. Dyason and M. Von
Itzstein, “NMR spectroscopic and molecular modeling investigations of the trans-
sialidase from Trypanosoma cruzi”, Glycobiology, 2004, 14, 895–907.
(138) J. Holck, D. M. Larsen, M. Michalak, H. Li, L. Kjærulff, F. Kirpekar, C. H. Got-
fredsen, S. Forssten, A. C. Ouwehand, J. D. Mikkelsen et al., “Enzyme catal-
ysed production of sialylated human milk oligosaccharides and galactooligosac-
charides by Trypanosoma cruzi trans-sialidase”, New Biotechnology, 2014, 31,
156–165.
(139) L. C. Pontes-de-Carvalho, S. Tomlinson and V. Nussenzweig, “Trypanosoma rangeli
sialidase lacks trans-sialidase activity”, Molecular and Biochemical Parasitology,
1993, 62, 19–25.
(140) A. Buschiazzo, G. A. Tavares, O. Campetella, S. Spinelli, M. L. Cremona, G.
Parıs, M. F. Amaya, A. C. Frasch and P. M. Alzari, “Structural basis of sialyl-
transferase activity in Trypanosomal sialidases”, The EMBO Journal, 2000, 19,
16–24.
(141) G. Paris, L. Ratier, M. F. Amaya, T. Nguyen, P. M. Alzari and A. C. C. Frasch,
“A sialidase mutant displaying trans-sialidase activity”, Journal of Molecular Bi-
ology, 2005, 345, 923–934.
(142) M. Michalak, D. M. Larsen, C. Jers, J. R. Almeida, M. Willer, H. Li, F. Kirpekar,
L. Kjærulff, C. H. Gotfredsen, R. T. Nordvang et al., “Biocatalytic production
of 3’-sialyllactose by use of a modified sialidase with superior trans-sialidase
activity”, Process Biochemistry, 2014, 49, 265–270.
(143) C. Nyffenegger, R. T. Nordvang, C. Jers, A. S. Meyer and J. D. Mikkelsen, “De-
sign of Trypanosoma rangeli sialidase mutants with improved trans-sialidase ac-
tivity”, PLOS One, 2017, 12, e0171585.
(144) M.-Y. Chou, S.-C. Li, M. Kiso, A. Hasegawa and Y.-T. Li, “Purification and char-
acterization of sialidase L, a NeuAc alpha 2→3Gal-specific sialidase.”, Journal
of Biological Chemistry, 1994, 269, 18821–18826.
(145) M.-Y. Chou, S.-C. Li and Y.-T. Li, “Cloning and expression of sialidase L, a
NeuAcα2→3Gal-specific sialidase from the leech, Macrobdella decora”, Jour-
nal of Biological Chemistry, 1996, 271, 19219–19224.
BIBLIOGRAPHY 121
(146) K. E. Achyuthan and A. M. Achyuthan, “Comparative enzymology, biochemistry

and pathophysiology of human exo-α-sialidases (neuraminidases)”, Comparative
Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 2001,
129, 29–64.
(147) Y. Uchida, Y. Tsukada and T. Sugimori, “Enzymatic properties of neuraminidases
from Arthrobacter ureafaciens”, The Journal of Biochemistry, 1979, 86, 1573–
1585.
(148) M. Iwamori, Y. Ohta, Y. Uchida and Y. Tsukada, “Arthrobacter ureafaciens siali-
dase isoenzymes, L, M1 and M2, cleave fucosyl GM1”, Glycoconjugate Journal,
1997, 14, 67–73.
(149) G. Long, P. Taylor and J. Luzio, “Characterization of bacteriophage E endosiali-
dase specific for α2-8-linked polysialic acid”, Polysialic Acid, Birkhauser-Verlag,
Basel, 1993, 137–144.
(150) J. Petter and E. Vimr, “Complete nucleotide sequence of the bacteriophage K1F
tail gene encoding endo-N-acylneuraminidase (endo-N) and comparison to an
endo-N homolog in bacteriophage PK1E.”, Journal of Bacteriology, 1993, 175,
4354–4363.
(151) B. L. Cantarel, P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard and B. Hen-
rissat, “The Carbohydrate-Active EnZymes database (CAZy): An expert resource
for glycogenomics”, Nucleic Acids Research, 2008, 37, D233–D238.
(152) N. Terrapon, V. Lombard, E. Drula, P. M. Coutinho and B. Henrissat, in A Prac-
tical Guide to Using Glycomics Databases, Springer, 2017, pp. 117–131.
(153) S. Manco, F. Hernon, H. Yesilkaya, J. C. Paton, P. W. Andrew and A. Kadioglu,
“Pneumococcal neuraminidases A and B both have essential roles during infection
of the respiratory tract and sepsis”, Infection and Immunity, 2006, 74, 4014–4020.
(154) J. Luotonen, E. Herva, P. Karma, M. Timonen, M. Leinonen and P. H. Mäkelä,
“The bacteriology of acute otitis media in children with special reference to Strep-
tococcus pneumoniae as studied by bacteriological and antigen detection meth-
ods”, Scandinavian Journal of Infectious Diseases, 1981, 13, 177–183.
(155) I. H. Park, D. G. Pritchard, R. Cartee, A. Brandao, M. C. C. Brandileone and
M. H. Nahm, “Discovery of a new capsular serotype (6C) within serogroup 6 of
Streptococcus pneumoniae”, Journal of Clinical Microbiology, 2007, 45, 1225–
1233.
(156) J. J. Calix and M. H. Nahm, “A new pneumococcal serotype, 11E, has a variably
inactivated wcjE gene”, The Journal of Infectious Diseases, 2010, 202, 29–38.
BIBLIOGRAPHY 122
(157) J. J. Calix, R. J. Porambo, A. M. Brady, T. R. Larson, J. Yother, C. Abeygunwar-

dana and M. H. Nahm, “Biochemical, genetic, and serological characterization of
two capsule subtypes among Streptococcus pneumoniae serotype 20 strains dis-
covery of a new pneumococcal serotype”, Journal of Biological Chemistry, 2012,
287, 27885–27894.
(158) C. Ardanuy, G. Adela, E. Garcıa, A. Fenoll, L. Calatayud, E. Cercenado, E. Pérez-
Trallero, E. Bouza and J. Liñares, “Spread of Streptococcus pneumoniae serotype
8-ST63 multidrug-resistant recombinant clone, Spain”, Emerging Infectious Dis-
eases, 2014, 20, 1848–1856.
(159) P. Yagupsky, N. Porat, D. Fraser, F. Prajgrod, M. Merires, L. McGee, K. P. Klug-
man and R. Dagan, “Acquisition, carriage, and transmission of pneumococci with
decreased antibiotic susceptibility in young children attending a day care facility
in southern Israel”, Journal of Infectious Diseases, 1998, 177, 1003–1012.
(160) D. Bogaert, R. de Groot and P. Hermans, “Streptococcus pneumoniae colonisa-
tion: The key to pneumococcal disease”, The Lancet Infectious Diseases, 2004, 4,
144–154.
(161) G. Regev-Yochay, M. Raz, R. Dagan, N. Porat, B. Shainberg, E. Pinco, N. Keller
and E. Rubinstein, “Nasopharyngeal carriage of Streptococcus pneumoniae by
adults and children in community and family settings”, Clinical Infectious Dis-
eases, 2004, 38, 632–639.
(162) A. M. van Rossum, E. S. Lysenko and J. N. Weiser, “Host and bacterial factors
contributing to the clearance of colonization by Streptococcus pneumoniae in a
murine model”, Infection and Immunity, 2005, 73, 7718–7726.
(163) Y.-J. Lu, J. Gross, D. Bogaert, A. Finn, L. Bagrade, Q. Zhang, J. K. Kolls, A.
Srivastava, A. Lundgren, S. Forte et al., “Interleukin-17A mediates acquired im-
munity to pneumococcal colonization”, PLOS pathogens, 2008, 4, e1000159.
(164) A. Kadioglu, J. N. Weiser, J. C. Paton and P. W. Andrew, “The role of Streptococ-
cus pneumoniae virulence factors in host respiratory colonization and disease”,
Nature Reviews Microbiology, 2008, 6, 288–301.
(165) D. M. Weinberger, R. Dagan, N. Givon-Lavi, G. Regev-Yochay, R. Malley and
M. Lipsitch, “Epidemiologic evidence for serotype-specific acquired immunity to
pneumococcal carriage”, The Journal of Infectious Diseases, 2008, 197, 1511–
1518.
(166) J. Vuononvirta, L. Toivonen, K. Gröndahl-Yli-Hannuksela, A.-M. Barkoff, L.
Lindholm, J. Mertsola, V. Peltola and Q. He, “Nasopharyngeal bacterial coloniza-
tion and gene polymorphisms of mannose-binding lectin and toll-like receptors 2
and 4 in infants”, PLOS One, 2011, 6, e26198.
BIBLIOGRAPHY 123
(167) M. Kyaw, S. Clarke, I. Jones and H. Campbell, “Non-invasive pneumococcal dis-

ease and antimicrobial resistance: Vaccine implications”, Epidemiology & Infec-
tion, 2002, 128, 21–27.
(168) X. Deng, Ph.D. Thesis, University of Toronto (Canada), 2017.
(169) C. C. Daniels, P. D. Rogers and C. M. Shelton, “A review of pneumococcal vac-
cines: Current polysaccharide vaccine recommendations and future protein anti-
gens”, The Journal of Pediatric Pharmacology and Therapeutics, 2016, 21, 27–
35.
(170) G. Xu, Ph.D. Thesis, University of St Andrews, 2009.
(171) H. Tettelin, K. E. Nelson, I. T. Paulsen, J. A. Eisen, T. D. Read, S. Peterson, J.
Heidelberg, R. T. DeBoy, D. H. Haft, R. J. Dodson et al., “Complete genome
sequence of a virulent isolate of Streptococcus pneumoniae”, Science, 2001, 498–
506.
(172) M. J. Jedrzejas, “Pneumococcal virulence factors: structure and function”, Micro-
biology and Molecular Biology Reviews, 2001, 65, 187–207.
(173) G. Xu, M. J. Kiefel, J. C. Wilson, P. W. Andrew, M. R. Oggioni and G. L. Taylor,
“Three Streptococcus pneumoniae sialidases: Three different products”, Journal
of the American Chemical Society, 2011, 133, 1718–1721.
(174) G. Xu, X. Li, P. W. Andrew and G. L. Taylor, “Structure of the catalytic domain
of Streptococcus pneumoniae sialidase NanA”, Acta Crystallographica Section
F: Structural Biology and Crystallization Communications, 2008, 64, 772–775.
(175) G. Xu, J. A. Potter, R. J. Russell, M. R. Oggioni, P. W. Andrew and G. L. Tay-
lor, “Crystal structure of the NanB sialidase from Streptococcus pneumoniae”,
Journal of Molecular Biology, 2008, 384, 436–449.
(176) C. D. Owen, P. Lukacik, J. A. Potter, O. Sleator, G. L. Taylor and M. A. Walsh,
“Streptococcus pneumoniae NanC structural insights into the specificity and mech-
anism of a sialidase that produce a silidase inhibitor”, Journal of Biological Chem-
istry, 2015, 290, 27736–27748.
(177) M. M. Pettigrew, K. P. Fennie, M. P. York, J. Daniels and F. Ghaffar, “Variation in
the presence of neuraminidase genes among Streptococcus pneumoniae isolates
with identical sequence types”, Infection and Immunity, 2006, 74, 3360–3365.
(178) J. Long, H. Tong and T. DeMaria, “Immunization with native or recombinant
Streptococcus pneumoniae neuraminidase affords protection in the chinchilla oti-
tis media model”, Infection and Immunity, 2004, 72, 4309–4313.
BIBLIOGRAPHY 124
(179) S. Uchiyama, A. F. Carlin, A. Khosravi, S. Weiman, A. Banerjee, D. Quach, G.

Hightower, T. J. Mitchell, K. S. Doran and V. Nizet, “The surface-anchored NanA
protein promotes pneumococcal brain endothelial cell invasion”, Journal of Ex-
perimental Medicine, 2009, 206, 1845–1852.
(180) H. Gut, S. J. King and M. A. Walsh, “Structural and functional studies of Strepto-
coccus pneumoniae neuraminidase B: An intramolecular trans-sialidase”, FEBS
Letters, 2008, 582, 3348–3352.
(181) M. P. Allen and D. J. Tildesley, Computer simulation of liquids, Oxford University
Press, 2017.
(182) W. F. van Gunsteren, X. Daura, N. Hansen, A. Mark, C. Oostenbrink, S. Riniker
and L. Smith, “Validation of molecular simulation: An overview of issues”, Ange-
wandte Chemie International Edition, 2017, 884–902.
(183) R. Lonsdale, J. N. Harvey and A. J. Mulholland, “A practical guide to modelling
enzyme-catalysed reactions”, Chemical Society Reviews, 2012, 41, 3025–3038.
(184) W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson,
D. C. Spellmeyer, T. Fox, J. W. Caldwell and P. A. Kollman, “A second generation
force field for the simulation of proteins, nucleic acids, and organic molecules”,
Journal of the American Chemical Society, 1995, 117, 5179–5197.
(185) P. Kollman, J. W. Caldwell, W. S. Ross, D. A. Pearlman, D. A. Case, S. DeBolt,
T. E. Cheatham, D. Ferguson and G. Seibel, in Encyclopedia of Computational
Chemistry, American Cancer Society, 2002.
(186) J. Wang, P. Cieplak and P. A. Kollman, “How well does a restrained electrostatic
potential (RESP) model perform in calculating conformational energies of or-
ganic and biological molecules?”, Journal of Computational Chemistry, 2000,
21, 1049–1074.
(187) Y. Duan, C. Wu, S. Chowdhury, M. C. Lee, G. Xiong, W. Zhang, R. Yang, P.
Cieplak, R. Luo, T. Lee et al., “A point-charge force field for molecular mechan-
ics simulations of proteins based on condensed-phase quantum mechanical calcu-
lations”, Journal of Computational Chemistry, 2003, 24, 1999–2012.
(188) A. D. MacKerell Jr, D. Bashford, M. Bellott, R. L. Dunbrack Jr, J. D. Evanseck,
M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha et al., “All-atom empirical poten-
tial for molecular modeling and dynamics studies of proteins”, The Journal of
Physical Chemistry B, 1998, 102, 3586–3616.
(189) A. D. MacKerell, B. Brooks, C. L. Brooks, L. Nilsson, B. Roux, Y. Won and M.
Karplus, in Encyclopedia of Computational Chemistry, American Cancer Society,
2002.
BIBLIOGRAPHY 125
(190) N. Foloppe and A. D. MacKerell Jr, “All-atom empirical force field for nucleic
acids: I. Parameter optimization based on small molecule and condensed phase
macromolecular target data”, Journal of Computational Chemistry, 2000, 21, 86–
104.
(191) A. D. Mackerell and N. K. Banavali, “All-atom empirical force field for nucleic
acids: II. Application to molecular dynamics simulations of DNA and RNA in
solution”, Journal of Computational Chemistry, 2000, 21, 105–120.
(192) W. R. Scott, P. H. Hünenberger, I. G. Tironi, A. E. Mark, S. R. Billeter, J. Fen-
nen, A. E. Torda, T. Huber, P. Krüger and W. F. van Gunsteren, “The GROMOS
biomolecular simulation program package”, The Journal of Physical Chemistry
A, 1999, 103, 3596–3607.
(193) L. D. Schuler, X. Daura and W. F. Van Gunsteren, “An improved GROMOS96
force field for aliphatic hydrocarbons in the condensed phase”, Journal of Com-
putational Chemistry, 2001, 22, 1205–1218.
(194) T. A. Soares, P. H. Hünenberger, M. A. Kastenholz, V. Kräutler, T. Lenz, R. D.
Lins, C. Oostenbrink and W. F. van Gunsteren, “An improved nucleic acid pa-
rameter set for the GROMOS force field”, Journal of Computational Chemistry,
2005, 26, 725–737.
(195) N. Schmid, A. P. Eichenberger, A. Choutko, S. Riniker, M. Winger, A. E. Mark
and W. F. van Gunsteren, “Definition and testing of the GROMOS force-field
versions 54A7 and 54B7”, European Biophysics Journal, 2011, 40, 843–856.
(196) W. L. Jorgensen and J. Tirado-Rives, “The OPLS [optimized potentials for liquid
simulations] potential functions for proteins, energy minimizations for crystals of
cyclic peptides and crambin”, Journal of the American Chemical Society, 1988,
110, 1657–1666.
(197) W. L. Jorgensen, D. S. Maxwell and J. Tirado-Rives, “Development and testing
of the OPLS all-atom force field on conformational energetics and properties of
organic liquids”, Journal of the American Chemical Society, 1996, 118, 11225–
11236.
(198) H. M. Senn and W. Thiel, “QM/MM methods for biomolecular systems”, Ange-
wandte Chemie International Edition, 2009, 48, 1198–1229.
(199) R. Lonsdale, K. E. Ranaghan and A. J. Mulholland, “Computational enzymol-
ogy”, Chemical Communications, 2010, 46, 2354–2372.
(200) F. Claeyssens, J. Harvey, F. Manby, R. Mata, A. Mulholland and K. Ranaghan,
“M. Sch ütz, S. Thiel, W. Thiel and HJ Werner”, Angewandte Chemie Interna-
tional Edition, 2006, 45, 6856–6859.
BIBLIOGRAPHY 126
(201) H. M. Senn and W. Thiel, “QM/MM studies of enzymes”, Current opinion in

Chemical Biology, 2007, 11, 182–187.
(202) M. W. van der Kamp, K. E. Shaw, C. J. Woods and A. J. Mulholland, “Biomolec-
ular simulation and modelling: Status, progress and prospects”, Journal of The
Royal Society Interface, 2008, 5, 173–190.
(203) A. Warshel, “Computer simulations of enzyme catalysis: Methods, progress, and
insights”, Annual Review of Biophysics and Biomolecular Structure, 2003, 32,
425–443.
(204) A. Warshel and M. Levitt, “Theoretical studies of enzymic reactions: Dielec-
tric, electrostatic and steric stabilization of the carbonium ion in the reaction of
lysozyme”, Journal of Molecular Biology, 1976, 103, 227–249.
(205) F. Perruccio, L. Ridder and A. J. Mulholland, “Quantum-mechanical/molecular-
mechanical methods in medicinal chemistry”, Quantum Medicinal Chemistry,
2003, 177–198.
(206) R. A. Friesner, “Combined quantum and molecular mechanics (QM/MM)”, Drug
Discovery Today: Technologies, 2004, 1, 253–260.
(207) R. A. Friesner and V. Guallar, “Ab initio quantum chemical and mixed quan-
tum mechanics/molecular mechanics (QM/MM) methods for studying enzymatic
catalysis”, Annual Review of Physical Chemistry, 2005, 56, 389–427.
(208) H. Lin and D. G. Truhlar, “QM/MM: What have we learned, where are we, and
where do we go from here?”, Theoretical Chemistry Accounts, 2007, 117, 185–
199.
(209) H. Hu and W. Yang, “Free energies of chemical reactions in solution and in en-
zymes with ab initio quantum mechanics/molecular mechanics methods”, Annual
Review of Physical Chemistry, 2008, 59, 573–601.
(210) M. M. Reif and C. Oostenbrink, “Net charge changes in the calculation of relative
ligand-binding free energies via classical atomistic molecular dynamics simula-
tion”, Journal of Computational Chemistry, 2014, 35, 227–243.
(211) G. J. Rocklin, D. L. Mobley, K. A. Dill and P. H. Hünenberger, “Calculating
the binding free energies of charged species based on explicit-solvent simulations
employing lattice-sum methods: an accurate correction scheme for electrostatic
finite-size effects”, The Journal of Chemical Physics, 2013, 139, 184103.
(212) W. Im, S. Berneche and B. Roux, “Generalized solvent boundary potential for
computer simulations”, The Journal of Chemical Physics, 2001, 114, 2924–2937.
BIBLIOGRAPHY 127
(213) P. Schaefer, D. Riccardi and Q. Cui, “Reliable treatment of electrostatics in com-

bined QM/MM simulation of macromolecules”, The Journal of Chemical Physics,
2005, 123, 014905.
(214) T. Benighaus and W. Thiel, “A general boundary potential for hybrid QM/MM
simulations of solvated biomolecular systems”, Journal of Chemical Theory and
Computation, 2009, 5, 3114–3128.
(215) J. Zienau and Q. Cui, “Implementation of the solvent macromolecule boundary
potential and application to model and realistic enzyme systems”, The Journal of
Physical Chemistry B, 2012, 116, 12522–12534.
(216) M. Elstner and G. Seifert, “Density functional tight binding”, Philosophical Trans-
actions of the Royal Society A, 2014, 372, 20120483.
(217) G. Seifert, H. Eschrig and W. Bieger, “An approximation variant of LCAO-X-
alpha methods”, Zeitschrift Fur Physikalische Chemie-Leipzig, 1986, 267, 529–
539.
(218) J. Frenzel, A. F. Oliveira, H. A. Duarte, T. Heine and G. Seifert, “Structural and
electronic properties of bulk gibbsite and gibbsite surfaces”, Zeitschrift für Anor-
ganische Und Allgemeine Chemie, 2005, 631, 1267–1271.
(219) M. Frisch, G. Trucks, H. Schlegel, G. Scuseria, M. Robb, J. Cheeseman, V. Za-
krzewski, J. Montgomery Jr, R. E. Stratmann, J. Burant et al., “Gaussian 98, revi-
sion a. 7; gaussian”, Inc., Pittsburgh, PA, 1998, 12.
(220) S. Hazebroucq, G. S. Picard, C. Adamo, T. Heine, S. Gemming and G. Seifert,
“Density-functional-based molecular-dynamics simulations of molten salts”, The
Journal of Chemical Physics, 2005, 123, 134510.
(221) T. Heine, H. F. Dos Santos, S. Patchkovskii and H. A. Duarte, “Structure and
dynamics of β -cyclodextrin in aqueous solution at the density-functional tight
binding level”, The Journal of Physical Chemistry A, 2007, 111, 5648–5654.
(222) V. Ivanovskaya, T. Heine, S. Gemming and G. Seifert, “Structure, stability and
electronic properties of composite Mo1–xNbxS2 nanotubes”, Physica Status So-
lidi (B), 2006, 243, 1757–1764.
(223) D. Porezag, T. Frauenheim, T. Köhler, G. Seifert and R. Kaschner, “Construction
of tight-binding-like potentials on the basis of density-functional theory: Appli-
cation to carbon”, Physical Review B, 1995, 51, 12947–12957.
(224) G. Seifert, D. Porezag and T. Frauenheim, “Calculations of molecules, clusters,
and solids with a simplified LCAO-DFT-LDA scheme”, International Journal of
Quantum Chemistry, 1996, 58, 185–192.
BIBLIOGRAPHY 128
(225) M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, M. Haugk, T. Frauenheim, S.

Suhai and G. Seifert, “Self-consistent-charge density-functional tight-binding method
for simulations of complex materials properties”, Physical Review B, 1998, 58,
7260–7268.
(226) Y. Yang, H. Yu, D. York, Q. Cui and M. Elstner, “Extension of the self-consistent-
charge density-functional tight-binding method: Third-order expansion of the den-
sity functional theory total energy and introduction of a modified effective coulomb
interaction”, The Journal of Physical Chemistry A, 2007, 111, 10861–10873.
(227) M. Gaus, Q. Cui and M. Elstner, “DFTB3: Extension of the self-consistent-charge
density-functional tight-binding method (SCC-DFTB)”, Journal of Chemical The-
ory and Computation, 2011, 7, 931–948.
(228) M. Gaus, A. Goez and M. Elstner, “Parametrization and benchmark of DFTB3
for organic molecules”, Journal of Chemical Theory and Computation, 2012, 9,
338–354.
(229) T. Frauenheim, G. Seifert, M. Elsterner, Z. Hajnal, G. Jungnickel, D. Porezag,
S. Suhai and R. Scholz, “A self-consistent charge density-functional based tight-
binding method for predictive materials simulations in physics, chemistry and
biology”, Physica Status Solidi(b), 2000, 217, 41–62.
(230) M. Gaus, Q. Cui and M. Elstner, “Density functional tight binding: Application
to organic and biological molecules”, Wiley Interdisciplinary Reviews: Computa-
tional Molecular Science, 2014, 4, 49–61.
(231) T. Krüger, M. Elstner, P. Schiffels and T. Frauenheim, “Validation of the density-
functional based tight-binding approximation method for the calculation of re-
action energies and other data”, The Journal of Chemical Physics, 2005, 122,
114110.
(232) D. Xu, H. Guo and Q. Cui, “Antibiotic deactivation by a dizinc β -lactamase:
Mechanistic insights from QM/MM and DFT studies”, Journal of the American
Chemical Society, 2007, 129, 10814–10822.
(233) D. Riccardi, S. Yang and Q. Cui, “Proton transfer function of carbonic anhydrase:
Insights from QM/MM simulations”, Biochimica et Biophysica Acta (BBA)-Proteins
and Proteomics, 2010, 1804, 342–351.
(234) G. Hou and Q. Cui, “QM/MM analysis suggests that alkaline phosphatase (AP)
and nucleotide pyrophosphatase/phosphodiesterase slightly tighten the transition
state for phosphate diester hydrolysis relative to solution: Implication for catalytic
promiscuity in the AP superfamily”, Journal of the American Chemical Society,
2011, 134, 229–246.
BIBLIOGRAPHY 129
(235) K. W. Sattelmeyer, J. Tirado-Rives and W. L. Jorgensen, “Comparison of SCC-

DFTB and NDDO-based semiempirical molecular orbital methods for organic
molecules”, The Journal of Physical Chemistry A, 2006, 110, 13551–13559.
(236) N. Otte, M. Scholten and W. Thiel, “Looking at self-consistent-charge density
functional tight binding from a semiempirical perspective”, The Journal of Phys-
ical Chemistry A, 2007, 111, 5751–5755.
(237) H. A. Witek and K. Morokuma, “Systematic study of vibrational frequencies cal-
culated with the self-consistent charge density functional tight-binding method”,
Journal of Computational Chemistry, 2004, 25, 1858–1864.
(238) H. A. Witek, K. Morokuma and A. Stradomska, “Modeling vibrational spectra us-
ing the self-consistent charge density-functional tight-binding method. I. Raman
spectra”, The Journal of Chemical Physics, 2004, 121, 5171–5178.
(239) H. A. Witek, K. Morokuma and A. Stradomska, “Modeling vibrational spectra us-
ing the self-consistent charge density-functional tight-binding method II: Infrared
spectra”, Journal of Theoretical and Computational Chemistry, 2005, 4, 639–655.
(240) H. A. Witek, S. Irle, G. Zheng, W. A. De Jong and K. Morokuma, “Modeling
carbon nanostructures with the self-consistent charge density-functional tight-
binding method: Vibrational spectra and electronic structure of C 28, C 60, and C
70”, The Journal of Chemical Physics, 2006, 125, 214706.
(241) M. Elstner, “The SCC-DFTB method and its application to biological systems”,
Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoret-
ica Chimica Acta), 2006, 116, 316–325.
(242) M. Elstner, T. Frauenheim, J. McKelvey and G. Seifert, “Density functional tight
binding: Contributions from the American chemical society symposium”, The
Journal of Physical Chemistry. A, 2007, 111, 5607–5608.
(243) Y. Yang, H. Yu and Q. Cui, “Extensive conformational transitions are required
to turn on ATP hydrolysis in myosin”, Journal of Molecular Biology, 2008, 381,
1407–1420.
(244) Q. Cui and M. Elstner, “”Multi-scale” QM/MM methods with self-consistent-
charge density-functional-tight-binding (SCC-DFTB)”, Multi-scale Quantum Mod-
els for Biocatalysis: Modern Techniques and Applications, 2009, 7, 173–196.
(245) N. Ghosh, X. Prat-Resina, M. Gunner and Q. Cui, “Microscopic pKa analysis of
Glu286 in cytochrome c oxidase (Rhodobacter sphaeroides): Toward a calibrated
molecular model”, Biochemistry, 2009, 48, 2468–2485.
(246) G. Li and Q. Cui, “pKa calculations with QM/MM free energy perturbations”, The
Journal of Physical Chemistry B, 2003, 107, 14521–14528.
BIBLIOGRAPHY 130
(247) S. C. Kamerlin, M. Haranczyk and A. Warshel, “Progress in ab initio QM/MM

free-energy simulations of electrostatic energies in proteins: accelerated QM/MM
studies of pKa , redox reactions and solvation free energies”, The Journal of Phys-
ical Chemistry B, 2008, 113, 1253–1272.
(248) Y. Song, J. Mao and M. Gunner, “MCCE2: improving protein pKa calculations
with extensive side chain rotamer sampling”, Journal of Computational Chem-
istry, 2009, 30, 2231–2247.
(249) M. N. Davies, C. P. Toseland, D. S. Moss and D. R. Flower, “Benchmarking pKa
prediction”, BMC Biochemistry, 2006, 7, 18–29.
(250) E. Awoonor-Williams and C. N. Rowley, “Evaluation of methods for the calcula-
tion of the pKa of cysteine residues in proteins”, Journal of Chemical Theory and
Computation, 2016, 12, 4662–4673.
(251) M. H. Olsson, C. R. Søndergaard, M. Rostkowski and J. H. Jensen, “PROPKA3:
Consistent treatment of internal and surface residues in empirical pKa predic-
tions”, Journal of Chemical Theory and Computation, 2011, 7, 525–537.
(252) M. W. van der Kamp and A. J. Mulholland, “Combined quantum mechanics/molecular
mechanics (QM/MM) methods in computational enzymology”, Biochemistry, 2013,
52, 2708–2728.
(253) M. Klähn, S. Braun-Sand, E. Rosta and A. Warshel, “On possible pitfalls in ab
initio QM/MM minimization approaches for studies of enzymatic reactions”, The
Journal of Physical Chemistry. B, 2005, 109, 15645–15650.
(254) D. Riccardi, P. Schaefer, Y. Yang, H. Yu, N. Ghosh, X. Prat-Resina, P. König, G.
Li, D. Xu, H. Guo et al., “Development of effective quantum mechanical/molecular
mechanical (QM/MM) methods for complex biological processes”, The Journal
of Physical Chemistry B, 2006, 110, 6458–6469.
(255) X. Zhu, Y. Yang and H. Yu, “In silico enzyme modelling”, Australian Biochemist,
2014, 2, 12–15.
(256) M. Garcia-Viloca, J. Gao, M. Karplus and D. G. Truhlar, “How enzymes work:
Analysis by modern rate theory and computer simulations”, Science, 2004, 303,
186–195.
(257) D. McQuarrie and J. Simon, “Physical chemistry: A molecular approach 1997”,
Sausalito: University Science Books.
(258) G. M. Torrie and J. P. Valleau, “Nonphysical sampling distributions in Monte
Carlo free-energy estimation: Umbrella sampling”, Journal of Computational Physics,
1977, 23, 187–199.
BIBLIOGRAPHY 131
(259) T. Huber, A. E. Torda and W. F. van Gunsteren, “Local elevation: A method for
improving the searching properties of molecular dynamics simulation”, Journal
of Computer-aided Molecular Design, 1994, 8, 695–708.
(260) A. Laio and M. Parrinello, “Escaping free-energy minima”, Proceedings of the
National Academy of Sciences, 2002, 99, 12562–12566.
(261) B. Ensing, M. De Vivo, Z. Liu, P. Moore and M. L. Klein, “Metadynamics as
a tool for exploring free energy landscapes of chemical reactions”, Accounts of
Chemical Research, 2006, 39, 73–81.
(262) A. Laio and F. L. Gervasio, “Metadynamics: a method to simulate rare events
and reconstruct the free energy in biophysics, chemistry and material science”,
Reports on Progress in Physics, 2008, 71, 126601.
(263) E. Weinan, W. Ren, E. Vanden-Eijnden et al., “Finite temperature string method
for the study of rare events”, The Journal of Physical Chemistry B, 2005, 109,
6688–6693.
(264) L. Maragliano, A. Fischer, E. Vanden-Eijnden and G. Ciccotti, “String method in
collective variables: Minimum free energy paths and isocommittor surfaces”, The
Journal of Chemical Physics, 2006, 125, 024106.
(265) C. Dellago, P. G. Bolhuis, F. S. Csajka and D. Chandler, “Transition path sampling
and the calculation of rate constants”, The Journal of Chemical Physics, 1998,
108, 1964–1977.
(266) C. Dellago, P. Bolhuis and P. L. Geissler, “Transition path sampling”, Advances
in Chemical Physics, 2002, 123, 1–78.
(267) B. Peters and B. L. Trout, “Obtaining reaction coordinates by likelihood maxi-
mization”, The Journal of Chemical Physics, 2006, 125, 054108.
(268) S. Kumar, J. M. Rosenberg, D. Bouzida, R. H. Swendsen and P. A. Kollman, “The
weighted histogram analysis method for free-energy calculations on biomolecules.
I. The method”, Journal of Computational Chemistry, 1992, 13, 1011–1021.
(269) M. D. Joshi, G. Sidhu, J. E. Nielsen, G. D. Brayer, S. G. Withers and L. P. McIn-
tosh, “Dissecting the electrostatic interactions and pH-dependent activity of a
family 11 glycosidase”, Biochemistry, 2001, 40, 10115–10139.
(270) H. Yu and T. M. Griffiths, “pKa cycling of the general acid/base in glycoside
hydrolase families 33 and 34”, Physical Chemistry Chemical Physics, 2014, 16,
5785–5792.
BIBLIOGRAPHY 132
(271) M. E. Soliman, J. J. R. Pernıa, I. R. Greig and I. H. Williams, “Mechanism of gly-

coside hydrolysis: A comparative QM/MM molecular dynamics analysis for wild-
type and Y69F mutant retaining xylanases”, Organic & Biomolecular Chemistry,
2009, 7, 5236–5244.
(272) M. E. Soliman, G. D. Ruggiero, J. J. R. Pernıa, I. R. Greig and I. H. Williams,
“Computational mutagenesis reveals the role of active-site tyrosine in stabilising
a boat conformation for the substrate: QM/MM molecular dynamics studies of
wild-type and mutant xylanases”, Organic & Biomolecular Chemistry, 2009, 7,
460–468.
(273) J. Kongsted, U. Ryde, J. Wydra and J. H. Jensen, “Prediction and rationalization
of the pH dependence of the activity and stability of family 11 xylanases”, Bio-
chemistry, 2007, 46, 13581–13592.
(274) M. R. Hediger, C. Steinmann, L. De Vico and J. H. Jensen, “A computational
method for the systematic screening of reaction barriers in enzymes: Searching
for Bacillus circulans xylanase mutants with greater activity towards a synthetic
substrate”, PeerJ, 2013, 1, e111.
(275) M. D. Joshi, G. Sidhu, I. Pot, G. D. Brayer, S. G. Withers and L. P. McIntosh, “Hy-
drogen bonding and catalysis: A novel explanation for how a single amino acid
substitution can change the pH optimum of a glycosidase”, Journal of Molecular
Biology, 2000, 299, 255–279.
(276) S. H. Kim, S. Pokhrel and Y. J. Yoo, “Mutation of non-conserved amino acids
surrounding catalytic site to shift pH optimum of Bacillus circulans xylanase”,
Journal of Molecular Catalysis B: Enzymatic, 2008, 55, 130–136.
(277) S. Pokhrel, J. C. Joo and Y. J. Yoo, “Shifting the optimum pH of Bacillus cir-
culans xylanase towards acidic side by introducing arginine”, Biotechnology and
Bioprocess Engineering: BBE, 2013, 18, 35–42.
(278) G. Sidhu, S. G. Withers, N. T. Nguyen, L. P. McIntosh, L. Ziser and G. D. Brayer,
“Sugar ring distortion in the glycosyl-enzyme intermediate of a family G/11 xy-
lanase”, Biochemistry, 1999, 38, 5346–5354.
(279) B. R. Brooks, C. L. Brooks, A. D. MacKerell, L. Nilsson, R. J. Petrella, B. Roux,
Y. Won, G. Archontis, C. Bartels, S. Boresch et al., “CHARMM: the biomolecular
simulation program”, Journal of Computational Chemistry, 2009, 30, 1545–1614.
(280) W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and M. L. Klein,
“Comparison of simple potential functions for simulating liquid water”, The Jour-
nal of Chemical Physics, 1983, 79, 926–935.
BIBLIOGRAPHY 133
(281) J.-P. Ryckaert, G. Ciccotti and H. J. Berendsen, “Numerical integration of the

cartesian equations of motion of a system with constraints: molecular dynamics
of n-alkanes”, Journal of Computational Physics, 1977, 23, 327–341.
(282) P. König, M. Hoffmann, T. Frauenheim and Q. Cui, “A critical evaluation of dif-
ferent QM/MM frontier treatments with SCC-DFTB as the QM method”, The
Journal of Physical Chemistry B, 2005, 109, 9082–9095.
(283) C. N. Schutz and A. Warshel, “What are the dielectric constants of proteins and
how to validate electrostatic models?”, Proteins: Structure, Function, and Bioin-
formatics, 2001, 44, 400–417.
(284) L. Li, C. Li, Z. Zhang and E. Alexov, “On the dielectric constant of proteins:
Smooth dielectric function for macromolecular modeling and its implementation
in Delphi”, Journal of Chemical Theory and Computation, 2013, 9, 2126–2136.
(285) F. Claeyssens, J. N. Harvey, F. R. Manby, R. A. Mata, A. J. Mulholland, K. E.
Ranaghan, M. Schütz, S. Thiel, W. Thiel and H.-J. Werner, “High-accuracy com-
putation of reaction barriers in enzymes”, Angewandte Chemie International Edi-
tion, 2006, 118, 7010–7013.
(286) M. Kato and A. Warshel, “Using a charging coordinate in studies of ionization in-
duced partial unfolding”, The Journal of Physical Chemistry B, 2006, 110, 11566–
11570.
(287) J. Heimdal and U. Ryde, “Convergence of QM/MM free-energy perturbations
based on molecular-mechanics or semiempirical simulations”, Physical Chem-
istry Chemical Physics, 2012, 14, 12592–12604.
(288) M. D. Joshi, A. Hedberg and L. P. Mcintosh, “Complete measurement of the pKa
values of the carboxyl and imidazole groups in Bacillus circulans xylanase”, Pro-
tein Science, 1997, 6, 2667–2670.
(289) Y. Yang, H. Yu, D. York, M. Elstner and Q. Cui, “Description of phosphate hy-
drolysis reactions with the self-consistent-charge density-functional-tight-binding
(SCC-DFTB) theory. 1. Parameterization”, Journal of Chemical Theory and Com-
putation, 2008, 4, 2067–2084.
(290) G. Hou, X. Zhu, M. Elstner and Q. Cui, “A modified QM/MM Hamiltonian
with the self-consistent-charge density-functional-tight-binding theory for highly
charged QM regions”, Journal of Chemical Theory and Computation, 2012, 8,
4293–4304.
(291) H. Yu, I. M. Ratheal, P. Artigas and B. Roux, “Protonation of key acidic residues is
critical for the K+-selectivity of the Na/K pump”, Nature Structural & Molecular
Biology, 2011, 18, 1159–1163.
BIBLIOGRAPHY 134
(292) W. L. Mock, “Theory of enzymatic reverse-protonation catalysis”, Bioorganic

Chemistry, 1992, 20, 377–381.
(293) W. W. Wakarchuk, R. L. Campbell, W. L. Sung, J. Davoodi and M. Yaguchi, “Mu-
tational and crystallographic analyses of the active site residues of the Bacillus
circulans xylanase”, Protein Science, 1994, 3, 467–475.
(294) S. S. Huang, K. M. Johnson, G. T. Ray, P. Wroe, T. A. Lieu, M. R. Moore, E. R.
Zell, J. A. Linder, C. G. Grijalva, J. P. Metlay et al., “Healthcare utilization and
cost of pneumococcal disease in the United States”, Vaccine, 2011, 29, 3398–
3412.
(295) B. Simell, K. Auranen, H. Käyhty, D. Goldblatt, R. Dagan, K. L. O?Brien and
P. C. G. (PneumoCarr), “The fundamental link between pneumococcal carriage
and disease”, Expert Review of Vaccines, 2012, 11, 841–855.
(296) R. A. Adegbola, “Childhood pneumonia as a global health priority and the strate-
gic interest of the Bill & Melinda Gates Foundation”, Clinical Infectious Diseases,
2012, 54, S89–S92.
(297) E. Walther, M. Richter, Z. Xu, C. Kramer, S. Von Grafenstein, J. Kirchmair, U.
Grienke, J. Rollinger, K. Liedl, H. Slevogt et al., “Antipneumococcal activity of
neuraminidase inhibiting artocarpin”, International Journal of Medical Microbi-
ology, 2015, 305, 289–297.
(298) P. Roggentin, B. Rothe, J. B. Kaper, J. Galen, L. Lawrisuk, E. R. Vimr and R.
Schauer, “Conserved sequences in bacterial and viral sialidases”, Glycoconjugate
Journal, 1989, 6, 349–353.
(299) H. Tong, L. Blue, M. James and T. DeMaria, “Evaluation of the virulence of aS-
treptococcus pneumoniae neuraminidase-deficient mutant in nasopharyngeal col-
onization and development of otitis media in the chinchilla model”, Infection and
Immunity, 2000, 68, 921–924.
(300) C. J. Orihuela, G. Gao, K. P. Francis, J. Yu and E. I. Tuomanen, “Tissue-specific
contributions of pneumococcal virulence factors to pathogenesis”, Journal of In-
fectious Diseases, 2004, 190, 1661–1669.
(301) X.-M. Song, W. Connor, K. Hokamp, L. A. Babiuk and A. A. Potter, “Strepto-
coccus pneumoniae early response genes to human lung epithelial cells”, BMC
Research Notes, 2008, 1, 64–73.
(302) C. Trappetti, A. Kadioglu, M. Carter, J. Hayre, F. Iannelli, G. Pozzi, P. W. An-
drew and M. R. Oggioni, “Sialic acid: ApKa preventable signal for pneumococcal
biofilm formation, colonization, and invasion of the host”, The Journal of Infec-
tious Diseases, 2009, 199, 1497–1505.
BIBLIOGRAPHY 135
(303) E. Walther, Z. Xu, M. Richter, J. Kirchmair, U. Grienke, J. M. Rollinger, A.

Krumbholz, H. P. Saluz, W. Pfister, A. Sauerbrei et al., “Dual acting neuraminidase
inhibitors open new opportunities to disrupt the lethal synergism between Strep-
tococcus pneumoniae and influenza virus”, Frontiers in Microbiology, 2016, 7.
(304) Z. Xu, S. Von Grafenstein, E. Walther, J. E. Fuchs, K. R. Liedl, A. Sauerbrei and
M. Schmidtke, “Sequence diversity of NanA manifests in distinct enzyme kinetics
and inhibitor susceptibility”, Scientific Reports, 2016, 6, 357–366.
(305) C. R. Søndergaard, M. H. Olsson, M. Rostkowski and J. H. Jensen, “Improved
treatment of ligands and coupling effects in empirical calculation and rational-
ization of pKa values”, Journal of Chemical Theory and Computation, 2011, 7,
2284–2295.
(306) A. MacKerell Jr, D. Bashford, M. Bellott, R. Dunbrack Jr, J. Evanseck, M. Field,
S. Fischer, J. Gao, H. Guo, S. Ha et al., “All-atom empirical potential for molec-
ular modeling and dynamics studies of proteins”, The Journal of Physical Chem-
istry B, 1998, 102, 3586–3616.
(307) O. Guvench, E. Hatcher, R. M. Venable, R. W. Pastor and A. D. MacKerell Jr,
“CHARMM additive all-atom force field for glycosidic linkages between hex-
opyranoses”, Journal of Chemical Theory and Computation, 2009, 5, 2353–2370.
(308) O. Guvench, S. S. Mallajosyula, E. P. Raman, E. Hatcher, K. Vanommeslaeghe,
T. J. Foster, F. W. Jamison and A. D. MacKerell Jr, “CHARMM additive all-
atom force field for carbohydrate derivatives and its utility in polysaccharide and
carbohydrate–protein modeling”, Journal of Chemical Theory and Computation,
2011, 7, 3162–3180.
(309) W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and M. L. Klein,
“Comparison of simple potential functions for simulating liquid water”, The Jour-
nal of Chemical Physics, 1983, 79, 926–935.
(310) P. Schaefer, D. Riccardi and Q. Cui, “Reliable treatment of electrostatics in com-
bined QM/MM simulation of macromolecules”, The Journal of Chemical Physics,
2005, 123, 014905.
(311) J.-P. Ryckaert, G. Ciccotti and H. J. Berendsen, “Numerical integration of the
cartesian equations of motion of a system with constraints: molecular dynamics
of n-alkanes”, Journal of Computational Physics, 1977, 23, 327–341.
(312) R. D. Guthrie and W. P. Jencks, “IUPAC recommendations for the representation
of reaction mechanisms”, Accounts of Chemical Research, 1989, 22, 343–349.
(313) R. M. O’Ferrall, “Relationships between E 2 and E 1c B mechanisms of β -
elimination”, Journal of the Chemical Society B: Physical Organic, 1970, 274–
277.
BIBLIOGRAPHY 136
(314) L. Pauling, “Atomic radii and interatomic distances in metals”, Journal of the
(315) G. Pierdominici-Sottile, N. A. Horenstein and A. E. Roitberg, “Free energy study
of the catalytic mechanism of trypanosoma cruzi trans-sialidase. From the michaelis
complex to the covalent intermediate”, Biochemistry, 2011, 50, 10150–10158.
(316) J. A. Bueren-Calabuig, G. Pierdominici-Sottile and A. E. Roitberg, “Unraveling
the differences of the hydrolytic activity of Trypanosoma cruzi trans-sialidase
and Trypanosoma rangeli sialidase: A quantum mechanics–molecular mechanics
modeling study”, The Journal of Physical Chemistry B, 2014, 118, 5807–5816.
(317) A. G. Watts, I. Damager, M. L. Amaya, A. Buschiazzo, P. Alzari, A. C. Frasch and
S. G. Withers, “Trypanosoma cruzi trans-sialidase operates through a covalent
sialyl-enzyme intermediate: Tyrosine is the catalytic nucleophile”, Journal of the
(318) M. F. Amaya, A. G. Watts, I. Damager, A. Wehenkel, T. Nguyen, A. Buschiazzo,
G. Paris, A. C. Frasch, S. G. Withers and P. M. Alzari, “Structural insights into
the catalytic mechanism of Trypanosoma cruzi trans-sialidase”, Structure, 2004,
12, 775–784.
(319) Q. Cui and M. Elstner, “Density functional tight binding: Values of semi-empirical
methods in an ab initio era”, Physical Chemistry Chemical Physics, 2014, 16,
14368–14377.
(320) J. C. Kromann, A. S. Christensen, Q. Cui and J. H. Jensen, “Towards a barrier
height benchmark set for biologically relevant systems”, PeerJ, 2016, 4, e1994.
(321) A. T. Pereira, A. J. Ribeiro, P. A. Fernandes and M. J. Ramos, “Benchmarking
of density functionals for the kinetics and thermodynamics of the hydrolysis of
glycosidic bonds catalyzed by glycosidases”, International Journal of Quantum
Chemistry, 2017, 117, e25409.
(322) M. Gruden, L. Andjeklović, A. K. Jissy, S. Stepanović, M. Zlatar, Q. Cui and
M. Elstner, “Benchmarking density functional tight binding models for barrier
heights and reaction energetics of organic molecules”, Journal of Computational
Chemistry, 2017, 38, 2171–2185.
(323) R. P. Neves, P. A. Fernandes, A. J. Varandas and M. J. Ramos, “Benchmarking
of density functionals for the accurate description of thiol–disulfide exchange”,
Journal of Chemical Theory and Computation, 2014, 10, 4842–4856.
(324) N. Mardirossian and M. Head-Gordon, “How accurate are the minnesota density
functionals for noncovalent interactions, isomerization energies, thermochemistry,
and barrier heights involving molecules composed of main-group elements?”,
Journal of Chemical Theory and Computation, 2016, 12, 4303–4325.
BIBLIOGRAPHY 137
(325) “Molecular mechanism of the glycosylation step catalyzed by Golgi α-mannosidase

II: A QM/MM metadynamics investigation”, Journal of the American Chemical
Society, 2010, 132, 8291–8300.
(326) A. Chandrasekaran, A. Srinivasan, R. Raman, K. Viswanathan, S. Raguram, T. M.
Tumpey, V. Sasisekharan and R. Sasisekharan, “Glycan topology determines hu-
man adaptation of avian H5N1 virus hemagglutinin”, Nature Biotechnology, 2008,
26, 107–113.
(327) J. K. Hayre, G. Xu, L. Borgianni, G. L. Taylor, P. W. Andrew, J.-D. Docquier and
M. R. Oggioni, “Optimization of a direct spectrophotometric method to investi-
gate the kinetics and inhibition of sialidases”, BMC Biochemistry, 2012, 13, 19–
25.
(328) A. J. Ribeiro, M. J. Ramos and P. A. Fernandes, “Benchmarking of DFT function-
als for the hydrolysis of phosphodiester bonds”, Journal of Chemical Theory and
Computation, 2010, 6, 2281–2292.
(329) L. Goerigk and S. Grimme, “A thorough benchmark of density functional meth-
ods for general main group thermochemistry, kinetics, and noncovalent interac-
tions”, Physical Chemistry Chemical Physics, 2011, 13, 6670–6688.
(330) H. Yu and W. F. Van Gunsteren, “Accounting for polarization in molecular simu-
lation”, Computer Physics Communications, 2005, 172, 69–85.
(331) D. P. Geerke, S. Thiel, W. Thiel and W. F. van Gunsteren, “Combined QM/MM
molecular dynamics study on a condensed-phase SN2 reaction at nitrogen: the
effect of explicitly including solvent polarization”, Journal of Chemical Theory
and Computation, 2007, 3, 1499–1509.
(332) V. L. Yip, J. Thompson and S. G. Withers, “Mechanism of GlvA from Bacillus
subtilis: A detailed kinetic analysis of a 6-phospho-α-glucosidase from glycoside
hydrolase family 4”, Biochemistry, 2007, 46, 9840–9852.
(333) P. V. Harris, D. Welner, K. McFarland, E. Re, J.-C. Navarro Poulsen, K. Brown,
R. Salbo, H. Ding, E. Vlasenko, S. Merino et al., “Stimulation of lignocellulosic
biomass hydrolysis by proteins of glycoside hydrolase family 61: Structure and
function of a large, enigmatic family”, Biochemistry, 2010, 49, 3305–3316.
(334) N. Asano, “Glycosidase inhibitors: Update and perspectives on practical use”,
Glycobiology, 2003, 13, 93R–104R.
(335) F. Marcelo, Y. He, S. A. Yuzwa, L. Nieto, J. Jiménez-Barbero, M. Sollogoub, D. J.
Vocadlo, G. D. Davies and Y. Blériot, “Molecular basis for inhibition of GH84
glycoside hydrolases by substituted azepanes: Conformational flexibility enables
probing of substrate distortion”, Journal of the American Chemical Society, 2009,
131, 5390–5392.
BIBLIOGRAPHY 138
(336) S. Thobhani, B. Ember, A. Siriwardena and G.-J. Boons, “Multivalency and the
mode of action of bacterial sialidases”, Journal of the American Chemical Society,
2003, 125, 7154–7155.
Appendix A
Exploring the catalytic mechanism of

the S. pneumoniae sialidases NanA
A.1 DFTB3/MM simulations

The combined DFTB3/MM method has been shown to be applicable to biomolecular sys-
tems (See details in Chapter 2). We first attempted to apply this method to carry out
explicit solvent simulations of the Michaelis complex of NanA. Surprisingly, an autodis-
sociation of C2-O bond in substrate was observed. Due to the time restraint of my PhD
study, we decided to adopt SCC-DFTB/MM and this problem will be further investigated
by another group member.
A.2 Additional MEP calculations for the second step of

NanA along the reaction path of PathC
The 2D energy profile on the reaction path of the second step for NanA based on the mech-
anism of NanC176 was evaluated with two predefined reaction coordinates, rxn3=d5-d6
and rxn7=d10-d11. This reaction pathway differs from the PathC investigated in Chapter
2 which has a catalytic water molecule involved. d5 refers to the distance between Tyr752
Oη and C2 position of the sugar ring, d6 corresponds to the one between Tyr752 Oη and
Hη, d10 refers to the one between C3 and H32 in substrate, and d12 accounts for that
between H32 of the sugar ring and Asp372 Oε2 (Fig. A.1). rxn3 was harmonically re-
strained from -0.22 to 2.02 Å while rxn7 was scanned from -1.63 to 1.53 Å using a force
constant of 2000.0 kcal/mol/Å2 . The scans were constructed in steps of 0.08 Å. A total
of 280 energy points have been obtained in the 2D MEP calculations. The contour plot
of the energy profiles is shown in Fig. A.2. The energy barrier is around 43.8 kcal/mol
which is significantly higher than the one for PathC in NanA (≈28.0 kcal/mol). This is a
further proof that NanA can not produce Neu5Ac along the reaction path of NanC.
139
APPENDIX A. THE CATALYTIC MECHANISM OF NANA 140
A.3 The extension MEP calculations for PathB and PathC

As the proton transfers between the Tyr752 and Glu647 involved in both PathB and PathC
have not been completed during their PC states, we expanded the MEP calculations with
the reaction coordinate rxn3=d5-d6. The results are shown in Fig. A.3 and Fig. A.4. For
these two simple linear 1D maps, one can see that minor barrier were observed (≈ 3.3
kcal/mol and ≈ 4.2 kcal/mol for PathB and PathC, respectively) to complete the whole
process of reaction. As we described in the main context, NanA is capable of producing
Neu5Ac along PathA with a relatively lower energies. These findings provide further
evidence that PathA is the most likely reaction pathway for NanA.
Tyr752
Glu647
d6
O O H O
+
HO OH d5 O H2N
Arg
O H2N
AcHN O- +
HO HO d10
H
d12
O O-
Asp372
OH
HO
O COOH
AcHN
OH HO
Figure A.1: The alternative mechanism for the second step of NanA along PathC.
1.5 0
3
42 6
9
1.0
12
15
18
21
0.5
24
42 27
30
0.0 33
rxn7(Å )
36
39
42
-0.5 45
-1.0
-1.5
42
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure A.2: Potential energy contour plot for the second step of NanA based on the
mechanism proposed for NanC. Energies are in kcal/mol.
Figure A.3: 1D potential energy plot of the subsequent reaction for PathB in NanA.
The last point (labeled in red) represents a free minimization without restraints on the
reaction coordinates.
Figure A.4: 1D potential energy plot of the subsequent reaction for PathC in NanA.
Appendix B
Comparative studies of catalytic

pathways for S. pneumoniae NanB and
NanC
B.1 Additional MEP calculations for the second steps of

NanB and NanC along the reaction path of PathC
The 2D MEP sampling on the second reaction steps along PathC for NanB and NanC
based on the mechanisms that have a catalytic water molecule involved were evaluated
with two predefined reaction coordinates, rxn3=d5-d6 and rxn7=d10-d12. d5 refers to the
distance between the tyrosine Oη (Tyr653 and Tyr695 of NanB and NanC separately) and
C2 position of the sugar ring, d6 corresponds to the one between the tyrosine Oη and Hη,
d10 refers to the one between C3 and H32 in substrate, and d12 accounts for that between
H32 of the sugar ring and the aspartic acid Oε2 (Asp270 and Asp315 for NanB and NanC,
respectively) (Fig. B.1). For NanB, rxn3 was harmonically restrained from -0.25 to 2.50
Å while rxn7 was scanned from -1.66 to 2.50 Å. For NanC, rxn3 was sampled from -0.19
to 2.45 Å while rxn7 was scanned from -3.17 to 2.59 Å. The scans were constructed in
steps of 0.08 Åusing a force constant of 2000.0 kcal/mol/Å2 . The contour plots of the
energy profiles are illustrated in Fig. B.2 and Fig. B.3. The energy barrier for the two
systems (≈34.8 and 38.2 kcal/mol, respectively) is much higher than the one based on the
mechanisms that without the catalytic water involved. These data support the proposed
mechanisms in this work but contradict with the previously proposed mechanisms.176
B.2 The extension MEP calculations for NanB and NanC

As the proton transfers between the tyrosine and the glutamic acid residue (NanB, Tyr653
and Glu541; NanC, Tyr695 and Glu584) involved in PathA, PathB and PathC have not
144
APPENDIX B. THE CATALYTIC PATHWAYS OF NANB AND NANC 145
been completed during their PC states, we expanded the MEP calculations with the reac-
tion coordinate rxn=d6. The results are shown in Fig. A.3 and Fig. A.4. From the linear
1D maps (Fig. B.4 to Fig. B.9), one can see that more energies are needed (≈ 3.8, 2.3
and 1.8 kcal/mol for PathA, PathB and PathC of NanB, respectively; ≈ 2.3, 1.9 and 2.0
kcal/mol for PathA, PathB and PathC of NanC, respectively;) to complete the whole pro-
cess of reaction (PC2). For the most likely reaction pathway PathB of NanB and PathC
of NanC, we also performed single point calculations on the structure from the optimized
MEP extension results. An energy of 2.7 and 2.2 kcal/mol is required to achieve PC2
along PathB of NanB and PathC of NanC, respectively. The complete potential energy
profiles for the second step for PathB of NanB and PathC of NanC based on M06-2X/MM
calculations are shown in Fig. B.10.
Previous experimental studies showed that the catalytic efficiency of NanA is much
higher than NanB and NanC.173,175,327 This is inconsistent with our results that NanA
has a higher reaction energy barrier (Fig. B.10). Previous work has shown that different
buffers, pH optimum and substrates can affect the enzymatic activity of sialidases.327,336
Additionally, it has been strongly suggested that the lectin/carbohydrate-binding module
(CBM) can greatly enhance the catalytic efficiency of large sialidases by binding the
catalytic domain with the polyvalent substrate.336 However, we only focus on the catalytic
domains of NanA, NanB and NanC with GSBP setup (See Section 2.3.1 in Chapter 2 for
more information) in our simulations, bringing considerable approximations during the
calculations. No structural studies have been performed on the interactions between the
CBM40 domains and the substrates of NanA, it is envisaged that CBM40 domains of
NanA may have the biggest effect on its catalytic efficiency.
B.3 Comparison of the mechanisms for NanA, NanB and

NanC from the intermediate to the product state
The values for the distance at the critical points of the preferred pathways of the three
sialidases (PathA of NanA, PathB of NanB and PathC of NanC) are presented in Table
B.1. At the TS1, the proton transfer between the tyrosine and the glutamic acid residue
is almost completed (d4≈1.18, 1.14 and 1.21 Å, respectively). All Tyr752, Tyr653 and
Tyr695 in the three cases are more negatively charged, that enhances them being a nucle-
ophile (See Chapter 3 for more information). The hydroxyl oxygen of the nucleophilic
tyrosine is moving closer to the C2 position of the substrate, the distance between which
is 2.24 Å in NanB which is the shortest one among the three cases (d3≈2.40 and 2.64
Å for NanA and NanC, respectively), making Tyr653 of NanB a better nucleophile. The
C2 (sialyl)-O (methyl group) bond is already cleaved (d1≈2.07, 2.24 and 2.10 Å, respec-
tively). Additionally, the proton transfer between the aspartic acid residue and the methyl
group has been completed in NanB and NanC (d2≈1.05 Å and 1.02 Å, respectively), ex-
cept in NanA (d2≈1.22 Å, See Fig. 3.10, Fig. 4.13 and Fig. 4.14). Therefore, the reaction
energy barrier of the first step of NanB is the lowest one (Fig. 3.5, Fig. 4.3 and Fig. 4.4).
At the TS2, the tyrosine Hη-Oη bond formation is beginning in NanA and NanC
(d6≈1.15 and 1.23 Å, respectively), while the Tyr653 Hη still binds with Glu541 Oε
instead of moving towards to the Tyr653 Oη in NanB (d6≈1.59 Å). In the meantime,
the sialyl cation O7 is ∼1.95 Å closer to the C2 position of the substrate in NanB while
the catalytic water oxygen moves slower in NanA (∼1.15 Å closer to the anomeric C2).
The cleavage of C3 (sialyl)-H32 (sialyl) bond and the formation of Asp315 Oε2-H32
(sialyl) bond have already been completed in NanC (d10≈1.59 Å and d11≈1.05 Å, Fig.
4.20). With these in mind, NanC reached its TS2 first, followed by NanB and NanA,
respectively. This is consistent with the MEP results.
Table B.1: Reaction coordinate distances (Å) at RC, TS1, IC, TS2 and PC in PathA of
NanA, PathB of NanB and PathC of NanC.
1st step 2nd step

RC TS1 IC IC TS2 PC
PathA of NanA d1 1.50 2.07 3.47 d5 1.52 2.63 3.05
d2 1.62 1.22 1.03 d6 1.95 1.15 1.09
d3 3.91 2.40 1.54 d7 3.26 2.11 1.50
d4 1.68 1.18 1.00 d8 1.79 1.66 1.01
PathB of NanB d1 1.52 2.24 3.53 d5 1.52 2.19 2.91
d2 1.65 1.05 1.02 d6 1.64 1.59 1.19
d3 2.99 2.24 1.57 d9 3.96 2.01 1.52
d4 1.27 1.14 1.01
PathC of NanC d1 1.49 2.10 3.49 d5 1.62 2.48 2.67
d2 2.34 1.02 0.98 d6 1.81 1.23 1.18
d3 3.49 2.64 1.57 d10 1.09 1.59 2.49
d4 1.24 1.21 1.00 d11 2.92 1.05 0.98
Tyr
Glu
d6
O O H O
+
HO OH d5 O H 2N
Arg
O H 2N
AcHN O- +
HO HO d10
H
d12 O
H
H
O O-
Asp
OH
HO
O COOH
AcHN
OH HO
Figure B.1: The alternative mechanism for the second step of NanB along PathC
2.5
0
3
6
2.0
9
12
15
1.5
18
21
24
1.0
27
33 30
33
rxn7(Å )
0.5
36
39
33
0.0 42
45
-0.5
-1.0
-1.5
0.0 0.5 1.0 1.5 2.0 2.5
rxn3(Å )
Figure B.2: Potential energy contour plot for the second step of NanB based on the
mechanism which has a catalytic water involved. Energies are in kcal/mol.
0
3
2 6
9
12
15
18
1
36 21
24
27
30
0
33
rxn7(Å )
36
39
42
-1 36 45
-2
-3
0.0 0.5 1.0 1.5 2.0
rxn3(Å )
Figure B.3: Potential energy contour plot for the second step of NanC based on the
mechanism which has a catalytic water involved. Energies are in kcal/mol.
Figure B.4: 1D potential energy plot of the subsequent reaction for PathA in NanB.
Figure B.5: 1D potential energy plot of the subsequent reaction for PathB in NanB.
Figure B.6: 1D potential energy plot of the subsequent reaction for PathC in NanB.
Figure B.7: 1D potential energy plot of the subsequent reaction for PathA in NanC.
Figure B.8: 1D potential energy plot of the subsequent reaction for PathB in NanC.
Figure B.9: 1D potential energy plot of the subsequent reaction for PathC in NanC.
30
25.9
26.0
25 PathA of NanA
PathB of NanB
PathC of NanC
20
17.8 18.0
TS2
15
11.3
10
8.0
7.4
2.2
0.0 0.0
0
-2.0
IC PC
PC2
-5
Figure B.10: Potential energy profiles for PathA of NanA (black), PathB of NanB (pur-
ple) and PathC of NanC (blue). This potential energy was obtained from single point
calculation based on M06-2X/6-31+G(d,p).

Understanding The Catalytic Mechanisms of Selected Glycoside Hydr

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Understanding The Catalytic Mechanisms of Selected Glycoside Hydr

Uploaded by

Copyright:

Available Formats

University of Wollongong

University of Wollongong Thesis Collection University of Wollongong Thesis Collections

Understanding the Catalytic Mechanisms of Selected Glycoside Hydrolases

Follow this and additional works at: https://ro.uow.edu.au/theses1

The University of Wollongong

June 14, 2018

Carbohydrate-active enzymes (CAZymes) are responsible for the synthesis, modification

Xiao K, Yu H. “Rationalising pKa shifts in Bacillus circulans xylanase with computa-

Xiao K, Yu H. “Exploring the catalytic mechanisms of the S. pneumoniae sialidases

Xiao K, Yu H. “Comparative studies of catalytic pathways for S. pneumoniae NanB and

Xiao K, Montgomery AP, Wang X, Skropeta D, and Yu H. (Invited Talk) “Mechanistic

I would like to acknowledge my supervisors Dr Haibo Yu and A/Prof Danielle Skropeta

C. perfingens Clostridium perfringens

Technical & Miscellaneous

AIDS Acquired immune deficiency syndrome

2,7-anhydro-Neu5Ac 2,7-anhydro-α-N-acetylneuraminic acid

molecular dynamics (MD), combined quantumn mechanics and molecular mechanics

Publications and Conference Presentations v

List of Tables xxiv

2 pKa shifts in BcX 28

3 The catalytic mechanism of NanA 48

4 The catalytic pathways of NanB and NanC 78

5 Summary and future work 105

A The catalytic mechanism of NanA 139

B The catalytic pathways of NanB and NanC 144

1.1 Diversity of CAZymes and auxiliary enzymes, according to the CAZy

1.1 Six types of enzyme catalysts.58,59 . . . . . . . . . . . . . . . . . . . . . 9

2.1 Summary of the starting structures, the equilibrium molecular dynamics

1.1 The carbohydrate-active enzymes

(1) α-D-Glucopyranose (2) β-D-Glucopyranose

The annotation of CAZymes is never a trivial task. Several state-of-the-art annotation

1.1.1 Puckering of carbohydrate structures

B2,5 OH OH B1,4 B2,5 3H 3H B1,4

1,4B 4H 2H 2,5B 1,4B 1H 5H 2,5B

1.1.2 Cremer-Pople puckering coordinates

Figure 1.4: Cremer-Pople puckering coordinates of a pyranose ring. Figure adapted

1.2 Glycoside hydrolases

1.2.1 The conformational catalytic itinerary in GHs

General base Oxocarbenium ion-like TS

Pople itinerary, predicting that a 1 S5 → B2,5 → O S2 itinerary occurs for β -mannanase.

1.2.2 pK a shifts in GHs

Table 1.1: Six types of enzyme catalysts.58,59

Types of enzyme catalysts Examples

1.3 Bacillus circulans xylanase

generally referred to heteropolysaccharides that are based on a backbone structure of β -

1.4 The sialidase superfamily

Glu172 Glu172 Glu172

Glu78 Glu78 Glu78

Covalent Intermediate Michaelis Complex

membranes-associated sialidases and intracellular-associated sialidases, respectively.96,98,99

1.4.1 Three S. pneumoniae sialidases NanA, NanB and NanC

1.5 Computational methods

1.5.1 Molecular dynamics simulations

1.5.2 Combined quantum mechanics/molecular mechanics methods

1.5.2.1 The QM model

where cµi , the molecular orbital coefficients, issuing in KS equations

1.5.3 pK a and pK a shift calculations in enzymes

Then pKa is defined in the form of a negative logarithm of Ka

where kB is the Boltzmann constant and T is the temperature.

1.5.3.1 Thermodynamic integration (TI) based on explicit solvent QM/MM simu-

∆pKa = pKa (E-A) − pKa (model-A)

∆∆G = ∆G(E-A(H/D)) − ∆G(model-A(H/D)) (1.13)

1.5.3.2 The multiconformation continuum electrostatics (MCCE)