Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Quantitative computer-aided studies of artificial and enantioselective enzymes
(USC Thesis Other)
Quantitative computer-aided studies of artificial and enantioselective enzymes
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
QUANTITATIVE COMPUTER-AIDED STUDIES OF ARTIFICIAL AND
ENANTIOSELECTIVE ENZYMES
by
Maria P. Frushicheva
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(CHEMISTRY)
December 2012
Copyright 2012 Maria P. Frushicheva
ii
EPIGRAPH
“Disciplining yourself to do what you know is right
and important, although difficult, is the highroad to pride,
self-esteem, and personal satisfaction”.
- Margaret Thatcher
iii
DEDICATION
To my mother,
Svetlana A. Frushicheva
iv
ACKNOWLEDGMENTS
I am grateful to Professor Arieh Warshel for the outstanding guidance and
mentorship during my graduate studies. Professor Warshel’s passion for science,
scientific thinking and intuition have introduced to me a scientific research philosophy
that I will carry through my career and onward.
Thank you to Professor Peter Z. Qin for the guidance and mentorship during the
first years of my Ph.D. program. Professor Peter Z. Qin with Professor Ian S. Haworth
introduced me to Molecular Dynamics field and the field of computational biophysics.
During the years in Professor Peter Z. Qin’s research group, I learned and developed the
essential skills for my growth as independent and critical thinking research scientist.
I am very thankful to my committee members Professor Aiichiro Nakano and
Professor Steve E. Bradforth for their willingness to be a part of this process. Thank you
to my screening and qualifying committee members Professor Richard L. Brutchey,
Professor Barry C. Thompson and Professor Robert Bau for their service in advance to
my candidacy. I also would like to thank Professor Hanna Reisler and Professor Andrey
Vilesov for their supportive advice and guidance during my graduate studies.
I am highly appreciated the friendship and mentorship of Professor Warshels’s
and Professor Qin’s past and current group members. I was honored to work with Dr.
Maite Roca, who introduced me to the field of computational chemistry. I am very
thankful to Dr. Zhen T. Chu for his help with the code implementation of our scientific
software. I am very proud to acknowledge my dearest colleagues, Dr. Nidhi Singh and
v
Dr. Gian P. Grant, for their irreplaceable friendship, help and support during all the stages
of my Ph.D. program.
I am very thankful for the greatest support of my dearest friends, Ms. Daria
Osinnikova, Ms. Natalia Osinnikova, Ms. Elena Altukhova, Dr. Maria Tatanova, Dr.
Anna M. Popova, Dr. Inessa Bychinskaya and Ms. Beate Burkhart.
I am very grateful to Mr. Petr P. Khlyabich for his friendship, care and support,
for his inspiration and positive spirit which always cheered me up during the difficult
moments of my student life.
I am very thankful for the love, care and support of my mother, Svetlana A.
Frushicheva. She always has been an important part of making me who I am today.
vi
TABLE OF CONTENTS
Epigraph ii
Dedication iii
Acknowledgments iv
List of Tables viii
List of Figures x
Abbreviations xv
Abstract xvii
Introduction 1
Chapter One: Exploring Challengers in Rational Enzyme Design by Simulating
the Catalysis in Artificial Kemp Eliminase 4
Introduction 4
Systems and Methods 6
System 6
Barrier Estimates 8
EVB Calibration Procedure 9
Simulation Protocol 14
Results 15
Validation on Existing Constructs 15
Group Contributions: Initial Screening 18
Analysis of Suggested Mutations 24
Rate Increase in Directed Evolution Mutants 26
Linear Response Approximation 29
Prospect for Computer-Aided Directed Evolution 32
Discussion 36
Conclusions 38
Chapter Two: Methodology: Empirical Valence Bond Approach 44
Empirical Valence Bond Theory 44
EVB Performance 49
Chapter Three: Challengers and Advances in Validating Enzyme Design
Proposals: The Case of the Kemp Eliminase Catalysis 52
Introduction 52
Systems 55
vii
Methods 61
Simulation Protocol 62
Group Contributions 63
Results 65
Modeling the Catalytic Effects in Different Kemp Eliminases 65
Examining the LFER Trends 68
Optimizing the Long Range Electrostatic Effects 73
Examination of the Inactive Designs 79
Conclusions 80
Chapter Four: Towards Quantitative Computer-Aided Studies of Enzymatic
Enantioselectivity: The Case of Candida antarctica Lipase A 88
Introduction 88
Concept of Enantioselectivity 91
System 92
System Preparation 93
Methods 96
EVB Calibration Procedure 96
Empirical Valence Bond Simulations 97
Simulation Protocol 98
Direct Calculation of E 103
Group Contributions: Linear Response Approximation/β 105
Results 107
Calculated Activation Free Energies 107
Calculating the Mutational Effects on E’ 108
Calculating the Total Selectivity and Its Changes by Mutational
Effects 111
Exploring Computer-Aided Refinement of Enantioselectivity 114
Conclusions 117
Perspectives 120
Bibliography 123
Appendices
Appendix A: Overlap of the EVB and ab initio TS Structures in Solution
for Kemp Elimination Reaction 135
Appendix B: The Standard Deviation for Calculated ∆g
‡
cat
for
the Reaction of CALA and Its Mutants 136
viii
LIST OF TABLES
Table 1.1 The ∆g
‡
cat
for the systems explored in this work. 7
Table 1.2 Parameters used in EVB calculations. 12
Table 1.3 Group contributions obtained using the structure of the wild
type enzyme for charge set A and set B. 21
Table 1.4 Comparing the LRA contributions of the solvation free
energies in the Kemp eliminase and DhlA systems. 28
Table 1.5 Predicted catalytic contributions from some ionized
residues. 34
Table 2.1 The efficiency of the EVB screening calculations. 51
Table 3.1 The correlation between the observed and calculated ∆g
‡
cat
(in kcal/mol) for the Kemp Elimination reaction with
different catalytic bases (acceptors) and substrates (donors). 58
Table 3.2 Experimental LFER for the acceptor and donor pK
a
in
solution . 59
Table 3.3 The calculated reaction barriers for the KE70 designed
variant for 20 starting conformations. 66
Table 3.4 Correlation between the calculated and observed acceptor
pKa at ε
p
=4.0 and ε
p
=6.0 in the protein. 70
Table 3.5 Calculated LFER for the acceptor and donor pK
a
in the
protein at ε
p
=4.0. 72
Table 3.6 Group contributions from the distanced protein residues of
the 13G5 catalytic antibody due to the charge change upon
moving from the RS to the TS and from the RS to the PS.
The residue numbers from 2 to 215 and from 216 to 431 are
located at the heavy (H) and light (L) chains of the 13G5
catalytic antibody respectively. The group contributions are
for changing the charge at the indicated site from zero to
plus 1.0 . 75
Table 3.7 The calculated ∆g
‡
cat
(in kcal/mol) for the Kemp Elimination
reaction of inactive enzymes. 80
ix
Table 3.8 Estimating the efficiency of the EVB screening calculations. 85
Table 4.1 Parameters used in the EVB calculations. 100
Table 4.2 The EVB calibrated parameters used in the present
calculations. 102
Table 4.3 Calculated and observed ∆g
‡
cat
for the reaction of CALA
and its mutants. 109
Table 4.4 Calculated and observed selectivities obtained using Eq. 4.5. 112
Table 4.5 Calculated and observed selectivities obtained using the
EVB barriers and the PDLD/S binding energies. 113
Table 4.6 The performance of the EVB, the TS binding and PDLD/S
binding calculations. 117
x
LIST OF FIGURES
Figure 1.1 A schematic description of the kemp elimination reaction. 6
Figure 1.2 The ab initio 2D (A) and 3D (B) free energy surfaces for the
kemp elimination reaction in solution. 9
Figure 1.3 The ab initio and EVB charges of the reactant state (RS),
transition state (TS) and product state (PS). The nature of
the charge used in each state is indicated in the figure, where
the ab initio (ai) charges are given in black, blue and red in
the RS, TS and PS, respectively. The qualitative EVB
charges of set B are given in black, red and red for the RS,
TS and PS, respectively. These charges are based on using
the ai RS and PS charges, where the TS charges are
generated as the average of the RS and PS charges. The
more quantitative EVB charges of set A are represents in
black, blue and blue for the RS, TS and PS, respectively.
These charges are obtained by using the ab initio RS
charges and then using the special PS charges of Eq. 1.1 that
force the EVB TS charges to reproduce the ab initio TS
charges. Using this set we have to realize that the PS
charges do not give the best representation of the PS and if
needed modify the PS free energy by performing
perturbation from the PS charges of set A to those of state B
while being at the PS configuration. 10
Figure 1.4 The structure of the native enzyme (PDB: 2RKX) with the
docked substrate. 16
Figure 1.5 Correlation between the calculated and observed activation
barriers. 17
Figure 1.6 Free energy profiles for the systems studied. The profiles are
given in terms of the mapping parameter rather than the
energy gap in order to have a more convenient and uniform
comparison. 17
xi
Figure 1.7 The group contributions (in kcal/mol) that reflect the
interactions between the protein residues with the charge
change upon moving from the RS to the TS (
s
Q ∆ in Eq.
1.2). The calculations are done with
ε
ij
=4 for all residues.
The largest negative contributions provide a rough guide for
the optimal sites for effective mutations that will enhance
the catalytic effect. 20
Figure 1.8 The electrostatic potential from the change of the substrate
charges upon moving from the RS to the TS (red and blue
for negative and positive potential respectively) and group
contributions (in kcal/mol) of key residues, where the
contributions are defined for the case where each residue is
positively charged. 24
Figure 1.9 Showing the water penetration near Glu101 in the wt (A)
and the R6 3/7 F mutant (B). As seen from the figure there
are fewer water molecules near Glu101 in the mutant. 26
Figure 1.10 Schematic representation of the reactant-state destabilization
effect in the kemp eliminase systems (wt and directed
evolution mutant R6 3/7F (A)) and the transition-state
stabilization in the DhlA (B). 29
Figure 1.11 The regions that provide potential sites for mutations that
involve charge change. 34
Figure 1.12 Group contributions (using charge set A) from residues that
provide optimal sites for changing the charge of different
residues. In this case the group contribution is given for
q=+1 rather than the charge of the specific group. 35
Figure 2.1 A description of the relationship between the diabatic free
energies ε
1
and ε
2
, the adiabatic ground state free energy E
g
and the off-diagonal element H
12
. Δg
≠
and ΔG
0
designate the
activation free energy and the reaction free energy
respectively. 46
Figure 2.2 The correlation between the calculated (in black) and
observed (in pattern) activation free energies for the
indicated systems, which are described in details in section
“Systems” of Chapter 3. 50
xii
Figure 2.3 The overlay of ten EVB calculated activation free energy
barriers. 50
Figure 3.1 (A) A schematic description of the Kemp elimination
reaction. (B) The substrate structures used in the study: 5-
nitrobenzisoxazole (1), 5,7-dinitrobenzisoxazole (2),
unsubstituted benzisoxazole (3) and 6-
glutaramidebenzisoxazole (4). 56
Figure 3.2 The structure of active site of the 34E4 catalytic antibody
with 5-nitrobenzisoxazole substrate. 60
Figure 3.3 Results of 20 EVB mapping runs for the KE70 (wt)
designed variant. 66
Figure 3.4 Correlation between the calculated and observed activation
barriers. 67
Figure 3.5 The observed and calculated LFER for the reaction of 5-
nitrobenzisoxazole in solution as a function of the acceptor
pK
a
(the experimental values of the pK
a
(A) are taken from
ref.
42
). 68
Figure 3.6 Experimental and calculated LFER for the reaction in water
as a function of the corresponding ∆ pK
a
. The experimental
pK
a
(D) were taken from ref.
31
, and pK
a
(A) was taken from
ref.
42
. More details can be found in Table 3.2. 69
Figure 3.7 The observed and calculated LFER as a function of the
acceptor pK
a
for the reaction in the protein. 70
Figure 3.8 The LFER as a function of ∆ pK
a
for the reaction in the
protein. 71
Figure 3.9 Group contributions from the all distanced (A) and ionized
distanced (B) residues. The calculations were done for the
13G5 catalytic antibody. The group contributions are for
changing the charge at the indicated site from zero to +1.0.
The residue numbers from 2 to 215 and from 216 to 431 are
located at the heavy and light chains of the 13G5 catalytic
antibody respectively. 74
Figure 3.10 Predicted sites of the most effective mutations for catalysis
by distanced residues. 78
xiii
Figure 4.1 A schematic description of the acylation step in the catalytic
reaction of the CALA. The reaction is considered as a two-
step mechanism, where step (1) involves a proton transfer
from Ser184 to His366 and step (2) involves the attack of
the negatively charged serine on the carbonyl carbon of the
substrate and a formation of a tetrahedral intermediate. 93
Figure 4.2 The structure of active site of the wild type CALA with the
R and S enantiomers of the 4-nitrophenyl 2-
methylheptanoate substrate. 94
Figure 4.3 The calculated free energy profile for reaction in water. 97
Figure 4.4 The three resonance structures (
i
Ψ ) used to describe the
acylation reaction of CALA.
1
Ψ ,
2
Ψ and
3
Ψ describe the
reactant state, the product of the proton transfer step and the
tetrahedral intermediate, respectively. 98
Figure 4.5 A thermodynamic cycle for mutating R to S. The figure
describes schematically the mutation of CH
3
to H and H to
CH
3
. In practice, the mutations were done by converting
both the CH
3
and H to dummy atoms (d) and then
converting the dummy atoms (d) to H and CH
3
. 103
Figure 4.6 The calculated free energy profile of the wild type CALA (R
enantiomer). 108
Figure 4.7 The correlation between the calculated and observed
activation free energies of the catalytic hydrolysis of the R
and S enantiomers of the substrate by the wild type CALA
and its mutants. 110
Figure 4.8 The calculated (black bars) and observed (empty bars) E’
values for the wild type CALA and the 3 and 4 mutants. The
plot gives positive and negative values, respectively, to
E’(R) and E’(S). 111
Figure 4.9 The calculated (black bars) and observed (empty bars) E
values for the wild type CALA and some mutants. The plot
gives positive and negative values, respectively, to E(R) and
E(S). 112
xiv
Figure 4.10 The LRA/β group contributions for transferring the TS from
R to S for A) wild-type CALA and for mutants B) 3, C) 4
and D) 9. The contribution is negative when the creation of
the given residue from Gly stabilized the S TS and positive
when the creation of the given residue stabilizes the R TS.
Thus, for example, the finding that the group contribution
for F233 in the wt is negative means that the LRA predicts
that forming the F from G stabilized the S TS form. Now
this is consistent with the fact that the F233G stabilized the
R TS. 115
xv
ABBREVIATIONS
EVB Empirical Valence Bond
CAED Computer-Aided Enzyme Design
wt wild type
TS Transition State
RS Reactant State
PS Product State
MO Molecular Orbital
QM/MM Quantum Mechanical / Molecular Mechanics
ESP Electrostatic Potential
ai ab initio
DhlA Haloalkan Dehalogenase
RSD Reactant State Destabilization
GSD Ground State Destabilization
LFER Linear Free Energy Relationships
FEP/US Free Energy Perturbation / Umbrella Sampling
SCAAS Surface Constrained All Atom Solvent
LRF Local Reaction Field
CALA Candida antarctica lipase A
DFT Density Functional Theory
NAC Near Attack Conformation
PT Proton Transfer
xvi
NA Nucleophilic Attack
LRA Linear Response Approximation
CG Coarse Grained
xvii
ABSTRACT
A fundamental challenge in biotechnology and in biochemistry is an ability to
design effective enzymes. Despite a gradual progress on this front, most of the advances
have been made by placing the reacting fragments in the proper places rather than by
optimizing the environment preorganization, which is the key factor in enzyme catalysis.
Improving the preorganization and assessing the effectiveness of different design options
requires the ability to calculate the actual catalytic effect. Current work explores the
computer-aided refinement of catalytic activities (Kemp Eliminase artificial constructs
and catalytic antibodies) and enantioselectivities (stereoselective Candida antarctica
Lipase A enzymes) using the empirical valence bond as the main screening tool. The
observed absolute catalytic effect and the effect of directed evolution experiments are
reproduced and analyzed. The following refinement procedure located the mutation sites
with increased catalytic activities and selectivities which have been tested
experimentally. Those predictions provide a broad avenue for the design of novel
catalysts. With this in mind, our approach provides a quantitative understanding of the
catalytic effect of the various enzyme constructs and the results of directed evolution
experiments. Furthermore, our technique provides the ability to locate the effective
mutations that can increase k
cat
by refining the electrostatic preorganization of the protein
environment, which allows us to refine and develop more efficient catalysts.
1
INTRODUCTION
Most chemical processes in biological systems are performed and accelerated by
enzymes. And catalysis is one the most difficult features of proteins to reproduce.
103
It
requires the amino acid sequence that binds the substrate, stabilizes transition state, and
releases the product with maintaining the fold and stability of the protein.
65
Since protein
design requires accurate testing of a large number of potential sites on a diverse
collection of scaffolds, the probability of building a single active site onto a scaffold is
very low.
65
This problem can be accessed for example experimentally through the
directed evolution experiments. However, this method is limited to a single scaffold study
and requires intensive investigation and extensive sampling. Therefore computational
enzyme design opens a large number of possibilities, because many reactions can be
tested in silico on a large number of potential protein scaffolds before experimental
characterization.
65
Once the designed site is constructed, it can be optimized further using
in vitro evolution methods.
65
Recent years have seen major advances in the field of computational enzyme
design. Starting from 2008 the number of de novo enzyme constructs was designed,
optimized and crystallized.
36, 79, 88
Among them, the artificial Kemp Eliminases were
designed to catalyze the base-catalyzed benzisoxazole ring opening with accelerated
catalytic rates.
79
The successful design of this catalytic reaction was achieved using the
multi-level design protocol.
8, 79
The first step of this protocol was to choose a catalytic
mechanism and then to use quantum mechanical transition state calculations to create an
idealized active site with protein functional groups positioned to maximize transition
2
state stabilization.
79
Then the RosettaMatch algorithm
109
was applied to search for
collection of protein backbone positions capable of supporting these idealized active sites
in a large set of stable protein scaffolds with ligand-binding pockets and high-resolution
crystal structures.
79
Designs were screened for compatibility with substrate and product
and were ranked on the basis of the catalytic geometry and the computed transition-state-
binding energy.
79
The most efficient design constructs were selected for experimental
testing and further optimization by directed evolution experiments. However, the
resulting catalytic rates of artificial Kemp Eliminases were in the same order of
magnitude with that of catalytic antibodies,
17
and far behind of naturally evolved
enzymes. Therefore the computational redesign of artificial Kemp Eliminases (which is
described here in Chapter 1 and Chapter 3) allowed to probe and modify its catalytic
activities on qualitative and quantitative level, and provided insights of the enzyme action
on a molecular level.
24, 25
Methodology used for the computational protein redesign is
described in Chapter 2. The following computer-aided refinement provided the
information about the potential mutation sites which could affect enzyme function on a
quantitative level (via discrete contribution of different protein residues to the catalytic
activity).
24, 25, 76
These screening approaches (used in iterative way) could therefore guide
the directed evolution experiments.
In addition to computer-aided refinement of catalytic activities of artificial protein
constructs, the design of efficient catalysts for chiral chemistry has obtained an increasing
attention in recent years. The significant interest rises by the pharmaceutical industry for
production of isomerically pure pharmaceuticals and fine chemicals.
73, 83
However, the
3
computational redesign of enantioselectivity poses significant computational difficulties,
due to the very small differences that govern selectivity and small differences between
the steric and electronic effects.
8
The concept of the refinement of enantioselectivity is
introduced and quantified in Chapter 4 using the free energy perturbation (FEP)
approach.
26
The accurate correlation between the observed and calculated selectivity
values was reproduced using extensive sampling for obtaining convergent results.
26
4
CHAPTER ONE:
EXPLORING CHALLENGES IN RATIONAL ENZYME DESIGN BY
SIMULATING THE CATALYSIS IN ARTIFICIAL KEMP ELIMINASE
INTRODUCTION
Rational enzyme design is expected to have a great potential in industrial
application and eventually in medicine.
95
Furthermore, the ability to design efficient
enzymes might be considered as the best manifestation of a true understanding of enzyme
catalysis. However, at present there has been a limited success in most attempts of
rational enzyme design, and the resulting constructs have been much less effective than
the corresponding natural enzymes.
95
Furthermore, despite the progress in directed
evolution (e.g., ref.
43
), we do not have unique rationales for the resulting rate
enhancements.
Most attempts to identify the problems with the current rational design
approaches (for review see ref.
95
) have not been based on actual simulations of the given
effect. In fact, it has been argued
76, 106
that the problems are due to the incomplete
modeling of the transition state (TS) and to the limited awareness to the key role of the
reorganization energy. Even a recent attempt to use a molecular orbital-combined quantum
mechanical / Molecular mechanics (MO-QM/MM) approach
3
has not provided a
5
reasonable estimate of the observed catalytic effect or the trend of the mutational effects in
an artificially design enzyme. Thus, reproducing the effect of directed evolution and
eventually obtaining better performance in enzyme design are crucial for understanding
what is exactly missing in current approaches.
Semiquantitative computational studies of the effect of mutations on enzyme
catalysis date back to the empirical valence bond (EVB) simulations of the anticatalytic
effect of mutations of trypsin.
99
Subsequent calculations of known and predicted mutational
effect include EVB studies (e.g., refs
32, 50, 103, 106
and more recent (MO-QM/MM) studies
(e.g., refs
27, 54, 55, 62
). The above studies, and in particular the quantitative one, have
established the importance of the changes in reorganization energy upon mutations.
50, 106
Recent study
76
has explored the ability of the EVB approach to be used in quantitative
screening of design proposals. However, such approaches have not been used in actual
predictive design studies, and we are not aware of any study that used the ability to
determine activation barriers quantitatively as a successful guide in subsequent enzyme
design experiments.
We would like to clarify that it is not extremely difficult to use the EVB to
reproduce the overall catalytic effect or to suggest mutations with large anticatalytic effect
that will reduce the overall catalysis. This was demonstrated in several enzymes (e.g.,
refs
32, 76
). However, it is much harder to suggest mutations that will drastically increase
catalysis. Thus we tried to explore this challenge and chose as a test system the Kemp
elimination reaction.
42
The design of enzymes that catalyze this reaction has been the
center of recent excitements
17, 79
but as will be argued below the design has not obtained
6
effective transition state stabilization. Our study focused on both validation of the power of
the EVB and more importantly on exploring the effect of the directed evolution and on the
requirement for improving the catalysis of the previously designed enzymes.
SYSTEMS AND METHODS
System
The system chosen as a benchmark for the present study is the Kemp elimination
reaction
42
of 5-nitrobenzisoxazole with a carboxylic acid as a base (Figure 1.1).
Figure 1.1. A schematic description of the kemp elimination reaction.
In order to explore the catalysis in this system we quantified first the energetics of the
reaction in water.
106
Here one may try to use the rate constant of the uncatalyzed reaction
in water (1.2
.
10
-6
M
-1
s
-1
at pH =7.25),
79
but this reaction involves a large contribution
from the hydroxide ion OH
-
available at this pH.
42
Furthermore, as discussed repeatedly
(e.g., ref.
106
) it is much more useful and relevant to consider the “chemically filtered”
reference reaction,
106
which is defined as the solution reaction that follows exactly the
same mechanism as the one in the enzyme. The corresponding rate constant was
determined by taking the estimate
94
of the rate of the reaction of 5-nitrobenzisoxazole in
7
acetate buffer (5
.
10
-5
M
-1
s
-1
) and extrapolating it to 55M to obtain the rate constant (k
cage
)
of 0.0025 s
-1
, for the case when the donor and acceptor (the substrate and a carboxylic
acid) are placed at the same solvent cage. Similar conclusions were obtained by other
barrier estimates (“Barrier Estimates” section of Chapter 1), including ab initio
calculations (“EVB Calibration Procedure” section of Chapter 1). The relevant activation
barriers are given in Table 1.1. It must be emphasized that we are not talking on k
cat
/K
M
relative to the uncatalyzed reaction, where the effect of the enzyme is larger, since this
rate enhancement includes the well understood effect of binding to the active site
106
and
the real challenge is to optimize k
cat
.
Table 1.1. The ∆g
‡
cat
for the systems explored in this work.
a
System ∆g
‡
obs
, kcal/mol ∆g
‡
calc
, kcal/mol
Water (cage ) 21.2 21.1
KE07 (wt, PDB: 2RKX)
KE07 (wt, PDB: 2RKX)+H
2
O
20.1
20.1
(20.2) 19.5
20.4
R6 3/7 F (PDB: 3IIP) 17.8 16.1
R7 2/5 B (from R6 3/7F) 17.6 18.0
A9S (from wt) 20.1
b
(17.2) 19.3
A9N (from wt) 20.1
b
(17.5) 19.5
A9T (from wt) 20.1
b
21.6
a
In brackets the results with the qualitative charge set (set B) while
all other results correspond to the quantitative set (set A).
b
Olga Khersonsky
and Dan Tawfik personal communication.
Interestingly, the reference reaction in a solvent cage is not much slower than the
reaction catalyzed by the originally designed enzyme (k
cat
=0.02-0.3 s
-1
).
79
This indicates
8
that a major part of the catalytic effect is due to just placing the donor and acceptor in a
contact distance (see also ref.
17
).
Barrier Estimates
To be more certain about the energetics of the uncatalyzed reaction (which is
taken here as the “chemically filtered” reaction with the same mechanism as the reaction
in the enzyme) we considered several estimates. This included the estimate for the case
when the donor and acceptor (the substrate and a carboxylic acid) are placed at the same
solvent cage (k
cage
=0.0025 s
-1
). A similar estimate was obtained by taking the rate
constant of the Kemp elimination reaction for 5-nitrobenzisoxazole and pyridine as a base
(1.3
.
10
-4
M
-1
s
-1
) and interpolating from the pK
a
of pyridine (5.2) to that of aspartic acid
(4.0), using the observed
42
linear free energy relationship (LFER). The interpolation of
pyridine rate constant to pK
a
= 4.0 and to 55M provided a rate constant of 0.0012 s
-1
.
Note that the LFER of ref.
42
applies to different bases and thus to COO
-
. After applying
transition state theory we obtained
cage
g
≠
∆
=21.2 kcal/mol. Finally the ab initio
calculations (described in “EVB Calibration Procedure” section of Chapter 1) gave a
barrier of about 17-18 kcal/mol which is in the same range as the observed barrier
(considering the fact that we did not include entropic corrections). In addition to the
reference cage reaction we also estimated the reference reaction with acetate at a standard
state of 1M using the estimate
94
of 5
.
10
-5
M
-1
s
-1
. This gives
w
g
≠
∆
=23.5 kcal/mol. In
some cases the rate enhancement of Kemp eliminase enzymes has been given in terms of
9
k
cat
/K
M
relative to the rate of the uncatalyzed reaction. This of course can look like a very
large catalytic effect, but it actually includes the well understood effect of binding to the
active site,
106
whereas the real challenge is to optimize the chemical step (namely k
cat
,
relative to k
cage
).
EVB Calibration Procedure
In order to calibrate the EVB Hamiltonian we need information about the free
energy surface of the solution reaction. Thus we performed the hybrid density functional
theory (DFT) ab initio calculations of the initial gas phase geometry optimization at the
MPW1PW91/6-31+G* level, using the Gaussian03 package (Revision E.01).
23
The
resulting structures were solvated implicitly using the COSMO continuum model
7, 45
(note that the EVB treats the solvent explicitly) at the MPW1PW91/6-311++G** level.
The resulting ab initio surface is described in Figure 1.2.
Figure 1.2. The ab initio 2D (A) and 3D (B) free energy surfaces for the Kemp
elimination reaction in solution.
10
The effective atomic charges were determined from the electronic wave functions
by fitting the resulting electrostatic potential in the neighborhood of these molecules by
the electrostatic potential (ESP) procedure (using Merz-Kollman scheme),
9
and the
corresponding charges are given in Figure 1.3.
Figure 1.3. The ab initio and EVB charges of the reactant state (RS), transition state (TS)
and product state (PS). The nature of the charge used in each state is indicated in the
figure, where the ab initio (ai) charges are given in black, blue and red in the RS, TS and
PS, respectively. The qualitative EVB charges of set B are given in black, red and red for
the RS, TS and PS, respectively. These charges are based on using the ai RS and PS
charges, where the TS charges are generated as the average of the RS and PS charges.
The more quantitative EVB charges of set A are represents in black, blue and blue for the
RS, TS and PS, respectively. These charges are obtained by using the ab initio RS
charges and then using the special PS charges of Eq. 1.1 that force the EVB TS charges
to reproduce the ab initio TS charges. Using this set we have to realize that the PS
charges do not give the best representation of the PS and if needed modify the PS free
energy by performing perturbation from the PS charges of set A to those of state B while
being at the PS configuration.
The EVB surface was then fitted to the ab initio surface, requiring that the barrier
and the exothermicity will be similar. This was followed by adjusting the EVB solution
surface to reproduce the observed activation barrier. The active space EVB surface was
confined to the system in Figure 1.3. Furthermore, the ab initio calculations were used to
11
determine the effective atomic charges of the reactant, product and transition states. In
constructing the EVB surface we noted that the complex ab initio surface should be
usually modeled by three diabatic surfaces. However, for efficient calculations it is
desirable to stay with a two state model, but in the case of two EVB states, the problem is
to guarantee that the TS will have the correct ab initio charge distribution. Usually it is
sufficient to assign to the EVB charges for the reactant and product states were taken as
the corresponding ab initio charges for the RS and PS, however, in the present case the
average of these charges does not produce the correct TS charges. Thus we took a special
procedure using a two state model, where the PS charges are determined by
2
VB ai ai
PS TS RS
Q Q Q = − (1.1)
where ai designates ab initio. This grantees that the average of
VB
PS
Q
and
VB
TS
Q
will
produce
ai
TS
Q
. The corresponding charges are given in Figure 1.3 and are designate in this
work as set A charges. The other parameters of the calibrated EVB surface are also given
in Table 1.2. A comparison between the TS structures in solution, obtained by the EVB
and by the DFT calculations is given in Appendix A.
12
Table 1.2. Parameters used in the EVB calculations.
a
EVB and ab initio charges used in calculations:
ab initio set A set B
atom
number
atom
name
RS TS PS RS TS PS RS TS PS
1 O -0.13 -0.36 -0.77 -0.13 -0.36 -0.59 -0.13 -0.45 -0.77
2 N -0.60 -0.46 -0.61 -0.60 -0.46 -0.32 -0.60 -0.60 -0.61
3 C 0.49 0.06 0.58 0.49 0.06 -0.37 0.49 0.54 0.58
4 C -0.39 -0.41 -0.10 -0.39 -0.41 -0.43 -0.39 -0.25 -0.10
5 C 0.06 -0.08 -0.08 0.06 -0.08 -0.22 0.06 -0.01 -0.08
6 C -0.03 -0.03 -0.16 -0.03 -0.03 -0.03 -0.03 -0.09 -0.16
7 C -0.48 -0.63 -0.39 -0.48 -0.63 -0.78 -0.48 -0.44 -0.39
8 C -0.13 0.03 -0.34 -0.13 0.03 0.19 -0.13 -0.24 -0.34
9 C 0.49 0.73 0.71 0.49 0.73 0.97 0.49 0.60 0.71
10 H 0.15 0.32 0.49 0.15 0.32 0.49 0.15 0.32 0.49
11 O -0.99 -0.82 -0.67 -0.99 -0.82 -0.65 -0.99 -0.83 -0.67
12 C 0.98 0.84 0.74 0.98 0.84 0.70 0.98 0.86 0.74
13 O -0.99 -0.78 -0.62 -0.99 -0.78 -0.57 -0.99 -0.81 -0.62
14 N (NO
2
)
b
0.71 0.85 0.75 0.71 0.85 0.99 0.71 0.73 0.75
15 O (NO
2
)
b
-0.48 -0.53 -0.56 -0.48 -0.53 -0.58 -0.48 -0.52 -0.56
16 O
’
(NO
2
)
b
-0.49 -0.54 -0.55 -0.49 -0.54 -0.59 -0.49 -0.52 -0.55
17 H (C
4
)
b
0.42 0.36 0.20 0.42 0.36 0.30 0.42 0.31 0.20
18 H (C
6
)
b
0.18 0.19 0.20 0.18 0.19 0.20 0.18 0.19 0.20
19 H (C
7
)
b
0.23 0.26 0.18 0.23 0.26 0.29 0.23 0.21 0.18
Morse bond parameters:
( )
( )
0
2
() 1
bb
M
Mb D e
µ −−
∆= −
Bond type D
M
b
0
µ
C
i
-C
i
96.0 1.40 2.0
C
3
-N
2
94.0 1.40 2.0
C
9
-O
1
93.0 1.25 2.0
N
2
-O
1
91.0 1.40 2.0
C
i
-H
i
100.0 1.10 2.0
C
12
-O
j=11,12
91.0 1.40 2.0
C
8
-C
3
96.0 1.40 2.0
C
3
-N
2
110.0 1.14 2.0
O
11
-H
10
97.0 1.10 2.0
Angle parameters:
( )
2
0
( ) 12 Vk
θθ
θ θ θ = −
Angle type 1/2k
θ
θ
0
C
i
-C
i
-C
i
50.0 120.0
C
3
-N
2
-O
1
50.0 120.0
N
2
-O
1
-C
9
50.0 120.0
C
8
-C
3
-N
2
50.0 180.0
13
Table 1.2 (Continued). Parameters used in the EVB calculations.
a
Torsion angle parameters: ( ) ( )
0
( ) 1 cos k n V
φ φ
φ φ φ = + −
Angle type k
ф
n ф
0
C
i
-C
i
-C
i
-C
i
15.0 2.0 180.0
C
i
-C
i
-O
1
-N
2
15.0 2.0 180.0
C
i
-C
i
-C
i
-X 15.0 2..0 180.0
C
i
-C
i
-N
2
-O
1
15.0 2.0 180.0
H
10
-C
3
-N
2
-O
1
15.0 2.0 180.0
C
3
-N
2
-O
1
-C
9
15.0 2.0 180.0
O
13
-C
12
-O
11
-H
10
15.0 2.0 180.0
Improper torsion angle parameters: ( ) ( )
0
( ) 1 cos k n V
φ φ
φ φ φ = + −
Angle type k
ф
n ф
0
C
i
-C
i
-C
i
-C
i
15.0 2.0 180.0
C
3
-C
8
-N
2
-H
10
15.0 2.0 180.0
Nonbonded parameters (repulsion function)
c
:
r
nb
V Ce
α −
=
Atom type C α
O
11
-H
10
4000 4.0
C
12
-O
j=11,12
19999 3.8
Nonbonded parameters (van der Waals)
d
:
( ) ( )
12 6
* * *
2
nb
V r r r r ε
= −
Atom type r
*
ε
*
O
11
-O
13
3.0 0.08
H
i
-H
i
2.5 0.01
O
13
-H
10
2.7 0.14
Other EVB parameters:
Off diagonal element
f
:
0
12 12
() rr
A e H
µ −−
= Gas phase shift:
21
αα α ∆= −
A
12
µ r
0
α
1
α
2
H
12
45.5 0.0 2.5
∆α 49.0 0.0
a
Energies are in kcal/mol, distances in Å and angles in degrees.
b
X(NO
2
) designates the atom X of the NO
2
group; H(C
i
) designates the hydrogen atom
bound to C
i
.
c
Nonbonded interactions, where only the repulsion term is considered for atoms, which
are bonded in one of the VB structures.
d
Nonbonded interactions for atoms, which are never bonded in any of the VB structures.
f
r
0
is the distance between the donor and acceptor.
14
Simulation Protocol
The free energy surface of the reference solution reaction was estimated by ab
initio calculations, in the same way described in our previous works (e.g., ref.
38
), and the
resulting surface is shown in Figure 1.2. The ab initio effective charges of the RS, TS and
PS are given in Figure 1.3. In converting the ab initio results to the corresponding EVB
we have to take into account the complex nature of the reaction with a concerted proton
transfer and C-N bond breaking. Usually this requires three diabatic states, but for the
present purpose we decided to use two states with modified charges that reproduce the ab
initio TS charges. The EVB calculations were carried by MOLARIS simulation
program
48
using the ENZYMIX force field. The EVB activation barriers were calculated
at the configurations selected by the same free energy perturbation umbrella sampling
(FEP/US) approach used in all our studies. The theoretical details of the EVB method are
described in Chapter 2 (“Methodology: Empirical Valence Bond Approach”). The
simulation systems were solvated by the surface constrained all atom solvent (SCAAS)
model
48
using a water sphere of 18 Å radius centered on the substrate and surrounded by
2 Å grid of Langevin dipoles and then by a bulk solvent, while long-range electrostatic
effects were treated by the local reaction field (LRF) method.
48
The EVB region is
consisted of the substrate and the carboxylic group of the glutamic amino acid that serves
as a proton acceptor. Validation studies were done within 22 Å radius of inner sphere.
The FEP mapping was evaluated by 21 frames of 20 ps each for moving along the
reaction coordinate using SCAAS model. All the simulations were done at 300 K with a
time step of 1fs. In order to obtain reliable results the simulations were repeated 20 times
15
with different initial conditions (obtained from arbitrary points in the relaxation
trajectory). The mutant systems were generated from the native enzymes via 100 ps
relaxation runs.
RESULTS
Validation on Existing Constructs
In order to validate our EVB approach we started by trying to reproduce the
observed k
cat
. This was done by calibrating an EVB surface for the reaction in solution
(by forcing it to reproduce the observed activation barrier and calculated exothermicity)
and using the calibrated EVB parameters (Table 1.2) in studies of the reaction in the
protein.
The initial structures of the “native” and mutant proteins were taken from the
corresponding X-ray structures.
43
The reacting substrate was placed in each active site in
a similar orientation to that in the design KE07 model and relaxed. The resulting
orientations (e.g., Figure 1.4) are similar to those in the original design.
16
Figure 1.4. The structure of the native enzyme (PDB: 2RKX) with the docked substrate.
Twenty structures were saved during each relaxation process, and then used to
generate the EVB surface and obtain the activation free energies. The calculated
activation barriers and the corresponding observed values are given in Table 1.1 and
Figure 1.5, and some reaction profiles are shown in Figure 1.6. As seen from the table the
calculated and observed catalytic effects are in good agreement with each other.
17
Figure 1.5. Correlation between the calculated and observed activation barriers.
Figure 1.6. Free energy profiles for the systems studied. The profiles are given in terms
of the mapping parameter rather than the energy gap in order to have a more convenient
and uniform comparison.
18
One of the interesting results of our analysis is the finding that the initial design
(KE07) has not lead to significant rate enhancement above that effect expected from
placing the acid and the base at the same cage (see Table 1.1).
Group Contributions: Initial Screening
In order to obtain an improved rational design it is crucial to be able to explore the
dependence of the TS energy on each residue. Since the electrostatic effect is by far the
most important factor in enzyme catalysis,
106
we started with an attempt to estimate the
electrostatic contributions expected from each residue. This was done by evaluating the
effect of changing the substrate charges (upon moving from the RS to the TS) on the
contribution of each residue. More specifically we assigned (artificially) to each protein
residues a charge of +1 and then calculated the change in the corresponding “group
contribution” when the residual charges (or polarity) of the reacting substrate changes.
The change,
i
G ∆ , is then used to estimate the optimal change in the charges (or polarity)
of the protein residues and thus the optimal mutations.
Our approach is based on the realization that the electrostatic contributions of the
protein residues to the activation barrier can be very roughly approximated by:
61
∆ ∆g
elec
≠
≅ 332 q
j
k
∆Q
i
( )
k
∑
ij
∑
r
ij
ε
ij
(1.2)
where
k
j
q are the residual charges of the protein atom, j runs over the protein residues, k
runs over the atoms of the j
th
residue, and i over the substrate atoms,
ij
ε is the dielectric
19
constant for the specific interaction. The
s
Q ∆ are the changes in the substrate charges
upon going from the RS to the TS (this charges are given in Figure 1.3).
This equation can be formally simplified by grouping together the atomic charges
of each residue and written:
332
i ij elec j
ij
ij
g q Q r ε
≠
∆ ∆ ≅ ∆
∑
(1.3)
where
j
q is the effective “charge” of the j
th
residue (which may represent dipoles in cases
of polar residues). Now we can explore the trend of Eq. 1.3 for any possible mutations if
we artificially assign to all protein residues a charge of +1.0 and then ask what charge
change will lead to the most negative
elec
g
≠
∆ ∆
. This lead to
∆q
j ( )
opt
≅ − α ∂ ∆ ∆g
elec
≠
∂q
j
= − α ∆Q
i
r
ij
ε
ij
i
∑
(1.4)
where now
j
q ∆ are proportional to the electrostatic group contribution, when all the
protein groups are positively charged. In this case if
j
q ∆ is positive the given residue
should be either negative or have its dipole orientation back from the potential due to the
s
Q ∆ .
We can use Eq. 1.2 for a estimating the most effective charge changes in each site
and this was done in examining the trend with two charge sets (set B that used the ab
initio RS and PS as the corresponding EVB charges for the RS and PS, and a much more
rigorous set (set A), that forced the EVB TS charges to reproduce the ab initio TS
charges. The resulting group contribution for set A is given in Figure 1.7 and Table 1.3.
The calculations with the qualitative set B suggested the intuitively appealing idea that
20
the charge of O1 changes significantly upon moving to the TS (this trend is observed in
Figure 1.3). The corresponding group contributions suggested that mutations at regions 6-
12 and 47-51 may be beneficial to catalysis. Although it is very tempting to follow this
hint, it appears that the TS in the Kemp reaction does not follow a simple rule (being the
average of the RS and PS charges) and apparently the charge change on O1 is
unexpectedly small.
Figure 1.7. The group contributions (in kcal/mol) that reflect the interactions between the
protein residues with the charge change upon moving from the RS to the TS (
s
Q ∆ in Eq.
1.2). The calculations are done with
ε
ij
=4 for all residues. The largest negative
contributions provide a rough guide for the optimal sites for effective mutations that will
enhance the catalytic effect.
21
Table 1.3. Group contributions obtained using the structure of the wild type enzyme for
charge set A and set B.
a
Group # Name
Distance to the
region I center
Set A:
∆G
RS TS
Set B:
∆G
RS TS
5 ARG 15.50 -0.16 -0.28
8 ALA 10.31 -0.33 -0.96
9 ALA 6.57 -1.08 -1.93
10 LEU 9.46 -0.56 -1.10
11 ILE 6.91 -0.82 -1.00
12 VAL 9.90 -0.22 -0.60
15 GLY 15.36 -0.20 -0.27
17 VAL 13.05 -0.25 -0.43
18 VAL 11.76 -0.24 -0.35
19 LYS 14.43 -0.20 -0.26
20 GLY 16.30 -0.25 -0.32
32 PRO 14.36 -0.21 -0.46
35 LEU 15.73 -0.18 -0.40
36 GLY 15.37 -0.19 -0.47
39 TYR 15.25 -0.17 -0.48
44 ILE 15.39 -0.18 -0.44
47 LEU 11.79 -0.22 -0.83
48 SER 6.74 -0.41 -3.13
49 PHE 9.12 -0.35 -1.37
50 TRP 3.98 -0.18 -0.17
52 ILE 9.49 -0.29 -0.09
58 LYS 13.58 0.165 0.24
62 MET 12.89 0.15 0.12
78 THR 9.13 0.160 -0.87
80 GLY 6.53 0.33 0.77
81 GLY 7.60 1.37 3.11
82 GLY 10.60 0.64 1.88
83 ILE 11.56 0.74 0.41
84 HIE 14.46 0.59 0.61
88 THR 16.75 0.26 0.28
89 ALA 15.67 0.32 0.20
92 LEU 14.75 0.20 0.08
98 ASP 16.53 0.16 -0.11
99 LYS 12.60 0.28 -0.14
100 VAL 11.81 0.51 0.34
22
Table 1.3 (Continued). Group contributions obtained using the structure of the wild
type enzyme for charge set A and set B.
a
Group # Name
Distance to the
region I center
Set A:
∆G
RS TS
Set B:
∆G
RS TS
102 ILE 11.11 0.86 1.42
103 ASN 8.82 0.75 2.42
104 THR 11.73 0.19 1.16
105 ALA 14.70 0.53 0.73
106 ALA 14.10 0.40 0.89
107 VAL 14.76 0.28 0.62
113 ILE 16.76 0.43 0.42
124 ALA 15.59 0.33 0.11
125 VAL 14.59 0.40 0.40
126 VAL 10.16 0.83 0.06
127 VAL 11.15 0.74 0.82
128 TYR 5.73 0.51 1.85
129 ILE 11.22 0.30 0.69
130 ALA 9.70 -0.18 0.69
132 LYS 15.96 -0.17 0.17
142 THR 12.58 0.22 0.81
143 TYR 13.92 0.17 0.70
144 SER 10.67 0.68 0.67
145 GLY 13.11 0.23 0.44
146 LYS 16.38 0.22 0.33
156 TRP 16.56 0.25 0.39
160 VAL 16.27 0.24 0.34
165 ALA 16.50 0.23 0.23
166 GLY 16.12 0.26 0.10
167 GLU 13.02 0.41 -0.15
168 ILE 12.87 0.28 0.14
169 VAL 7.54 0.42 -0.62
172 SER 11.74 -0.20 0.02
173 ILE 11.51 -0.24 0.19
174 ASP 15.34 -0.21 0.05
175 ARG 15.54 -0.19 -0.07
176 LEU 9.11 -0.49 -0.09
177 GLY 12.36 -0.27 -0.23
178 THR 15.45 -0.19 -0.12
179 LYS 15.72 -0.36 -0.22
181 GLY 16.51 -0.19 -0.15
23
Table 1.3 (Continued). Group contributions obtained using the structure of the wild
type enzyme for charge set A and set B.
a
Group # Name
Distance to the
region I center
Set A:
∆G
RS TS
Set B:
∆G
RS TS
182 TYR 13.55 -0.23 -0.27
183 ASP 16.31 -0.12 -0.04
201 HIS 6.45 -0.87 -0.73
202 GLY 9.06 -0.42 -0.32
203 GLY 11.36 -0.38 -0.25
204 ALA 14.11 -0.31 -0.37
210 PHE 16.09 -0.15 -0.29
223 ALA 9.87 -0.46 -0.80
224 ASN 8.59 -0.60 -0.97
225 SER 12.07 -0.34 -0.53
226 VAL 14.41 -0.31 -0.41
227 PHE 13.79 -0.28 -0.47
228 HIS 13.20 -0.33 -0.54
229 PHE 16.76 -0.21 -0.29
a
Energies are in kcal/mol and distances in Å. The group contributions are
elevated by assigning charge +1 to the given residue. As clarified in the
text the estimates provided by this approach are not quantitative.
It should be noted that the group contribution can only provide very rough hints
since it does not reflect the reorganization energy consistently. In fact, obtaining more
quantitative group contributions is possible (especially for the actual given sequence by
using the linear response approximation that captures the reorganization effect).
105
Also it is instructive to examine the potential due to the
i
Q ∆ of each of the atoms
of the reacting system, and the corresponding group contributions. These features, which
are depicted in Figure 1.8 suggest that the TS may be stabilized by placing a positive
charge or a hydrogen bond near the regions were the negative charge is developing
during the proton transfer process.
24
Figure 1.8. The electrostatic potential from the change of the substrate charges upon
moving from the RS to the TS (red and blue for negative and positive potential
respectively) and group contributions (in kcal/mol) of key residues, where the
contributions are defined for the case where each residue is positively charged.
There is also an option to stabilize the positive charge developing on Glu101. This
can be also done by destabilizing the reactant charge. Unfortunately, the group
contributions are quite small in the case of the Kemp elimination reaction and do not
provide a clear hint for a mutation with a large effect.
Analysis of Suggested Mutations
Trying to exploit the hints from the calculated group contributions we examined
first the intuitive hints of the qualitative TS charge set (Set B) performing EVB
calculations, with set B, of mutations that would change residue 9 to polar residues and
provide stabilization of the charges that might develop on O1. The initial calculated
25
increase in TS stabilization for the mutations is given in bracket in Table 1.1. As seen
from the table the A9S mutant (with the qualitative set B) is predicted to increase the
catalytic effect.
Instructively, the effect of the A9S and A9N mutations was explored recently
(Olga Khersonsky
and Dan Tawfik personal communication) and these mutations did not
lead to any catalytic improvement (see Table 1.1), in an apparent contrast to the trend
predicted by set B charges and to simple qualitative considerations. Thus we looked for
pitfalls, starting with the possibility of having a water molecule in the space between Ala
9 and O1. However, our calculations (with set A) produced small and inconclusive effect,
and the use of the set B could not rationalize the lack of detectable effect of polar groups
on the supposedly developing TS charge of O1. Thus we considered the fact that the TS
charges cannot be represented by the intuitive set and we moved to the more realistic
(and non intuitive) set A. Now we obtained the results listed in Table 1.1 (without
brackets) where almost all the predicted effect of polar stabilization of the O1 developing
charge disappeared. This learning process provided an interesting example where
mutational effect forces one to look more carefully at the theoretical model.
Considering the difficulties in obtaining a major effect by improving the active
site polarity, we looked for possible hints from the directed evolution experiments (here
we focused on R6 3/7 F). Our starting point was the fact that we succeed, for the first
time, to reproduced computationally the observed effect and thus have the tools to
examine its origin. It was found that the mutant has an early TS (as shown in Figure 1.6),
due to both small destabilization of the reactant state and stabilization of the product
26
state. Apparently, while the water molecules play an important role in the wt, they
provide less solvation of the reactant state charges of the mutant (see Figure 1.9).
However, even the directed evolution did not find a major way of TS stabilization.
Figure 1.9. Showing the water penetration near Glu101 in the wt (A) and the R6 3/7 F
mutant (B). As seen from the figure there are fewer water molecules near Glu101 in the
mutant.
Rate Increase in Directed Evolution Mutants
The Kemp elimination reaction is formally similar to other reactions where a
negative charge is being transfer from the ionized acid to the substrate. A good example
is the reaction of haloalkan dehalogenase (DhlA),
67
where we have a negatively charged
Asp group in the RS [(- ) (0)] configuration transfers its negative charge to a leaving
group (Cl) and form a [(-1/2) (-1/2)] configuration in the TS (here the first and second
bracket correspond to the charge on Asp and on the Cl atom). In this case we have shown
by quantitative LRA analysis
67
that, while in solution the system lose a very large
27
solvation energy upon moving to the TS, in the enzyme the loss of solvation energy is
much smaller, resulting (see Table 1.4 (II) and Figure 1.10 (B)) in a major TS
stabilization (relative to the solution reaction). This is accomplished by the
preorganization of two polar hydrogen bonds (Trp125 and Trp175) that stabilizes the -1/2
charge of the living Cl atom. Unfortunately, in the case of the Kemp elimination, the
charge that is being transferred to the substrate is much more delocalized and thus
making possible the preorganization effects less effective (see also ref.
29
). Here (see
Table 1.4 (I) and Figure 1.10 (A)) it seems that the solution found by the directed
evolution involves reducing the solvation of the reactant state by water molecules. This
does not involve significant reactant state destabilization since K
M
increases by only
factor of 3 (which means that the binding energy is similar for the native and mutant
enzyme). However, the destabilization of the ionized Glu101 chosen here can also be
consider as a reactant state destabilization (RSD) effect, but it is clearly not the standard
avenue chosen by DhlA and other naturally evolved enzymes (see systematic analysis in
Table 1.4 and Figure 1.10). Perhaps, the problem is the ability of the water molecules to
approach the proton donor (see Figure 1.9). Here a change of the enzyme scaffold to
block entrance of water molecules near Glu101 can be useful. Perhaps replacing Asn103
by a larger residue will be effective in improving the wt enzyme.
28
Table 1.4. Comparing the LRA contributions of the solvation free energies in the Kemp
eliminase and DhlA systems:
a
(I) Kemp eliminase:
water reaction native reaction mutant reaction
RS TS RS TS RS TS
0
w
Q
Q
U U − -176.0 -49.1 -36.6 -10.2 -16.3 -10.8
0
0
w
Q
U U − 2.8 2.7 5.5 1.0 8.7 4.7
0
p
Q
Q
U U − -1.5 -0.6 -64.0 -31.9 -70.7 -32.1
0
0
p
Q
U U − -1.1 -0.6 -32.2 -16.4 -37.7 -22.6
solv
G ∆
-87.9 -23.8 -63.6 -28.7 -58.0 -30.4
(II) DhlA:
water reference reaction enzyme reaction
RS TS RS TS
0
w p
Q
Q
U U
+
− -158.5 -114.2 -129.7 -102.4
0
0
w p
Q
U U
+
− 2.9 4.4 -62 -59.1
solv
G ∆
-77.8
-54.9 -95.6
-80.8
a
All energies are given in kcal/mol. The results for DhlA are taken from ref.
67
. As can be
seen from our results, in DhlA the RS is better solvated better in the enzyme than in the
water reference reaction, while the TS is better solvated in the enzyme than in the water
reaction. On the other hand in the Kemp eliminase the RS is solvated more in water than
in the enzyme. The notation for the LRA contributions are explained in details ref.
67
and
they represent the difference in the energy of the charge and non polar forms of the
indicated system average over trajectories with the indicated charge state.
29
Figure 1.10. Schematic representation of the reactant state destabilization effect in the
kemp eliminase systems (wt and directed evolution mutant R6 3/7F (A)) and the
transition state stabilization in the DhlA (B).
Linear Response Approximation
The EVB surfaces were used in evaluating the free energy profiles for the studied
systems, and the corresponding results are summarized in Table 1.1 and in Figure 1.6.
Since we are dealing with relatively small changes in the activation barrier we put special
emphasize on trying to obtain stable results. This included extensive averaging and
eventually running the calculations with uncharged Lys222, as the observed effect of this
residue is very small,
36, 79
and yet it leads to instability in the calculations (see below).
We also explored the origin of the overall electrostatic contribution to the
activation barrier by evaluating the LRA contributions to the charging of the RS and the
TS. This analysis was compared to the corresponding analysis of the electrostatic
30
contribution in DhlA.
67
The results, which are given in Table 1.4 and Figure 1.10, show
that in the present case the directed evolution cycles led to destabilization of the RS,
which is much less effective than the mod used in DhlA and other enzymes, where the
enzyme mainly solvated the TS.
One of the problems in the calculations has been associated with the fact that
some simulation runs give an excess catalytic effects by generating configurations with
unstable ground state energy (see discussion in ref.
76
). Trying to reduce this instability we
exploited the same linear response approximation (LRA) treatment used in our study of
bacteriorhodopsin.
12
This treatment estimates the energy change of the entire protein
system, upon moving between two configurations in the reactant state, by evaluating the
EVB energy gaps in these two configurations, using a treatment similar to that used in
ref.
12
.
The relevant free energy is then obtained by:
( )
1 1 1 2
1 1 1 2 2 1 2 1
, ,
1
( ) ( )
2
r r
G r r
ε ε
ε ε ε ε ε ε
∆ → = − − −
(1.5)
where
1
,
i
r ε
indicates an average on trajectories that are running on ε
1
, where the
system is held by a weak constraint near the coordinates of the indicated conformations
(r
i
). This treatment will allow us to explore the change in the protein internal free energy
upon conformational change.
The LRA treatment appeared to significantly reduce the error associated with
having different protein structures and to provide a way to obtain more stable results from
the calculated
cat
g
≠
∆
. In the present case we reduced the difference less than 2 kcal/mol.
31
An interesting related point explored here has been the change in the calculated barriers
obtained for very different structures. For example, we examined the difference in the
catalytic effect obtained for the KE07 protein while starting from the actual structure
(PDB code: 2RKX.pdb) and the one obtained starting from the original template (PDB
code: 1THF.pdb). It appear that the calculation that start from the template structure gave
major (and incorrect) catalytic effect with a barrier of about 11-13 kcal/mol and that this
barrier changed to a barrier of about 24 kcal/mol, once the LRA correction has been
applied. The initial excess catalytic effect obtained with the poor initial structural model
of KE07 is apparently due to having an ES structure were the bound ligand is not in its
most stable configuration.
Another interesting issue that was explored in this work was the possibility that
the substrate has orientations, which are different than the one found by the initial
docking. This was done by rotating the substrate to 90 and 180 around the NO
2
–Glu101
vector, relaxing the substrate and calculating the EVB profile at each orientation. It was
found that the new configurations lead to a complete loss of the catalytic effect (more
than 5 kcal/mol higher barrier). This reduces the likelihood that the substrate is in some
alternative configuration.
32
Prospect for Computer-Aided Directed Evolution
Our previous studies (e.g., refs
92, 100
) have indicated that the effect of ionized
residues at a distance of more than 6 Å can be approximated by using a relatively large
effective dielectric constant for charge–charge interaction. In other words the catalytic
effect can be estimated by Eq. 1.4 considering only ionized residues. This gives
( ) ( )
i ij ij j
opt
i
q Q r r α ε ∆ ≅ − ∆
∑
(1.5)
where
ij
ε
is replaced by
( )
eff ij
r ε
, which is a distance dependent dielectric constant
whose value is about 30 at medium distances. However, for interaction with the
transition state residual charge one should use a smaller
ε
eff
. Here one would only
mutate the residue with the largest
( ) j
opt
q ∆ , relax the protein and then repeat the
calculations, looking for the next mutation. The above expression can be extended to the
use of the group contribution also for polar residues, but in this case the predicted effect
of each mutation (and thus formally
ε
ij
) should be deduced from actual EVB
calculations. The prediction of the above approximation for ionized residues is expected
to be reasonable for residues at more than 6 Å from the substrate, as has been shown in
earlier studies.
106
In the present case we still do not have a direct validation of mutations
that change a single ionized group (rather than multiple mutations). However, directed
evolution rounds (Olga Khersonsky
and Dan Tawfik personal communication) in the
refinement of the wt, identified mutations that are likely to be increase the activity. Most
of these mutations are consistent with the predictions of Table 1.5 and the trend of Figure
33
1.11 as well as Figure 1.12. More specifically, according to our prediction, one should
consider the region with a negative potential from the substrate
s
Q ∆ (the region along the
arrow that extends from the blue to red densities in Figure 1.11) and mutate negatively
charged residues to neutral residues in order to increase the TS stabilization. The same
will be true for taking neutral residues and mutating them to positively charged residues.
On the other hand, taking positively charged residues and mutating them to uncharged
residues will destabilize the TS. The opposite effect is predicted to occur for residues in
the regions with positive potential from the
s
Q ∆ . The prediction of the effect of ionized
residues is expected to be robust for residues at intermediate distances from the substrate.
However, in the case of residues in the region perpendicular to the field from the
s
Q ∆ ,
we expect the prediction from the group contribution to be less reliable, as small change
in the residue position can move it from the region of a positive potential to the region of
a negative one. Overall it seems that the experimental screening is consistent with our
specific prediction (see Table 1.5). However, only actual evaluation of k
cat
will provide a
validation of these predictions. Furthermore, unfortunately the dipole generated by the
s
Q ∆ is relatively small and is not predicted to give substantial interaction with distanced
ionized residues. Perhaps using the KE70 mutants where the proton acceptor is histamine
can lead to a larger dipole effect.
34
Table 1.5. Predicted catalytic contributions from some ionized residues.
site of
mutation
possible
mutation
∆G
RS TS
,
kcal/mol
factor of rate
acceleration /
reduction
observed change
in activity
a
ARG16 neutral -0.12 1.2 (acceleration) R16H (5, 7 rounds)
ASP31 neutral -0.13 1.2 (reduction) -
LYS37 neutral -0.09 1.2 (acceleration) K37L (1round)
ASN109
negatively
charged
0.25 1.5 (acceleration) N109D (7-10 rounds)
LYS146
negatively
charged
0.22 1.5 (acceleration) K146E (2-10 rounds)
LYS162
negatively
charged
0.11 1.2 (acceleration) K162E (10 round)
ARG163 neutral 0.17 1.3 (reduction) -
a
Predicted positive or negative means that generating the corresponding charge
change at the given site will lead to small rate acceleration. The observed column
designates possible increase in activity found in the sequenced KE07 variants
throughout the directed evolution process, and probably the mutations are
beneficial or neutral (Olga Khersonsky
and Dan Tawfik personal
communication). In most cases the direction of the predictions agrees with the
observed activity, but as clarified in the text the experiments do not correspond to
actual measurements of k
cat
.
Figure 1.11. The regions that provide potential sites for mutations that involve charge
change.
35
Figure 1.12. Group contributions (using charge set A) from residues that provide optimal
sites for changing the charge of different residues. In this case the group contribution is
given for q=+1 rather than the charge of the specific group.
A further constraint on the design can be imposed by introducing mutations that
compensate for the loss of stability during the refinement process (see discussion in
ref.
43
). This can be done by our recent focused dielectric approach,
58
which gives an
approximated expression:
332
j fold h ih focus
ih
G q q r ε ∆ ≅
∑
(1.6)
where
focus
ε
is the optimal dielectric constant
96
and the
s
q reflects the given pH.
We can now optimize
j
q under the constraint of large folding energy, and this
lead to
( ) j j elec j fold
qq q Gg α
≠
− ∂ ∂ ∂∆∆ ∂ ∆− (1.7)
The use of the combined TS stabilization and stability constraints is expected to
provide a quite effective way for guiding the mutations of distanced ionized residues.
36
Also it would be interesting to use this approach in an iterative way as a model for
directed evolution.
While the effects of distanced ionized residues are early estimated by our
macroscopic approximation, the effects of nearby ionized residues are harder to obtain
reliably by microscopic simulations since the simulations might not provide the proper
dielectric compensation. This situation appears to be particularly serious with Lys222 due
to the flexibility of this residue. Fortunately, it was found experimentally that this residue
has extremely small effect and in view of the instability of the calculations with the
ionized Lys222 we perform most of the calculations while artificially using non ionized
Lys222.
DISCUSSION
In considering the catalytic factor in the kemp eliminase reaction it is important to
consider proposals discussed in previous design efforts. Here we would like to clarify
that TS stabilization by delocalization effects
79
is unlikely to provide a significant
catalytic factor since the same effect exists in the reference solution reaction. Thus, the
possible effect of π-stacking (which was considered in ref.
79
) is not expected to lead to a
significant rate enhancement above the simple effect of having nearby induced dipoles,
which is much less effective than having preorganized polar environment. In fact, as
realized by Hilvert and coworkers
17
the corresponding dispersion or more precisely
inductive effect is small.
37
The attemptes to refine the electrostatic environment near O1 can be considered by
some as an extension of the idea of placing an acid near O1.
18, 43
However, the idea that
such a base is needed is reminiscence from physical organic chemistry concepts that
capture some of the electrostatic effect, but end up looking at factors that do not play
major role in enzyme catalysis. That is, the issue is not a charge transfer or a covalent
bond to the substrate as might be concluded from gas phase calculations, since we are not
dealing here with a bifunctional reaction with two steps (unless we have a new
chemistry), but with a pure electrostatic stabilization. Invoking the need for a proton
acceptor has similarity to the well-known LBHB effect, whose anticatalitic effect is
discussed elsewhere.
106
It is true that the attempts to focus on the base lead in some cases
to what we consider the correct direction, like placing Lys or His near O1,
43
but this has
little to do with the pK
a
of Lys, as one would assume from the traditional picture. It
actually reflects the electrostatic free energy of interactions, which can also come from
distant ionized residues.
12
There is no doubt that more extensive exploration of the effects of mutations far
from the active site might be beneficial. In fact a useful strategy for this may be offered
by our approach of using a simplified potential as a reference for the explicit
calculations.
76
However, rational improvements of the polarity in the first solvation shell
are clearly more effective. Nevertheless, even this challenge has not been yet overcome.
For example, it might be hard to design an oxyanion hole without optimizing the relevant
scaffold. This is an issue that might require changing residues further from the active site,
38
or in the case of a complete design, it should provide a major constraint on the overall
folding.
Obviously our strategy can be and will be improved in the future, but the main
point is the ability to consider enzyme design by using energy based rational way. In this
respect it is useful to point out a special nontrivial advantage of our CAED approach.
That is, in the case of attempts to improve a specific enzyme, there is an acquired
advantage in the accumulated experience of modeling different mutants (as was the case
here in the study of the evolved mutants). Even significant errors in predicting the first set
of mutants can lead to improvement of the model (e.g., in selecting enzyme specific
dielectric to compensate for convergence difficulties in the treatment of ionized residues)
and to a better understanding of the specific enzymes and thus to a better prediction
ability in further design rounds. Such an advantage is not expected from computational
design approaches that are not base on capturing the physics of the catalytic process.
CONCLUSIONS
The rational design of enzymes with native activity requires the ability to predict
the proper TS stabilization, and this involves the challenge of predicting the overall
preorganization effect. Attempts to estimate the catalytic effect by using gas phase
models or even by looking at the electrostatic interaction between different residues and
the TS are unlikely to reproduce the correct catalytic effect since it is impossible to assess
39
the preorganization effect without including the protein and its reorganization in the
simulations.
The challenge of evaluating the catalytic power of a given mutant is not different
than that addressed in our early 1986 study of computer aided mutations.
99
At this stage it
seems to us that the potential of the EVB has been demonstrated in well defined cases
(e.g., refs
76, 87
), where it was found to reproduce the large effects of mutations that
destroy the catalytic effect of evolved enzymes. Thus our main current challenge is to use
this approach in improving non efficient enzymes.
Attempting the advance CAED we used first the EVB method to the study of the
nature of the observed catalytic effect in the wt and mutated enzymes. Obviously this
major requirement cannot be accomplished by gas phase models, but more significantly it
has not been accomplished by alternative MO-QM/MM studies,
3
which yielded large
discrepancies between the calculated and observed catalytic effects. Our ability to capture
the observed trend established the effectiveness of the EVB in reliable enzyme modeling,
even when the catalytic effects are small.
The next challenge is an understanding the origin of the catalytic effect and hopefully
helping in increasing this effect. In this respect it is useful to reemphasize that the initial
constructs of the Kemp eliminase benefited mainly from the catalytic effect associated
with placing the donor an acceptor at a closed proximity in the same cage, without the
optimization of environmental preorganization effects. This cage effect has been
understood almost quantitatively by the key early workers in the field (e.g., refs
77, 84
),
since the 70’s, and it is a well known factor that has little to do with the secret of
40
enzymatic reactions.
106
As to other factors that were discussed in previous design efforts,
we would like to clarify that TS stabilization by delocalization effects
79
is unlikely to
provide a significant catalytic factor since the same effect exists in the reference solution
reaction.
Using the hints from the group contribution obtained with the qualitative initial
EVB charge set, we attempted to refine the electrostatic environment near O1. This
attempt can be considered by some as an extension of the idea of placing an acid near
O1,
18, 43
but having an acid in this site is a reminiscence of the so called low barrier
hydrogen bond (LBHB) effect, whose anticatalytic effect is discussed elsewhere,
106
or to
the assumption that enzyme work by covalent catalysis, whose problems have been
discussed in ref.
106
. Furthermore, the support for this idea comes mainly from gas phase
calculations,
63
which tend to drastically overestimate LBHB effects. Thus, unless we deal
with a new reaction path, with a proton transfer from the acid, we will not benefit from
having an acid instead of a preorganized hydrogen bond. Thus we focused on the
electrostatic stabilization of the TS by the nearby polar groups.
Surprisingly, focusing on the intuitive idea of increasing the polarity of residue 9
and the stabilization of the developing charge on O1 has not generated rate acceleration.
Here we were forced by the results of the mutation experiments and by examining the ab
initio TS charges to move to a more consistent EVB charges that, in fact, consistently
accounted for the absence of observed mutational effects.
In analyzing the above findings we noted that the Kemp reaction is not the best
system for demonstrating the role of optimizing the active site environment, since the
41
charge that is being transferred to the substrate upon proton transfer is delocalized in a
way that does not lead to large group contributions and to simple hints about clear sites
for possible effective TS stabilization. This type of problem could have been anticipated
in fact in view of the study of ref.
29
.
Furthermore, as we found out in this work, even the current directed evolution
cycles have not exploited any major electrostatic stabilization of the TS, basically using
the change in solvation by water plus small changes in “solvation” by the protein polar
groups to provide small RS destabilization and small TS stabilization. At present is seems
that the lack of clear sites with large change of charge distribution in the RS-TS transition
(e.g., oxyanion type site) makes it hard for the enzyme to generate major catalytic effects.
However, enzymes that have evolved during a long time found ways to exploit
electrostatic effects, even when the charge changes are not localized; the best example is
vitamin B12 enzymes, where the enzyme found a remarkable way to exploit electrostatic
TS stabilization with no change in the charge distribution of the bond that is being broken
(see ref.
86
). Overall, it is not clear if directed evolution explores all options and it is clear
the group contribution approach cannot be very effective with the TS charge distribution
of our system. At any rate, while addressing the challenge of improving poor enzymes it
should also be useful to take optimal and highly catalytic enzymes and then to predict
which mutations will have the largest effect on reducing the catalysis.
Regardless of the small effect of distant residues we also see major potential in
simple refinement of long-range electrostatic interactions. This strategy can be
42
particularly instructive once it is connected with directed evolution experiments, and it
can be used in a systematic refinement with small incremental increase in catalysis.
At present there are still many who believe in dynamical and other esoteric effects
that are presumed to contribute to catalysis (for review see ref.
40
). In many cases it is
clearly suggested that improving such effects will be crucial for optimal enzyme design
(e.g., ref.
64
). However, it seems to us that by far the main factor that actually contributes
to catalysis is the preorganization effect and thus we feel that there is no rational way for
improving dynamics and related effects as these factors do not contribute to catalysis.
40
We also suspect that factors such as π-stacking stabilization are unlikely to be effective in
providing significant TS stabilization, beyond the simple effect of induced dipoles. Thus
the focus should be placed on the refining of the polar reorganization, as we attempted to
do in this work.
Apparently the directed evolution has tried the “trivial” solution of disolvating the
reactant state. Although, this type of catalysis is not used by naturally evolving enzymes
it might assumed by some that it provides a support for the long held disolvation
hypothesis (see discussion in ref.
106
). However, the enormous disolvation catalytic effect
of the Kemp eliminase reaction in say CH
3
CN
29
is not even mildly achieved in the Kemp
eliminase. Here it is important to realize that the disolvation hypothesis reflects an
incorrect thermodynamic cycle (see ref.
106
), where one missed the free energy of moving
the reactant state between water to the less polar environment. Thus it is actually more
interesting to ask why the directed evolution used such an uneffective strategy, than to
incorrectly assume that we have here a support to the disolvation proposal.
43
Finally, we would like to reclarify that we have demonstrated the ability to
reproduce quantitatively the absolute catalytic effects and mutational effects in naturally
evolved enzymes
106
and in designer enzymes. This clearly indicates that the catalytic
power of enzyme is not due to elusive effects (e.g., conformational dynamics), but to
what is by now well understood; the electrostatic preorganization. Thus our difficulties in
improving designer enzymes are not due to overlooking misunderstood factors, but to the
difficulties in optimizing well understood factors. In other words, a method that
reproduces the catalytic rate in known systems should be able to do so in any unknown
sequence and the challenge is to find the unknown optimal sequence. At any rate, it
seems to us that the present study provided a useful analysis of the reasons for the less
that perfect performance of current designer enzymes.
44
CHAPTER TWO:
METHODOLOGY: EMPIRICAL VALENCE BOND APPROACH
EMPIRICAL VALENCE BOND (EVB) THEORY
To simulate chemical reactions in proteins, it is essential to have a reasonable
potential energy surface. The EVB approach provides a very powerful tool for calculation
of the relevant potential energy surface in an analytical form. The EVB provides the
detailed molecular basis of the observed catalytic effect and reproduces the reliable
correlations between the observed and calculated activation free energies with a high
level of accuracy. The EVB method has been proven and successfully used on multiple
systems since its discovery in 1980.
103, 107
The EVB method has been described extensively elsewhere (e.g., refs
15, 41, 103
) and
we only give a few key points here.
The EVB method is a QM/MM method that describes reactions by mixing
resonance states (or more precisely diabatic states) that correspond to classical valence-
bond (VB) structures, which represent the reactant intermediate (or intermediates) and
product states. The EVB potential surface is schematically described in Figure 2.1. The
potential energies of these diabatic states are represented by classical MM force fields of
the form:
( ) ( ) ( )
intra
R,Q R,Q,r,q r,q
ii i
ii i gas ss Ss
H UU U εα = =+ + + (2.1)
45
Here, R and Q represent the atomic coordinates and charges of the diabatic states, and r
and q are those of the surrounding protein and solvent.
i
gas
α is the gas-phase energy of
the ith diabatic state (where all the fragments are taken to be at infinity), U
intra
(R,Q) is the
intramolecular potential of the solute system (relative to its minimum), U
Ss
(R,Q,r,q)
represents the interaction between the solute (S) atoms and the surrounding (s) solvent
and protein atoms. U
ss
(r,q) represents the potential energy of the protein/solvent system
(“ss” designates surrounding-surrounding).
i
ε is given by Eq. 2.1 from the diagonal
elements of the EVB Hamiltonian (H
EVB
). The off-diagonal elements of this Hamiltonian,
H
ij
, are assumed to be constant or they can be represented by a simple function such as an
exponential function of the distances between the reacting atoms. These H
ij
elements are
assumed to be the same in the gas phase, in solutions and in the proteins. The adiabatic
ground-state energy E
g
and the corresponding eigenvector C
g
are obtained by solving the
secular equation:
EVB g g g
H C E C = (2.2)
46
Figure 2.1. A description of the relationship between the diabatic free energies ε
1
and ε
2
,
the adiabatic ground state free energy E
g
and the off-diagonal element H
12
. Δg
≠
and ΔG
0
designate the activation free energy and the reaction free energy respectively.
The EVB treatment provides a natural picture of intersecting electronic states,
which is useful for exploring environmental effects on chemical reactions in condensed
phases.
33
The ground-state charge distribution of the reacting species (“solute”) polarizes
the surroundings (“solvent”), and the charges of each resonance structure of the solute
then interacts with the polarized solvent.
33
This coupling enables the EVB model to
capture the effect of the solvent on the quantum mechanical mixing of the different states
of the solute. For example, in cases where ionic and covalent states are describing the
solute, the solvent stabilizes the ionic state to a greater extent, and the resulting ground
state has more ionic character and more solvation energy.
The simplicity of the EVB formulation makes it easy to obtain its analytical
derivatives (using the Hellmann-Feynman theorem for Eq. 2.2) and, thus, to sample the
47
EVB energy surface by MD simulations. Running such MD trajectories on the EVB
surface of the reactant state can provide the free-energy function Δg that is needed to
calculate the activation energy Δg
‡
. However, since trajectories on the reactant surface
will reach the transition state only rarely, it is usually necessary to run a series of
trajectories on potential surfaces that gradually drive the system from the reactant to the
product state.
33
The EVB approach accomplishes this by changing the system
adiabatically from one diabatic state to another. In the simple case of two diabatic states,
this “mapping” potential, ε
m
, can be written as a linear combination of the reactant and
product potentials, ε
1
and ε
2
:
12
()
m mm
1 ε θ ε θε =−+ ()
m
01 θ ≤≤ (2.3)
When θ
m
is changed from 0 to 1 in n+1 fixed increments (θ
m
= 0/n, 1/n, 2/n, ..., n/n),
potentials with one or more of the intermediate values of θ
m
will force the system to
fluctuate near the TS.
The free energy ΔG
m
associated with changing θ
m
from 0 to m/n is evaluated by
the FEP procedure described elsewhere. The free-energy functional that corresponds to
the adiabatic ground-state surface E
g
(in Eq. 2.2) is then obtained by the FEP-umbrella
sampling (FEP/US) method,
103
which can be written as
( ) ( ) ( ) ( )
' 1'
( ) ln exp
m
m g m
gx G x x E x x
ε
βδ β ε
−
∆ = ∆− − − −
(2.4)
where ε
m
is the mapping potential that keeps x in the region of x’. If the changes in ε
m
are
sufficiently gradual, the free-energy functional Δg(x’) obtained with several values of m
overlap over a range of x’, and patching together the full set of Δg(x’) gives the complete
free energy curve for the reaction. The FEP/US approach may also be used to obtain the
48
free-energy functional of the isolated diabatic states. For example, the diabatic free
energy Δg
1
of the reactant state can be calculated as
( ) ( ) ( ) ( )
' 1'
11
( ) ln exp
m
mm
gx G x x x x
ε
β δ βε ε
−
∆ = ∆− − − −
(2.5)
The diabatic free-energy profiles of the reactant and product states represent the
microscopic equivalent of the Marcus’ parabolas.
53
The EVB method satisfies some of the main requirements for reliable studies of
enzymatic reactions. Among the obvious advantages of the EVB approach is the
facilitation of proper configurational sampling and converging free energy calculations.
This includes the inherent ability to evaluate nonequilibrium solvation effects.
97
It should
also be noted that the reliability of any method addressing enzyme catalysis lies in its
ability to obtain accurate free energy differences between the enzyme and solution
reaction. This is hard to accomplish by, for example, current ab initio QM/MM methods,
which are often perceived as more accurate even though they suffer from significant
sampling difficulties. The use of semiempirical QM/MM approaches can greatly improve
this, but also these methods require calibration, which is easier to accomplish with the
EVB formulation. The EVB surface is carefully calibrated to the experimental
information for the reference reaction in solution for a given reaction type. And finally,
the EVB benefits from the abovementioned ability to treat the solute-solvent coupling
consistently, and therefore reproduces the correct environmental effect on chemical
reactions.
49
EVB PERFORMANCE
In our works we showed that at present the Empirical Valence Bond method is an
accurate screening method to study the effect of mutations.
24, 25, 76
It allows to screen for
activity not only within the same family of proteins but also to explore the catalytic effect
of different catalytic bases and substrates of series of the proteins (as shown in Figure
2.2). As you can see from the Figure 2.2 our calculated free energy barriers are in a good
agreement with the corresponding observed ones.
Using the EVB approach we can screen significant number of mutations candidates in
a few days and make reasonable suggestion for further experimental studies. As you can
see from the Table 2.1 we can screen 14 mutants (twenty runs per each mutant) in 24
hours on 200 nodes. And if you have enough computer resources you will be able to
screen up to 70 mutants per day using 1000 nodes with the reasonable convergence (see
Table 2.1 and Figure 2.3).
50
Figure 2.2. The correlation between the calculated (in black) and observed (in pattern)
activation free energies for the indicated systems, which are described in details in
section “Systems” of Chapter 3.
Figure 2.3. The overlay of ten EVB calculated activation free energy barriers.
51
Table 2.1. The efficiency of the EVB screening calculations.
a
Δg
≠
cat
using EVB
computer time per mutant
(runs, processors)
17 h
(1, 1)
No. of mutants per 24 h per 200 processors
(runs, processors)
14
(20, 20)
No. of mutants per 24 h per 1000 processors
(runs, processor)
70
(20, 20)
a
The calculations were conducted on the University of Southern
California HPCC (High Performance Computing and
Communication) Linux computer, using Dual Intel Xeon (64-bit)
3.2 GHz 2Gb Memory nodes.
52
CHAPTER THREE:
CHALLENGES AND ADVANCES IN VALIDATING ENZYME DESIGN
PROPOSALS: THE CASE OF THE KEMP ELIMNASE CATALYSIS
INTRODUCTION
Advances in rational enzyme design are expected to have a great potential in
industrial application and eventually in medicine.
95
In fact, the ability to design efficient
enzymes can be considered as the best manifestation of a true understanding of enzyme
catalysis. However, most attempts of rational enzyme design, and the resulting constructs
have been much less effective than the corresponding natural enzymes.
95
Moreover,
despite the progress in directed evolution (e.g., ref.
43
), we do not have unique rationales
for the resulting rate enhancements.
In our view the best way to identify the problems with the current rational design
approaches (for review see ref.
95
) is to examine the actual performance of simulations of
catalytic effects. Unfortunately, most current studies have not been based on simulations of
the relevant activation barriers. In this respect, it has been argued,
76, 106
that the problems in
generating highly effective designer enzymes in are due (at least in part) to the incomplete
modeling of the transition state (TS) and to the limited awareness to the key role of the
reorganization energy.
Capturing the reorganization energy by simulation techniques is, in fact, routinely
done by the empirical valence bond (EVB) approach, which is a semiempirical valence
53
bond type, quantum mechanical / molecular mechanics (QM/MM) approach (e.g., refs
103,
106
). More specifically, semiquantiative computational studies of the effect of mutations on
enzyme catalysis date back to the EVB simulations of the anticatalytic effect of mutations
of trypsin.
99
Subsequent calculations of known and predicted mutational effects include
EVB studies (e.g., refs
32, 50, 103, 106
) and more recent molecular orbital-combined quantum
mechanical / molecular mechanics (MO-QM/MM) studies (e.g., refs
27, 54, 55, 62
). The above
studies, and in particular the quantitative one, have established the importance of the
changes in reorganization energy upon mutations.
50, 106
Recent study
76
has explored the
ability of the EVB approach to be used in quantitative screening of design proposals. We
also recently demonstrated the ability of the EVB to reproduce the effect of directed
evolution in refining Kemp eliminases.
24
Here we will focus again on the Kemp elimination reaction
42
since the design of
enzymes that catalyze this reaction has been the center of recent excitements.
17, 79
At this
point it may be useful to clarify that the ability to screen computationally different design
proposals is not a trivial feature that is shared by all current approaches. That is, while the
accomplishments of constructing artificial enzyme from scratch (e.g., ref.
43
) are truly
impressive, it is not clear if the final screening is based on a reliable “scoring function”.
Here the best way to make a clear judgment is to take cases where we know the observed
catalytic effects and see which approach can reproduce them reliably. More specifically,
while mutation predictions would be very convincing, having a clear benchmark (with
known mutational results) and reproducing it by a given computer program should be a
clear condition for a proper validation. With this perspective in mind, we can start by noting
54
that gas phase calculations (or calculations with a small cluster of very few protein residues)
are very unlikely to reproduce the corresponding observed catalytic effects. Furthermore,
even recent attempts to use MO-QM/MM approach (e.g., ref.
3
) has not provided a
reasonable estimate of the observed catalytic effect or the trend of the mutational effects in
artificially design enzymes. The same is true even for recent attempts to improve the MO-
QM/MM calculations
2
and for ONIOM calculations.
44
Some of the problems with current perspectives in the field are illustrated by a
recent Current Opinions,
46
where the focus has been placed on catalytic proposals that are
known to be ineffective (e.g., dynamics and ground state destabilization), while overlooking
computational studies that actually reproduce the observed catalytic effect. Here the issue is
not the different opinions on what might be important in catalysis, but the which effect has
been found by proper computer modeling to contribute even mildly to catalysis where the
results are known (see refs
40, 106
). In other words, if we are inserted in computer-aided
enzyme design (CAED), we should ask what was found in computations that actually
examined all possible catalytic effects, and reproduce the observed catalysis, before we
venture to consider options that were found to give a minor (or no) effect. The issue here is
not what might contribute to catalysis, but what does contribute, and the judgment (in
particularly when it comes to CAED) should be based, at least at some level, on the ability
to reproduce the observed effect of the active site. This will help to eliminate effects that are
not capable to lead to catalysis.
With the above perspective in mind, we followed the preliminary study of ref.
24
,
but focused here on more extensive validation of the power of the EVB in calculating the
55
catalytic effect in Kemp eliminases, on exploring the effect of directed evolution and on the
requirement for improving the catalysis of the previously designed enzymes. We also focus
here on the origin of the difference between the linear free energy relationships (LFER) for
the Kemp elimination reaction in solution and in the designed enzymes as a way to better
understand the problems in obtaining effective catalysis. Furthermore, in view of the
uncertainty about the binding mode of the substrate in the system used in ref.
24
we put
significant effort on studies of catalytic antibodies where we have direct information on the
binding mode.
Overall, we demonstrate that our approach allows one to explore the effect of
different mutations and in principle to predict which mutations can lead to enhanced
catalysis. We also clarify the difference between our strategies to other less quantitative
approaches.
SYSTEMS
The enzymes chosen as a benchmark for the present study catalyze Kemp
elimination reactions
42
with different proton donors and proton acceptors (the reaction
studied are described schematically in Figure 3.1).
56
Figure 3.1. (A) A schematic description of the Kemp elimination reaction. (B) The
substrate structures used in the study: 5-nitrobenzisoxazole (1), 5,7-dinitrobenzisoxazole
(2), unsubstituted benzisoxazole (3) and 6-glutaramidebenzisoxazole (4).
In order to demonstrate the considerations in determining the rate constants for the
reference reactions in these systems we quantified first the energetics of the reaction in
water, with 5-nitrobenzisoxazole as a donor and a carboxylic acid as a base.
106
Here one
may try to use the rate constant of the uncatalyzed reaction in water (1.2
.
10
-6
M
-1
s
-1
at pH
=7.25),
79
but this reaction involves a large contribution from the hydroxide ion OH
-
available at this pH.
42
In this respect we would like to clarify that we have no problem
modeling pH effect in solution reaction (see ref.
37
), but the well understood effect of OH
-
will be different at different pHs, and thus is not so useful in establishing a reasonable
reference state (since the enzyme rate constant does not depend on the OH
-
concentration).
In fact, we are interested in a reaction with a well-defined base (and the effect of the base
concentration has been well understood quite early by the Biochemical community (see
57
discussion in ref.
106
)). Thus, as discussed repeatedly (e.g., ref.
106
) it is much more useful
and relevant to consider the “chemically filtered” reference reaction,
106
which is defined
as the solution reaction that follows exactly the same mechanism as the one in the
enzyme. Thus we determined the corresponding rate constant by taking the estimate
94
of
the rate of the reaction of 5-nitrobenzisoxazole in acetate buffer (5
.
10
-5
M
-1
s
-1
) and
extrapolating it to 55M to obtain the rate constant (k
cage
) of 0.0025 s
-1
, for the case when
the donor and acceptor (the substrate and a carboxylic acid) are placed at the same solvent
cage. Similar conclusions were obtained by other estimates, including ab initio
calculations (“EVB Calibration Procedure” section of Chapter 1) where we obtained a
barrier of about 17-18 kcal/mol, which is in the same range as the observed barrier
(considering the fact that we did not include entropic corrections). At any rate, the
relevant activation barrier for the reference reaction is given in Table 3.1.
58
Table 3.1. The correlation between the observed and calculated ∆g
‡
cat
(in kcal/mol) for
the Kemp Elimination reaction with different catalytic bases (acceptors) and substrates
(donors).
System
a
Base
b
Donor
c
∆g
‡
obs
d
∆g
‡
calc
SD
Water (cage) Glu/Asp 5-NO
2
21.2 21.2 0.30
34E4 antibody (wt, PDB: 1Y0L) E
H50
5-NO
2
17.9 17.3 0.47
34E4 mutant (E
H50
D, PDB: 1Y18) D
H50
5-NO
2
19.9 20.2 0.66
34E4 mutant (Y
L32
K, from wt) E
H50
5-NO
2
19.5 19.9
0.96
HG2 mutant (S265T, from HG2 wt) D126 5-NO
2
17.7 18.2 0.92
1A53-2 (X-ray from Prof. Mayo) E178 5-NO
2
20.0 20.7 1.01
Water (cage) Glu/Asp 5,7-(NO
2
)
2
17.7 17.7 0.30
34E4 antibody (wt, PDB: 1Y0L) E
H50
5,7-(NO
2
)
2
13.6 13.3 0.93
Water (cage) Glu/Asp unsubstituted 23.9 24.0 0.30
34E4 antibody (wt, PDB: 1Y0L) E
H50
unsubstituted 23.5 23.7 0.68
Water (cage) Glu 6-glutaramide 24.2 24.2 0.30
13G5 antibody (wt, PDB: 3FO0) D
H35
6-glutaramide 19.7 19.7 0.37
13G5 mutant (H
H95
N, from wt) D
H35
6-glutaramide 20.4 20.6 0.29
13G5 mutant (E
L34
Q, PDB: 3FO2) D
H35
6-glutaramide 19.8 19.4 0.52
13G5 mutant (E
L34
A, PDB: 3FO1) D
H35
6-glutaramide 18.4 18.3 0.38
Water (cage) His 5-NO
2
19.8 19.8 0.30
KE70 designed variant (wt) H16-D44 5-NO
2
18.5 19.3 0.67
KE70 mutant (D44N, from wt) H16-D44N 5-NO
2
19.1 19.5 1.09
Water (cage) Lys 5-NO
2
15.2 15.2 0.30
HSA (wt, PDB: 3B9L) K199 5-NO
2
17.9 17.7 0.84
BSA (wt, from HSA sequence) K222 5-NO
2
16.4 16.3 1.05
a
The system definition includes the name of the protein, its mutant and the X-ray
PDB code.
b
The base definition includes the name of the base and number in the protein X-ray
structure.
c
All donors are in the family of the benzisoxazoles with different group-
substitutions: 5-NO
2
states for the 5-nitrobenzisoxazole; 5,7-(NO
2
)
2
is 5,7-
dinitrobenzisoxazole, unsubstituted is unsubstituted benzisoxazole and 6-glutaramide
is 6-glutaramidebenzisoxazole.
d
The source of the experimental ∆g
‡
are give in the text.
59
Similar analysis (see the “Barrier Estimates” section of Chapter 1 and Table 3.2)
can also be performed with other bases or with other leaving groups and, in fact the
relevant results can be taken from ref.
31
but with the 55 M cage correction. It must be
emphasized that we are not talking on k
cat
/K
M
relative to the uncatalyzed reaction, where
the effect of the enzyme is larger, since this rate enhancement includes the well
understood effect of binding to the active site
106
and the real challenge is to optimize k
cat
.
Interestingly, and significantly, the reference reaction in a solvent cage is not much slower
than the reaction catalyzed by the originally designed enzyme (k
cat
=0.02-0.3 s
-1
; see
ref.
79
). This indicates that a major part of the catalytic effect is due to just placing the
donor and acceptor in a contact distance (see also ref.
17
).
Table 3.2. Experimental LFER for the acceptor and donor pK
a
in solution.
Donor
a
Acceptor
∆g
‡
obs
,
kcal/mol
pK
a
exp
(D)
b
pK
a
cxp
(A)
c
5-NO
2
Glu/Asp 21.2 4.1 4.1
5-NO
2
His 19.8 4.1 6.0
5-NO
2
Lys 15.2 4.1 10.5
6-glutaramide Glu/Asp
24.2 7.0 4.1
5,7-(NO
2
)
2
Glu/Asp 17.7 0.6 4.1
5,7-(NO
2
)
2
His 16.2 0.6 6.0
5,7-(NO
2
)
2
Lys 12.7 0.6 10.5
unsubstituted Glu/Asp 23.9 6.9 4.1
unsubstituted His 22.0 6.9 6.0
unsubstituted Lys 17.4 6.9 10.5
a
All donors are in the family of the benzisoxazoles with different
group-substitutions: 5-NO
2
states for the 5-nitrobenzisoxazole; 5,7-
(NO
2
)
2
is 5,7-dinitrobenzisoxazole, unsubstituted is unsubstituted
benzisoxazole and 6-glutaramide is 6-glutaramidebenzisoxazole;
b
Experimental pK
a
of the donor is taken from ref.
31
;
c
Experimental pK
a
of the acceptor is taken from ref.
42
.
60
The enzymes studied here are the catalytic antibodies 34E4
17
and 13G5;
18
designed template structures KE70 and KE59;
79
designed proteins HG2 and 1A53-2;
71
human (HSA)
110
and bovine serum albumins (BSA).
28
The X-ray coordinates of the
studied systems were obtained from Protein Data Bank and are listed in Table 3.1. A
typical system (34E4 antibody) is depicted in Figure 3.2. The studied enzyme systems
include 5-nitrobenzisoxazole, 5,7-dinitrobenzisoxazole, unsubstituted benzisoxazole and
6-glutaramidebenzisoxazole substrates (Figure 3.1(B)) with different catalytic bases
(Glu/Asp, Lys and His).
Figure 3.2. The structure of active site of the 34E4 catalytic antibody with 5-
nitrobenzisoxazole substrate.
61
METHODS
The free energy surface of the reference solution reaction of 5-nitrobenzisoxazole with
a carboxylic acid as a base was estimated by ab initio calculations, in the same way
described in our previous works (e.g., ref.
38
), and the resulting surface is shown in Figure
1.2 of Chapter 1 (i.e. Figure S1 of ref.
24
). The ab initio effective charges of the RS, TS and
PS are given in Figure 1.3 of Chapter 1 (i.e. Figure S2 of ref.
24
). The calibration
parameters of the calculated EVB surface are given in Table 1.2 of Chapter 1. The same
approach was used for the other reference reactions.
The reaction in the protein was modeled by the EVB method whose feature have
been extensively described in many previous works (see refs
39, 41
for a recent review) and
is described in Chapter 2 (“Methodology: Empirical Valence Bond Approach”). We note,
however, that the EVB surfaces are calibrated by forcing them to reproduce the above
mention ab initio results for the solution reactions. In doing so, we have to take into
account the complex nature of the reaction with a concerted proton transfer and C-N bond
breaking. Usually this requires three diabatic states, but for the present purpose we
decided to use two states with modified charges that reproduce the ab initio TS charges.
24
This treatment is justified, since we forced the EVB TS to reach the same geometry, and
more importantly the same charge distribution, as the corresponding ab initio solution
system, with the concerted path. Thus the whole issue of catalysis is related to the
difference between the reactant state and transition state and not the exact path between
them. In other words, in the present work we are looking for a fast yet accurate method
62
and once we realized that the standard two state charge distribution does not give the
correct TS charges, we have two options: either to use three states (which is more
expensive) or to use the current strategy, which guarantee the correct TS features, but
requires one to introduce a correction for the energetics of moving from the transition
state to the product state (which is not crucial to the present study). Of course, the second
option is more reasonable, considering what we are looking for.
Simulation Protocol
The EVB calculations were carried by MOLARIS simulation program
15, 48
using the
ENZYMIX force field. The EVB activation barriers were calculated at the configurations
selected by the same free energy perturbation umbrella sampling (FEP/US) approach used
in all our studies (e.g., refs
103, 106
) The simulation systems were solvated by the surface
constrained all atom solvent (SCAAS) model
48
using a water sphere of 18 Å radius
centered on the substrate and surrounded by 2 Å grid of Langevin dipoles and then by a
bulk solvent, while long-range electrostatic effects were treated by the local reaction field
(LRF) method.
48
The EVB region includes the substrate and the functional group of the
proton acceptor (e.g., the carboxylic group of the glutamic acid). Validation studies were
done within 22 Å radius of inner sphere, where we repeated the calculations of the
activation barrier and obtained practically the same results (of course treating the
distanced ionized groups with a high dielectric macroscopic model). The FEP mapping
was evaluated by 21 frames of 20 ps each for moving along the reaction coordinate using
SCAAS model. All the simulations were done at 300 K with a time step of 1 fs. In order
63
to obtain reliable results the simulations were repeated 20 times (see “Results: Modeling
the Catalytic Effects in Different Kemp Eliminases” section) with different initial
conditions (obtained from arbitrary points of the relaxation trajectory). The average was
determining by taking in each case the difference between the calculated minimum at the
RS and the given TS. The mutant systems were generated from the native enzymes via
100 ps relaxation runs.
Group Contributions
In addition to the above EVB approach, it is useful to have faster screening
approaches. Here our initial screening has been based on evaluation of electrostatic group
contributions that are defined as the effect of “mutating” all the residual charges of the
given group to zero. In principle, we could perform such mutations and evaluate the
PDLD/S-LRA binding energy for the given native and mutant proteins.
89
But since we are
dealing with charged and polar residues, it is reasonable to start with the faster screening
approximation introduced and described in ref.
96
. This approach estimates the electrostatic
contribution of different residues to the activation barrier by an expression that can be
formally simplified by grouping together the atomic charges of each residue and written:
332
i ij elec j
ij
ij
g q Q r ε
≠
∆ ∆ ≅ ∆
∑
(3.1)
where
j
q is the effective “charge” of the j
th
residue (which may represent dipoles in cases
of polar residues), the
s
Q ∆ are the changes in the substrate charges upon going from the
RS to the TS and
ij
ε is the dielectric constant for the specific interaction. Now we can
64
explore the trend of Eq. 3.1 for any possible mutations if we artificially assign to all
protein residues a charge of +1.0 and then ask what charge change will lead to the most
negative
elec
g
≠
∆ ∆
. This is obtained by expressing the step in the optimization procedure
step as in a steepest decent optimization of the barrier of Eq. 3.1. This leads to:
∆q
j ( )
opt
≅ − α ∂ ∆ ∆g
elec
≠
∂q
j
= − α ∆Q
i
r
ij
ε
ij
i
∑
(3.2)
where α is a proportionality constant and
j
q ∆ are proportional to the electrostatic group
contribution, when all the protein groups are positively charged. In this case if
j
q ∆ is
positive the given residue should be either negative or have its dipole orientation back
from the potential due to the
s
Q ∆ .
The treatment of Eq. 3.2 can be further refined by demanding that the mutations
will also retain or optimize the protein stability. This can be done by our recent focused
dielectric approach,
96
which gives an approximated expression:
∆G
fold
≅ 332 q
j
q
h
r
ih
ε
focus
ih
∑
(3.3)
where
focus
ε
is the optimal dielectric constant
96
and the
s
q reflects the given pH.
We can now optimize
j
q under the constraint of large folding energy, and this
lead to
( ) j j elec j
opt
fold
q qq Gg β α
≠
− ∂ ∂ ∂∆∆ ∂ ∆ − (3.4)
The use of the combined TS stabilization and stability constraints is expected to
provide a quite effective way for guiding the mutations of distanced ionized residues.
65
It should be noted that the group contribution can only provide very rough hints
since it does not reflect the reorganization energy consistently. In fact, obtaining more
quantitative group contributions is possible (especially for the actual given sequence by
using the linear response approximation that captures the reorganization effect).
RESULTS
Modeling the Catalytic Effects in Different Kemp Eliminases
The most basic requirement for effective enzyme design is the ability to reproduce
the observed catalytic effects of different design constructs and/or those of natural active
sites. Thus we started with the systematic evaluation of the activation barriers for the
different systems considered in Table 3.1. These systems involve different proteins,
different mutants and different bases. Our typical procedure of obtaining the average
activation barrier is illustrated in Figure 3.3 and Table 3.3, where we show the results of
20 EVB free energy profiles. The same averaging procedure has been applied to all the
system studied and the results of the calculations are summarized in Figure 3.4. The main
point that emerged from the Table 3.1 and Figure 3.4 is the fact that our EVB approach
provides a reliable and effective way for screening different design options. In particular
we would like to point out that our results are much more reliable than those reported by
alternative approaches (see “Conclusions” section of Chapter 3).
66
Figure 3.3. Results of 20 EVB mapping runs for the KE70 (wt) designed variant.
Table 3.3. The calculated reaction barriers for the KE70 designed variant for 20 starting
conformations.
a
τ, ps
∆g
‡
calc
,
kcal/mol
τ, ps
∆g
‡
calc
,
kcal/mol
10 18.71 110 18.97
20 18.61 120 19.45
30 18.9 130 19.07
40 18.44 140 19.67
50 19.28 150 19.82
60 20.15 160 18.99
70 18.51 170 20.11
80 20.67 180 19.08
90 18.19 190 20.08
100 19.03 200 19.78
average ∆g
‡
calc
, kcal/mol 19.28
∆g
‡
obs
, kcal/mol 18.51
SD 0.67
a
The given mapping starts from configurations
generated from a single trajectory after the
designated generation time (τ).
67
Figure 3.4. Correlation between the calculated and observed activation barriers.
After establishing the reliability of our EVB approach, we turn our focus to
attempts to explore possible avenues for controlling and changing the catalytic power of
Kemp eliminases. As discussed in our previous work
24
(see also ref.
29
) it is extremely hard
to design an effective Kemp eliminase since the change in charge distribution upon
moving to the TS is not large and furthermore this change is not localized. In fact, as
found in our previous work,
24
the main catalytic effect that has been obtained in Kemp
eliminases (beyond the trivial effect of bringing the donor and acceptor to the same cage)
is the destabilization of the ground state charge distribution of the proton acceptor.
Although this is not the regular avenue taken by native enzymes it is the simplest direction
for improving Kemp eliminases. At any rate, since this effect is associated with the pK
a
of the proton acceptor, one may ask what can be gained by changing the type of the donor
/ acceptor, and the corresponding analysis is given in the next section.
68
Examining the LFER Trends
An interesting and potentially promising aspect of our study is the ability to
explore the linear free energy relationship (LFER) between the catalysis and the nature of
the donor and the acceptor (the base). In solution we have a clear LFER for the 5-
nitrobenzisoxazole as a function of the acceptor pK
a
, for a series of different catalytic
bases (for e.g., pyridine catalytic base with pK
a
exp
=5.13 and ∆ g
‡
exp
=20.3 kcal/mol) with
β=0.93. This correlation is easily reproduced by the EVB model (see Figure 3.5) by
changing of the gas phase proton affinity (which is reflected by the EVB gas phase shift).
Figure 3.5. The observed and calculated LFER for the reaction of 5-nitrobenzisoxazole in
solution as a function of the acceptor pK
a
(the experimental values of the pK
a
(A) are taken
from ref.
42
).
Furthermore, we have a similar correlation between the activation barrier and the
∆pK
a
of the donor and acceptor in the reaction in water (Figure 3.6) and this correlation
is also reproduced by the EVB. It is tempting to assume that we may obtain a similar
69
correlation in the enzyme. Unfortunately, however, in the protein the situation is more
complex. That is, while the gas phase proton affinity is identical in the protein and in
solution, the local environment changes the apparent pK
a
in the protein (the calculated
values are given in Table 3.4). This is further complicated by the fact that the
environment of the proton donor is also changed by the protein. In view of this
complication we start by first exploring the correlation between the observed pK
a
of the
proton acceptor and the activation barrier (Figure 3.7).
Figure 3.6. Experimental and calculated LFER for the reaction in water as a function of
the corresponding ∆pK
a
. The experimental pK
a
(D) were taken from ref.
31
, and pK
a
(A) was
taken from ref.
42
. More details can be found in Table 3.2.
70
Table 3.4. Correlation between the calculated and observed acceptor pK
a
at ε
p
=4.0 and
ε
p
=6.0 in the protein.
System
a
Base
b
Donor
c
pK
a
exp
pK
a
calc
( ε
p
=4.0)
pK
a
calc
( ε
p
=6.0)
34E4 antibody (wt, PDB: 1Y0L) E
H50
5-NO
2
6.4 8.9 7.5
34E4 mutant (E
H50
D, PDB: 1Y18) D
H50
5-NO
2
6.6 5.4 4.6
13G5 antibody (wt, PDB: 3FO0) D
H35
6-glutaramide 6.0 6.4 5.5
13G5 mutant (H
H95
N, from wt) D
H35
6-glutaramide 5.8 7.3 6.2
13G5 mutant (E
L34
Q, PDB: 3FO2) D
H35
6-glutaramide 6.3 7.8 6.6
13G5 mutant (E
L34
A, PDB: 3FO1) D
H35
6-glutaramide 6.6 6.9 6.0
KE70 designed variant (wt) H16-D44 5-NO
2
- 5.5 6.4
KE70 mutant (D44N, from wt) H16-D44N 5-2 - 5.4 6.1
HSA (wt, PDB: 3B9L) K199 5-NO
2
9.2 7.1 7.8
BSA (wt, from HSA sequence) K222 5-NO
2
10.0 9.7 9.8
38C2 antibody (wt, PDB: 3FO9) K
H93
5-NO
2
8.9 5.3 7.2
a
The system definition includes the name of the protein, its mutant and the X-ray
PDB code.
b
The base definition includes the name of the base and number in the protein X-ray
structure.
c
All donors are in the family of the benzisoxazoles with different group-substitutions:
5-NO
2
states for the 5-nitrobenzisoxazole and 6-glutaramide states for the 6-
glutaramidebenzisoxazole.
Figure 3.7. The observed and calculated LFER as a function of the acceptor pK
a
for the
reaction in the protein.
71
As seen from the figures the correlation is very qualitative, with large deviations.
This is significant since we obtained much better correlation between the total calculated
and observed activation barriers (Figure 3.4). Thus we must conclude that the correlation
between the apparent pK
a
(or ∆pK
a
) and the activation barrier cannot provide a simple
guide improving the catalytic power of Kemp eliminases. More specifically, the
correlation between the activation barriers and the pK
a
of the proton acceptor in the
protein (Figure 3.7) does follow the same trend as the LFER in solution (Figure 3.5).
Furthermore, although the LFER in the protein for ∆ pK
a
(the calculated values are listed
in Table 3.5) has a slop in the correct direction, the overall effect on the activation barrier
is small since the ∆pK
a
in the protein is much smaller than the corresponding ∆ pK
a
in
water (Figure 3.8). Thus the effect on the activation barrier is small. A significant part of
the problem reflects the fact that an increase in the pK
a
of the acceptor, due to decrease in
polarity of its site, is usually accompanied by an increase in the pK
a
of the donor (Figure
3.8), due to reduce in polarity at the donor site.
Figure 3.8. The LFER as a function of ∆pK
a
for the reaction in the protein.
72
Table 3.5. Calculated LFER for the acceptor and donor pK
a
in the protein at ε
p
=4.0.
System
a
Base
b
Donor
c
∆g
‡
obs
,
kcal/mol
pK
a
calc
(D) pK
a
calc
(A)
13G5 antibody (wt, PDB: 3FO0) D
H35
6-glutaramide 19.7 10.2 6.4
13G5 mutant (H
H95
N, from wt) D
H35
6-glutaramide 20.4 9.5 7.3
13G5 mutant (E
L34
Q, PDB: 3FO2) D
H35
6-glutaramide 19.8 9.3 7.9
13G5 mutant (E
L34
A, PDB: 3FO1) D
H35
6-glutaramide 18.4 8.9 6.9
KE70 designed variant (wt) H16-D44 5-NO
2
18.5 9.7 5.5
KE70 mutant (D44N, from wt) H16-D44N 5-NO
2
19.1 9.5 5.4
HSA (wt, PDB: 3B9L) K199 5-NO
2
17.9 8.6 7.1
BSA (wt, from HSA sequence) K222 5-NO
2
16.4 7.1 9.7
a
The system definition includes the name of the protein, its mutant and the X-ray PDB
code.
b
The base definition includes the name of the base and number in the protein X-ray
structure.
c
All donors are in the family of the benzisoxazoles with different group-substitutions:
5-NO
2
states for the 5-nitrobenzisoxazole and 6-glutaramide states for
6-glutaramidebenzisoxazole.
Of course, one can try to generate different polarities in both sites, but due to the
complex nature of the change in the charge of the donor upon TS formation, it is hard to
obtain effective active site mutations that will stabilize the developing donor charge (see
ref.
24
). In other words it is much simpler, for example, to reduce the number of water
molecules in the active site than selectively stabilize the ionized form of the proton
donor. With this insight in mind we can look for a way to change the apparent pK
a
of the
acceptor by external charges. This is done in the following session.
73
Optimizing the Long Range Electrostatic Effects
The above studies have established our ability to conduct an effective screening
but did not address the issue of enzyme design. This issue was discussed in our previous
works
24, 76
and in the specific case of the Kemp elimination reaction it has been argued
that the refinement of the local environment cannot be effective, due to the small change
in charge distribution during the reaction. Thus we focused on the small improvements of
the catalytic effect by distanced ionized residues. That is, our previous studies (e.g.,
refs
100, 106
) have indicated that the effect of ionized residues at a distance of more than 6
Å can be approximated by using a relatively large effective dielectric constant for
charge–charge interaction. Here we can use the approach introduced in our previous
study,
24
and described in the “Group Contributions” section of Chapter 3, where we
optimize the effect of distanced ionized residues on the difference in charge distribution
between the RS and TS while simultaneously retaining the protein stability. This is done
by using Eq. 3.4.
Here we report a qualitative attempt to use Eq. 3.3 for the 13G5 system. The
calculated group contributions are given in Figure 3.9 and Table 3.6, where we focus on
the predicted effect of ionized distanced residues. The most effective sites for distanced
ionized residues are also depicted in Figure 3.10. For example, we would predict from
Table 3.6 a rate acceleration of about 5 (TS stabilization is about 0.9 kcal/mol) due to the
V
H98
E mutation, but this is done without exploring the effect on stability (by the use of
Eq. 3.4). The stability consideration should also take into account the fact that we already
have two negatively charged groups (E
H6
and D
H72
) at the same region. It may be simpler
74
to explore our predication by simply perfuming the E
H6
Q mutation (where we predict a
rate reduction by a factor of about 4).
Figure 3.9. Group contributions from the all distanced (A) and ionized distanced (B)
residues. The calculations were done for the 13G5 catalytic antibody. The group
contributions are for changing the charge at the indicated site from zero to +1.0. The
residue numbers from 2 to 215 and from 216 to 431 are located at the heavy and light
chains of the 13G5 catalytic antibody respectively.
75
Table 3.6. Group contributions from the distanced protein residues of the 13G5 catalytic
antibody due to the charge change upon moving from the RS to the TS and from the RS to
the PS. The residue numbers from 2 to 215 and from 216 to 431 are located at the heavy
(H) and light (L) chains of the 13G5 catalytic antibody respectively. The group
contributions are for changing the charge at the indicated site from zero to plus 1.0.
Residue number
(H-heavy chain; L-
light chain)
Residue
name
Distance to the region
I center
∆G
RS
TS
,
kcal/mol
∆G
RS
PS
,
kcal/mol
H2 VAL 18.91 0.83 1.72
H3 GLN 20.06 0.50 1.02
H4 LEU 16.49 0.73 1.64
H5 LEU 19.96 0.31 0.51
H6 GLU 18.98 0.62 1.51
H20 ILE 19.41 0.44 1.04
H21 THR 20.78 0.42 0.96
H22 CYS 17.12 0.73 1.63
H23 THR 19.77 0.21 0.47
H24 VAL 17.83 0.57 1.28
H25 SER 20.33 0.20 0.47
H26 GLY 20.85 0.83 2.25
H27 PHE 16.94 0.52 1.16
H28 SER 18.24 0.31 0.72
H29 LEU 15.87 0.48 1.17
H30 THR 17.59 0.27 0.60
H31 ASN 15.67 0.17 0.54
H38 ARG 16.07 0.39 0.98
H39 GLN 18.34 0.26 0.68
H40 PRO 19.99 0.29 0.46
H44 GLY 18.23 0.14 0.38
H62 ALA 16.05 0.14 0.32
H63 LEU 15.47 0.26 0.69
H65 SER 19.29 0.14 0.33
H66 ARG 20.03 0.15 0.37
H67 LEU 15.26 0.45 1.00
H68 SER 16.67 0.36 0.83
H70 SER 16.55 0.47 1.04
H71 LYS 15.40 0.52 1.07
H72 ASP 19.98 0.27 0.59
H73 ASN 18.93 0.32 0.70
H76 SER 20.51 0.38 0.84
76
Table 3.6 (Continued). Group contributions from the distanced protein residues of the
13G5 catalytic antibody due to the charge change upon moving from the RS to the TS and
from the RS to the PS. The residue numbers from 2 to 215 and from 216 to 431 are
located at the heavy (H) and light (L) chains of the 13G5 catalytic antibody respectively.
The group contributions are for changing the charge at the indicated site from zero to plus
1.0.
Residue number
(H-heavy chain; L-
light chain)
Residue
name
Distance to the region
I center
∆G
RS
TS
,
kcal/mol
∆G
RS
PS
,
kcal/mol
H77 GLN 19.77 0.31 0.69
H78 VAL 15.09 0.86 1.78
H79 PHE 18.97 0.46 1.02
H80 LEU 16.08 0.61 1.38
H82 MET 19.39 0.38 0.77
H91 ALA 19.96 0.52 0.78
H92 VAL 20.08 0.32 0.81
H93 TYR 17.61 0.44 1.11
H94 TYR 15.89 0.55 1.48
H110 HIS 16.55 0.45 1.12
H112 GLY 17.26 0.55 1.39
H113 GLN 20.25 0.47 0.66
H114 GLY 19.69 0.40 1.02
H115 THR 21.18 0.45 0.90
H215 GLU 16.78 -0.20 -0.40
L217 VAL 16.60 -0.20 -0.40
L219 THR 19.48 -0.20 -0.30
L220 GLN 18.71 -0.10 -0.30
L237 CYS 17.58 -0.30 -0.60
L238 ARG 19.33 -0.10 -0.30
L239 SER 16.38 -0.30 -0.80
L240 SER 17.74 -0.30 -0.60
L241 GLN 15.75 -0.30 -0.80
L242 SER 16.83 -0.40 -0.90
L244 VAL 15.14 -0.50 -1.30
L246 SER 15.28 -0.40 -1.10
L248 GLY 15.88 -0.50 -1.20
L254 TRP 15.88 -0.30 -0.70
L257 GLN 19.23 0.16 0.44
L262 SER 18.61 0.32 0.84
L263 PRO 15.59 0.33 0.93
77
Table 3.6 (Continued). Group contributions from the distanced protein residues of the
13G5 catalytic antibody due to the charge change upon moving from the RS to the TS and
from the RS to the PS. The residue numbers from 2 to 215 and from 216 to 431 are
located at the heavy (H) and light (L) chains of the 13G5 catalytic antibody respectively.
The group contributions are for changing the charge at the indicated site from zero to plus
1.0.
Residue number
(H-heavy chain; L-
light chain)
Residue
name
Distance to the region
I center
∆G
RS
TS
,
kcal/mol
∆G
RS
PS
,
kcal/mol
L264 LYS 18.68 0.27 0.81
L266 LEU 18.15 -0.10 -0.20
L267 ILE 16.93 -0.30 -0.60
L270 VAL 15.28 -0.60 -1.40
L271 SER 18.55 -0.30 -0.80
L272 ASN 17.58 -0.30 -0.70
L273 ARG 19.91 -0.10 -0.20
L274 PHE 16.97 -0.10 -0.10
L283 GLY 20.05 -0.20 -0.60
L284 SER 20.15 -0.10 -0.40
L285 GLY 20.31 -0.30 -0.70
L286 SER 19.72 -0.30 -0.80
L287 GLY 17.85 -0.40 -0.90
L288 THR 18.86 -0.30 -0.70
L289 ASP 19.88 -0.10 -0.50
L290 PHE 17.12 -0.40 -0.90
L291 THR 20.41 -0.10 -0.20
L292 LEU 20.22 -0.10 -0.30
78
Figure 3.10. Predicted sites of the most effective mutations for catalysis by distanced
residues.
We focus here on predicting the effects of ionized distanced residues in view of
the fact that the use of Eq. 3.1 for predicting the effect of active site residues (in the
neighborhood of the substrate) is not expected to be particularly effective. Of course, we
can use the full EVB calculations in the same way used in section “Modeling the Catalytic
Effects in Different Kemp Eliminases” of Chapter 3 to predict catalytic mutations, but as
discussed in ref.
24
, it is hard to obtain major catalytic effects in the case of Kemp
elimination reaction and thus it would be more effective to try to exploit the simpler
strategy of using Eq. 3.1 for distanced residues. In this respect it is useful to consider the
His
H95
Asn mutant, where the mutation of His to Asn reduces the rate significantly.
However, here it is possible that in the case of the native enzyme His
H95
is the proton
acceptor and Asp
H35
stabilizes the protonated His
H95
, while in the case of the H
H95
N
79
mutant Asp
H35
is the proton acceptor. The exact role of the Asp-His pair will be the
subject of a subsequent study. Here we mainly bring this point to illustrate the difficulty in
proposing modification of the ionization states of residues in the immediate neighborhood
of the reacting system. On the hand, the effect of distanced residues is much simpler to
predict and the prediction is likely to be reliable.
Examination of the Inactive Designs
One of the questions that should be addressed in enzyme design is the ability to
eliminate design constructs that lead to inactive enzymes. This issue is explored here by
calculating the activity of several inactive enzymes. The results are summarized in Table
3.7. As seen from the table we obtained high activation barriers for the non-active
systems examined. Obviously a more extensive examination is needed, however, despite
the interest in predicting non-active constructs and eliminating them from design
experiments, we place much more importance on the ability to improve the catalytic
activity of mutants that are already slightly active and believe that this is the direction
where would be the most beneficial to use our quantitative methods.
80
Table 3.7. The calculated ∆g
‡
cat
(in kcal/mol) for the Kemp Elimination reaction of inactive
enzymes.
System
a
Base
b
Donor
c
∆g
‡
obs
,
kcal/mol
∆g
‡
calc
,
kcal/mol
Water (cage) Glu/Asp 5-NO
2
21.2 21.2
KE59 designed variant E231 5-NO
2
inactive 24.3
HG2 (wt, X-ray from Prof. Mayo) D126 5-NO
2
inactive 31.7
HG2 mutant (S265A, from wt) D126 5-NO
2
inactive 26.7
a
The system definition includes the name of the protein, its mutant and the
X-ray PDB code.
b
The base definition includes the name of the base and number in the protein
X-ray structure.
c
All donors are in the family of the benzisoxazoles with different group-
substitutions: 5-NO
2
states for the 5-nitrobenzisoxazole; 5,7-(NO
2
)
2
is 5,7-
dinitrobenzisoxazole, unsubstituted is unsubstituted benzisoxazole and 6-
glutaramide is 6-glutaramidebenzisoxazole.
CONCLUSIONS
The rational design of enzymes with native activity requires the ability to predict
the proper TS stabilization, and this involves the challenge of capturing the overall
preorganization effect. Attempts to estimate the catalytic effect by using gas phase
models or even by looking at the electrostatic interaction between different residues and
the TS are unlikely to reproduce the correct catalytic effect since it is impossible to assess
the preorganization effect without including the protein and simulating its reorganization
during the reaction.
The challenge of evaluating the catalytic power of a given mutant is not different
than that addressed in our early 1986 study of computer-aided mutations.
99
At this stage
it seems to us that the potential of the EVB has been demonstrated in well defined cases
81
(e.g., refs
76, 87
), where it was found to reproduce the large effects of mutations that
destroy the catalytic effect of evolved enzymes. Thus our main current challenge is to use
this approach in improving non-efficient enzymes.
It is also important to clarify that we appreciate the advances made in designing
artificial enzymes (and clearly those done with catalytic antibodies), in terms of
generating active sites that bind and fit the given reacting system. However, we do not
believe that the current steps are sufficient for generating effective catalysts and a CAED
must involve the ability to predict the catalysis in the given active site.
At this point it is useful to clarify the difference between our EVB approaches
and current alternative approaches. The essential requirement from a proper screening
approach is the ability to reproduce the observed catalytic effects. Obviously this major
requirement cannot be accomplished by gas phase models (including even gas phase
models with the substrate and very few residues) that were used for the initial screening
in some cases (e.g., refs
31, 79
). Instructive MM and related simulations
43, 44
can tell us
about the optimal donor / acceptor geometries and to help in generating proper scaffolds
for the reacting systems, but are unlikely to be able to predict the catalytic trends in
properly oriented systems. More relevant and instructive would be a comparison of the
EVB to current MO-QM/MM studies of enzyme design. Here it would be useful to
consider several recent studies of the Kemp eliminase and related systems: the
semiempirical MO-QM/MM study of Jorgensen and coworkers have provided reliable
results for the water reference reactions,
1
but the predicted trend in the protein
3
is not
encouraging. More specifically, the MO-QM/MM approach performs nicely in exploring
82
the effect of changing the distance between the donor and acceptor (i.e., the Glu to Asp
mutation).
4
However, the real challenge is to reproduce the effect of changing the
environment (which occurs in directed evolution experiments and is usually responsible
for the catalytic activity) and this challenge has not been yet met by the current MO-
QM/MM studies of Kemp eliminases (which drastically underestimated the barrier in the
enzyme). Interestingly ref.
2
argued that it could improve the MO-QM/MM results.
However, the reported results (and the agreement with the corresponding experimental
results) seem to overlook the energetics of forming the protonated water molecule that is
assumed to be the proton donor. Perhaps the current difficulties with the MO-QM/MM
are due to the fact that the reported studies kept the main chain fixed. Alternatively, Houk
and coworkers
44
have attempted to use ONIOM truncated protein model but obtained a
relatively poor agreement (with a spread of about 12 kcal/mol for experiment deviations
of 2 kcal/mol (see Figure S3 in ref.
44
). This work also presented views that might lead to
some confusion. First, there are problems with the argument that calculations with an
error of 1.5 kcal/mol may not be useful since the observed mutational effects are around
2 kcal/mol. In fact, predictability with 2 kcal/mol error range would be a fantastic tool in
attempts to generate enzyme with large catalytic effects. Second and potentially more
problematic is the idea that the predicted power requires a very high level QM method.
This suggestion is risky (in terms of its possible impact on the experimental community)
and unjustified. That is, predictive approaches like the EVB are not interested in
predicting the absolute QM energy of the substrate since what counts is the change in this
energy upon moving from water to the protein active site. Thus the effort in developing
83
predictive method must be spent on having a good convergence and a proper long rang
treatment and not on getting the best basis set.
Overall, we have no doubt that MO-QM/MM approach with proper sampling
(e.g., with our approach of using a reference potential
78
will be able to provide a proper
screening tool (in particularly when used with an EVB as a reference potential).
However, at present the EVB seems to provide the most effective way for obtaining
reliable screening results perhaps because its ability to allow for sufficiently extensive
sampling.
Some workers may see a great potential for using disolvation effects in enzyme
and causing a catalysis by ground state destabilization (GSD) (see discussion in ref.
106
)
but native enzymes do not catalyze reaction by exploiting disolvation effects.
106
Furthermore, even Kemp eliminases have not been able to exploit this effect
significantly. One of the problems is that even if we could create a strong RS disolvation
for the base it would lead to a very large pK
a
and this will not help at physiological pH.
That is, if we try to destabilize the RS by destabilizing the ionized base (e.g., the ionized
Asp), the base will be protonated by a bulk proton. Here the best option is to use the
polar TS stabilization but unfortunately it is very hard to obtain for the Kemp TS (see
ref.
24
). Perhaps a part of the reason why enzymes do not use disolvation effects is the
available pH range and the available amino acids (see discussion of ODCase in ref.
104
).
Note that the GSD issue has been established with the reliable linear response
(LRA) calculations of the ground state solvation free energy
24
and by detailed
comparison to the related case in dehalogenase where a similar situation is handled by a
84
neutrally evolved enzyme in a completely different way, with ground state stabilization
and with very large transition state stabilization (see ref.
24
). Our conclusion about the fact
that the Kemp eliminases use RSD is not related to the exact structure of the TS, namely
concerted or stepwise, but to the charge distribution of the TS (which has been treated her
in a rigorous by our special procedure).
Since the present work invested a major computational power in validating the
EVB results, one may ask why we have not provided some predictions. The answer
involves two points. First, we do provide several clear predictions with regards to the
effects of mutating distanced residues predictions. This reflect our finding (see ref.
24
that
it is extremely hard to get large catalysis s in Kemp eliminases by simple mutations of
the active site residues (this is why we enough turn our attention to other systems where
the change in the substrate charge upon going to the TS are larger). Second we already
demonstrate our ability to have reasonable predictions of the Asn155 to Ala in subtilisin
in ref.
32
. Thus the present paper is more about what does it take to get a reliable
prediction than about actual predictions. More specifically, in our previous works (e.g.,
ref.
76
) we examine the trade on between speeds and reliability in different approaches for
enzyme design. At present we feel that our fast strategies (like using group contributions
and reorganization energies are not predictive enough, and that it is important to use the
extensive averaging considered here). However, with the current advances in computer
power, this is not such a bad news. That is as seen from Table 3.8 we can screen 14
mutants (twenty runs for each mutant) in 24 hours on 200 nodes and 70 mutants can be
screened using 1000 nodes. We can (and we had) used more parallel approach where the
85
mapping is distributed on several processes but this did not lead to more efficient
screening.
Table 3.8. Estimating the efficiency of the EVB screening calculations.
a
computer time
per mutant
(runs, processor)
No. of mutants per 24 h
per 200 processors
b
No. of mutants per 24 h
per 1000 processors
b
∆g
‡
cat
using EVB
17 h (1, 1) 14 70
a
The calculations were conducted on the University of Southern California HPCC
(High Performance Computing and Communication) Linux computer, using Dual
Intel Xeon(64-bit) 3.2 GHz 2GB Memory nodes.
b
Twenty runs per mutant.
At present there are still many who believe in dynamical and other esoteric effects
that are presumed to contribute to catalysis (for review see ref.
40
). In many cases it is
clearly suggested that improving such effects will be crucial for optimal enzyme design
(e.g., ref.
64
). However, it seems to us that by far the main factor that actually contributes
to catalysis is the preorganization effect and thus we feel that there is no rational way for
improving dynamics and related effects as these factors do not contribute to catalysis.
40
Furthermore, we would like to clarify that TS stabilization by delocalization effects
79
is
unlikely to provide a significant catalytic factor since the same effect exists in the
reference solution reaction. Thus, the possible effect of π-stacking (which was
considered in ref.
79
) is not expected to lead to a significant rate enhancement above the
simple effect of having nearby induced dipoles, which is much less effective than having
86
preorganized polar environment. In fact, as realized by Hilvert and coworkers
17
the
corresponding dispersion or more precisely inductive effect is small.
Our previous work
24
attempted to refine the electrostatic environment near O1.
This effort can be considered by some as an extension of the idea of placing an acid near
O1 (e.g., refs
17, 43
). However, the idea that such a base is needed is reminiscence from
physical organic chemistry concepts that capture some of the electrostatic effect, but end
up looking at factors that do not play major role in enzyme catalysis. In our view, the
issue is not a charge transfer or a covalent bond to the substrate as might be concluded
from gas phase calculations, since we are not dealing here with a bifunctional reaction
with two steps (unless we have a new chemistry), but with a pure electrostatic
stabilization. It is true that the attempts to focus on the base lead in some cases to what
we consider the correct direction, like placing Lys or His near O1,
43
but this has little to
do with the pK
a
of Lys, as one would assume from the traditional picture. It actually
reflects the electrostatic free energy of interactions.
Obviously our strategy can be and will be improved in the future, but the main
point is the ability to consider enzyme design by using energy based rational way. In
this respect it is useful to point out a special nontrivial advantage of our CAED
approach. That is, in the case of attempts to improve a specific enzyme, there is an
acquired advantage in the accumulated experience of modeling different mutants (as was
the case here in the study of the evolved mutants). Even significant errors in predicting
the first set of mutants can lead to improvement of the model (e.g., in selecting enzyme
specific dielectric to compensate for convergence difficulties in the treatment of ionized
87
residues) and to a better understanding of the specific enzymes and thus to a better
prediction ability in further design rounds. Such an advantage is not expected from
computational design approaches that are not base on capturing the physics of the
catalytic process.
Finally, we would like to reclarify that we have demonstrated the ability to
reproduce quantitatively the absolute catalytic effects and mutational effects in naturally
evolved enzymes
106
and in designer enzymes (this work). This clearly indicates that the
catalytic power of enzyme is not due to elusive effects (e.g., conformational dynamics),
but to what is by now well understood; namely the electrostatic preorganization. Thus
our difficulties in improving designer enzymes are not due to overlooking misunderstood
factors, but to the difficulties in optimizing well understood factors. In other words, a
method that reproduces the catalytic rate in known systems should be able to do so in any
unknown sequence and the challenge is to find the unknown optimal sequence. At any
rate, it seems to us that the present study provided a useful analysis of the reasons for the
less than perfect performance of current designer enzymes.
88
CHAPTER FOUR:
TOWARDS QUANTITATIVE COMPUTER-AIDED STUDIES OF ENZYMATIC
ENANTIOSELECTIVITY: THE CASE OF CANDIDA ANTARCTICA LIPASE A
INTRODUCTION
Optimizing enzymes to catalyze selective enantioselective reactions has a major
potential in biotechnology.
10
For example, the use of biocatalyst for efficient synthesis of
enantiomerically pure chiral molecules is of great importance in the production of drugs
by the pharmaceutical industry.
13
Furthermore, understanding the observed
enantioselectivity in different enzyme presents a significant challenge for approaches
aimed at understanding enzyme catalysis. Experimental studies of enantioselective
enzymatic reactions have provided major advances in recent years (e.g., refs
20, 70, 75
). A
major focus of these studies turned to lipases
20, 51, 52, 69, 80, 81
via an examination of
esterification reactions,
68
solvent effects,
49
the temperature effects
52
and substrates
effects.
57, 69, 80
Furthermore, instructive advances have been done with directed evolution
experiments.
20, 81
Enantioselective enzymatic reactions have also been examined by theoretical
approaches, including MM, MD studies
66, 68, 69, 74
and QM/MM,
93
but these interesting
studies have not provided quantitative insight (in our view quantitative prediction
requires one to have calculated values that are within 1-2 kcal/mol from the
89
corresponding observed values). Attempt to use cluster QM model
30
has provided
interesting insight and encouraging results but such approach might find difficulties in
capturing entropic effects and in overestimating strong steric effects. Here the initial
model building may be crucial since the effect of alternative conformations is hard to
assess. Overall it is important to explore QM/MM approaches that involve extensive
sampling and evaluate the actual activation free energies, since such strategies should be
able to explore entropic effects and to allow for exploring, more realistic relaxation of the
active site. Here it is important to validate the method used by careful comparison to
available experimental results and an excellent test case is provided by the observed
enantioselectivity of Candida antarctica lipase A (CALA) and its mutants.
At this point it is important to clarify that the use of force field and energy
minimization method in assessing the stereoselectivity of TS models (e.g., ref.
81
) can
provide interesting insight, but may not be able to capture the quantitative aspects of the
actual TS binding free energy of different enantiomers. Attempts to use average
interaction energies from MD simulations (e.g., refs
11, 74
) can give sometimes the correct
trend. However, the study of ref.
74
required rather arbitrary selection of the regions
included in the averaging and different regions gave different results, while the
insightful study of ref.
11
drastically overestimated the energy change upon mutations. An
interesting attempt to obtain more quantitative results study has been report by ref.
72
, who
explored the enantioselectivity of dehalogenase by using the LRA/β and exploring the so
called near attack conformation (NAC). However, this method has not been validated in a
quantitative way as it did not provide the actual calculated enantioselectivity but rather
90
used the LRA binding energy and the NAC criterion as a way to assess the observed
selectivity.
In trying to obtain more quantitative results it is crucial to improve two aspects of
the modeling; namely the potential surface and the sampling. That is, trying to evaluate
the free energy of mutating the R to S enantiomers using a force field model of their TS
can be useful, but here it is important to determine the correct charge distributions and
structure of the TS and this can be best done by a QM/MM approach (it is important to
capture the change of the TS charge and geometry upon interaction with the enzyme
active site). Unfortunately, the use of a QM/MM approach is unlikely to give reliable free
energies without extensive sampling (in fact, the sampling of the enzyme substrate
configurations is also the most crucial requirement in classical force field studies). Here
the empirical valence bond (EVB) arguably provides the optimal current strategy, since it
combines a reliable semiempirical QM/MM model with the ability for extremely
effective sampling.
This work demonstrates that the EVB approach allows one to explore the effect of
different mutations that switch the catalytic activities between the R and S enantiomers of
the Candida antarctica lipase A (CALA). We also examine faster screening approaches
and find the linear response approximation (LRA) in its LRA/β version to have some
promising aspects. Our study establishes the need for extensive sampling in computational
attempts to quantify the selectivity. Thus although using a few structures may produce the
correct results, trying other structure is likely to give different results.
91
CONCEPT OF ENANTIOSELECTIVITY
The enantioselectivity, which is the main subject of the present work, is defined by
the catalytic efficiency ratio (E) of the enzymatic rate of the two enantiomers. A
convenient way for expressing this ratio is given by:
E( fast) = k
cat
K
M
( )
fast
k
cat
K
M
( )
slow
(4.1)
where the notation “fast” stands for the enantiomer with the larger k
cat
/K
M
. The free
energies that are relevant to k
cat
/K
M
(or more precisely to k
cat
/K
bind
(RS)) can be expressed
in terms of the TS binding free energy
34, 103
ΔG
bind
(TS)= Δg
‡
p
– Δg
‡
w
= –RT ln [k
cat
/ K
bind
(RS)] + RT ln (k
B
T / h) + RT ln k
w
– RT ln (k
B
T / h)
= –RT ln [k
cat
/ K
bind
(RS)] + RT ln k
w
(4.2)
where Δg
‡
p
is the activation barrier that corresponds to k
cat
/K
bind
(RS) (see ref.
34
).
In principle we can (and will) calculate the TS binding free energy, but in the first
stage we can just focus on the difference in k
cat
rather than k
cat
/K
M
. The contribution to
the enantioselectivity in terms of k
cat
will be called here E’:
( ) ( )
'
cat cat
fast slow
E (fast)= k k (4.3)
where
( ) ( )
( )
ln
cat cat fast slow
fast slow
RT k k = g g
≠ ≠
− ∆ −∆ (4.4)
92
SYSTEM
CALA is a serine hydrolase whose catalytic mechanism has been studied
extensively (for e.g., refs
98, 102
). The CALA active site includes the catalytic triad of
Asp334, His366 and Ser184, which acts in the same way as the well studied serine
proteases
98, 102, 103
where a proton transfer from Ser184 to His366 is followed by a
nucleophilic attack of the ester carbonyl by the deprotonated Ser184 (Figure 4.1). The
enzyme catalyzed the reaction by stabilizing the negatively charged oxyanion by an
oxyanion hole, the protonated Asp95 and Gly185, (the same type of oxyanion hole is a
key catalytic factor in various proteases)
98, 102
and by the electrostatic interaction between
Asp334 and the ionized His366 (again in analogy with serine proteases).
16, 103
The
reaction can be rate limiting by either the acylation or the deacylation steps
52, 81
and in the
case of α-substituted esters hydrolysis by CALA studies here it is limited by the acylation
step.
81
93
Figure 4.1. A schematic description of the acylation step in the catalytic reaction of the
CALA. The reaction is considered as a two-step mechanism, where step (1) involves a
proton transfer from Ser184 to His366 and step (2) involves the attack of the negatively
charged serine on the carbonyl carbon of the substrate and a formation of a tetrahedral
intermediate.
System Preparation
The starting coordinates of the unbound CALA were obtained from Protein Data
Bank (PDB: 2VEO, 2.20 Å).
21
The R and S enantiomers of 4-nitrophenyl 2-
methylheptanoate substrate were built into the free enzyme (shown in Figure 4.2) using
the AutoDock 4.0 software.
59
The automated docking of the ligand to the CALA was
94
performed using the standard protocol (similar to our previous studies
89
). The ligand was
prepositioned within the binding pocket similar to the ligand placement shown in ref.
81
.
Using the AutoDock 4.0 program
59
several good ligand conformations were found. Based
on the known biological and modeling data, the preferred ligand conformation was
identified that were used as the starting structures for MD simulations. During the system
preparation, the crystal waters were removed; all hydrogen atoms and water molecules
were added using the MOLARIS software package.
15, 48
Figure 4.2. The structure of active site of the wild type CALA with the R and S
enantiomers of the 4-nitrophenyl 2-methylheptanoate substrate.
95
It must be emphasized at this point that our results do not depend on the model
building used (as long as we have reasonable starting point) since we perform very
extensive averaging that allows the system to find its lowest free energy landscape.
The partial atomic charges of the ligand were determined from the electronic wave
functions by fitting the resulting electrostatic potential in the neighborhood of these
molecules using Merz-Kollman scheme. The electronic wave functions were calculated
with hybrid density functional theory (DFT) at the B3LYP/6-311G** level, performed
with the Gaussian03 package.
23
The generated protein complex system (that includes the protein, bound ligand,
water and Langevin dipoles) was preequilibrated for 200ps at 300K with a time steps of 1
fs using the ENZYMIX force field.
15, 48
The spherical inner part of the system with radius
18 Å was constrained by a weak harmonic potential of the form,
02
' ()
i i i
V Ar r = Σ−
, with
A=0.03 kcal mol
-1
Å
2
to keep the protein atoms near the corresponding observed positions.
Along with the inner spherical constraints, the weak residue constraints of 0.5 kcal mol
-
1
Å
2
were applied on the substrate, Asp95, His366 and Asp334 (as these residues are in
similar position in the R and S enantiomers). The protein atoms outside this sphere were
held fixed and their electrostatic effects excluded from the model.
The mutant systems were generated from the wild type (wt) CALA X-ray structure
using PyMOL molecular graphics software
19
with the following 200 ps relaxation.
96
METHODS
EVB Calibration Procedure
Our study started with a systematic analysis of the reference reaction for histidine
assisted ester hydrolysis in solution, using the relevant experimental information. The
calibration procedure includes the energetics of the proton transfer (PT) from serine to
imidazole and energetics of the following nucleophilic attack (NA) of the ionized serine
and carbonyl group of the substrate (see ref.
103
).
The energy of the proton transfer step in water is determined from the pK
a
values
(pK
a
(Ser) ~ 16 and pK
a
(His) ~7) and is found to be 12 kcal/mol. The energy of forming
the tetrahedral intermediate was calibrated by using the rate constant of the uncatalyzed
reaction in water of the methanol with p-nitrophenyl acetate (600 M
-1
s
-1
)
35
and
extrapolating it to 55M (which guarantees the presence of the OH
-
in the solvent cage).
The resulting rate constant (k
cage
) is around 3.3
.
10
4
s
-1
and an activation free energy
equals 11.3 kcal/mol.
Combining the free energy for the PT step and the nucleophilic attack gives a total
activation barrier (
cage
G
≠
∆ ) of ~ 23.3 kcal/mol for our reference reaction in solvent cage.
The barrier for a concerted path is expected to be very similar to our stepwise estimate
(see ref.
91
). The above estimate was used to calibrate the EVB surface for our reference
reaction in solution and the corresponding free energy surface is shown in Figure 4.3. The
corresponding free energy surface for the enzymatic reaction will be considered below.
97
Figure 4.3. The calculated free energy profile for reaction in water.
Empirical Valence Bond Simulations
The calculations of the activation free energies were performed by the empirical
valence bond (EVB) method. This method that has been described extensively
elsewhere
5, 103
and is described in Chapter 2 (“Methodology: Empirical Valence Bond
Approach”). The EVB is an empirical quantum mechanics/molecular mechanics
(QM/MM) method that can be considered as a mixture of diabatic states describing the
reactant(s), intermediate(s) and product(s) in a way that retains the correct change in
structure and charge distribution along the reaction coordinate. The EVB diabatic states
provide an effective way for evaluating the reaction free energy surface by using them for
driving the system from the reactants to the product states in a free energy perturbation
umbrella sampling procedure. The reason for the remarkable reliability of the EVB is that
it is calibrated on the reference solution reaction and then the calculations in the enzyme
98
active site reflect (consistently) only the change of the environment, exploiting the fact
that the reacting system is the same in enzyme and solution. Thus, the EVB approach is
calibrated only once in a study of a given type of enzymatic reaction.
Simulation Protocol
The EVB for the present reaction has been constructed by using the three states
described in Figure 4.4. The EVB parameters for the surfaces of the solution reaction
were calibrated by using the available experimental information about this reaction
(“EVB Calibration Procedure” section of Chapter 4). The calibrated parameters were kept
unchanged for the generation of the protein EVB surface.
Figure 4.4. The three resonance structures (
i
Ψ ) used to describe the acylation reaction of
CALA.
1
Ψ ,
2
Ψ and
3
Ψ describe the reactant state, the product of the proton transfer step
and the tetrahedral intermediate, respectively.
The EVB calculations were carried by MOLARIS simulation program using the
ENZYMIX force field.
15, 48
The EVB activation barriers were calculated at the
configurations selected by the same free energy perturbation umbrella sampling
(FEP/US) approach used in all our studies (e.g., refs
103, 106
). The simulation systems were
99
solvated by the surface constrained all atom solvent (SCAAS) model
48
using a water
sphere of 18 Å radius centered on the substrate and surrounded by 2 Å grid of Langevin
dipoles and then by a bulk solvent, while long-range electrostatic effects were treated by
the local reaction field (LRF) method.
48
The EVB region includes the carbonyl group of
the substrate and imidazole of the histidine and hydroxyl group of the serine. Validation
studies were done within 22 Å radius of inner sphere, where we repeated the calculations
of the activation barrier and obtained practically the same results (of course treating the
distanced ionized groups with a high dielectric macroscopic model). The FEP mapping
procedure involved the use of 21 frames (5 ps each) for moving along the reaction
coordinate. All the simulations were done at 300 K with a time step of 1 fs. The weak
residue constraints of 0.5 kcal mol
-1
Å
2
were applied on the substrate, Asp95, His366 and
Asp334
to keep the atoms near the corresponding observed positions. The simulations
were repeated 10 times in order to obtain reliable results with different initial conditions
(obtained from arbitrary points of the relaxation trajectory). Furthermore, the hysteresis
in the calculation was examined by performing forward and backward (which gave very
similar results). The average was determining by taking in each case the difference
between the calculated minimum at the reaction state (RS) and the given transition state
(TS). This simulation protocol was applied to both reaction steps, the proton-transfer step
and the nucleophilic attack.
Asp95, which forms a part of the oxyanion hole, was consider to be protonated based on
its calculated pK
a
(pK
a
calc
= 6.5) and mutations experiments.
82
The relevant pK
a
calculations were performed using the MOLARIS package.
85
100
The Van der Waals parameters of the hydroxyl oxygen of Ser184 with protein or water
molecules were fine tuned based on the calculation of the solvation free energy
calc
solv
G = -91 ∆ kcal/mol, which is comparable to the corresponding observed value
obs
solv
G = -92 ∆ kcal/mol.
101
The calibrated EVB parameters are listed in Table 4.1 and 4.2.
Table 4.1. Parameters used in the EVB calculations.
a
The EVB charges used in calculations
atom
name
atom
type
ψ
1
b
ψ
2
ψ
3
His366 C CG 0.035 0.330 0.330
N ND1 0.015 -0.200 -0.200
H HD1 0.187 0.400 0.400
C CE1 0.110 -0.030 -0.030
H HE1 0.075 0.250 0.250
N NE2 -0.520 -0.090 -0.090
C CD2 0.028 -0.310 -0.310
H HD2 0.070 0.260 0.260
Ser184 O OG -0.427 -1.000 -0.200
H HG 0.427 0.390 0.390
substrate C C 0.392 0.392 0.200
O O -0.392 -0.392 -1.000
Morse bond parameters:
( )
( )
0
2
() 1
bb
M
b MD e
µ −−
∆ = −
Bond type D
M
b
0
µ
C
i
-C
i
96.0 1.40 2.0
C
i
-N
i
94.0 1.40 2.0
C
i
-H
i
100.0 1.10 2.0
N
i
-H
i
98.0 1.10 2.0
OG-HG 102.0 0.96 2.0
OG-C 93.0 1.50 0.8
C-O (ψ
1
;ψ
2
)
b
93.0 1.25 2.0
C-O (ψ
3
)
b
93.0 1.40 0.8
101
Table 4.1 (Continued). Parameters used in the EVB calculations.
a
Angle parameters:
( )
2
0
( ) 12 Vk
θθ
θ θ θ = −
Angle type ½ k
θ
θ
0
C
i
-C
i
-N
i
50.0 120.0
C
i
-N
i
-C
i
50.0 120.0
N
i
-C
i
-N
i
50.0 120.0
C
i
-C
i
-H
i
50.0 120.0
C
i
-N
i
-H
i
50.0 120.0
N
i
-C
i
-H
i
50.0 120.0
OG-C-O (ψ
3
)
b
50.0 109.5
Torsion angle parameters: ( ) ( )
0
( ) 1 cos k n V
φ φ
φ φ φ = + −
Angle type k
ф
n ф
0
C
i
-C
i
-N
i
-C
i
15.0 2.0 180.0
H
i
-C
i
-N
i
-C
i
15.0 2.0 180.0
C
i
-N
i
-C
i
-N
i
15.0 2.0 180.0
H
i
-N
i
-C
i
-N
i
15.0 2.0 180.0
H
i
-N
i
-C
i
-H
i
15.0 2.0 180.0
Improper torsion angle parameters: ( ) ( )
0
( ) 1 cos k n V
φ φ
φ φ φ = + −
Angle type k
ф
n ф
0
C
i
-C
i
-N
i
-C
i
15.0 2.0 180.0
H
i
-C
i
-N
i
-C
i
15.0 2.0 180.0
C
i
-N
i
-C
i
-N
i
15.0 2.0 180.0
H
i
-N
i
-C
i
-N
i
15.0 2.0 180.0
H
i
-N
i
-C
i
-H
i
15.0 2.0 180.0
Nonbonded parameters (repulsion function)
c
:
r
nb
V Ce
α −
=
Atom type C α
OG-HG 4000 4.0
C-OG (ψ
3
)
b
19999 3.8
C-O (ψ
3
)
b
19999 3.8
Nonbonded parameters (van der Waals)
d
:
( ) ( )
12 6
* * *
2
nb
V r r r r ε
= −
Atom type r
*
ε
*
H
i
-H
i
2.5 0.01
102
Table 4.1 (Continued). Parameters used in the EVB calculations.
a
Nonbonded parameters (van der Waals)
f
:
/ 12 / 6
**
EVB p w EVB p w
nb
V AA r BB r = −
Atom type A
EVB
B
EVB
OG 1000 25.0
a
Energies are in kcal/mol, distances in Å and angles in degrees.
b
(ψ
i
) designates the EVB state (ψ
1,
ψ
2
or ψ
3
) for the specific EVB parameters. The EVB
states are defined in Figure 4.4.
c
Nonbonded interactions, where only the repulsion term is considered for atoms, which are
bonded in one of the VB structures.
d
Nonbonded interactions for atoms, which are never bonded in any of the VB structures.
f
Nonbonded interactions of the EVB atoms with protein or water. A
EVB
and B
EVB
designates
the van der Waals parameters A and B of the EVB atoms respectively. A
p/w
and
B
p/w
designates the van der Waals parameters A and B of the protein or water atoms respectively.
Table 4.2. The EVB calibrated parameters used in the present calculations.
a
EVB mapping parameters
b
12
Ψ →Ψ
c
23
Ψ →Ψ
c
()
0
rr
ij ij
A e H
µ − −
= 12
H = 12.0
23
H = 46.7
ij i j
α αα ∆= −
=
12
99.0 α ∆
23
= -67.0 α ∆
a
The force field parameters used in EVB calculations are listed in the Table 4.1.
b
The EVB mapping parameters are used for the calibration of the corresponding solution
reaction.
ij
H is the off diagonal matrix element and
ij
α ∆ is the gas phase shift.
c
i
Ψ represents the resonance structures used in the EVB calculations (Figure 4.4).
103
Direct Calculations of E
Although the initial focus in this work has been on the contribution of k
cat
to the
enantioselectivity (namely E’), we also explored our ability to evaluate E. Here it is
useful to exploit that fact that the selectivity reflects simply the difference between the TS
binding energies of the R and S enantiomers. Thus we can use the thermodynamic cycle
of Figure 4.5 and obtain:
( ) ( )
RTln E R − =
∆ ∆G
bind
TS
R → S
( )
= ∆G
bind
TS
S
( )
− ∆G
bind
TS
R
( )
= ∆G
TS( p)
R → S
( )
− ∆G
TS(w)
R → S
( )
(4.5)
Figure 4.5. A thermodynamic cycle for mutating R to S. The figure describes
schematically the mutation of CH
3
to H and H to CH
3
. In practice, the mutations were
done by converting both the CH
3
and H to dummy atoms (d) and then converting the
dummy atoms (d) to H and CH
3
.
104
In using this expression it is enough to mutate the R to S in the TS in the protein
and then just to subtract the corresponding results for the mutation in water (which can be
different than zero due to force field artifacts). The approach of mutating TS has already
been used in our early mutational studies,
32
but at that time we mutate the protein, while
here we mutate the substrate. The actual mutation is conveniently done by labeling the
atoms that distinguish R for S (specifically the H and CH
3
) as atoms that can be either
real or dummy atoms (see Figure 4.5). In practice the mutations were done by converting
both H and CH
3
to dummy atoms and then converting the dummy atoms to H and CH
3
.
The FEP mapping procedure involved the use of 31 frames (2 ps each) for moving along
the reaction coordinate from the real TS to the TS with dummy atoms. All the simulations
were done at 300 K with a time step of 1 fs. The TS structure was taken from the
corresponding EVB calculations and was further relaxed for additional 100 ps for each
studied system. The weak residue constraints of 0.5 kcal mol
-1
Å
2
were applied on the
substrate, Asp95, His366 and Asp334
to keep the atoms near the corresponding observed
positions. The simulations averaged over runs from 4 different initial conditions, in order
to obtain reliable results with different initial conditions (obtained from arbitrary points
of the TS relaxation trajectory). In order to avoid the end point catastrophe in mutating to
dummy atoms we found it is convenient to delete the first and last frame, while exploring
the convergence with larger number of frames.
105
Group Contributions: Linear Response Approximation/β (LRA/β)
An integral part of our studies of enzyme design is the ability to evaluate the
electrostatic contributions of different residues to the free energy of the reactant,
transition state and product state. Arguably the most effective ways of obtaining such
contributions is the use of the linear response approximation (LRA) approach.
47
This
method provides not only a good estimate for the free energy associated with the change
between two potential surfaces,
47
but also offers the unique ability to decompose total
free energies to their individual contributions.
22, 87
That is, using the LRA we can express
the free energy associated with changing the electrostatic potential of the system from
U
elect, A
to
U
elect,B
:
∆G(U
elct, A
→ U
elect,B
) =
1
2
U
elect,B
− U
elect, A
A
+ U
elect, A
− U
elect,B
B
( )
(4.6)
where
α
designates a molecular dynamics (MD) average over trajectories obtained
with
U = U
α
. Accordingly, we can write:
∆G
bind ,elect
LRA
=
1
2
[ U
elec,l
p
l
+ U
elec,l
p
l '
] − [ U
elec,l
w
l
− U
elec,l
w
l '
] (4.7)
where
,
p
elec
U is the electrostatic contribution for the interaction between the ligand and its
surroundings, p and w designate protein and water, respectively, and and ' designate
the ligand in its actual charged form and the “non-polar” ligand, where all of the residual
charges are set to 0. In this expression, the term
U
elec,l
− U
elec,l '
, which is required by the
106
LRA treatment, is replaced by
, elec
U since
,'
0
elec
U =
. Now, the above expression can be
used in evaluating the LRA contribution of the i
th
residue of the protein by writing:
∆G
bind ,elect
LRA,i
=
1
2
U
elec,l
p,i
l
+ U
elec,l
p,i
l '
(4.8)
It is important to note that in contrast to the effectiveness of the LRA in studies of the
absolute catalytic effect (which is directly controlled by the electrostatic preorganization)
the actual E and E’ values are frequently determined by indirect electrostatic effect,
where steric interactions prevent one stereoisomer from reaching the same optimal
electrostatic preorganization as the other. Thus it is important to try to capture steric
effect in a simplified way, which is not more expensive than the LRA treatment. Here we
can apply our LRA/β approach,
89
which is a combination of our electrostatic LRA
method and Aqvsit’s LIE steric (nonelectrostatic) term.
85
That is we express the total
binding free energy as
, ,,
[]
LRA p w
bind bind elec vdw vdw
GG U U β ∆ ≅∆ + −
(4.9)
where β is an empirical parameter that scales the vdW component of the protein-ligand
interaction. In our calculations the scaling parameter β equals to 1.0. A careful analysis of
the relationship between the LRA and LIE approaches and the origin of the β parameters
is given in ref.
85
. This analysis shows that β can be evaluated in a deterministic way
provided that one can determine the entropic contribution and preferably, the water
penetration effect microscopically.
107
At any rate, we can write (see also ref.
85
)
∆G
bind
LRA/ β
= 0.5( U
elec,l
p
l
− U
elec,l
w
l
) + 0.5( U
elec,l
p
l '
− U
elec,l
w
l '
) + β( U
vdW ,l
p
l
− U
vdW ,l
w
l
)
(4.10)
Thus we can write
/ /
ln (E(R)) = (1/RT){ ( ) ( ) }
LRA LRA
bind R bind S
G TS G TS
ββ
∆ −∆ (4.11)
=
,, , ,, ,
''
( 0.5 / ){( ) ) ( ) ) }
pp p pp p
elec l elec l vdW l R elec l elec l vdW l S
ll l ll l
RT UU U UU U ββ − ++ − ++
where the water contribution of Eq. 4.10 has been cancelled. Here again we can evaluate
the group contributions to E(R) in analogy with Eq. 4.8.
RESULTS
Calculated Activation Free Energies
Using the EVB parameters calibrated on the reference solution reaction we evaluated
the EVB free energy surface for the reaction in CALA. The resulting free energy surface
(is shown in Figure 4.6) with a calculated activation barrier
, cat calc
G
≠
∆ of 18.3 kcal/mol.
This barrier is in a good agreement with the observed barrier (
∆G
cat,obs
≠
=17.9 kcal/mol)
obtained using transition state theory and the observed
cat
k of the wild type CALA for the
R enantiomer (
cat
k =0.48 s
-1
).
81
An overall the enzyme stabilizes the transition state by
about 5 kcal/mol than water does. The structural elements responsible for the stabilization
of the transition state in enzyme are shown in Figure 4.1 and 4.2.
108
Figure 4.6. The calculated free energy profile of the wild type CALA (R enantiomer).
Calculating the Mutational Effects on E’
One of our main aims of this work is the evaluation of the contribution to
enantioselectivity for k
cat
(E’). However, before exploring our ability to calculate the E’
of CALA and its mutant, it is important to establish the reliability of our approach in
reproducing the observed catalytic effects in the wt and the different mutants. The
performance of the EVB in reproducing the catalytic effect of the wild type has been
demonstrated in the previous section and the performance with the mutants is considered
in Table 4.3 and Figure 4.7. As seen from the table and the figure the calculated
activation free energies are in good agreement with the corresponding observed values.
109
Table 4.3. Calculated and observed ∆g
‡
cat
for the reaction of CALA and its mutants.
a
mutations
b
label
enan-
tiomer
∆g
0
PT
∆g
‡
cat
(NA)
∆g
‡
cat
(total)
∆g
‡
exp
(total)
E’
exp.
c
E’
calc.
c
water 1 R / S 11.9 11.3 23.2 23.3 - -
wild type 2
R 13.1 5.2 18.3 17.9
3.8 (S) 1.7 (S)
S 14.5 3.5 18.0 17.1
F233L / G237Y 3
R 12.1 4.7 16.8 16.9
7.6 (R) 6.4 (R)
S 14.1 3.8 17.9 18.1
T64M/ F149S/ I150D/
F233N/ G237L
4
R 13.1 4.5 17.6 18.6
11.0
(S)
5.4 (S)
S 14.2 2.4 16.6 17.2
T64M/ F149S/ I150D/
Y183F/ F233N/ G237L
5
R 13.8 4.6 18.4 -
no exp. 14.9 (S)
S 14.5 2.3 16.8 -
Y183F 6
R 15.2 3.2 18.4 -
no exp. 6.4 (S)
S 12.5 4.8 17.3 -
G237Y 7
R 11.4 5.1 16.5 -
no exp. 6.4 (S)
S 13.1 2.3 15.4 -
F233N 8
R 13.2 5.2 18.4 -
no exp. 24.8 (S)
S 13.3 3.2 16.5 -
F233G 9
R 10.6 3.9 14.5 -
E
exp.
=
17 (R)
12.6 (R)
S 13.2 2.8 16.0 -
F149Y/ I150N/ F233G 10
R 12.6 2.9 15.5 -
E
exp.
=
104 (R)
68.5 (R)
S 14.3 3.7 18.0 -
F233L 11
R 11.5 5.6 17.1 -
no exp. 3.3 (R)
S 14.0 3.8 17.8 -
a
The calculated ∆g
‡
cat
(in kcal/mol) reflects an average over 10 conformations
obtained from equally spaced points along the relaxation trajectory. ∆g
0
PT
is the free
energy change in proton-transfer step. The standard deviation is reported in
Appendix B.
b
The X-ray structure 2VEO.pdb of the wild type CALA
21
were used as the initial
geometry for the subsequent wt relaxation and EVB calculations. The initial
structures for the different mutations were generated from the wt CALA X-ray
structure, using the PyMOL molecular graphics software,
19
following by 200ps
relaxation run.
c
The E’(fast)-value is given by ((k
cat
)
fast
/(k
cat
)
slow
). The experimental E’-values for
the wt CALA protein and its 3 and 4 mutants are taken from ref.
81
. The
experimental E-values for the 10 and 11 mutants are taken from ref.
20
.
110
Figure 4.7. The correlation between the calculated and observed activation free energies
of the catalytic hydrolysis of the R and S enantiomers of the substrate by the wild type
CALA and its mutants.
After establishing the performance of the EVB in reproducing the general
catalytic effect, we turn our attention to the performance in calculating E’. The calculated
results in terms of the E’ of the system studied are given in Table 4.3 and Figure 4.8. We
consider these results to be encouraging, considering the difficulties in obtaining stable
results. That is, the simulations involved configurations with different degree of
penetration of water molecules to the active site and sometimes in cases where the R
enantiomer occupied the site usually occupied by the S enantiomer.
111
Figure 4.8. The calculated (black bars) and observed (empty bars) E’ values for the wild
type CALA and the 3 and 4 mutants. The plot gives positive and negative values,
respectively, to E’(R) and E’(S).
However, despite this difficulty, which may lead to significant problems with
QM/MM approaches without sampling, the final calculated average free energies show
converging trend with a RMS of less than 2 kcal/mol and with trend that reproduces the
observed trend. Overall the results presented in Figure 4.8 indicate that the EVB
approach allows one to reproduce the trend in the enantioselectivity E’ with proper
systematic sampling.
Calculating the Total Selectivity and Its Changes by Mutational Effects
Although the main focus in this work has been on the contribution of k
cat
to the
enantioselectivity (namely E’), we also explored our ability to evaluate E by using the
thermodynamic cycle of Figure 4.5 and mutating the R to S in the TS in the protein and
then just to subtract the corresponding results for the mutation in water. The approach of
mutating TS has already been used in our early mutational studies,
32
but at that time we
112
mutate the protein, while here we mutate the substrate. The results of our R to S
mutational procedure are summarized in Table 4.4 and Figure 4.9.
Table 4.4. Calculated and observed selectivities obtained using Eq. 4.5.
a
system
∆ ∆G
bind ,calc
TS
R → S
( )
∆ ∆G
bind ,exp
TS
R → S
( )
E
calc.
E
exp.
wt -1.50 (S) -0.83 (S) 12 (S) 4 (S)
mutant 3 1.41 (R) 1.80 (R) 11 (R) 20 (R)
mutant 4 -2.26 (S) -2.00 (S) 43 (S) 28 (S)
mutant 9 2.71 (R) 1.70 (R) 91(R) 17 (R)
a
Energies in kcal/mol. The energy shift ( )
() TS w
R S G → ∆ of Eq. 4.5,
for the specific implementation of the dummy atoms was 0.39 kcal/mol.
Figure 4.9. The calculated (black bars) and observed (empty bars) E values for the wild
type CALA and some mutants. The plot gives positive and negative values, respectively,
to E(R) and E(S).
While the results are not perfect, they are interesting since we are dealing with
probably the first full formally rigorous free energy calculation of mutational effects of
the absolute selectivity, where all the elements of the calculation involve careful
113
sampling of what is basically a reliable semiempirical QM/MM potential. Most notable
we could reproduce the steric effect of the F233G mutation, which will also be discussed
below.
Another effective strategy was obtained by using the PDLD/S–LRA/β
89
to
calculate the binding energy and the using the EVB calculated E’ in order to evaluate the
total E. The results obtained by this approach are given in Table 4.5. Here we exploited
the fact that fully microscopic, yet accurate, calculations of binding free energy have
serious convergence problems (this can also affect the implicit TS binding in our
mutation approach), while using the PDLD/S-LRA/β provides stable binding results (see
ref.
89
).
Table 4.5. Calculated and observed selectivities obtained using the EVB barriers and the
PDLD/S binding energies.
a
system
,
RS TS
EVB calc
G
→
∆∆
( ) RS →
/
,
PDLD S
bind calc
G ∆∆
( ) RS →
bind,exp
G ∆∆
( ) RS →
,
TS
bind calc
G ∆∆
( ) RS →
,
TS
bind exp
G ∆∆
( ) RS →
E
calc.
E
exp.
wt -0.30 0.19 0.05 -0.49 -0.83 2 (S) 4 (S)
mutant 3 1.10 -0.57 -0.57 1.67 1.80 17 (R) 20 (R)
mutant 4 -1.00 0.36 0.56 -1.36 -2.00 10 (S) 28 (S)
mutant 9 1.50 -0.14 no exp. 1.64 1.70 14 (R) 17 (R)
a
Energies in kcal/mol.
114
Exploring Computer-Aided Refinement of Enantioselectivity
While the above sections consider the validation of our ability to reproduce E’
and E for different mutants, we clearly need to be able to predict which mutations would
increases E. Of course one can think on brute force approach of just trying to calculate E
(or E’) for different generated residues in the active site, were we consider the effect of
changing from polar to non-polar residues or from small to large residues. However,
having a general guide would be a much more promising strategy. Thus we explored
below some initial screening approaches.
The first obvious screening strategy is to use the electrostatic group contribution
approach
60, 61
or the more rigorous LRA/β approach of Eq. 4.10. Here we evaluate the
LRA/β group contributions of each residue to the free energy that correspond to E
(Figure 4.10). Some typical results are depicted in Figure 4.10, where prospective
contribution indicates that generating the given residue from a non-polar residue and in
some respect from smaller residue will stabilize the R TS.
115
Figure 4.10. The LRA/β group contributions for transferring the TS from R to S for A)
wild-type CALA and for mutants B) 3, C) 4 and D) 9. The contribution is negative when
the creation of the given residue from Gly stabilized the S TS and positive when the
creation of the given residue stabilizes the R TS. Thus, for example, the finding that the
group contribution for F233 in the wt is negative means that the LRA predicts that
forming the F from G stabilized the S TS form. Now this is consistent with the fact that
the F233G stabilized the R TS.
The figure gives relatively encouraging message. That is, in the WT we see a
negative peak for F233, this means that formation of F stabilizes the S TS and its removal
stabilizes the R TS. This is consistent with the fact that the F233G has E(R). Similarly we
see in mutant 4 a negative peak F233N, which is consistent with the fact that this mutant
is S and with our prediction that the single mutant F233N has E’(S) which is larger than
the G233F effect (although this prediction should be explored experimentally). It should
116
be noted the actual E values are frequently determined by indirect electrostatic effect,
where steric interactions prevent one stereoisomer from reaching the same optimal
electrostatic preorganization as the other. Thus although the LRA/β of Eq. 4.10 reflects
direct steric effects it is not clear how effective this approach can be when the steric
effect is indirect. This issue has to be explored by direct comparison to experimental
studies of single mutations.
Considering the current uncertainties in using the LRA/β approach for selectivity
screening, it may be unavoidable to use massive EVB / FEP analysis (probably with the
approach of Eq. 4.5). The current performance of the brute force approach is reported in
Table 4.6. The point of this table that even with the brute force approach we can screen
significant number of mutations candidates in a few days and make reasonable suggestion
for further experimental studies. Furthermore, we can save time in mutating the enzyme
by using a simplified (coarse grained (CG)) model as a reference state.
58
Using CG model
we can simultaneously mutate one simplified residue to many explicit residues and
explores the effect on E (see the related example in ref.
58
).
117
Table 4.6. The performance of the EVB, the TS binding and PDLD/S binding
calculations.
a
computer time
per mutant
(runs, processor)
No. of mutants per 24 h
per 200 processors
No. of mutants per 24 h
per 1000 processors
cat
g
≠
∆
using EVB
b
16.5 h (1, 1) 29
c
146
c
TS
bind
G ∆
using FEP
8 h (1,1) 150
d
750
d
/ PDLD S
bind
G ∆
2.5 h (1,1) 480
d
2400
d
a
The calculations were conducted on the University of Southern California
HPCC (High Performance Computing and Communication) Linux computer,
using Dual Intel Xeon(64-bit) 3.2 GHz 2GB Memory nodes.
b
The total computational time of the two reaction steps (the proton transfer and
the nucleophilic attack).
c
Average over ten runs per mutant.
d
Average over four runs per mutant.
CONCLUSIONS
Optimizing enzymes to catalyze selective enantioselective reactions has a major
potential in biotechnology, including in the generation of biocatalyst for efficient
synthesis of enantiomerically pure chiral molecules for the production of drugs by the
pharmaceutical industry.
The recent advances in this field have been due in part to directed evolution
experiments with some qualitative insight from theoretical studies. It seems to us that the
present stage it is important to push the capacity of theoretical simulation as useful tool in
designing enantioselective enzymes. This paper addresses the corresponding challenge of
118
providing predictive calculations despite the fact that we are dealing rather small
difference in activation barriers.
We start by demonstrating that the EVB provides a powerful tool for effective
sampling and free energy calculations in the landscape that reproduce a difference
between the activation barrier of the R and S enantiomers in CALA.
We also explored our ability to tackle the challenging task of evaluation the total
enatioselectivity (E) of the wt and different mutants of CALA. The corresponding
calculations involve mutations of the TS of R to S in the active site of CALA and its
mutants. Our study indicates that a converging prediction of enantioselectivity is a major
challenge (since a major sampling is needed to obtain converging results). This does not
mean that simple energy minimization, and even just inspection, cannot give powerful
guide for getting effective mutations. However, approaches that do not involve extensive
sampling are unlikely to give stable results, just because of the fact that starting from
different initial configurations gives different results. Overall, it seems that the most
rigorous approach is the one where we calculate E by mutating the R to S at the TS.
However, it is not yet clear if this approach leads to the fastest convergence.
Our finding that only an extensive sampling can provide correct estimate of the
enantioselectivity has some general implications. That is, it is sometime assumed that it is
sufficient to start with some reasonable binding model and to just evaluate the energy (or
average energy) near the generated structure. However, our finding indicate that the
results (regardless if the interactions are evaluated by QM/MM or classical model)
would be different with different starting pints and thus it is essential to us extensive an
119
consistent conformational sampling. Thus it is at least recommended to examine the
stability of the calculated results to changes in the initial structure.
In addition to demonstrating our ability to obtain reasonable results for the
observed enantioselectivity, we also explored several options for predictive studies. It
appears that a pure LRA/β screening does have some potential but it would require
careful validation. Now, since the EVB is able to capture mutational effects we might be
forced to rely on extensive simulations that will use a CG model as a basis for
perturbation to different possible mutants. However, with the EVB we clearly have a tool
that allows us to reproduce the observe E and to determine its origin (including case
when it is due to entropic effects). This should be useful in analyzing the origin of the
effect of directed evolution.
An interesting strategies for refining enzyme stereoselectivity is provided by
iterative refinement approaches, where one to exploit the knowledge of the effect of
single mutants while assuming some additivity in the effect of several mutants. However,
in many cases we do not expect additivity and this would make a simple predictive
experimental refinement rather challenging. Here we believe that the ability screen
computationally for double mutants can be of significant importance.
120
PERSPECTIVES
Computer-aided enzyme design provides an understanding of the structure-
function relationships of molecular biocatalysts. Until now the most impressive results in
the field of CAED were obtained by the directed evolution experiments via optimization
of activity, stability and enantioselectivity of the proteins. However, the directed
evolution requires an extensive testing, and therefore faster and more efficient strategies
are needed. The improvements of efficiency of directed evolution can be guided by
computational techniques, for example by evaluation of electrostatic contribution of each
amino acid residues to the transition state energy. Therefore, the combination of
computational and experimental techniques will continue to benefit enzyme design field
effectively.
108
However even with impressive advances in computational enzyme design the
actual rate enhancements obtained by the designed systems are quite poor and far from
that of naturally occurring systems.
8
And the activities of artificial enzymes are not much
higher than those of catalytic antibodies developed 15-25 years ago.
6
The challenges in
the computer-aided enzyme design rise either from incorrect modeling of an ideal active
site, or that the desired active site is not realized by the actual design, or that the designed
active site is not supported by the desired protein scaffold.
6, 8
Therefore computational
enzyme design has a long road ahead for achieving native like levels of catalytic
activity.
6
121
Explanation the reasons of the low success rate of computationally designed
proteins would provide an ultimate understanding of enzymology. The rational design
requires a detailed knowledge of specific enzyme structure and catalytic mechanism,
making it very difficult to predict how an enzyme will behave a priori.
8, 90
Therefore the
mechanistic studies of designed enzymes are needed for understanding the catalytic effect
on a molecular level with the following comparison to those natural enzymes.
6, 14
Naturally occurring enzymes (or even catalytic antibodies) can provide the valuable
information about the enzyme action for the specific reaction type, which could serve as a
starting point for the design process. Therefore, one could think about de novo enzyme
design as a selection of a proper protein scaffold as a starting point.
56
In this view, the
computer simulation study of chemical reaction profile of an existing enzyme catalyzing
the same or similar reaction would provide a deep understanding of enzyme action. It
would also give clues about amino acid residues responsible for the transition state
stabilization (or destabilization). Based on this analysis the several mutations can be
proposed and checked in silico. And this approach provides a powerful in silico test for
guiding an experimental work and achieving improvements in the enzyme design.
In a long term, the CAED has great potential for the future development of
efficient catalysts and the successful design of artificial enzymes seems to be achievable.
However, a further development of the structure-based designer programs is needed. In
the most structure-based design methods, the backbone of the protein remains frozen and
does not reflect the reorganization energy which is a key in enzyme catalysis.
106
At
present this problem is addressed via optimization of the designed constructs by directed
122
evolution experiments. However, the future refinement of computational screening
methods should involve the conformational optimization of the protein backbone and side
chains. Another aspect that should be incorporated in CAED is the thermal stability of the
enzyme which is affected by mutations.
56
A dual control of protein fold and activity
should be included as an additional step in de novo design. Therefore the development of
fast and accurate computational screening techniques is crucial for a future of the CAED
to increase the computer accuracy and calculation speed, and provide an ability to bring
the protein design to industrial level.
123
BIBLIOGRAPHY
1. Acevedo, O.; Jorgensen, W. L., Solvent Effects and Mechanism for a
Nucleophilic Aromatic Substitution from QM/MM Simulations. Org. Lett. 2004, 6, 2881-
2884.
2. Acevedo, O., Role of Water in the Multifaceted Catalytic Antibody 4B2 for
Allylic Isomerization and Kemp Elimination Reactions. J. Phys. Chem. B 2009, 113,
15372-15381.
3. Alexandrova, A. N.; Rothlisberger, D.; Baker, D.; Jorgensen, W. L., Catalytic
Mechanism and Performance of Computationally Designed Enzymes for Kemp
Elimination. J. Am. Chem. Soc. 2008, 130, 15907-15915.
4. Alexandrova, A. N.; Jorgensen, W. L., Origin of the Activity Drop with the E50D
Variant of Catalytic Antibody 34E4 for Kemp Elimination. J. Phys. Chem. B 2009, 113,
497-504.
5. Aqvist, J.; Warshel, A., Simulation of Enzyme-Reactions Using Valence-Bond
Force-Fields and Other Hybrid Quantum-Classical Approaches. Chem. Rev. 1993, 93,
2523-2544.
6. Baker, D., An Exciting but Challenging Road Ahead for Computational Enzyme
Design. Protein Sci. 2010, 19, 1817-1819.
7. Barone, V.; Cossi, M., Quantum Calculation of Molecular Energies and Energy
Gradients in Solution by a Conductor Solvent Model. J. Phys. Chem. A 1998, 102, 1995-
2001.
8. Barozzo, A.; Borstnar, R.; Marloie, G.; Kamerlin, S. C. L., Computational Protein
Engineering: Bridging the Gap between Rational Design and Laboratory Evolution. Int.
J. Mol. Sci. 2012, 13, 12428-12460.
9. Bayly, C. I.; Cieplak, P.; Cornell, W. D.; Kollman, P. A., A Well-Behaved
Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic
Charges: The RESP Model. J. Phys. Chem. 1993, 97, 10269-10280.
124
10. Beck, G., Synthesis of Chiral Drug Substances. Synlett. 2002, 6, 837-850.
11. Bocola, M.; Otte, N.; Jaeger, K. E.; Reetz, M. T.; Thiel, W., Learning from
Directed Evolution: Theoretical Investigations into Cooperative Mutations in Lipase
Enantioselectivity. ChemBioChem. 2004, 5, 214-223.
12. Braun-Sand, S.; Sharma, P. K.; Chu, Z. T.; Pisliakov, A. V.; Warshel, A., The
Energetics of the Primary Proton Transfer in Bacteriorhodopsin Revisited: It Is a
Sequential Light-Induced Charge Separation after All. Biochim. Biophys. Acta 2008,
1777, 441-452.
13. Breuer, M.; Ditrich, K.; Habicher, T.; Hauer, B.; Kesseler, M.; Sturmer, R.;
Zelinski, T., Industrial Methods for the Production of Optically Active Intermediates.
Angew. Chem. 2004, 116, 806-843.
14. Chen, C. Y.; Georgiev, I.; Anderson, A. C.; Donald, B. R., Computational
Structure-Based Redesign of Enzyme Activity. Proc. Natl. Acad. Sci. U. S. A. 2009, 106,
3764-3769.
15. Chu, Z. T.; Villa, J.; Strajbl, M.; Schutz, C. N.; Shurki, A.; Warshel, A. Molaris,
Version 9.06; University of Southern California: Los Angeles, CA, USA, 2006.
16. Creighton, S.; Hwang, J. K.; Warshel, A.; Parson, W. W.; Norris, J., Simulating
the Dynamics of the Primary Charge Separation Process in Bacterial Photosynthesis.
Biochemistry 1988, 27, 774-781.
17. Debler, E. W.; Ito, S.; Seebeck, F. P.; Heine, A.; Hilvert, D.; Wilson, I. A.,
Structural Origins of Efficient Proton Abstraction from Carbon by a Catalytic Antibody.
Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 4984-4989.
18. Debler, E. W.; Muller, R.; Hilvert, D.; Wilson, I. A., An Aspartate and a Water
Molecule Mediate Efficient Acid-Base Catalysis in a Tailored Antibody Pocket. Proc.
Natl. Acad. Sci. U. S. A. 2009, 106, 18539-18544.
19. DeLano, W. L. The Pymol Molecular Graphics, Version 1.2r3pre; DeLano
Scientific: San Carlos, CA, USA, 2002.
125
20. Engstrom, K.; Nyhlen, J.; Sandstrom, A. G.; Backvall, J. E., Directed Evolution
of an Enantioselective Lipase with Broad Substrate Scope for Hydrolysis of Alpha-
Substituted Esters. J. Am. Chem. Soc. 2010, 132, 7038-7042.
21. Ericsson, D. J.; Kasrayan, A.; Johanssonl, P.; Bergfors, T.; Sandstrom, A. G.;
Backvall, J. E.; Mowbray, S. L., X-Ray Structure of Candida Antarctica Lipase a Shows
a Novel Lid Structure and a Likely Mode of Interfacial Activation. J. Mol. Biol. 2008,
376, 109-119.
22. Florian, J.; Goodman, M. F.; Warshel, A., Theoretical Investigation of the
Binding Free Energies and Key Substrate-Recognition Components of the Replication
Fidelity of Human DNA Polymerase Beta. J. Phys. Chem. B 2002, 106, 5739-5753.
23. Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.;
Cheeseman, J. R.; Montgomery, J., J. A.; Vreven, T.; Kudin, K. N.; Burant, J. C.; Millam,
J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci, B.; Cossi, M.; Scalmani, G.;
Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.;
Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Ktao, O.; Nakai, H.; Klene, M.; Li,
X.; Knox, J. E.; Hratchian, H. P.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.;
Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.;
Ochterski, J. W.; Ayala, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg, J.
J.; Zakrzewski, V. G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas, O.; Malick, D.
K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz, J. V.; Cui, Q.; Baboul, A.
G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.;
Kmaromi, I.; Martin, R. L.; Fox, D. J.; Keith, T.; Al-Laham, M. A.; Peng, C. Y.;
Nanayakkara, A.; Challacombe, M.; Gill, P. M. W.; Johnson, B.; Chen, W.; Wong, M.
W.; Gonzalez, C.; Pople, J. A. Gaussian 03, Revision E.01; Gaussian, Inc.: Wallingford,
CT, USA, 2004.
24. Frushicheva, M. P.; Cao, J.; Chu, Z. T.; Warshel, A., Exploring Challenges in
Rational Enzyme Design by Simulating the Catalysis in Artificial Kemp Eliminase. Proc.
Natl. Acad. Sci. U. S. A. 2010, 107, 16869-16874.
25. Frushicheva, M. P.; Cao, J.; Warshel, A., Challenges and Advances in Validating
Enzyme Design Proposals: The Case of Kemp Eliminase Catalysis. Biochemistry 2011,
50, 3849-3858.
126
26. Frushicheva, M. P.; Warshel, A., Towards Quantitative Computer-Aided Studies
of Enzymatic Enantioselectivity: The Case of Candida Antarctica Lipase A.
ChemBioChem. 2012, 13, 215-223.
27. Grigorenko, B. L.; Nemukhin, A. V.; Topol, I. A.; Cachau, R. E.; Burt, S. K.,
QM/MM Modeling the Ras-Gap Catalyzed Hydrolysis of Guanosine Triphosphate.
Proteins: Struct., Funct., Bioinf. 2005, 60, 495-503.
28. Hollfelder, F.; Kirby, A. J.; Tawfik, D. S.; Kikuchi, K.; Hilvert, D.,
Characterization of Proton-Transfer Catalysis by Serum Albumins. J. Am. Chem. Soc.
2000, 122, 1022-1029.
29. Hollfelder, F.; Kirby, A. J.; Tawfik, D. S., On the Magnitude and Specificity of
Medium Effects in Enzyme-Like Catalysts for Proton Transfer. J. Org. Chem. 2001, 66,
5866-5874.
30. Hopmann, K. H.; Hallberg, B. M.; Himo, F., Catalytic Mechanism of Limonene
Epoxide Hydrolase, a Theoretical Study. J. Am. Chem. Soc. 2005, 127, 14339-14347.
31. Hu, Y.; Houk, K. N.; Kikuchi, K.; Hotta, K.; Hilvert, D., Nonspecific Medium
Effects Versus Specific Group Positioning in the Antibody and Albumin Catalysis of the
Base-Promoted Ring-Opening Reactions of Benzisoxazoles. J. Am. Chem. Soc. 2004,
126, 8197-8205.
32. Hwang, J. K.; Warshel, A., Semiquantitative Calculations of Catalytic Free
Energies in Genetically Modified Enzymes. Biochemistry 1987, 26, 2669-2673.
33. Hwang, J. K.; King, G. S.; Creighton, S.; Warshel, A., Simulation of Free Energy
Relationships and Dynamics of SN2 Reactions in Aqueous Solutions. J. Am. Chem. Soc.
1988, 110, 5297-5311.
34. Ishikita, H.; Warshel, A., Predicting Drug-Resistant Mutations of HIV Protease.
Angew. Chem. 2008, 120, 709-712.
35. Jencks, W. P.; Gilchrist, M., Nucleophilic Reactivity of Alcoholate Anions toward
Rho-Nitrophenyl Acetate. J. Am. Chem. Soc. 1962, 84, 2910-2913.
127
36. Jiang, L.; Althoff, E. A.; Clemente, F. R.; Doyle, L.; Rothlisberger, D.;
Zanghellini, A.; Gallaher, J. L.; Betker, J. L.; Tanka, F.; Barbas, C. F. I.; Hilvert, D.;
Houk, K. N.; Stoddard, B.; Baker, D., De Novo Computational Design of Retro-Aldol
Enzymes. Science 2008, 319, 1387-1391.
37. Kamerlin, S. C.; Williams, N. H.; Warshel, A., Dineopentyl Phosphate
Hydrolysis: Evidence for Stepwise Water Attack. J. Org. Chem. 2008, 73, 6960-6969.
38. Kamerlin, S. C.; Warshel, A., On the Energetics of ATP Hydrolysis in Solution. J.
Phys. Chem. B 2009, 113, 15692-15698.
39. Kamerlin, S. C.; Warshel, A., The EVB as a Quantitative Tool for Formulating
Simulations and Analyzing Biological and Chemical Reactions. Faraday Discuss. 2010,
145, 71-106.
40. Kamerlin, S. C.; Warshel, A., At the Dawn of the 21st Century: Is Dynamics the
Missing Link for Understanding Enzyme Catalysis? Proteins: Struct., Funct., Bioinf.
2010, 78, 1339-1375.
41. Kamerlin, S. C.; Warshel, A., The Empirical Valence Bond Model: Theory and
Applications. WIREs Comput. Mol. Sci. 2011, 1, 30-45.
42. Kemp, D. S.; Casey, M. L., Physical Organic Chemistry of Benzisoxazoles II.
Linearity of the Bronsted Free Energy Relationship for the Base-Catalyzed
Decomposition of Benzisoxazoles. J. Am. Chem. Soc. 1973, 95, 6670-6680.
43. Khersonsky, O.; Rothlisberger, D.; Dym, O.; Albeck, S.; Jackson, C. J.; Baker,
D.; Tawfik, D. S., Evolutionary Optimization of Computationally Designed Enzymes:
Kemp Eliminases of the KE07 Series. J. Mol. Biol. 2010, 396, 1025-1042.
44. Kiss, G.; Rothlisberger, D.; Baker, D.; Houk, K. N., Evaluation and Ranking of
Enzyme Designs. Protein Sci. 2010, 19, 1760-1773.
45. Klamt, A.; Schrmann, G. J., COSMO: A New Approach to Dielectric Screening
in Solvents with Explicit Expressions for the Screening Energy and Its Gradient. J. Chem.
Soc., Perkin Trans. 2 1993, 5, 799 -805.
128
46. Lassila, J. K., Conformational Diversity and Computational Enzyme Design.
Curr. Opin. Chem. Biol. 2010, 14, 676-682.
47. Lee, F. S.; Chu, Z. T.; Bolger, M. B.; Warshel, A., Calculations of Antibody
Antigen Interactions - Microscopic and Semimicroscopic Evaluation of the Free-Energies
of Binding of Phosphorylcholine Analogs to MCPC603. Protein Eng., Des. Sel. 1992, 5,
215-228.
48. Lee, F. S.; Chu, Z. T.; Warshel, A., Microscopic and Semimicroscopic
Calculations of Electrostatic Energies in Proteins by the Polaris and Enzymix Programs.
J. Comput. Chem. 1993, 14, 161-185.
49. Leonard, V.; Fransson, L.; Lamare, S.; Hult, K.; Graber, M., A Water Molecule in
the Stereospecificity Pocket of Candida Antarctica Lipase B Enhances Enantioselectivity
Towards Pentan-2-Ol. ChemBioChem. 2007, 8, 662-667.
50. Liu, H.; Warshel, A., The Catalytic Effect of Dihydrofolate Reductase and Its
Mutants Is Determined by Reorganization Energies. Biochemistry 2007, 46, 6011-6025.
51. Magnusson, A. O.; Rotticci-Mulder, J. C.; Santagostino, A.; Hult, K., Creating
Space for Large Secondary Alcohols by Rational Redesign of Candida Antarctica Lipase
B. ChemBioChem. 2005, 6, 1051-1056.
52. Magnusson, A. O.; Takwa, M.; Harnberg, A.; Hult, K., An S-Selective Lipase
Was Created by Rational Redesign and the Enantioselectivity Increased with
Temperature. Angew. Chem. 2005, 117, 4658-4661.
53. Marcus, R. A., Chemical and Electrochemical Electron Transfer Theory. Ann.
Rev. Phys. Chem. 1964, 15, 155-196.
54. Marti, S.; Andres, J.; Moliner, V.; Silla, E.; Tunon, I.; Bertran, J., Computer-
Aided Rational Design of Catalytic Antibodies: The 1F7 Case. Angew. Chem., Int. Ed.
2007, 46, 286-290.
55. Marti, S.; Andres, J.; Moliner, V.; Silla, E.; Tunon, I.; Bertran, J., Predicting an
Improvement of Secondary Catalytic Activity of Promiscuos Isochorismate Pyruvate
Lyase by Computational Design. J. Am. Chem. Soc. 2008, 130, 2894-2895.
129
56. Marti, S.; Andres, J.; Moliner, V.; Silla, E.; Tunon, I.; Bertran, J., Computational
Design of Biological Catalysts. Chem. Soc. Rev. 2008, 37, 2634-2643.
57. Martinelle, M.; Hult, K., Kinetics of Acyl Transfer-Reactions in Organic Media
Catalyzed by Candida Antarctica Lipase B. Biochim. Biophys. Acta, Protein Struct. Mol.
Enz. 1995, 1251, 191-197.
58. Messer, B. M.; Roca, M.; Chu, Z. T.; Vicatos, S.; Kilshtain, A. V.; Warshel, A.,
Multiscale Simulations of Protein Landscapes: Using Coarse-Grained Models as
Reference Potentials to Full Explicit Models. Proteins: Struct., Funct., Bioinf. 2010, 78,
1212-1227.
59. Morris, G. M.; Goodsell, D. S.; Halliday, R. S.; Huey, R.; Hart, W. E.; Belew, R.
K.; Olson, A. J., Automated Docking Using a Lamarckian Genetic Algorithm and an
Empirical Binding Free Energy Function. J. Comput. Chem. 1998, 19, 1639-1662.
60. Muegge, I.; Tao, H.; Warshel, A., A Fast Estimate of Electrostatic Group
Contributions to the Free Energy of Protein-Inhibitor Binding. Protein Eng. 1997, 10,
1363-1372.
61. Muegge, I.; Schweins, T.; Warshel, A., Electrostatic Contributions to Protein-
Protein Binding Affinities: Application to Rap/Raf Interaction. Proteins: Struct., Funct.,
Bioinf. 1998, 30, 407-423.
62. Mulholland, A. J., Computational Enzymology: Modelling the Mechanisms of
Biological Catalysts. Biochem. Soc. Trans. 2008, 36, 22-26.
63. Na, J.; Houk, K. N.; Hilvert, D., Transition State of the Base-Promoted Ring-
Opening of Isoxazoles. Theoretical Prediction of Catalytic Functionalities and Design of
Haptens for Antibody Production. J. Am. Chem. Soc. 1996, 118, 6462–6471.
64. Nagel, Z. D.; Klinman, J. P., A 21st Century Revisionist's View at a Turning
Point in Enzymology. Nat. Chem. Biol. 2009, 5, 543-550.
65. Nanda, V., Do-It-Yourself Enzymes. Nat. Chem. Biol. 2008, 4, 273-275.
130
66. Nyhlen, J.; Martin-Matute, B.; Sandstrom, A. G.; Bocola, M.; Backvall, J. E.,
Influence of Delta-Functional Groups on the Enantiorecognition of Secondary Alcohols
by Candida Antarctica Lipase B. ChemBioChem. 2008, 9, 1968-1974.
67. Olsson, M. H.; Warshel, A., Solute Solvent Dynamics and Energetics in Enzyme
Catalysis: The S(N)2 Reaction of Dehalogenase as a General Benchmark. J. Am. Chem.
Soc. 2004, 126, 15167-15179.
68. Orrenius, C.; Haeffner, F.; Rotticci, D.; Ohrner, N.; Norin, T.; Hult, K., Chiral
Recognition of Alcohol Enantiomers in Acyl Transfer Reactions Catalysed by Candida
Antarctica Lipase B. Biocatal. Biotransform. 1998, 16, 1-15.
69. Ottosson, J.; Fransson, L.; Hult, K., Substrate Entropy in Enzyme
Enantioselectivity: An Experimental and Molecular Modeling Study of a Lipase. Protein
Sci. 2002, 11, 1462-1471.
70. Prasad, S.; Bocola, M.; Reetz, M. T., Revisiting the Lipase from Pseudomonas
Aeruginosa: Directed Evolution of Substrate Acceptance and Enantioselectivity Using
Iterative Saturation Mutagenesis. ChemPhysChem. 2011, 12, 1550-1557.
71. Privett, H. K. An Iterative Approach to De Novo Computational Enzyme Design
and the Successful Application to the Kemp Elimination. Ph.D. Dissertation, California
Institute of Technology, Los Angeles, CA, USA, 2009.
72. Prokop, Z.; Sato, Y.; Brezovsky, J.; Mozga, T.; Chaloupkova, R.; Koudelakova,
T.; Jerabek, P.; Stepankova, V.; Natsume, R.; van Leeuwen, J. G.; Janssen, D. B.;
Florian, J.; Nagata, Y.; Senda, T.; Damborsky, J., Enantioselectivity of Haloalkane
Dehalogenases and Its Modulation by Surface Loop Engineering. Angew. Chem. 2010,
122, 6247-6251.
73. Ramesh, N. P., Synthesis of Chiral Pharmaceutical Intermediates by Biocatalysis.
Coord. Chem. Rev. 2008, 252, 659-701.
74. Raza, S.; Fransson, L.; Hult, K., Enantioselectivity in Candida Antarctica Lipase
B: A Molecular Dynamics Study. Protein Sci. 2001, 10, 329-338.
131
75. Reetz, M. T.; Prasad, S.; Carballeira, J. D.; Gumulya, Y.; Bocola, M., Iterative
Saturation Mutagenesis Accelerates Laboratory Evolution of Enzyme Stereoselectivity:
Rigorous Comparison with Traditional Methods. J. Am. Chem. Soc. 2010, 132, 9144-
9152.
76. Roca, M.; Vardi-Kilshtain, A.; Warshel, A., Toward Accurate Screening in
Computer-Aided Enzyme Design. Biochemistry 2009, 48, 3046-3056.
77. Rogers, G. A.; Bruice, T. C., Synthesis and Evaluation of a Model for the So-
Called "Charge-Relay" System of the Serine Esterases. J. Am. Chem. Soc. 1974, 96,
2473-2481.
78. Rosta, E.; Klahn, M.; Warshel, A., Towards Accurate Ab Initio QM/MM
Calculations of Free-Energy Profiles of Enzymatic Reactions. J. Phys. Chem. B 2006,
110, 2934-2941.
79. Rothlisberger, D.; Khersonsky, O.; Wollacott, A. M.; Jiang, J.; DeChancie, J.;
Betker, J.; Gallaher, J. L.; Althoff, E. A.; Zanghellini, A.; Dym, O.; Albeck, S.; Houk, K.
N.; Tawfik, D. S.; D., B., Kemp Elimination Catalysts by Computational Enzyme Design.
Nature 2008, 453, 190-195.
80. Rotticci, D.; Haeffner, F.; Orrenius, C.; Norin, T.; Hult, K., Molecular
Recognition of Sec-Alcohol Enantiomers by Candida Antarctica Lipase B. J. Mol. Catal.
B: Enzym. 1998, 5, 267-272.
81. Sandstrom, A. G.; Engstrom, K.; Nyhlen, J.; Kasrayan, A.; Backvall, J. E.,
Directed Evolution of Candida Antarctica Lipase a Using an Episomaly Replicating
Yeast Plasmid Dagger. Protein Eng., Des. Sel. 2009, 22, 413-420.
82. Sandstrom, A. G. Protein Engineering of Candida Antarctica Lipase A. Ph.D.
Dissertation, Stockholm University, Sweden, 2010.
83. Schmid, A.; Dordick, J. S.; Hauer, B.; Kiener, A.; Wubbolts, M.; Witholt, B.,
Industrial Biocatalysis Today and Tomorrow. Nature 2001, 409, 258-268.
84. Schowen, R. L., Transition States in Biological Processes. Plenum: New York,
1978.
132
85. Sham, Y. Y.; Chu, Z. T.; Warshel, A., Consistent Calculations of Pk(a)'S of
Ionizable Residues in Proteins: Semi-Microscopic and Microscopic Approaches. J. Phys.
Chem. B 1997, 101, 4458-4472.
86. Sharma, P. K.; Chu, Z. T.; Olsson, M. H.; Warshel, A., A New Paradigm for
Electrostatic Catalysis of Radical Reactions in Vitamin B12 Enzymes. Proc. Natl. Acad.
Sci. U. S. A. 2007, 104, 9661-9666.
87. Shurki, A.; Warshel, A., Why Does the Ras Switch "Break" By Oncogenic
Mutations? Proteins: Struct., Funct., Bioinf. 2004, 55, 1-10.
88. Siegel, J. B.; Zanghellini, A.; Lovick, H. M.; Kiss, G.; Lambert, A. R.; St Clair, J.
L.; Gallaher, J. L.; Hilvert, D.; Gelb, M. H.; Stoddard, B. L.; Houk, K. N.; Michael, F. E.;
Baker, D., Computational Design of an Enzyme Catalyst for a Stereoselective
Bimolecular Diels-Alder Reaction. Science 2010, 329, 309-313.
89. Singh, N.; Warshel, A., Absolute Binding Free Energy Calculations: On the
Accuracy of Computational Scoring of Protein-Ligand Interactions. Proteins: Struct.,
Funct., Bioinf. 2010, 78, 1705-1723.
90. Sterner, R.; Merkl, R.; Raushel, F. M., Computational Design of Enzymes. Chem.
Biol. 2008, 15, 421-423.
91. Štrajbl, M.; Florián, J.; Warshel, A., Ab Initio Evaluation of the Potential Surface
for General Base-Catalyzed Methanolysis of Formamide: A Reference Solution Reaction
for Studies of Serine Proteases. J. Am. Chem. Soc. 2000, 122, 5354–5366.
92. Sucato, C. A.; Upton, T. G.; Kashemirov, B. A.; Batra, V. K.; Martinek, V.;
Xiang, Y.; Beard, W. A.; Pedersen, L. C.; Wilson, S. H.; McKenna, C. E.; Florian, J.;
Warshel, A.; Goodman, M. F., Modifying the Beta,Gamma Leaving-Group Bridging
Oxygen Alters Nucleotide Incorporation Efficiency, Fidelity, and the Catalytic
Mechanism of DNA Polymerase Beta. Biochemistry 2007, 46, 461-471.
93. Svedendahl, M.; Carlqvist, P.; Branneby, C.; Allner, O.; Frise, A.; Hult, K.;
Berglund, P.; Brinck, T., Direct Epoxidation in Candida Antarctica Lipase B Studied by
Experiment and Theory. ChemBioChem. 2008, 9, 2443-2451.
133
94. Thorn, S. N.; Daniels, R. G.; Auditor, M. T.; Hilvert, D., Large Rate
Accelerations in Antibody Catalysis by Strategic Use of Haptenic Charge. Nature 1995,
373, 228-230.
95. Toscano, M. D.; Woycechowsky, K. J.; Hilvert, D., Minimalist Active-Site
Redesign: Teaching Old Enzymes New Tricks. Angew. Chem., Int. Ed. 2007, 46, 4468-
4470.
96. Vicatos, S.; Roca, M.; Warshel, A., Effective Approach for Calculations of
Absolute Stability of Proteins Using Focused Dielectric Constants. Proteins: Struct.,
Funct., Bioinf. 2009, 77, 670-684.
97. Villa, J.; Warshel, A., Energetics and Dynamics of Enzymatic Reactions. J. Phys.
Chem. B 2001, 105, 7887-7907.
98. Warshel, A.; Russell, S., Theoretical Correlation of Structure and Energetics in
the Catalytic Reaction of Trypsin. J. Am. Chem. Soc. 1986, 108, 6569-6579.
99. Warshel, A.; Sussman, F., Toward Computer-Aided Site-Directed Mutagenesis of
Enzymes. Proc. Natl. Acad. Sci. U. S. A. 1986, 83, 3806-3810.
100. Warshel, A., What About Protein Polarity? Nature 1987, 330, 15-16.
101. Warshel, A.; Aqvist, J.; Creighton, S., Enzymes Work by Solvation Substitution
Rather Than by Desolvation. Proc. Natl. Acad. Sci. U. S. A. 1989, 86, 5820-5824.
102. Warshel, A.; Narayszabo, G.; Sussman, F.; Hwang, J. K., How Do Serine
Proteases Really Work. Biochemistry 1989, 28, 3629-3637.
103. Warshel, A., Computer Modeling of Chemical Reactions in Enzymes and
Solutions. Wiley Interscience: New York, 1991.
104. Warshel, A.; Strajbl, M.; Villa, J.; Florian, J., Remarkable Rate Enhancement of
Orotidine 5'-Monophosphate Decarboxylase Is Due to Transition-State Stabilization
Rather Than to Ground-State Destabilization. Biochemistry 2000, 39, 14728-14738.
134
105. Warshel, A., Inverting the Selectivity of Aquaporin 6: Gating Versus Direct
Electrostatic Interaction. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 1813-1814.
106. Warshel, A.; Sharma, P. K.; Kato, M.; Xiang, Y.; Liu, H.; Olsson, M. H.,
Electrostatic Basis for Enzyme Catalysis. Chem. Rev. 2006, 106, 3210-3235.
107. Weiss, R. M.; Warshel, A., An Empirical Valence Bond Approach for Comparing
Reactions in Solutions and in Enzymes. J. Am. Chem. Soc. 1980, 102, 6218-6226.
108. Woycechowsky, K. J.; Vamvaca, K.; Hilvert, D., Novel Enzymes through Design
and Evolution. Adv. Enzymol. Relat. Areas Mol. Biol. 2007, 75, 241-294, xiii.
109. Zanghellini, A.; Jiang, L.; Wollacott, A. M.; Cheng, G.; Meiler, J.; Althoff, E. A.;
Rothlisberger, D.; Baker, D., New Algorithms and an in Silico Benchmark for
Computational Enzyme Design. Protein Sci. 2006, 15, 2785-2794.
110. Zhu, L.; Yang, F.; Chen, L.; Meehan, E. J.; Huang, M., A New Drug Binding
Subsite on Human Serum Albumin and Drug-Drug Interaction Studied by X-Ray
Crystallography. J. Struct. Biol. 2008, 162, 40-49.
135
APPENDIX A:
OVERLAP OF THE EVB AND AB INITIO TS STRUCTURES IN SOLUTION
FOR KEMP ELIMINATION REACTION
A comparison between the TS structures of the ab initio (shown in blue color) and EVB
(shown in red color) surfaces for the kemp elimination reaction in solution.
136
APPENDIX B:
THE STANDARD DEVIATION FOR CALCULATED ∆g
‡
cat
FOR THE REACTION OF CALA AND ITS MUTANTS
system label
b
Standard Deviation, kcal/mol
a
R enantiomer S enantiomer
2 1.3 1.5
3 0.6 1.4
4 1.0 1.1
5 1.0 0.6
6 0.9 1.3
7 0.6 0.7
8 0.7 0.7
9 0.9 0.7
10 0.9 0.5
11 0.8 1.0
a
The standard deviation (in kcal/mol) is over 10 conformations.
b
Systems are defined in the Table 4.3 of Chapter 4.
Abstract (if available)
Abstract
A fundamental challenge in biotechnology and in biochemistry is an ability to design effective enzymes. Despite a gradual progress on this front, most of the advances have been made by placing the reacting fragments in the proper places rather than by optimizing the environment preorganization, which is the key factor in enzyme catalysis. Improving the preorganization and assessing the effectiveness of different design options requires the ability to calculate the actual catalytic effect. Current work explores the computer-aided refinement of catalytic activities (Kemp Eliminase artificial constructs and catalytic antibodies) and enantioselectivities (stereoselective Candida antarctica Lipase A enzymes) using the empirical valence bond as the main screening tool. The observed absolute catalytic effect and the effect of directed evolution experiments are reproduced and analyzed. The following refinement procedure located the mutation sites with increased catalytic activities and selectivities which have been tested experimentally. Those predictions provide a broad avenue for the design of novel catalysts. With this in mind, our approach provides a quantitative understanding of the catalytic effect of the various enzyme constructs and the results of directed evolution experiments. Furthermore, our technique provides the ability to locate the effective mutations that can increase k
cat
by refining the electrostatic preorganization of the protein environment, which allows us to refine and develop more efficient catalysts.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
In silico enzyme modeling: A study of entropic contributions to enzyme catalysis and the design of artificial kemp eliminase
PDF
Capturing the complexity of ion channels: simulations of long-time dynamics, conformational changes and the effect of the membrane potential
PDF
Understanding electrostatic effects in the function of biological systems
PDF
Two state free energy barrier evaluation technique for dehalogenation reaction
PDF
Ab initio methodologies in studying enzymatic reactions
PDF
Non-linear surface spectroscopy of photoswitches annd photovoltaics
PDF
Advancing computational methods for free energy calculations in proteins and the applications for studies of catalysis in nucleotide-associated proteins
PDF
Molecular dynamics study of energetic materials
PDF
Sustainable continuous flow syntheses of colloidal inorganic nanoparticle catalysts
PDF
Investigations in cooperative catalysis: synthesis, reactivity and metal-ligand bonding
PDF
The role of ATP in the regulation of Escherichia coli DNA polymerase V activity
PDF
The crystal structure of APOBEC-2 and implications for APOBEC enzymes
PDF
Artificial photosynthesis on titanium oxide passivated III-V semiconductors
PDF
Genome-wide studies reveal the function and evolution of DNA shape
PDF
Responsible artificial intelligence for a complex world
PDF
Design, synthesis, and investigation of a bio inspired CO₂ reduction catalyst
PDF
Deciphering protein-nucleic acid interactions with artificial intelligence
PDF
Design of durable and efficient catalysts for the electro-oxidation of methanol
PDF
Towards robust dynamical decoupling and high fidelity adiabatic quantum computation
PDF
Understanding the mechanism of oxygen reduction and oxygen evolution on transition metal oxide electrocatalysts and applications in iron-air rechargeable battery
Asset Metadata
Creator
Frushicheva, Maria P. (author)
Core Title
Quantitative computer-aided studies of artificial and enantioselective enzymes
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Chemistry
Publication Date
11/19/2012
Defense Date
10/24/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
artificial enzymes,catalytic reactions,computational chemistry,computer-aided enzyme design,enantioselectivity,enzymology,OAI-PMH Harvest
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Warshel, Arieh (
committee chair
), Bradforth, Stephen E. (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
frushich@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-115440
Unique identifier
UC11290070
Identifier
usctheses-c3-115440 (legacy record id)
Legacy Identifier
etd-Frushichev-1307-0.pdf
Dmrecord
115440
Document Type
Dissertation
Rights
Frushicheva, Maria P.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
artificial enzymes
catalytic reactions
computational chemistry
computer-aided enzyme design
enantioselectivity
enzymology