Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Advancing ab initio QM/MM free energy calculations: refining, validating and quantifying the reference potential approach
(USC Thesis Other)
Advancing ab initio QM/MM free energy calculations: refining, validating and quantifying the reference potential approach
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Advancing ab initio QM/MM Free Energy Calculations: Refining, Validating and Quantifying the Reference Potential Approach by Nikolay V. Plotnikov A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (CHEMISTRY) May 2013 Copyright 2013 Nikolay V. Plotnikov ii Acknowledgements During my PhD program, I was fortunate to have support from many people who would often go the extra mile to help me. I am very grateful to my scientific adviser, Professor Arieh Warshel, for the opportunity of working in his group, his time and energy invested into my education and professional development, and for his guidance and support in getting this work done. I would like to acknowledge my colleagues at Dr Warshel’s research group for their help and stimulating discussions, especially: Dr Zhen T. Chu (implementation of methods in MOLARIS- XG), Dr Shina K. L. Kamerlin (Paradynamics, Chapter 2) and Dr B. Ram Prasad (phosphate hydrolysis, Chapter 4). I am grateful to Professor Anna Krylov for her participation in my PhD program and for her help with QM methods, QChem and with the red fluorescent protein (Chapter 5). I am also thankful to Kirill Khistyaev for his help with parallel QChem and to Dr James Stewart for providing access to MOPAC2009 code. I gratefully acknowledge the University of Southern California’s High Performance Computing and Communications Center for computer time, prof. Yuk Sham and Minnesota supercomputing institute for providing access to computational resources. This work also used the Extreme Science and Engineering Discovery Environment (XSEDE) – in particular Gordon at San Diego Supercomputing Center. I would like to thank all USC professors who contributed into my education during the PhD program, especially: Stephen Bradforth (particularly for his guidance in the 1 st year of my program), Jack Feinberg and Aiichiro Nakano. I am grateful to Prof. Ian Haworth and to Prof. Alex Benderskii for serving on my qualifying and dissertation committees. I would like to thank administrative manager of USC College of Letters, Arts & Sciences Michele Dea for her outstanding work. I am grateful to all those who offered me a hand of help and support. In particular, I have to acknowledge Nikolay Bogdanov, Igor Fedorov, Lena Neo and Peter Witrak, and most of all, my dear parents in Karelia, Russia to whom I am forever indebted. iii Table of Contents 1 Chapter 1. ParaDynamics: An Effective and Reliable Model for Ab Initio QM/MM Free Energy Calculations ................................................................................................................................................... 1 1.1 Introduction .................................................................................................................. 1 1.2 Background ................................................................................................................... 2 1.3 Paradynamics Approach ............................................................................................... 4 1.3.1 General Outline ..................................................................................................................... 4 1.3.2 Empirical Valance Bond Method ........................................................................................... 5 1.3.3 Evaluating the Correction for Moving from the EVB to the ab initio Potential .................... 9 1.3.4 Performing Iterative PD Fitting to an ab initio Potential .................................................... 10 1.4 Results and Discussion ............................................................................................... 12 1.4.1 Performing the QM Calculations ......................................................................................... 13 1.4.2 Performing an Initial EVB Study Using the Default Parameters ......................................... 13 1.4.3 Performing the PD Refinement ........................................................................................... 16 1.4.4 Efficiency Benchmarks ........................................................................................................ 19 1.4.5 Choosing an Appropriate Reaction Coordinate .................................................................. 21 1.5 Comparison between the PD and the MTD Approaches ............................................ 22 1.6 Quantifying the Paradynamics Refinement ................................................................ 24 1.7 Concluding Remarks .................................................................................................. 25 2 Chapter 2. Exploring, refining and validating the Paradynamics model ............................................. 27 2.1 Introduction ................................................................................................................ 27 2.2 Methods and Results .................................................................................................. 31 2.2.1 Paradynamics approach with an arbitrary reference potential .......................................... 31 2.2.2 Validating the LRA by a full FEP treatment ......................................................................... 34 iv 2.2.3 Semi‐empirical potential as a model of a QM(ai) potential ................................................ 45 2.2.4 PD refinement of the reference potential by modifying its functional form ...................... 47 2.2.5 Extension to the condensed phase using the EVB‐type solvent potential ......................... 54 2.3 Concluding discussion ................................................................................................ 57 3 Chapter 3. Methods for Calculating QM(ai)/MM Free Energy Surfaces ............................................. 59 3.1 Introduction ................................................................................................................ 59 3.2 Methods ...................................................................................................................... 59 3.2.1 Calculating free energy perturbation .................................................................................. 59 3.2.2 Calculating free energy functions with the umbrella sampling approach .......................... 63 3.2.3 WHAM approach ................................................................................................................. 66 3.3 Examples .................................................................................................................... 67 3.3.1 Free energy perturbation in the Paradynamics approach .................................................. 67 3.3.2 Obtaining catalytic effect using free energy calculations ................................................... 70 3.3.3 Other reference potential approach formulation ............................................................... 72 3.3.4 Multidimensional QM/MM free energy surfaces ............................................................... 73 3.4 Conclusions ................................................................................................................ 74 4 Chapter 4. Quantifying the Mechanism of Phosphate Monoester Hydrolysis in Aqueous Solution by Evaluating the Relevant ab initio QM/MM Free Energy Surfaces .............................................................. 75 4.1 Introduction ................................................................................................................ 75 4.2 Methods ...................................................................................................................... 77 4.3 Results ........................................................................................................................ 78 4.3.1 PM3 QM/MM results .......................................................................................................... 78 4.3.2 QM(ai)/MM results ............................................................................................................. 84 4.3.3 Nuclear Quantum mechanical and other corrections ........................................................ 92 4.3.4 Emerging Picture ................................................................................................................. 93 v 4.4 Concluding Discussion ............................................................................................... 93 5 Chapter 5. Prospective of the Reference Potential Approach........................................................... 95 5.1 Other applications of the reference potential approach strategy ................................ 95 5.1.1 Calculation of absorption and emission spectra ................................................................. 95 5.1.2 Methods .............................................................................................................................. 96 5.1.3 Propagating MD trajectories on the QCFF‐PI potential ...................................................... 97 5.2 Concluding remarks .................................................................................................... 99 6 References ........................................................................................................................................ 101 7 Appendices ........................................................................................................................................ 111 7.1 Appendix 1. Starting and refined EVB parameters in chapter 1. ............................. 111 7.2 Appendix 2. Simulation protocols used in chapter 2. ............................................... 114 vi List of Figures Figure 1‐1 A conceptual comparison of meta‐ and paradynamics. (A) The MTD approach iteratively adds Gaussian potentials, V 1 ’, V 2 ’, V 3 ’…V N ’, to the real potential of the system, V 0 . At the final iteration, the addition of the Gaussian potential V N ’ to the real potential creates a flat potential, V 0 +V N ’, which allows for sampling of the TS region. (B) In the PD approach, the reference potential (in this case an EVB potential), V 1 ’, is iteratively and systematically refined (V 2 ’, V 3 ’...V N ’) to reproduce the real potential, V 0 . The iterative refinement of the EVB reference potential, V n ’, is aimed at reaching a situation where V 0 and V n ’ almost coincide. Once this is achieved, the computationally expensive sampling on the real potential can be substituted by sampling the reference potential instead (e.g. by EVB FEP/US), plus a minor LRA correction. ................................................................................................................................... 4 Figure 1‐2 The thermodynamic cycle used in calculating the QM(ai) activation free energy. The QM activation free energy (vertical dashed arrow) is calculated as the sum of the EVB barrier (vertical solid arrow) evaluated by EVB FEP/US method, and the perturbations between the QM and EVB surfaces at the RS and TS (horizontal solid arrows), which is evaluated using the LRA approach. ................................ 9 Figure 1‐3 The S N 2 reaction between chloride ion and methyl chloride. This reaction was taken as the test system in the present work. The subscripts A and B are used to distinguish between the two chlorine atoms. ........................................................................................................................................... 12 Figure 1‐4 Describing the EVB resonance structures. The figure illustrates the nature of the EVB bonding along the reaction coordinate (e.g. the RS and TS), where, for each geometry, we use the two relevant diabatic states, with the indicated bonding pattern. ................................................................................. 14 Figure 1‐5 The initial EVB free energy surface. The figure depicts the free energy surface obtained for our SN2 reaction in the gas phase, prior to the refinement of any EVB parameters. The corresponding EVB parameters are given in Table 1. The red arrow represents the EVB free energy barrier, the blue arrow shows the QM barrier obtained with the LRA correction but without PD refinement, the cyan arrow designates the real QM barrier calculated in MOPAC2009 using the harmonic approximation for PM3. 14 Figure 1‐6 Fluctuations of the energy gap (E QM ‐E EVB ) for trajectories run on the initial EVB and PM3 potentials. The energy fluctuations (blue dots) were obtained during MD runs on: (A) the original EVB potential at the RS; (B) the PM3 potential at the RS; (C) the EVB potential with constraints (see the main text) at the TS, and (D) the PM3 potential with constraints at the TS. The averages over these vii fluctuations (dashed orange lines) are used to obtain the LRA estimates of the free energy of transfer from the EVB to the PM3 surface ( G EVB QM ) at the RS (A, B) and TS (C, D), according to Eq. (1.28). .... 15 Figure 1‐7 Illustrating the effect of refining the EVB parameters by the PD approach. The blue line shows the energy difference between the EVB and PM3 potentials before the parameter refinement, while the red line is the result after a series of refinement iterations done for: (A) just the structures generated at the RS, where E EVB has no contribution from H 12 (E EVB (RS) = E 1 ) and the only parameters refined are those of EVB diabatic states; and (B) both geometries lying near the RS, as well as those lying near the TS. Here the refinement is done for H 12 , and the Cl‐C‐Cl angle deformation energy parameters, while other parameters are taken from the earlier iteration round. ................................................................... 16 Figure 1‐8 The EVB free energy surface obtained with the refined parameters. This figure depicts the free energy surface obtained for our S N 2 reaction in the gas phase, after the refinement of the EVB parameters. The red arrow represents the EVB free energy barrier, the blue arrow shows the barrier obtained through the LRA‐correction after PD refinement, the cyan arrow designates the real QM barrier calculated in MOPAC2009 using the harmonic approximation for PM3. ....................................... 18 Figure 1‐9 Fluctuations of the energy gap, E QM ‐E EVB , for trajectories run on the refined EVB and PM3 potentials. The relevant energy fluctuations (dotted lines) were obtained during MD runs on the refined EVB (red) and PM3 (blue) potentials at the RS (A) and TS (B). The average values are shown as dashed lines of the matching colors. ....................................................................................................................... 18 Figure 1‐10 A comparison of the simulation times required for MTD and PD, at different levels of QM theory.(A) Time per QM call (in seconds) for different methods, for calculations of only the single point energy (SPE, black) as well as calculations that calculate both the single point energy and the energy derivatives (Force, red). The calculations of the energy and its derivatives were performed for the studied system using G03 on a single core of a Quad‐Core AMD Opteron2356 CPU with 4GB available memory. (B) The estimated total number of CPU‐hours required for a PD study of the test system presented in this work (black), comprising a total of 12000 QM calls, and for a corresponding MTD study of the same system (red), which would comprise 2∙10 6 QM calls, based on Ref. 38 . Note that the simulation time for method 11 (60667 CPU‐hours) is out of scale. In both cases, the X‐axis represents the relevant QM approach. The QM approaches used for the benchmark are: (1) PM3, (2) RHF/6‐31G(d), (3) RMP2‐FC/6‐31G(d), (4) RB3LYP/6‐31+G(d), (5) RmPW1PW91/6‐31+G(d), (6) RMP2‐FC/6‐31+G(d), (7) RCISD‐FC/6‐31G(d), (8) RCISD‐FC/6‐31+G(d), (9) RMP2‐FC/6‐311+G(d,p), (10) RCCSD‐FC/6‐31G(d) and (11) RCCSD‐FC/6‐31+G(d). .......................................................................................................................... 20 viii Figure 1‐11 Comparison of the free energy surfaces obtained using different approaches for the S N 2 reaction studied in this work. Shown here is the free energy surface in the reaction coordinate defined by r(C‐Cl A )‐r(C‐Cl B ) obtained by: EVB FEP/US after the PD refinement (blue), the MTD approach of Ref. 38 (red) as well as the free energy values of the PM3 optimized reactant, transition and product states (cyan), obtained using the harmonic approximation. ................................................................................ 23 Figure 1‐12 2‐D Free Energy Surface constructed by EVB FEP/US after the PD refinement. The surface is defined in terms of the distances between the central carbon atom and the leaving and attacking nucleophiles respectively (i.e. r(C‐Cl A ), r(C‐Cl B )). ........................................................................................ 23 Figure 1‐13 Steepest descent minimization of the least‐square function of Eq. (1.29). (LEFT) The steepest descent least‐square function minimized with respect to a set of EVB parameters by the optimal steepest descent algorithm implemented in the development version of MOLARIS, here it’s is comprised of the energy derivatives. The geometries for the refinement are taken along the reaction path on the EVB mapping potential for DCE + AcO(‐) (RIGHT) Difference between the energy derivatives before (RED) and after (GREEN) refinement for DCE + AcO(‐) system for the geometries generated near the reactants state and near the transition state on the wB97X‐D potential. ................................................................. 24 Figure 1‐14 2‐D Free Energy Surface constructed by EVB FEP/US after the PD refinement. (LEFT) EVB 1D FES after PD refinement (blue) for DCE + AcO(‐) in the gas phase (RIGHT) 2D EVB FES after PD refinement relative to the omegaB97X‐D/6‐31+G(d) potential. QM optimized values for r(C‐Cl) and r(C‐O) in Angstroms for RS(PS) are 1.820/3.207 (3.358/1.457); for TS ‐ 2.321/2.041. ............................................. 25 Figure 2‐1 The thermodynamic cycle used in the Paradynamics approach to calculate the free energy barriers. The blue curve is the free energy surface, Δg(ξ), obtained for the reference potential with the corresponding barrier , Δg ≠ , (blue arrow); the red curve is the free energy surface for the target potential, for which we want to evaluate the free energy barrier (red arrow). To achieve that we estimate the free energy of moving from the reference potential to the target potential at the RS and at the TS (shown by black arrows). ................................................................................................................. 33 Figure 2‐2 Estimating the free energy of moving from E PM3/MM to E PM6/MM at the TS (A) and at the RS (B) by: 2 points LRA (black dashes), a single point LRA averages of the energy gap, E PM6/MM ‐E PM3/MM , on E PM3/MM (red dashes) and on E PM6/MM (green dashes). Note that the results of the calculations made by forward FEP (magenta squares), backward FEP (cyan triangles) and their average (blue line), as well as the N‐points LRA (black triangles) and TDI (orange stars) are hardly distinguishable due to the overlap (also shown on the insets). ......................................................................................................................... 36 ix Figure 2‐3 (A). The distribution of the energy gap, E PM3/MM ‐ E PM6/MM , obtained from the MD trajectories propagated in FEP calculations of moving from E PM3/MM +E CONS (ξ) to E PM6/MM +E CONS (ξ), where ξ=d(C‐Cl L )‐d(C‐ Cl A ) is the solute RC, in the RS and at the TS regions (designated by red and green respectively); (B) The sampling distribution of the nuclear RC, obtained from the MD runs used in FEP calculation of moving from E PM3/MM +E CONS (ξ) to E PM6/MM +E CONS (ξ) in the RS and in the TS regions (which are designated by red and green respectively). .............................................................................................................................. 38 Figure 2‐4 The free energy functions of the solute RC, ξ, obtained at the RC values corresponding to the TS (A) and to the RS (B), for the full mapping between PM3/MM and PM6/MM. The blue line designates Δg PM6/MM and the red line designates Δg PM3/MM , obtained as weighted averages over all mapping potentials, blue and red dots, respectively. ............................................................................................... 39 Figure 2‐5 The PMF obtained for the PM3/MM potential with the solute charges derived by the Mulliken analysis (red) and by the ESP‐fitting (blue) and for PM6/MM potential with the Mulliken solute charges (green). The PMF’s have been calculated by the WHAM procedure. ........................................... 39 Figure 2‐6 Estimating the free energy of moving from the EVB to PM3/MM at the TS (A) and at the RS (B) in water by: FEP (blue line), 2 point LRA (black dashes), LRA averages of the energy gap, E PM3/MM ‐E EVB , on E EVB (red dashes) and on E PM3/MM (green dashes). Note that the results of the calculations made by forward FEP (magenta squares), backward FEP (cyan triangles) and their average (blue line), as well as the N‐points LRA (black triangles) and TDI (orange stars) are hardly distinguishable due to the overlap (also shown on the insets). ......................................................................................................................... 41 Figure 2‐7 The distributions of the energy gap between the EVB potential and the PM3/MM potential, obtained while doing FEP at the RS (green) and of the gap between the EVB mapping potential and PM3/MM while doing FEP at the TS (red). ................................................................................................. 42 Figure 2‐8 The free energy functions along the reaction coordinate ξ=d(C‐Cl L )‐d(C‐Cl A ) obtained in FEP calculations from EVB potential to PM3/MM in water at the TS (A) and at the RS (B). The blue line designates Δg EVB (weight‐averaged over the blue points), while the red line designates Δg PM3/MM (weight‐ averaged over red points). .......................................................................................................................... 43 Figure 2‐9 The free energy functions along the reaction coordinate for FEP from EVB mapping potential to PM3/MM in water at the TS. The blue line designates the EVB function (weight‐averaged over blue points), while the red line designates the PM3/MM function (weight‐average over red points) ............. 43 x Figure 2‐10 (A) The free energy profile along the nuclear RC obtained by the FEP/US approach for the EVB reference potential (blue dots) and for the PM3/MM (red line); (B) EVB FEP/US free energy profile along the energy gap between the EVB diabatic states. ............................................................................ 44 Figure 2‐11 Fitting the Γ‐functions. (A) Γ EVB (green line) fitted to the EVB gas‐phase PMF/WHAM (red triangles). The EVB potential used in the simulation (blue squares) was modified by removing the effect of the EVB distance constraints for the RC values outside the range: ‐1>ξ>1. (B): PM3 and PM6 gas phase energy scans from MOPAC2009 (red stars and green triangles) and the corresponding Γ‐functions. ...... 47 Figure 2‐12 The gas‐phase EVB PMF obtained by using the WHAM approach and the harmonic constraints on different values of RC. The red line is for the EVB potential which was fitted to PM3(gas), while the green line is for the PMF obtained for the E EVB ‐Γ EVB +Γ PM6 potential ............................................ 50 Figure 2‐13 The PMF’s obtained with explicit (green) and implicit COSMO (red) solvation models. The PMF’s were obtained by adding to the QM/MM potential the negative of Γ COSMO . The figure also depicts the Γ COSMO (blue line) function, derived from PES with COSMO solvation model (blue triangles). ............ 51 Figure 2‐14 The distribution of the nuclear RC during MD simulation on the potential E PM3/MM ‐ Γ COSMO (red) and on the potential E PM3+COSMO ‐ Γ COSMO (green). The histogram for E PM3/MM ‐ Γ COSMO shows a poor sampling at the TS for the explicit model, since Γ COSMO does not make the explicit potential fully flat. .. 52 Figure 2‐15 The PMF’s obtained for the CH 3 Cl and Cl ‐ S N 2 reaction in water. (A) (Red) PMF constructed for E PM3/MM potential (polarized PM3 Hamiltonian with the solvent polarized by the ESP charges) using the WHAM approach. (Blue) PMF obtained by sampling on the flat E PM3(GAS) ‐Γ PM3(GAS) with the EVB‐type solvent driving potential using the FEP/US approach for the potential E PM3(GAS) +E sS (Q PM3 ). ....................... 56 Figure 2‐16 Demonstrating the refinement of the EVB reference potential in condensed phases, using the correction potential –Γ EVB +Γ TARG fitted by Gaussians plus a vector of the new EVB charges derived for the target potential. The free energy profiles obtained by the EVB FEP/US approach along the EVB energy gap (A) and along the nuclear RC (B). (Blue) the original reference potential with parameters refined for the gas phase PM3; (red) the refined EVB reference potential for PM3/MM target potential; (green) the refined EVB reference potential for PM6/MM target potential. ............................................. 57 xi Figure 3‐1 Active center of dehaloalkane dehalogenase. The protein backbone shown in green the catalytic residue, aspartate, is shown as sticks. The dotted red line depicts the direction of the nucleophilic attack of the substrate – 1, 2‐dichloroethane shown as sticks. ............................................. 68 Figure 3‐2 Catalytic effect based on PMF simulations using WHAM approach as reproduced by PM3/MM potential. Blue line is PMF for the reference reaction in water. Black line is PMF for the reaction in enzyme. ....................................................................................................................................................... 69 Figure 3‐3 Application of the PD approach to an enzymatic reaction in dehalogenase. (A) Free energy perturbation of moving from the PM3/MM reference potential to the PM6/MM reference potential at the RS by a 21‐step FEP (RED line) and by the end‐point LRA approach (BLACK line) and at the TS by a 26‐ step FEP (GREEN line) and by the LRA (BLUE line). The black arrow shows the corresponding correction to the activation free energy barrier. (B) RED line: PM3/MM reference free energy surface (PMF); CYAN line: PM6/MM target free energy surface. Both PMF are calculated using the FEP/US approach black dots and blue dots. Their weight‐averages are in red and cyan respectively. ........................................... 70 Figure 3‐4 Catalytic effect based on PMF simulations using WHAM approach as reproduced by PM6/MM potential. Blue line is PMF for the reference reaction in water. Black line is PMF for the reaction in enzyme. ....................................................................................................................................................... 71 Figure 3‐5 Catalytic effect based on PMF simulations using WHAM approach as reproduced by B3LYP//6‐ 31G*/MM potential. Blue line is PMF for the reference reaction in water. Black line is PMF for the reaction in enzyme. ..................................................................................................................................... 72 Figure 3‐6 PMF for PM3/MM (left) and PM6/MM (right) potentials in dehalogenase. The estimates are obtained using WHAM (BLUE), FEP/US (GREEN) with sampling of the target potential. Red dots are the weighted‐averages calculated using the FEP/US approach with configurational averages taken on the reference potential. .................................................................................................................................... 72 Figure 3‐7 2D PMF for PM3/MM (BLUE) and PM6/MM (RED) potentials for the reaction in dehalogenase. The BLUE and RED surfaces are calculated using 2D‐WHAM and mapping along 2 reaction coordinates (d(C‐O) and d(C‐Cl)), while the black surface obtained using the FEP/US approach processing data generated on 1D mapping potential along d(C‐O)‐d(C‐Cl) RC. The cyan arrow shows the LRA free energy perturbation calculated by the PD approach. ............................................................................................. 73 xii Figure 3‐8 The corresponding contour maps for the free energy surfaces shown in Figure 7. (LEFT) PM3/MM FES, (RIGHT) PM6/MM FES for the reaction in dehalogenase. .................................................. 74 Figure 4‐1 The simplest models used to study the proton transfer step of the phosphate monoester hydrolysis (PO 3 ‐ + H 2 O (or 2H 2 O)). (LEFT) the near transition state configuration for the mechanisms involving one water molecule. (RIGHT) and two water molecules. ........................................................... 79 Figure 4‐2 The PM3/MM free energy surfaces calculated for the 1W and 2W proton transfer mechanisms for the (PO 3 ‐ + H 2 O ( or 2H 2 O)) system. (A) The 2D free energy surface in the ξ, R 2 space, where ξ is the 1W PT coordinate, defined as the difference between the proton‐donor and proton‐ acceptor distances. (B) A comparison of the 1D free energy surfaces along the ξ RC for the 1W PT, obtained using WHAM (black dots) and FEP/US (red dots) and for the 2W PT, obtained with WHAM (black triangles) and with FEP/US (green dots). Note that here the proton acceptor is the second water for 2W PT and the oxygen of the hydrated metaphosphate for the 1W PT. .............................................. 79 Figure 4‐3 A description of the QM region used to study the hydrolysis of MHDP. Here the P‐O bond (grey dashes) is broken and the two studied mechanisms for the final proton transfer step are shown as pink dashes (1W PT) and as blue dashes (2W PT). ..................................................................................... 80 Figure 4‐4 The PM3/MM free energy surface calculated for MHDP + 2 QM water molecules (the model of the QM region given in Figure 4‐3) (A) in the R 1 , R 2 space for the cleavage of the P‐O bond (B) for the 1W and the 2W PT mechanisms. ................................................................................................................ 80 Figure 4‐5 A description of a larger model for the QM region (MDP with Mg 2+ and 16 QM water molecule) used to study the hydrolysis reaction (LEFT). Here the P‐O bond is already broken and the system is at the plateau prior to the final proton transfer. (RIGHT) Shown is the QM region solvated by MM water (blue sticks) ............................................................................................................................... 81 Figure 4‐6 The PM3/MM free energy surface for the QM model of Figure 4‐5, calculated in the R 1 , R 2 space for the cleavage of the P‐O bond. .................................................................................................... 81 Figure 4‐7 PM3/MM free energy surfaces for: (A) MTP plus Mg 2+ plus 16 QM H 2 O (B) MDP plus Na + plus 16 QM H 2 O. The obvious effect of Mg 2+ is that it seems to change the probability of the associative/dissociative paths, making the associative more likely. Probably it provides stabilization to the associative complex by decreasing the repulsion between the negatively charged leaving terminal xiii phosphate and the residual methyl(di)phosphate. Also Na + seems to be doing worse than Mg 2+ in terms of lowering the associative TS1. ................................................................................................................. 82 Figure 4‐8 BLYP/MM free energy surfaces calculated for the 1W and 2W proton transfer mechanisms for the (PO 3 ‐ + H 2 O ( or 2H 2 O)) system. (A) The 2D free energy surface in ξ, R 2 space, where ξ is the 1W PT coordinate defined as the difference between the proton‐donor and proton‐acceptor distances. (B) A comparison of the 1D free energy surfaces obtained using FEP/US along the ξ RC for the 1W PT path with the 6‐31G* basis set (black dots) and the 6‐31G basis set (red dots) and for the 2W PT path , obtained with 6‐31G* basis set (blue dots). Note that here the proton acceptor is the second water for 2W PT and the oxygen of metaphosphate for 1W PT. ............................................................................... 85 Figure 4‐9 B3LYP//6‐31G*/MM free energy surfaces calculated for the 1W and 2W proton transfer mechanisms for the (PO 3 ‐ + H 2 O ( or 2H 2 O)) system. (A) The 2D free energy surface in the ξ, R 2 space, where ξ is the 1W PT coordinate defined as the difference between the proton‐donor and proton‐ acceptor distances. (B) A comparison of the 1D free energy surfaces obtained along the ξ RC for the 1W PT path , obtained by WHAM (blue dots) and for the 2W PT path , obtained by WHAM (green dots) and using FEP/US (black dots). Note that here the proton acceptor is the second water for 2W PT and the oxygen of metaphosphate for 1W PT. ........................................................................................................ 85 Figure 4‐10 The free energy surfaces in the R 1 , R 2 space for MHDP plus 2 QM H 2 O (the system with the QM region given in Figure 4‐5), for the first step of the ester hydrolysis (the cleavage of the P‐O bond) obtained by (A) the PMF /COSMO model ; (B) the B3LYP//6‐31G*/MM model. ..................................... 86 Figure 4‐11 B3LYP//6‐31G*/MM free energy surface for MDP plus Mg 2+ plus 16 QM H 2 O, obtained with a 1D mapping potential, which contained an harmonic constraints on R 1 ‐R 2 RC: (A) The 1D free energy surface calculated along the R 1 ‐R 2 RC with FEP/US(blue) and with WHAM(red) and (B) The free energy surface calculated in the R 1 , R 2 space by the FEP/US approach. ............................................................... 87 Figure 4‐12 The Free energy surfaces for the 1W PT path calculated for the MHDP + 2 QM H 2 O QM region of Figure 4‐3 using (A) the PMF/B3LYP//6‐31G*/COSMO and (B) B3LYP//6‐31G*/MM potentials. .................................................................................................................................................................... 88 Figure 4‐13 The Free energy surface for the 2W PT path calculated for the MHDP + 2 QM H 2 O the QM region of Figure 4‐3 using (A) 2D mapping with the PMF/B3LYP//6‐31G*/COSMO model . (B) 1D mapping with the PMF/B3LYP//6‐31G*/COSMO model .......................................................................... 88 xiv Figure 4‐14 Free energy profiles calculated on B3LYP//6‐31G*//MM potential for: the 1W PT mechanism using MHDP plus 1 QM water as a model for the QM region (RED was obtained using WHAM; BLACK curve using FEP/US) and for the 2W PT mechanism using MHDP plus 2 QM water (BLUE curve calculated using WHAM) .............................................................................................................................................. 89 Figure 4‐15 A snapshot of one the studied models for the QM region (MDP with Mg 2+ and 6 QM water molecules) used to study the PT step in hydrolysis of phosphate monoester. Here the P‐O bond is already broken and the system is prior to the final proton transfer. ......................................................... 89 Figure 4‐16 The Free energy surfaces for the 1W PT path calculated for the model of the QM region given in Figure 4‐15, using the B3LYP//6‐31G*/MM potentials. ................................................................ 90 Figure 5‐1 LEFT A monomeric unit of the red fluorescent protein LSmKate1 with the view given from the top of the barrel formed by the protein (shown as green cartoon). A snapshot is taken from a QCFF‐Pi trajectory for the protonated fluorophore. (RIGHT) Zoom of the chromophore in the protonated state and the suggested proton acceptor GLU 160. ............................................................................................ 96 Figure 5‐2 Assignment of ionization states to nearby ionizable residues of chromophore in LSSmKate1 red fluorescent protein at pH=7. Note that the protonation state of GLU160 in simulations is determined by the protonated state of the chromophore. ........................................................................................... 97 Figure 5‐3 Absorption (A) and emission (E) spectra calculated for the protonated (HA) and for the ionized states (A ‐ ) of the chromophore in LSSmKate1 red fluorescent protein. (LEFT) Cyan line/white dots – A(HA), CIS(D); blue line – A(HA), QCFF‐π; cyan line/white dots – A experiment at pH=7; cyan line/red dots – A experiment at pH=11. (RIGHT) cyan line/white dots – A experiment at pH=7; cyan line/red dots – A experiment at pH=11; red line/white stars and orange line/white stars – E experiment 150 at pH=11; orange line ‐ A(A ‐ ), QCFF‐π; purple line ‐ E(A ‐ ), QCFF‐π; cyan line/white dots ‐ A(A ‐ ), CIS(D); magenta line/white dots ‐ E(A ‐ ), CIS(D). Units are: relative intensity versus wavelength of the absorbed/emitted light. ............................................................................................................................................................ 98 xv List of Tables Table 1‐1Evaluating the free energies for transferring from the EVB to the PM3 free energy surface, using the LRA approach outlined in Eq. (1.28). All energies are in kcal/mol. ............................................ 17 Table 2‐1 Parameters for the Γ‐functions presented in Figure 2‐11 composed of 3‐gaussians ( A i exp(‐ α i (ξ‐ ξ i ) 2 ) )each ............................................................................................................................................. 48 Table 2‐2 Comparing the solvation energies of the COSMO and the explicit SCAAS 99 models. ................. 52 Table 4‐1. Summary of the free energy simulation results obtained using PM3/MM (a) ........................... 83 Table 4‐2 Summary of the BLYP//6‐31G*/MM, B3LYP//6‐31G*/MM and B3LYP//6‐31G*(COSMO) free energy calculations (a) ................................................................................................................................... 91 xvi Abbreviations QM(ai)/MM ab initio quantum mechanical/ molecular mechanical RP reference potential (potential energy surface) TP target potential (potential energy surface) LRA linear response approximation FEP free energy perturbation EVB empirical valence bond TDI thermodynamic integration PD Paradynamics MTD Metadynamics PT proton transfer 1W PT one water proton transfer 2W PT two water proton transfer PMF potential of mean force FES free energy surface PES potential energy surface RC reaction coordinate 1D one-dimensional 2D two-dimensional QCP quantized classical path US umbrella sampling WHAM weigthed-histogram analysis method QCFF Quantum chemical extension of the self-consistent force field RS reactant state PS product state TS transition state INT intermediate SCAAS surface constrained all-atom solvent DCE 1,2-dichloroethane xvii PM3 parametric model 3 PM6 parametric model 6 DFT density functional theory BLYP Becke exchange functional with Lee-Yang-Parr correlation B3LYP Becke 3 parametric hybrid functional with Lee-Yang-Parr correlation wB97X-D Becke97 exchange hybrid , long ranged corrected, dispersion corrected SCF self-consistent field HF Hartree-Fock NQM Nuclear quantum mechanical MHDP MeO-HPO3-PO3(2-), methyl monohydrogen diphosphate MDP MeO-PO3-PO3(3-), methyl diphosphate MTP MeO-(PO3)2-PO3(4-), methyl triphosphate xviii Abstract Reliable computational modeling of chemical processes in condensed phases such as computation of the activation free energies requires both extensive configurational sampling and an appropriate level of theoretical treatment. The former is necessary for capturing solute-solvent fluctuations (which is extremely problematic in the energy-minimization approach) and for obtaining convergent free energies, while the latter is required for reliable description of redistribution of electron density during chemical processes and the corresponding energetics. These two factors make computational cost of ab initio QM/MM (QM(ai)/MM) simulations extremely high or even prohibitively expensive for the real system of interest such as chemical reactions in aqueous solution and enzymatic reactions. A powerful general approach, which circumvents these problems, is the reference potential based Paradynamics approach. In this approach the extensive configurational sampling is performed at a lower level of theory using a cheaper (computationally) reference potential rather than directly on the expensive target QM(ai)/MM potential. This is followed by calculating the free energy perturbation of moving from the reference potential to the target potential. In this work, a number of advances is presented which are made to the QM(ai)/MM free energy computation techniques in general and to the reference potential strategy in particular. First, a formulation of the reference potential based Paradynamics model is given. Chapter 1 addresses an important issue of improving the convergence of the linear response approximation (LRA), which is used to calculate the free energy perturbation from the EVB reference potential to the QM target potential. The improvement in the LRA convergence is achieved by iteratively reducing the difference between the two potentials by fitting EVB parameters to the energy and its derivatives of the target potential. A thorough comparative analysis of the computational cost of the Paradynamics approach is given relative to other methods based on the direct sampling of the target potential. In Chapter 2, an extensive study exploring, refining, and quantifying the Paradynamics model is given. First, a different refinement strategy for a general reference potential is formulated, where the difference between the reference potential and the target QM/MM surface is reduced using the correction potentials comprised of Gaussian functions along the reaction coordinates. Additionally we propose a way of accelerating the potential of xix mean force calculations by using a combination of the same correction potentials along the reaction coordinates and a solvent polarizing potential. Secondly, it is rigorously demonstrated how to gradually and in a controlled way improve calculations of the free energy perturbations associated with moving from the reference potential to the target QM/MM surface using the multistep free energy perturbation approach. Furthermore, the LRA treatment is validate by comparing it to the full FEP approach. Finally, results of the PD model prediction for the activation barrier on the target potential are compared to results of PMF calculations. In chapter 3 we demonstrate practical applications of the QM(ai)/MM free energy calculation approach in general and the Paradynamics model in particular by addressing challenging questions in chemistry and related fields. Advances in developing computational tools for the free energy calculations required in the Paradynamics thermodynamic cycle are reported with illustrative examples. First, the Paradynamics model is applied to an enzymatic reaction, and its predictions are confirmed by 1D and 2D PMF calculations. Second, we calculate the catalytic effect of an enzyme. Last, we quantify another reference potential approach where the target free energy surface is estimated entirely by sampling the reference potential, and relating the corresponding probabilities. In chapter 4, a thorough computational study is carried out to quantify the mechanism of hydrolysis of a phosphate monoester, in particular the mechanism of proton transfer step. This is achieved by calculating the relevant QM(ai)/MM free energy surfaces for the possible mechanisms. This chapter also discusses application of the reference potential approach in estimating the nuclear quantum mechanical effect terms of the QCP approach. Finally, in the concluding Chapter 5 another application of the reference potential approach towards quantifying the photochemical cycle (which involves calculation of the excited state proton transfer) by calculating properties (the emission and absorption spectra) of the red fluorescent protein is considered. It is demonstrated how one can overcome the problem of prohibitively expensive QM(ai)/MM sampling for a system in the excited state by using the QCFF-PI reference potential. 1 1 Chapter 1. ParaDynamics: An Effective and Reliable Model for Ab Initio QM/MM Free Energy Calculations 1.1 Introduction Obtaining reliable ab initio QM/MM free energy surfaces for chemical reactions in the condensed phase has long been a challenge of modern computational chemistry (see e.g. Refs. 1- 10 ). The main challenge in such calculations is the high computational cost associated with the repeated evaluation of the ab initio QM/MM potential. Significant research efforts have been invested to advance this problem, resulting in a number of promising strategies 2,5,9-15 Many of these approaches 4,5,10,12,14,16 exploit an idea 17 of utilizing a simplified (e.g. classical) potential as the reference potential. Here, we demonstrate the effectiveness of an updated version of the classical reference potential for QM(ai)/MM calculations, which we refer to as “paradynamics” (PD). This approach is based on the use of an empirical valence bond (EVB) reference potential, which is already similar to the real ab initio potential. The reference potential is fitted to the ab initio potential by an iterative and, to a great degree, automated refinement procedure. The corresponding free energy profile is then constructed using the refined EVB potential, and the linear response approximation (LRA) is used to evaluate the QM(ai)/MM activation free energy barrier. The automated refinement of the EVB surface (and thus the reduction of the difference between the reference and ab initio potentials) is a key factor in accelerating the convergence of the LRA approach. Another increasingly popular approach that has emerged in recent years is the “metadynamics” (MTD) approach, which is often combined with the Carr-Parinello ab initio molecular dynamics or with the QM(ai)/MM approaches in order to study processes that are dominated by changes in chemical structure, such as chemical reactions. In the MTD approach, the configurational sampling is performed directly on the target potential, but limited to the particular region of interest of the free energy surface. The MTD approach adopts the local minima elevation sampling strategy of van Gunsteren and coworkers’ 18-20 and gradually elevates the local minima in the space of selected coordinates “collective variables” by filling them with Gaussians. This prevents the simulated system from returning to areas of the configurational space that have already been already visited, and therefore improves sampling efficiency. Since its original 2 inception in 2002 21 , the MTD approach has skyrocketed in popularity (possibly, amongst other reasons, because of its seemingly automatic implementation), and has been applied to a wide range of problems, including photoinduced ring transformations 22 , intramolecular reactions in complex systems 23 , rational catalyst design 24 and organometallic reactivity 25 , to name just a few examples (see also the review in Ref. 26 ). In the present work, we will use the MTD approach (based on the direct sampling of the target QM(ai)/MM potential) for a comparative study of the recent refinement of the reference potential based PD approach. Our refined approach will be referred to in this work as “paradynamics” (PD) in order to both highlight its similarities to the MTD approach, as well (perhaps) as to highlight advantages achieved with the PD approach in terms of the computational cost. 1.2 Background In order to prepare the reader with the proper perspective to be able to compare the MTD approach to the PD approach using an empirical valence bond (EVB) reference potential, we begin by providing a description of the main points of the MTD approach. The MTD approach involves the clever idea of devising a potential that is much shallower than the original system potential, where the simulation can thus avoid becoming trapped in the minima of the real potential, separated by a large barrier as in the case of the energy surfaces for chemical reactions. The MTD approach moves further in this direction, by trying to generate a potential that will completely flatten the surface of the system in the directions where the presence of barriers is likely. This is achieved by iteratively adding a potential, E’(r), to the real potential, in the form of a sum of Gaussians that gradually approaches –E(r)) (where E(r) is the original potential). In other words, at the k th iteration, the MTD is run on the potential given by: kQM k Er E r Er (1.1) where E k r A i e i r 2 i k i k 1 (1.2) Here r is the space that is spanned by a small number of chemically relevant collective variables (which are used to define the reaction coordinate). The system can “escape” over the lowest transition state as soon as the growing biasing potential and the underlying free energy well 3 counterbalance each other, effectively allowing the simulation to “escape” free energy minima 27 . To better illustrate this principle, one can consider, as an example, the case described in Figure 1A. In the MTD approach, one divides the range of interest spanned by the selected RCs (from a minimum to maximum value) into smaller intervals, and identifies the case where the dynamics starts to lead to the closest minima on V 0 , which are effectively sampled. At this point, a potential V 1 ’ is built at the V 0 minimum (using Gaussians as the functions of the chosen RC), and the dynamics is propagated on V 0 +V 1 ’. This allows for the sampling of higher energy points on the FES, i.e. those that lie on the height of the minima at V 0 +V 1 ’. At the next iteration, the potential V 2 ’ is constructed, and the region near the minima of V 0 +V 2 ’ is effectively sampled. This procedure is iteratively repeated until the maxima of the free energy surface are all sampled, at which point the sum of V 0 +V n ’ is basically flat. The resulting flat potential would allow for an efficient sampling of the V 0 in the r-direction. In the MTD approach, it is assumed that the free energy surface is equal to the resulting - V n ’. It should be noted that although MTD has been formulated in an elegant way, the main computational cost arises from flattening the potential, as this involves evaluating the energy by expensive QM calculations a countless number of times. Since the key issue is the overall efficiency, we may look at related formulations. Thus, we start by clarifying that the philosophy of this approach is actually similar (at least formally) to the overall philosophy of the reference potential approach. In fact, the construction of a potential that makes the landscape flat is ideologically a counterpart of a simplified reference potential. The general use of a reference potential for accelerated sampling has been a key part of QM/MM free energy perturbation (QM/MM-FEP) studies for a long time 8 , long before the appearance of MTD in the field. The same approach has also been used in quantizing nuclear motions 28,29 , as well as in moving from coarse-grained to full potentials in studies of protein folding 30 , as well as related problems 3,31-34 . The formal similarity between the two approaches becomes clearer once, while developing the reference potential, we also take into account the idea of improving this potential so as to minimize the difference (by some measure) between the reference and the target potentials (see e.g. Refs. 3 ). This point can be best illustrated in Figure 1-1, which compares and contrasts the MTD and PD approaches. From this figure, it can be seen that, in the case of MTD, as outlined above, this approach samples the energy landscape while incrementally 4 adding Gaussian functions to the real potential of the system. This is repeated many times until the sum of both potentials reaches zero (which results in a flat potential, as shown in Figure 1-1A). In contrast to this, in our PD approach, our initial reference potential (in this case an EVB potential) is already quite similar to the real potential of the system, and instead we iteratively and systematically refine the parameters of the reference potential, until the difference between the two potentials approaches zero (see Figure 1-1B). A comparison of the computational cost of the two approaches will be found in Section 1.5. Figure 1-1 A conceptual comparison of meta- and paradynamics. (A) The MTD approach iteratively adds Gaussian potentials, V 1 ’, V 2 ’, V 3 ’…V N ’, to the real potential of the system, V 0 . At the final iteration, the addition of the Gaussian potential V N ’ to the real potential creates a flat potential, V 0 +V N ’, which allows for sampling of the TS region. (B) In the PD approach, the reference potential (in this case an EVB potential), V 1 ’, is iteratively and systematically refined (V 2 ’, V 3 ’...V N ’) to reproduce the real potential, V 0 . The iterative refinement of the EVB reference potential, V n ’, is aimed at reaching a situation where V 0 and V n ’ almost coincide. Once this is achieved, the computationally expensive sampling on the real potential can be substituted by sampling the reference potential instead (e.g. by EVB FEP/US), plus a minor LRA correction. 1.3 Paradynamics Approach 1.3.1 General Outline The aim of PD is to provide an efficient and reliable way to perform QM(ai)/MM free energy calculations. The starting point for this is the use of the EVB approach (which has been discussed in detail in elsewhere 35,36 ) as a reference potential, which has proven to be highly successful in the past (e.g. Refs. 3,17,32,37 ). This method has several features, the first of which involves having a reference potential which is reasonably close to the real potential, allowing one to use a perturbation treatment, similar to the approach of Ref. 3 . Secondly, the EVB energy gap 5 offers what is probably the best RC for studies of chemical processes both in solution and in proteins, as was also recently observed in a study that used the energy gap in MTD calculations 38 . Now the use of a reference potential is formally similar to using a potential that is the inverse of the real potential, as is done by the MTD (see Figure 1-1). This requires, however, that the two potentials (i.e. the EVB and the ab initio potentials) are as close as possible. As will be outlined below, this can be achieved in several steps by means of our PD approach. Here, we will start with a brief description of EVB free energy calculations, followed by a consideration of the procedure for moving from the EVB to the ab initio FES, and finally, we will outline the procedure for refining the EVB parameters. 1.3.2 Empirical Valance Bond Method We start our description from the method which due to its low computational cost for long time was the only practical way of calculating QM/MM free energy surfaces. In the Empirical Valence Bond approach the Schrodinger equation (1.3) E H (1.3) is formed by approximating the diagonal ii H iH i and the off-diagonal ij H iH j elements using analytical empirical expressions. Each EVB diagonal element is approximated by the analytical energy function for the i-th VB diababtic state i : i bonded nonbonded i EE E (1.4) The first term of Eq. (1.4) describes bonded interaction within the QM region and the bonded QM/MM coupling; the second term accounts for electrostatic and Van der Waals interations within the QM region and the corresponding nonbonded QM/MM coupling; the third term is the gas phase shift of the diabatic state. The first term of Eq. (1.4) contains the terms describing bond stretching, angle bending and deformation of torsional (and improper torsional) angles: bonded bond angle tors E EE E (1.5) Breaking and forming bonds are described using the Morse potential: 0 2 1 rr bond bonds EDe (1.6) 6 Where D is the potential well depth; 0 r is the bond equilibrium distance; is the parameter controlling the steepness of the well. Angles deformations are described using the harmonic potential with a force constant K with the minimum at 0 : 2 0 angle coupl angles EK (1.7) where coupl is the coupling element which is introduced when an angle formed by atoms i, j, k exists only in one of the EVB states due to breaking/forming bond between atoms i, j or j, k: 00 0 2 0 1, 2, coupl rr r r rr ee rr (1.8) Torsional angles deformation is accounted by: 0 1cos tors coupl tors EK n (1.9) where coupl (Eq. (1.8) is the coupling element which is introduced for a torsional angle characterized by a force constant K , phase angel 0 and periodicity n , formed by atoms i, j, k, l which exists only in one of the EVB states due to breaking/forming bond between atoms i, j or k, l atoms. The second term of Eq. (1.4) in case of a non-polarizable force field incorporates electrostatic interactions and Van der Waals interactions: nonbonded elec VdW EEE (1.10) Electrostatic interactions consist of those within the QM region and of the interaction between the QM and MM regions described with a Coulomb law: ij i j elec ij QM i j QM MM ij ij QQ Qq EC fr C rr (1.11) here C is a constant that depends on the units of energy used (332 for kcal/mol); Q is the charge of a QM atom and q is the charge on the MM atom; ij f r is the screening function defined as: 2 1 ij r ij fre (1.12) 7 Van der Waals interactions within the EVB region are described by the combination of 6-12 Lennard-Jones potential and the Buckingham potential: 12 12 12 12 ij ij r ij ij ij ij VdW ij QM i j QM MM ij ij ij ij ABAB ECe rr rr (1.13) In the EVB method, the effect of solvent is included only in the Hamiltonian diagonal elements. The off-diagonal elements are approximated by the Gaussian functions: 2 ij H Ae (1.14) where ξ is the reaction coordinate defined as the difference between the forming and breaking bond lengths. After diagonalization of Eq. (1.3) the lowest eigenvalue is identified: EVB E H (1.15) with the corresponding normalized eigenvector with its components being c 1 , c 2 , .., c i , .., c n the corresponding derivatives are calculated: EVB E xx H (1.16) For 2x2 EVB Hamiltonian Eq. (1.15) and Eq. (1.16) become: 22 11 2 2 12 12 2 EVB E cE cE cc H (1.17) 22 11 2 2 12 12 2 EVB E cE cE cc H x xx x (1.18) The EVB approach considered above is based on a very intuitive picture of EVB diabatic states, which are closely related to the concept of resonance structures. Due to analytical form of empirical approximations the computational cost of the EVB method is very low for a QM/MM method, which allows to perform an extensive configurational sampling and to achieve quantitatively reliable results which are determined by the quality of the EVB parameterization. The EVB potential surface is obtained by constructing diabatic surfaces (E i ) that approximate the reactant (RS), product (PS) and different intermediate states (the transition state (TS) in 8 particular), and mixing these surfaces with off-diagonal terms that lead to the actual ground state surface, E EVB (for details see e.g. Refs. 35,36,39 ). The relevant activation free energy, EVB g , is evaluated using a mapping potential, map m , which drives the system from one diabatic state to another as we accumulate sampling along the RC between the RS and PS. In the simplest case, which involves only two diabatic states, the mapping potential can be written as a linear product of the reactant (E 1 ) and product (E 2 ) potentials: 12 (1 ) map mm m E E (1.19) where m is incrementally changed from 0 to 1 ((n+1) values) in n fixed increments ∆ . The free energy increment associated with moving between the two adjacent mapping potentials map m and 1 map m is evaluated using the (FEP) formula 40 : 11 ln exp map m map map map m m m m GkT kT (1.20) where < > designates an average over molecular dynamics (MD) trajectories propagated on map m The final free energy for moving from E 1 to E 2 is taken as a sum of all free energy increments: 11 1 n map n map m m m GG (1.21) However, Eq. (1.21) would only provide the free energy barrier, map g , for moving from E 1 to the mapping potential TS map , forcing the system to remain near the EVB TS. Therefore, in order to construct the full EVB FES, we use as our RC, x, the energy gap between the two diabatic states (i.e. x =E 1 – E 2 ), and calculate the free energy of the system at a particular value of the RC using: ln exp map m map EVB EVB m map m Gx kT x E x kT G (1.22) (see e.g. Refs. 41 and 42 for further discussion). 1.3.3 E Once the on the EV Similarly Figure 1-2 dashed arro perturbation approach. In princip is substit and use t EVB FES The term relationsh Where th free ener Evaluating th e EVB surfa VB FES, usi y, the activat 2 The thermodyn ow) is calculated ns between the Q ple, one can tuted by E QM the thermody S, EVB E Gx ms on the ri hip: QM G he first term rgy change a he Correcti ace has been ing: g tion free ene g amic cycle used d as the sum of QM and EVB su directly obt M 8,17 (see Ch ynamic cycl VB , are used ght hand sid i M QM x G of Eq. (1.25 associated w ion for Mov constructed EVB EVB g G rgy barrier o QM QM g G in calculating the the EVB barrie urfaces at the RS tain the QM hapter 3). Ho le depicted i d as referenc de of Eq. (1 i EVB EVB G x 5) is obtaine with moving 9 ving from th d, we obtain, TS B EVB x G on the QM F TS QM Q xG e QM(ai) activati er (vertical solid S and TS (horiz FES with a owever in th n Figure 1-2 ces for QM G 1.24) can be EVB QM G ed from EVB from the EV he EVB to th the activatio RS EVB EVB G x FES, QM g , i RS QMQM x tion free energy. d arrow) evalua zontal solid arrow modified fo his work we 2, where the M QM x . e calculated ii EVB QM xx B calculation VB to the Q he ab initio P on free energ s given by: The QM activati ted by EVB FE ws), which is ev orm of Eq. ( e adopt the a e correspond d by means M ns and the se QM FES at t Potential gy barrier, (1 (1 ion free energy ( EP/US method, a valuated using th 1.22) where approach of ding points o of the follo ( econd term i the relevant EVB g , .23) .24) vertical and the he LRA E EVB Ref. 2 on the owing 1.25) is the point 10 on x i (i stands for RS, TS or PS). The resulting QM free energy barrier is therefore related to the EVB barrier by: QM EVB EVB QM gg g (1.26) where the terms on the right hand side of Eq. (1.26) are the difference between the corresponding right hand side terms of Eq. (1.25) at the TS and at the RS. The free energy of the transfer between EVB and QM potentials (i.e. the second term of Eq.(1.25)) can then be directly obtained by means of FEP calculations using the mapping potential: 1 map mmEVBmQM EE (0 m 1) (1.27) (see Ref. 2 ), and this will be demonstrated in Chapter 2. Alternately, following the approach of Ref. 3 , and using the LRA formulation, we write: 1 2 QM EVB ii EVB QM EVB QM QM EVB QM EVB EE Gx x E E E E (1.28) This is particularly important in case when the EVB and QM potential surfaces differ significantly from each other. However, since the point of the refinement procedure is to bring the two potentials as close as possible to each other, once the EVB parameter refinement is complete, the end-points approach of the LRA should be a good approximation to performing full FEP. 1.3.4 Performing Iterative PD Fitting to an ab initio Potential Let us move to the key point of this chapter which is the reduction of the difference between the EVB and the QM(ai)/MM surfaces. It should be made clear that EVB parameters (explicitly given in Appendix 1) can be fitted to experimental results and/or to reliable ab initio calculations (see e.g. Ref. 41 ). This fitting can, in principle, be done manually to any desired QM potential (whether it is semi-empirical, ab initio or hybrid QM/MM, etc). In order to facilitate this process, we adopt the minimum least-square fitting algorithm, which, to a great extent, provides an automated way to refine the default EVB parameters so as to fit them to the QM potential of interest. One should keep in mind, however, that each QM method has its own intrinsic pros and cons, which sets limitations to its applicability, both in terms of reliability and/or feasibility. Consequently, each QM potential has its own particular landscape and the FES, which may differ 11 significantly from each other, from the experimental results, and from the initial EVB potential. The refinement procedure ensures the proximity of the EVB reference potential to the QM potential of interest. This in turn ensures that the points on the EVB FES that are obtained by sampling the EVB potential during MD or Monte Carlo simulations to the first approximation already represent the QM FES (in other words that the reference EVB potential has a significant overlap in the phase space with the target potential). Thus, a major amount of computer time is saved by running MD trajectories on the EVB potential, while constructing the reference FES, since EVB calculations of the energy and its derivatives are faster than any electronic structure calculation. The first step in our procedure involves automatically fitting the EVB potential to the corresponding QM(ai)/MM potential. It is worth mentioning here that since the EVB method includes solvation effects in the energies of the diabatic states through the uncoupled electrostatic, VdW and induced dipoles terms, it can be sufficient to refine the parameters in the gas phase, and to use this parameter set in subsequent calculations (see e.g. Refs. 41 and 43 ). This is achieved by iteratively refining the EVB parameters for the reaction in solution (as well as also the gas-phase reaction). Our treatment considers the difference, , between EVB and QM/MM energies, as well as between the first energy derivatives at various configurations generated during MD runs on either the QM or EVB surfaces typically at the RS and the TS. can be represented by following the idea of Ref. 1 : 2 3 2 12 111 ,, , NN EVB QM EVB QM ii i i iij jj pr k E pr E r k E pr E r xx (1.29) where E EVB and E QM denote EVB and QM energies respectively, p is the vector of the EVB parameters, i runs over the configurations generated during the MD run and k 1 and k 2 are weighting factors which are applied to the energies and derivatives respectively. For each EVB parameter, p, we iteratively search for the value that minimizes using an augmented steepest descent approach, with some Newton-Raphson character, using the diagonal second derivatives: 22 1kk k k pp g p p p p (1.30) 12 where g is the steepest descent scaling factor, which is increased or decreased at every iteration, depending on whether the value of increases or decreases. The corresponding derivatives are calculated numerically: 2 p dp p dp p pdp (1.31) 2 22 () () pdp p p p dp p pdp (1.32) Note, however, that the relevant first and second derivatives could also in principle be evaluated analytically; however, the EVB calculations are sufficiently cost-efficient to permit the calculating of the derivatives using the numerical approach outlined in Eqs. (1.31) and (1.32). The iterative approach described above comprises the refinement procedures used to bring the EVB potential close to the QM potential, and effectively flattens the QM potential, in the sense that the high energy points of the QM potential are sampled on the underlying EVB potential. 1.4 Results and Discussion In order to best illustrate our PD approach, we will confine ourselves to a simple test case, namely the symmetric S N 2 reaction between chloride ion and methyl chloride (CH 3 Cl + Cl - , see Figure 1-3). Figure 1-3 The S N 2 reaction between chloride ion and methyl chloride. This reaction was taken as the test system in the present work. The subscripts A and B are used to distinguish between the two chlorine atoms. All calculations, unless otherwise specified, were performed using the development version of MOLARIS-XG software package 44,45 . The interface between MOLARIS-XG and the QM program (in this case MOPAC2009 46 ) was implemented through automated python scripts. 13 1.4.1 Performing the QM Calculations The test system was initially studied in the gas phase, using MOPAC2009 46 . Geometries of the RS (the dipole complex) and the TS were optimized at the PM3 level of theory 47,48 . The enthalpic contribution to the activation free energy was found to be 9.7 kcal/mol. In addition, despite its approximate nature (see discussion in e.g. 49-51 ) the harmonic entropic contribution to the activation free energy was evaluated by means of the vibrational analysis in the harmonic approximation, with the rigid rotator approximation, and was estimated to be ~1.6 kcal/mol, giving a total QM g of 11.2 kcal/mol at 298K. Normal modes calculated using PM3 were used to constrain the MD runs near the TS and to prevent the system from sliding downhill along the vibrational mode corresponding to the single imaginary frequency. Particular care was taken in order to leave other vibrational modes, and, in particular, those of high intensities, intact even with the imposed constraints. Specifically, large distance constraints of 250 kcal mol -1 Å -1 were imposed on the C-Cl atom pairs to keep the relevant atoms at a distance of approximately 2.3Å. 1.4.2 Performing an Initial EVB Study Using the Default Parameters In parallel, we performed EVB free energy perturbation/umbrella sampling (FEP/US) calculations using the non-polarizable version of the ENZYMIX force field 44 and the guess EVB parameters (see Table S1 in Appendix 1), using the resonance structures shown in Figure 1-4. That is, the initial EVB runs were performed using the standard EVB parameters, which are given in the Table S1 of Appendix 1, note that as the reaction under consideration is a symmetrical reaction, the relevant parameters for the RS and PS are identical). To improve sampling near the RS and PS, medium harmonic constraints (K=5 kcal mol -1 Å -1 ) were introduced in both states, to keep the respective atoms (2 and 6 in the RS, 1 and 2 in the PS) at a distance of ~3Å. The constraint energy was included in the corresponding diabatic energies. Additionally, an angle constraint between the Cl - -C-Cl atoms was used at both the RS and TS. The barrier height on the EVB free energy surface, EVB g , was adjusted to 12 kcal/mol (in accordance with experiment 52 ), using an H 12 of 50 kcal/mol. The resulting EVB free energy surface is shown in Figure 1-5. Figure 1-4 coordinate (e Figure 1-5 phase, prior represents th refinement, t 4 Describing the e.g. the RS and T 5 The initial EVB r to the refineme he EVB free ene the cyan arrow d e EVB resonance S), where, for eac B free energy sur ent of any EVB ergy barrier, the esignates the real e structures. The ch geometry, we u rface. The figure parameters. The e blue arrow sho l QM barrier calc 14 e figure illustrat use the two releva depicts the free e corresponding ows the QM bar culated in MOPA tes the nature o ant diabatic state energy surface o EVB parameter rier obtained wi AC2009 using the of the EVB bond es, with the indica obtained for our rs are given in T ith the LRA cor harmonic approx ding along the r ated bonding patt SN2 reaction in Table 1. The red rection but with ximation for PM3 reaction tern. the gas d arrow hout PD 3. 15 In the next step, we evaluated the relevant terms to calculate EVB QM G using the LRA approach outlined in Eq.(1.28), in which we run MD trajectories on both the EVB ( EVB E ) and PM3 ( QM E ) potentials at the RS and TS (see discussion above and Ref. 3 ). Figure 1-6 shows the fluctuations in QM EVB E E during the MD run, with the results of the calculations being summarized in Table 1-1. Figure 1-6 Fluctuations of the energy gap (E QM -E EVB ) for trajectories run on the initial EVB and PM3 potentials. The energy fluctuations (blue dots) were obtained during MD runs on: (A) the original EVB potential at the RS; (B) the PM3 potential at the RS; (C) the EVB potential with constraints (see the main text) at the TS, and (D) the PM3 potential with constraints at the TS. The averages over these fluctuations (dashed orange lines) are used to obtain the LRA estimates of the free energy of transfer from the EVB to the PM3 surface (G EVB QM ) at the RS (A, B) and TS (C, D), according to Eq. (1.28). 16 As seen from this table, we obtained a correction of EVB QM g =-7.0, and the resulting QM g is about 5 kcal/mol. The relatively poor agreement with the estimated barrier of 11.2 kcal/mol reflects the large difference between the EVB and PM3 surfaces, and the corresponding difficulties in obtaining a reliable estimate of the free energy change for transferring between the two potentials using the LRA approach when the two potentials are very different. 1.4.3 Performing the PD Refinement As was discussed above, once we reduce the difference between EVB and QM potentials, it is possible to obtain a reliable QM(ai)/MM surface. Thus, we utilized the automated PD parameter refinement procedure in order to find a set of EVB parameters that minimize the energy difference between the EVB and PM3 surfaces. It is important to note that although the refinement protocol is formally automated, we strongly believe that it is important to apply human intuition and logic to the trends and problems that could arise during the refinement procedure. That is, for instance, we noticed in particular that the parameter fitting for the diabatic states is simpler if the mixing term (H 12 ) is relatively small in the RS and PS (as, in this case, the energetics near the RS and PS have very little effect from H 12 and are determined by the parameters for the corresponding diabatic states only, as shown in Figure 1-7 A). Figure 1-7 Illustrating the effect of refining the EVB parameters by the PD approach. The blue line shows the energy difference between the EVB and PM3 potentials before the parameter refinement, while the red line is the result after a series of refinement iterations done for: (A) just the structures generated at the RS, where E EVB has no contribution from H 12 (E EVB (RS) = E 1 ) and the only parameters refined are those of EVB diabatic states; and (B) both geometries lying near the RS, as well as those lying near the TS. Here the refinement is done for H 12 , and the Cl-C-Cl angle deformation energy parameters, while other parameters are taken from the earlier iteration round. 17 Running MD trajectories on the PM3 potential at the RS and PS generates the structures used during the refinement. Once a significant improvement in the EVB parameters has been achieved, the structures generated by running MD on the PM3 potential at the TS can be introduced into the refinement procedure (note that for the system under consideration here, the Cl atoms were frozen and a 500 kcal mol -1 Å -2 positional constraint was applied to the carbon atom in the Z-direction along the Cl-Cl axis, thus constraining C-Cl bonds at 2.3Å). Here, the best form for the mixing term was found to be: 2 12 12 26 exp( ( ) ) HA r r (1.33) where r 12 is the distance between atoms 1 and 2 and r 26 the distance between atoms 2 and 6 (see Figure 1-4 and Table S2 of Appendix 1 for values of the EVB parameters). We refined the H 12 parameters and added the Cl-C-Cl angle bending contribution to the energies of diabatic states. The resulting difference between the EVB and PM3 potentials using the refined EVB parameters is outlined in Figure 1-7 B. Table 1-1 Evaluating the free energies for transferring from the EVB to the PM3 free energy surface, using the LRA approach outlined in Eq. (1.28). All energies are in kcal/mol. Default EVB PD refinement <E PM3 -E EVB > EVB(RS) 18.79 4.08 <E PM3 -E EVB > PM3(RS) -0.12 0.68 G(RS) EVB PM3 9.34 2.38 <E PM3 -E EVB > EVB(TS) 5.65 4.37 <E PM3 -E EVB > PM3(TS) -0.96 -0.36 G(TS) EVB PM3 2.34 2.00 3 EVB PM g -7.00 -0.38 With a new set of the refined EVB parameters we calculated the free energy profile shown in Figure 1-8, which gave a EVB g =12.12 kcal/mol. Calculating the corresponding LRA free energy perturbation terms at the RS and at the TS, we obtained results presented in Table 1-1. The corresponding MD runs used for LRA correction are given in Figure 1-9. The correction for 18 EVB QM g =-0.38 kcal/mol resulted in QM g = 11.74 kcal/mol, which gives much better agreement with the estimated value presented in Section 1.4.2. Figure 1-8 The EVB free energy surface obtained with the refined parameters. This figure depicts the free energy surface obtained for our S N 2 reaction in the gas phase, after the refinement of the EVB parameters. The red arrow represents the EVB free energy barrier, the blue arrow shows the barrier obtained through the LRA-correction after PD refinement, the cyan arrow designates the real QM barrier calculated in MOPAC2009 using the harmonic approximation for PM3. Figure 1-9 Fluctuations of the energy gap, E QM -E EVB , for trajectories run on the refined EVB and PM3 potentials. The relevant energy fluctuations (dotted lines) were obtained during MD runs on the refined EVB (red) and PM3 (blue) potentials at the RS (A) and TS (B). The average values are shown as dashed lines of the matching colors. 19 1.4.4 Efficiency Benchmarks As one of the main points in our discussion is the efficiency of PD in comparison to the currently popular MTD approach, it is therefore important to provide a detailed timing comparison between the two. As was pointed out in the discussion above, the most time consuming part of the QM(ai)/MM calculations are the repeated calls to the QM. Thus, the overall simulation time is roughly proportional to the total number of QM calls necessary to obtain QM g . The length of a single QM call depends on the method being used and on the specific type of calculation requested. That is, when evaluating the total simulation time for the QM free energy calculations, we need to take into account whether the calls to QM are a simple single point energy (SPE) calculation, the time required for which is denoted here as t SPE , or a longer job which also includes an additional calculation of the energy derivatives (denoted “Force” for simplicity), as well as the calculation of the charges by fitting of the electrostatic potential. We also denote the time required for a “Force” job as t Force . Here, we should keep in mind that MTD is run on the QM/MM potential, and therefore, in the MTD approach, every QM call is a “Force” job. In comparison, in PD, only half of the total QM calls (i.e. those for trajectories running on the QM surface) require more expensive “Force” calculations. Additionally, it is worth also noting that, in a gas-phase MTD study of the same system 38 , the total number of the “Force” calls required for a full QM(ai)/MM study was 2·10 6 , whereas in our PD study, we required 6000 “Force” calls, and 6000 SPE calls (i.e. a total of 12000 calls to the QM). In order to estimate the total time required for calculating an ab initio QM/MM free energy barrier, we ran jobs requesting both “Force” and SPE calculations at different levels of theory using the Gaussian03 53 software package for our system in the gas-phase. Figure 1-10 A presents t SPE and t Force required for the corresponding job types on a single core of a Quad-Core AMD Opteron 2356 CPU, with 4 Gb available memory. Of course, the calculations can be parallelized with different levels of efficiency, but since this varies between different approaches, our benchmark timings are reported here for a sequential job on one CPU. In order to make the real cost of the calculation clear to the reader, we estimated the total number of CPU-hours for a full QM(ai)/MM study by our PD approach, and compared this with an estimate for the computational cost of studying the same system using the MTD 38 approach, using: here, the respectiv respectiv Figure 1-1 seconds) for single point studied syste number of C and for a co simulation ti QM approa RmPW1PW (10) RCCSD As ment for our P 6000 are ~200 tim factor de e total time vely, and N vely. 0 A comparison o different method energy and the e em using G03 on CPU-hours requir orresponding MT ime for method 1 aches used for t 91/6-31+G(d), (6 D-FC/6-31G(d) an tioned above PD approach, e “Force” ca mes computat pends on the needed for N MTD and N P of the simulation ds, for calculation energy derivative a single core of a red for a PD stud TD study of the s 11 (60667 CPU-h he benchmark a 6) RMP2-FC/6-31 d (11) RCCSD-FC e, in the case , this involve alculations. F tionally mor e QM metho 0.5 MTD PD t t N the MTD an PD denote t times required fo ns of only the sing es (Force, red). T a Quad-Core AM dy of the test syst same system (red hours) is out of sc are: (1) PM3, (2 1+G(d), (7) RCIS C/6-31+G(d). e of MTD, t es 12000 tot From this re re efficient t od used). 20 MTD Force PD Force Nt N t t nd PD calcu the total nu or MTD and PD, gle point energy ( The calculations o D Opteron2356 C tem presented in d), which would c cale. In both case 2) RHF/6-31G(d SD-FC/6-31G(d), this involves tal QM calls, elationship, than the MT SPE t ulations are umber of Q at different level (SPE, black) as w of the energy and CPU with 4GB av this work (black comprise 2·10 6 Q es, the X-axis rep d), (3) RMP2-FC (8) RCISD-FC/ s 2·10 6 “For , of which 6 it can be se TD approach designated QM calls for ls of QM theory.( well as calculation d its derivatives vailable memory. k), comprising a t QM calls, based o presents the relev C/6-31G(d), (4) R /6-31+G(d), (9) R rce” calls, as 000 are SPE een that our h (though, cl ( as t MTD and r each appr A) Time per QM ns that calculate b were performed . (B) The estimat total of 12000 QM on Ref. 38 . Note t vant QM approa RB3LYP/6-31+G RMP2-FC/6-311+ s per Ref. 38 , E calculation r PD approa early, the pr (1.34) d t PD , roach M call (in both the for the ed total M calls, that the ch. The G(d), (5) +G(d,p), , and ns and ach is recise 21 1.4.5 Choosing an Appropriate Reaction Coordinate At this stage, it is important to comment on our choice of RC, x, in this work. In general, the choice of RC is of crucial importance, as it is not only important to select a RC that properly describes the relevant process of interest, but also, a poorly selected RC will greatly affect the convergence of free energy calculations, as the chosen RC is often used to construct the mapping potential. Now it may be tempting to describe reaction progress by means of the popular approach of using a geometric coordinate, and, in fact, the “collective variables” that are normally chosen to define the FES in MTD tend to be geometric variables such as bond distances, angles, torsions, etc. However, while such a description may be effective in gas-phase studies, it becomes rapidly problematic when attempting to model the multi-dimensional RC in solution and in proteins (particularly as it is important in such cases to be able to efficiently capture the effect of solvent polarization). Fortunately, the PD sidesteps all problems associated with the use of a geometric RC by following our earlier treatment 36 of using as our RC the electronic energy gap (E gap ) between the two diabatic states. This approach, which we believe is at present the best RC when studying chemical processes in solution and in enzymes, is particularly powerful when one tries to represent the entire many dimensional solvent space by a single RC (see e.g. discussion in Ref. 41 ). Specifically, the energy gap coordinate, x(R,s) naturally includes the dependence on the solute geometrical coordinates, R, as well as on the solvent coordinate s, reflecting the solute- solvent interactions, which are dominated by the electrostatic term 41 . As the reaction involves changes in both R (due to changes in the chemical structure of the solute) and s (due to the response of the solvent to the change in the solute charge distribution), it is crucial to correctly represent their coupling at each point along the reaction path. The use of just the solute coordinate, R, requires extreme caution and longer relaxation times in order for the solvent to adjust to the new charge distribution on the solute atoms, and it also misses the so-called non- equilibrium solvation barrier (see Refs. 41,54 ). In contrast, the energy gap represents the entire configurational space of the system (both solute and solvent), and therefore an EVB mapping potential driving the system from the reactant to products along this coordinate allows for more efficient sampling. Additionally, the energy gap coordinate appears to accurately capture the physics of the solvent reaction field, a fact that is now recognized by many workers. In 22 comparison, it is unlikely that a blind MTD search will find the optimal solvent coordinate (see section V). This was specifically illustrated by a recent study 38 , which examined the efficiency of the energy gap as an universal RC for the simulation of chemical reactivity, using the same test case as that presented in the current work. In this work, the authors demonstrated that not only does E gap (which can be used in systems described with any quantum chemistry Hamiltonian) provide significantly more efficient sampling than a geometrical RC (see above), as well as allowing better TS localization, but, more importantly for the cases in which E QM can be reproduced within 2-3RT (which we demonstrate is the case with our PD approach), the use of E gap allows one to calculate the free energy barrier with two orders of magnitude less computational effort than performing MTD calculations on the same system. 1.5 Comparison between the PD and the MTD Approaches The present chapter gives comparison of the MTD approach (with the direct sampling of the target QM potential) and the PD approaches (utilizing the reference potential) in the specific case of QM/MM calculations. However, we would like to emphasize that the PD approach is not a specialized method that performs best in the case of one-dimensional problems. That is, like the MTD approach, it is an extremely generalized approach that can be used in countless complex situations (and is possibly even more generalized than MTD). To elaborate on the dimensionality issue, we start from a rather trivial clarification of the fact that the PD is not actually a highly specialized approach to construct a 1-D free energy profile, using E gap as a the reaction coordinate. Here, we would like to point out that while E gap is in our opinion the most reliable reaction coordinate, nevertheless, the PD approach is not limited to using this reaction coordinate. In order to illustrate this fact, we have provided the PD-refined EVB free energy profile along the conventional geometrical coordinate used to define S N 2 reactions (i.e. the difference in distance between the electrophilic center and the leaving and attacking nucleophiles respectively) in Figure 1-11. This profile closely represents the QM FES constructed using the more expensive approach of Ref. 38 in the whole range of the RC between the RS and the PS. Figure 1-1 Shown here refinement ( product state Figure 1-1 distances bet In Figure between 11 Comparison o is the free energ (blue), the MTD es (cyan), obtaine 12 2-D Free Ener tween the central e 1-12, we a the central f the free energy gy surface in the approach of Ref ed using the harm rgy Surface const carbon atom and lso provide carbon atom y surfaces obtain reaction coordin f. 38 (red) as well monic approximat tructed by EVB d the leaving and the 2-D PD- m and the le 23 ned using differen ate defined by r( as the free energ tion. FEP/US after th attacking nucleo -refined EVB eaving and a nt approaches fo (C-Cl A )-r(C-Cl B ) gy values of the he PD refinement ophiles respective B FES (defi attacking nu or the S N 2 reactio obtained by: EV PM3 optimized t. The surface is ely (i.e. r(C-Cl A ), r ined in terms ucleophiles), on studied in thi VB FEP/US after reactant, transiti defined in terms r(C-Cl B )). s of the dista , where it ca s work. the PD ion and s of the ances an be 24 seen that the coordinates of the relevant stationary points (RS, TS and PS) are close to the PM3 optimized values (i.e. 1.806/2.843, 2.189/2.189 and 2.843/1.806Å respectively). Both surfaces were, as mentioned, obtained using EVB FEP/US after performing the full PD refinement, for the test system considered in this work (see Figure 1-3). Once the PD refinement has been performed, it is trivial to obtain the 2-D FES by post-processing a single computational job, in contrast to the MTD approach, which would require a separate run for each choice of RC. 1.6 Quantifying the Paradynamics Refinement We also demonstrate the refinement of EVB parameters for another S N 2 reaction between 1,2 – dichloroethane and acetate anion. Here we performed the refinement of the EVB parameters using the energy derivatives and a high-level of theory DFT functional omegaB97X-D and calculated the corresponding EVB free energy surfaces Figure 1-13 Steepest descent minimization of the least-square function of Eq. (1.29). (LEFT) The steepest descent least-square function minimized with respect to a set of EVB parameters by the optimal steepest descent algorithm implemented in the development version of MOLARIS, here it’s is comprised of the energy derivatives. The geometries for the refinement are taken along the reaction path on the EVB mapping potential for DCE + AcO(-) (RIGHT) Difference between the energy derivatives before (RED) and after (GREEN) refinement for DCE + AcO(-) system for the geometries generated near the reactants state and near the transition state on the wB97X-D potential. For a large system, however, the number of parameters grows very rapidly and it is becoming important to identify those, which contribute a lot to the least-square function. Furthermore, sometimes it might be better to refine the vector of EVB parameters, p, in the optimal steepest descent approach with the result of k+1-th iteration given by: 25 1kk k sgrad pp p (1.35) Major efforts were invested in automatization of the parametric refinement of EVB parameters, and a significant progress was made in this direction. Still, as was pointed out earlier, this approach heavily relies on human’s chemical intuition and even with a deep understanding of the EVB method might require a heuristic approach. Often a significant amount of human time has to be invested in order to refine parameters for the EVB reference potential. As the computer time becomes more affordable, other, computationally more expensive, potentials of higher levels of theory can be used as a reference, and the PD refinement scheme has to be generalized for any reference potential. Figure 1-14 2-D Free Energy Surface constructed by EVB FEP/US after the PD refinement. (LEFT) EVB 1D FES after PD refinement (blue) for DCE + AcO(-) in the gas phase (RIGHT) 2D EVB FES after PD refinement relative to the omegaB97X-D/6-31+G(d) potential. QM optimized values for r(C-Cl) and r(C-O) in Angstroms for RS(PS) are 1.820/3.207 (3.358/1.457); for TS - 2.321/2.041. 1.7 Concluding Remarks The present chapter clarifies the formulation and implementation of the PD approach, and compares it to the currently popular MTD approach, which was chosen merely as a recently developed simulation protocol involving both an interesting sampling strategy coupled to the particular way of calculating the free energy surfaces. In next chapters, we turn to the cases of calculating the free energy surfaces using the direct sampling combined with the umbrella 26 sampling approach. Here, we highlighted the conceptual similarities between the idea of refining the reference potential and the iterative construction of the negative of the target potential in the MTD approaches, as well as to demonstrating the actual computational requirements of each approach. We demonstrate that the PD approach provides major savings in computational time due to the fact that it requires significantly fewer QM calls to accurately evaluate the activation free energy than the MTD approach. That is, since the main contributions to the barrier height are calculated by the EVB FEP/US approach, where MD is run on the EVB mapping potential, the calls to the QM program are only needed in order to evaluate the free energy difference between the EVB and QM surfaces at the key inflection points on the RC, namely the RS and TS. It should also be pointed out that, to a large extent, the advantage of the PD approach comes from the wealth of chemical information provided by the EVB treatment (as mentioned above). However, the general idea of using a simple reference potential can also by exploited by an adaptation of the proposed approach in which, rather than using EVB as was done in this chapter, the reference potential is evaluated by means of a fast MO semi-empirical approach 55 . Interestingly, as will be demonstrated in the next chapter, the PD option of refining the reference potential can also include using Gaussian type functions that will help minimize the difference between V 0 and V’ 27 2 Chapter 2. Exploring, refining and validating the Paradynamics model 2.1 Introduction Modeling of chemical processes in the condensed phases using QM/MM 56 technique and the simplified reference potential (RP) approach 57 long ago became the mainstream in computer simulations. The recent methodological advances in the electronic structures calculation, in parallel computing and the increased computer power have opened new opportunities for the predictive ab initio QM/MM simulations and for understanding chemical processes in polar environment on a molecular level 5,58-60 . One of the recent advances is a powerful and time- efficient computational scheme for the QM(ai)/MM free energy calculation using the RP approach 61-63 which was implemented in the Paradynamics (PD) model 64 . The PD model is based on a number of earlier works (besides those given in ref. 1, 2). First, on those which have established the way of efficient sampling of the reference potential (RP) using the Empirical Valence Bond 65 (EVB) method combined with the evaluation of the free energy surface by a specialized combination 66,67 of the Free Energy Perturbation (FEP 68 ) and Umbrella Sampling (US 69 ) protocols, with the energy gap as a reaction coordinate 66,67 . Second, on those which introduced the idea of using the EVB RP and the FEP/US approach in semi-empirical 62 and ab initio 61 QM/MM free energy calculations. Third, on the ideas of refining the EVB RP 63 to minimize the energy difference between the EVB RP and the target QM(ai)/MM potential (TP), and using the Linear Response Approximation 70,71 (LRA) for calculating the FEP, while moving from the RP to the TP at the transition (TS) and at the reactant states (RS). The refinement of the EVB RP is important for fast convergence of the FEP in LRA approach. That is, evaluation of the activation free energy barrier by the PD approach using the EVB RP was estimated 64 to require ~200 times fewer calls to the TP compared to MTD studies involving a direct sampling of the TP 72 . The PD approach provides a rigorous way of dealing with the high computational cost of sampling on the QM(ai)/MM potential necessary to get the accurate free energy barrier. In this chapter, we report further progress and advances of expanding the potential of this approach, clarifying its assumptions and introducing several practical advances. 28 At first, it will be shown that moving from the RP to the TP in the PD scheme can be done with both the linear response approximation and the full multistep free energy perturbation. Thus, we will demonstrate that the PD approach is not limited by validity of the LRA, which will also be quantified. Secondly, an alternative strategy to the iterative PD approach 64 based on refining the EVB RP will be put forward. This leads to generalization of the RP functional form used in the PD scheme. As was already mentioned, in some cases the parametric refinement of the EVB parameters might not be fully robust. In order to overcome this problem we developed an approach that at least in some cases can be very effective in constructing and refining the RP in alternative, more general way. This new approach involves fitting the potential energy scan (PES) along the specified reaction coordinate (RC) in the gas phase (or in the implicit solvent) for both the RP and the TP by Gaussian functions. The refined RP is constructed as the original RP modified with the correction equal to the difference between the TP and the RP approximated by the Gaussian functions. An additional step of refinement in case of the EVB RP involves replacing the original vectors of charges for the EVB diabatic states (Reactant State (RS) and Product State (PS)) with the vectors of charges fitted to reproduce the solute charge distribution given by the TP in presence of solvent. Third, in this chapter we will look into the problem of increasing the efficiency of sampling in PMF calculations, and also examine the efficiency of calculating the free energy perturbation at the TS. We will provide a clear recipe of moving from an arbitrary reference potential to an arbitrary target potential in a quantitative manner using removable constraints on the nuclear reaction coordinate. This new advance brings the PD model to the quantitative level of agreement with the PMF calculation involving the direct sampling of the target potential what will be demonstrated to quantify and validate the PD model for a chemical reaction in the condensed phase for a general MO-based QM/MM reference potential. The original RP approach 17 was adopted and implemented in a number of methods for the QM(ai)/MM calculations of activation free energies in enzymes 73-75 and in aqueous solutions 76 , free energies of solvation 77,78 , binding 79 and of the conformational changes 80 . For example, in one of the recent approaches, a semi-empirical AM1/MM potential was used as the RP to do the 29 configurational sampling and to construct the reference Free Energy Surface (FES). Then, at a number of points along the chosen RC, optimization on a DFT QM/MM potential was performed yielding the least energy reaction path. The entropic contribution was approximated by the difference between the activation free energy and the activation energy obtained on the AM1/MM RP. Furthermore, single point energy calculations on the optimized DFT QM/MM least energy path, using CCSD 73 or MP2 74 , were proposed to provide the activation free energy for a high level ab initio QM/MM surface. Another technique, introduced by Yang and coworkers 81 , and adopted elsewhere 82 , provides an estimate of the QM(ai)/MM free energy profiles with a fixed solute at the predetermined RC (which is determined, for instance, through iterative sequential optimization of the QM and MM regions independently, where in each step one of the regions is kept fixed while optimizing the second 81 ). This is followed by a single-step FEP from the MM-type RP to a QM/MM TP (with averaging on the RP). The reference FES is constructed by sampling on the RP. The MM-type RP (for which the ESP-derived solute charges are updated periodically with QM/MM calculations), suggested by Yang and coworkers, seems to be capable of capturing both the fixed solute polarization and the solvent polarization in a physically consistent way. An extension of this approach involves single point energy evaluations 75,83 on a high level QM/MM, performed for the fixed solute geometries, similar to the approach used in the earlier mentioned work 73 . The energy difference between the DFT QM/MM and the high-level QM/MM was assumed to approximate the FEP while moving between these two potentials 82 . In another recent work, another semi-empirical RP, PM3/MM 80 , was used to construct the reference FES, followed by a single step FEP from the RP to the DFT/MM TP with an average being calculated on the RP. The overall RP results, obtained in the above studies, clearly indicate that this approach is a powerful and versatile tool, which allows one to overcome high cost of the direct sampling on the QM(ai)/MM potential. However, there is a great potential for improvements of some of the existing RP treatments, especially considering the fact that some of them reflect overlooking our earlier, and arguably more consistent, RP treatment. For instance, one can foresee certain disadvantages of the minimization done during the PES on the multidimensional DFT QM/MM surface, as it is expensive and rather computationally inefficient, besides it can lead to multiple 30 local minima. One has to keep in mind that the entropic contribution to the activation free energy barrier can be different for the TP and the RP, and that the activation energy is not equal to the activation enthalpy. Another vulnerable point is the slow convergence of the single step FEP, when the averaged is taken only on the RP. This is especially problematic when the difference between the TP and the RP is significant (and their overlap in the phase space is small). In this case, sampling on the RP poorly represents the relevant region of interest on the TP. The above problems are largely eliminated by the PD approach due to, mainly, the following features; (a) using the end point LRA that takes averages both on the RP and on the TP, and (b) using the PD refinement, which ensures the faster LRA convergence due to the higher overlap between the TP and the refined RP. These points will be illustrated and discussed below. An important element of the PD approach is the idea of refining the RP so it will be as closed as possible to the TP. Our main refinement strategy starts with evaluation of the energy gap between the RP and the TP for the geometries generated by MD trajectories propagated at the RS and at the TS on the RP as well as on the TP. This step is followed by the least-square minimization of the energy difference (plus other quantities). Although this approach is very effective, we found out that in cases when the improvement of the EVB requires a new functional we can get an effective refinement by adding Gaussian functions to the RP in a way that minimizes the difference between the RP and TP along the RC. A somewhat similar refinement strategy has also been introduced in other context 84 . A related idea of deforming the original potential to eliminate local minima was proposed by Scheraga and coworkers 85 in order to find the global minimum on a multidimensional surface. Use of a single negative Gaussian in the bias potential was also found to be useful in the MC study performed by Jorgensen and coworkers 86 , however, it was for a rather stand-alone, specific case. A more general strategy of improved sampling by iteratively fitting the negative of the original potential with Gaussians in The Local Elevation Method 87 was formulated by van Gunsteren and coworkers. This idea was adopted, generalized and popularized in the Metadynamics (MTD) approach of Parinello and coworkers 27 . As will be clarified below, unlike these approaches ours does not require expensive iterative sampling while building the negative potential. This step is accomplished by fitting the PES, and using the negative potential approximated by Gaussians in our FEP/US PMF 88 31 calculations. For PMF calculations in the explicit solvent models, to circumvent the high cost of the PES, we additionally proposed using the EVB type solvent driving potential, which polarizes solvent by the solvated solute charges along the reaction path, while using the negative potential derived in the gas phase PES. In summary, we do two separate applications with the Gaussian functions: 1. In the PD refinement, where we perform two PES’s (on the RP and on the TP), fit them both with the Gaussian functions, and take the difference as correction to the original RP; and 2. In the improved sampling strategy: where we perform PES to obtain the negative of the original potential, which is used to create the flat mapping potential in FEP/US PMF calculations. The sampling approach is augmented by the EVB-type solvent driving potential for the condensed phase calculations with the explicit solvent models. At any rate, our Gaussian refinement strategy is as related to the MTD as our idea of building the RP that will be as similar as possible to the TP. As we clarified in our previous work 64 this early RP idea is formally identical to the MTD idea of building the negative of the TP, but allows for more effective implementation. 2.2 Methods and Results 2.2.1 Paradynamics approach with an arbitrary reference potential The central idea of the PD (RP) approach is that the extensive configurational sampling required to calculate the QM(ai)/MM free energy barrier is done on a computationally inexpensive reference potential 61-63,71 , rather than directly on the expensive target QM(ai)/MM potential. Our standard approach, described in Chapter One, involves a construction of the free energy surface (FES) using the EVB 65 method and the FEP/US technique 67,69 . This is followed by evaluating the FEP using the LRA approach 44 , while moving from the EVB RP to the TP at the RS and at the TS. In Figure 2-1 we generalize the RP scheme to include an arbitrary reference potential. In considering the activation barrier for chemical reactions we follow derivation of the previous work 71 and start from the expression for the rate constant, k, in order to clarify the difference between the free energy profile (or the PMF), Δg, and the free energy of system in a particular state, ΔG. That is, the multidimensional rate constant can be expressed as 36,89 : 32 exp exp 22 exp TS TS g kG gx dx (2.1) Where κ is the transmission factor, is the velocity along the RC, , is the width of the TS region, RS g is the value of the free energy profile (or PMF ) at the minimum at the RS ( RS x ) and g is its value at the TS ( x ). Here G is defined by: exp exp 1 exp RS RS gg G gx g dx 1 ln exp RS RS Gg g kT gx g dx (2.2) where the last term of Eq. (2.2) provides a convenient way of incorporating the entropic effect of the ground state into the TS-theory rate constant 36 . For the free energy of the RS we also have: () gx GRS eedx (2.3) In order to stress out this conceptual difference, we enforce in this work the notation with the lower case Δg( ξ) and the capital ΔG, as respectively the value of the free energy profile or PMF at a particular RC value (e.g. at TS, RS) and the free energy of a particular state (e.g. RS or TS defined by Eq. (2.3) or by ). The second key idea of the PD approach, which drastically improves its computational cost, is use of the LRA approximation to calculate the free energy differences between the RP and the TP shown in Figure 2-1. Figure 2-1 free energy s free energy estimate the For exam can write 1 2 REF E T g E where the Finally, t it closer This wa framewo are refine 1 The thermodyn surface, Δg( ξ), ob surface for the t free energy of mo mple, in the e 71 : TARG TARG TARG EE TARG E e q’s are the the third imp to the TP an s discussed rk for the PD ed by seekin 1 , N i k pr amic cycle used i btained for the r arget potential, oving from the re case where ln T TARG E E REF E q kT q E partition fun portant comp nd ensures a in details D refinemen ng the minim 1 , N EVB i E p r in the Paradynam eference potentia for which we wa eference potential the TS posit TARG TARG REF TARG TARG E E E TA E E nctions at th ponent of the fast converg in Chapter nt procedure mum of the le QM i E r r 33 mics approach to al with the corres ant to evaluate l to the target pot tion is ident ln exp TARG ARG E kT he specific R e PD approa gence of the 1. The abo e, where in c east square fu 3 2 2 11 N ij k o calculate the fre sponding barrier the free energy b tential at the RS a tical for the p TARG TARG REF E E E RC values. ach is refinem e FEP calcul ove LRA case of the E function 64 : EVB i j E x p ee energy barrier , Δg ≠ , (blue arro barrier (red arro and at the TS (sh two free en TARG REF EREF E E kT ment of the R lation from t expression EVB RP the , Q i j E x p r rs. The blue curv ow); the red curv ow). To achieve t own by black arr nergy surface TARG F E E (2 RP, which b the RP to th also provid EVB param 2 QM r (2 ve is the ve is the that we rows). es we REF E .4) brings e TP. des a meters .5) 34 by either using a simplifed Newton Rhaphson approach where we refine one EVB parameter, p i , at a time with the result of k+1-th iteration given by: 22 1 ii kk i i k pp g p p (2.6) or just by refining the vector of EVB parameters, p, in the optimal steepest descent approach with the result of k+1-th iteration given by: 1kk k sgrad pp p (2.7) Similar schemes for refining a semiempirical potential can be realized. However, for the sake of generality we will establish a novel generalized refinement strategy. We will consider below the refinement scheme utilizing the EVB RP, and also provide a benchmark example of using a more expensive (compared to EVB) semi-empirical RP to demonstrate that the PD approach is general as far as the RP is concerned. 2.2.2 Validating the LRA by a full FEP treatment 2.2.2.1 The case of 2 general potentials The LRA approach is important for the cost effectiveness of the PD approach. Its overall convergence may be problematic when the difference between the RP and the TP is significant, what is adressed by the PD refinement. Nevertheless, the validity of the LRA treatment does not limit the application of the PD approach. LRA is just the end point approximation of the full FEP treatment which can be easly implemented by: 1 1 1 ln exp m n mm FEP m E EE GkT kT (2.8) where n is the number of simulation windows, between which the perturbation parameter λ (here weight of the TP) is changing from 0 to 1, and E m is the mapping potential given by: 1 m m REF m TARG EE E (2.9) E REF and E TARG designate the RP and the TP, respectively. Better convergence is obtained, however, if we take the average of the forward and backward FEP (see e.g. 88 ): 1 11 12 1 ln exp ln exp 2 m m nn mm m m FEP mm EE EE E E GkT kT kT kT (2.10) 35 With the accurate results of the FEP method we can explore the validity of the end point LRA, which gives the relevant G by 70 : 1 2 REF TARG LRA TARG REF TARG REF EE GE E E E (2.11) Furthermore, to demonstrate how performance of LRA is affected by the difference between the two potentials we also examined the n-points LRA (which is also called the slow growth approach) between the adjacent mapping potentials: 1 1 11 1 1 2 mm n n LRA m m m m EE m GEE EE (2.12) Obviously, with a large n this expression converges to the regular FEP. In the present case we start our examination of the validity of the LRA treatment by evaluating the FEP and LRA perturbations between the PM3/MM 47 and PM6/MM potentials of the MOPAC2009 46 package, with the correct treatment of the solute polarization 56,62 described in below. This is done for the S N 2 reaction in water between methyl chloride and chloride (here PM3/MM is the RP, and PM6/MM is the TP). For the FEP treatment at the TS we used (in addition to the QM/MM potential) the harmonic potential: 2 0 CONS EK (2.13) where ξ is the reaction coordinate defined as ()() AB RCCl RCCl (2.14) Note that 0PM3 0PM6 0 xx We also evaluated the free energy by the thermodynamic integration method (which is very similar to the regular FEP), using: 1 1 2 0 1 2 mm n TDI TARG REF TARG REF EE m G G d EE EE (2.15) The evaluations of ΔΔG were done between the potentials REF E and TARG E , corresponding to PM3/MM E and PM6/MM E which were contained to the TS region by: Similarly Performa remarkab those of between in order harmonic values (s the vicini Figure 2-2 dashes), a sin that the resu line), as well the insets). REF E TARG E y for the RS REF E TARG E ance of the bly good per the full FEP the RP and t to keep the c potentials u specified by ity of these p 2 Estimating the ngle point LRA a ults of the calcula l as the N-points PM3/MM E PM6/MM E we have: PM3/MM 1 E PM6/MM G E FEP and L rformance o P treatment. the TP, whic e trajectories used to cont the correspo points. free energy of m averages of the e ations made by f LRA (black trian 0 100 0 100 0 0 100 R 0 100 LRA calcula of the end-p Note that t ch were mod s at the spec tain the samp onding ξ 0 ) a moving from E PM3 nergy gap, E PM6/M forward FEP (m ngles) and TDI (o 36 0 2 0 0 2 0 0 2 1.75 RS 0 2 1.75 RS ations is com oint LRA tr these are the dified by add cific RC va pling at the R at which the 3/MM to E PM6/MM a MM -E PM3/MM , on E agenta squares), orange stars) are mpared in F reatment wi e estimates o ding the con alues (RS an RS and TS h FEP was ca at the TS (A) and E PM3/MM (red dash backward FEP hardly distinguis Figure 2-2, ith nearly id of the free e straining har nd TS). Not have no bias alculated, an d at the RS (B) b hes) and on E PM6/ (cyan triangles) shable due to the (2 (2 where we dentical resu energy differ rmonic poten te, also, tha s effect at th nd small effe by: 2 points LRA /MM (green dashes and their averag e overlap (also sh 2.16) .17) see a ults to rence ntials at the he RC ect in A (black s). Note ge (blue hown on 37 Of course, the convincing validation requires us to show that the barrier calculated for the TP using the PD approach is the same as the barrier obtained in a separate PMF calculations. In order to be able to compare the ΔΔg of the PD treatment to the corresponding difference between the full PMF calculations we have to take into account the fact that what we get from the PMF corresponds to the partial partition functions, q’s, of Eq.(2.4), whereas the FEP simulations described above sample the whole phase space. Thus strictly speaking in order to compare with the difference between the individual PMFs we have to determine ΔΔg( ξ) through the difference between the free energy functions of RC (at its specific values with removing the effect of the constraining harmonic potential). This, in turn, can be compared to the ΔΔG estimates obtained by the FEP treatment between the unbiased potentials plus the corresponding harmonic constraints. This can be done with a treatment, which is a variant of the EVB FEP/US mapping formulation 66,67,90 . Namely: PM3/MM PM6/MM PM6/MM PM3/MM gg g (2.18) Where the free energy functions are given by PM3/MM PM3/MM ln exp m m m E EE gGkTx kT (2.19) and PM6/MM PM6/MM ln exp m m m E EE gGkTx kT where PM3/MM PM6/MM 1 mm m CONS EE E E In this expression we deal with the change from E REF to E TARG (in the EVB treatment we go from E 1 to E 2 ), and ΔΔG m is the result of moving from E PM3/MM +E CONS to E m by the FEP procedure. The second term removes the effect of the constraining potential and selects the particular values of RC. To demonstrate the point discussed above we start by building a histogram for the MD trajectories to see distribution of the energy gap between the two potentials, (E PM3/MM - E PM6/MM ). As can be seen in Figure 2-3 (A), the distribution of the energy gap for the MD trajectories propagat centered kcal/mol RS and between These est (see Figu Further e harmonic RC. Thus ξ 0 param the free distributi found tha very little Figure 2-3 calculations regions (desi FEP calculat and green re To extrac (2.18) w ed during ca at -3.5 kcal/ . These are at the TS. the activati timates are c ure 2-2) examination c potentials s, one can ev meter of the h energy per ion. Note, h at while incr e effect on th 3 (A). The distri of moving from E ignated by red an tion of moving f espectively). ct the releva with the nuc alculation of /mol. Simila the most pr Adding up on free ener close the val ns of the dis effectively n valuate the f harmonic po rturbation is owever, that reasing the f he distributio ibution of the e E PM3/MM +E CONS ( ξ) nd green respectiv from E PM3/MM +E C ant ΔΔg( ξ) w clear RC. T f the free ene arly, the corr obable diffe these estim rgy barriers lues obtaine stribution of narrowed the free energy p otential. Whi estimated, t all the oth force constan on of the ene nergy gap, E PM3 ) to E PM6/MM +E CON vely); (B) The sam CONS ( ξ) to E PM6/MM which can b his gave th 38 ergy of mov esponding g erence betwe mates also gi for the RP d by perform f the nuclea e sampling a perturbation ile the ξ 0 pa the force c her solute de nt further na ergy gap and 3/MM - E PM6/MM , o NS ( ξ), where ξ=d(C mpling distributio M +E CONS ( ξ) in the e compared he free ener ving from th gap evaluated een the two ives the mo and for the ming the FE ar RC in Fig at the corres s at any valu arameter spe constant con egrees of fre arrowed the d almost no e obtained from th (C-Cl L )-d(C-Cl A ) is on of the nuclear RS and in the T to the corre rgy function he RP to the d near the T potentials c ost probable e TP, which EP between t gure 2-3 (B sponding tar ue of the RC cifies the RC ntrols the w eedom are s distribution effect on the he MD trajector s the solute RC, r RC, obtained fro TS regions (whic esponding P ns of the R TP at the R S is centered constrained a e total differ h is 7.5 kcal the two poten B) shows tha rget values o C specified b C value at w width of the ampled. We in the RC, i e FEP estima ries propagated in the RS and at om the MD runs ch are designated PMF we used RC, Δg PM3 ( ξ) RS, is d at 4 at the rence l/mol. ntials at the of the by the which e RC e also it had ate. in FEP t the TS used in d by red d Eq. ) and Δg PM6 ( ξ) the TS ( ξ demonstr energy o contain s between Figure 2-4 the full mapp weighted ave Figure 2-5 ESP-fitting ( procedure. , in the vicin ξ=0) (~ -4) rated that th of moving fr sampling at the free ener 4 The free energy ping between PM erages over all ma 5 The PMF obtai (blue) and for PM nity of the R and at the R e estimates rom the RP the RS and rgy function functions of the M3/MM and PM6 apping potentials ined for the PM3 M6/MM potential S and TS. In RS ( ξ=-1.75 obtained by to the TP d at the TS) ns at the corr solute RC, ξ, obt 6/MM. The blue li s, blue and red do 3/MM potential w with the Mullike 39 n Figure 2-4 5), which ar the full FEP (which are ) are in an e responding R ained at the RC v ine designates Δg ots, respectively. with the solute ch en solute charges one can read re ~ -4 and ~ P and by the modified by excellent ag RC values. values correspon g PM6/MM and the r harges derived by (green). The PM d the corresp ~ 3.5, respe e LRA evalu y the harmo greement wi nding to the TS (A red line designate y the Mulliken a MF’s have been ca ponding valu ectively. Thu uation of the onic potentia ith the differ A) and to the RS es Δg PM3/MM , obta analysis (red) and alculated by the W ues at us we e free als to rence (B), for ained as d by the WHAM 40 Finally, we compared our careful evaluation to the difference between the PMF’s for PM3/MM and PM6/MM shown in Figure 2-5. Each PMF was obtained by introducing the harmonic constraint at the specific RC values, ξ 0 , along the reaction path. The free energy was calculated by the WHAM 91,92 equations. Inspecting the PMF’s one finds that the difference between the activation free energy barriers is about 7.5 kcal/mol, which is in a perfect agreement with the results obtained for the difference between the potentials by the FEP treatment of Eq. (2.10) and the FEP/US treatment. Thus in the above section we demonstrated, using two general MO-type potentials, that the LRA approach gives a very good approximation for the full FEP/US treatment, while giving a significant time savings compared to the full FEP treatment. Also note with the difference between the RP and the TP of ~ 4 kcal/mol, the LRA approach performs remarkably well when the averages are calculated on both potentials. 2.2.2.2 Using the EVB as the reference potential Next we will expand on the FEP-LRA evaluation of ΔΔG for moving from the EVB RP to a QM/MM potential. In this case it is useful to comment here about evaluation of the free energy perturbation at particular values of the RC, when one uses EVB as the RP. Now, as we have shown in the previous example, the harmonic constraint on the solute RC is quite efficient for estimating the FEP at a particular RC value for a MO-based QM/MM potential (where it’s enough to have K=100 kcal/(A 2 mol) to stay at the specified RC). For the EVB potential we found that a more effective way to keep EVB at its TS (where E 1 =E 2 ) is obtained by using the approximation 93 : 12 12 0.5 0.5 EVB EVB ExE EH (2.20) where we use the fact that Eq. 20 improves overlap with the adiabatic EVB at the TS: 22 11 2 2 12 12 2 EVB E cE cE cc H (2.21) The use of this expression is equivalent to constraining the eigenvector components to be equal (so that the both of EVB diabatic states equally contribute to the adiabatic state). Figure 2-6 line), 2 point Note that the line), as well the insets). Here we the poten E and at the E The estim approxim both the ~ 6 kcal/ PM3/MM In the ne for the F probable at the TS 6 Estimating the t LRA (black das e results of the ca l as the N-points estimate the ntials: REF EVB E E e RS using: REF EVB E E mate of ΔΔG mation to the RP as well a /mol, thus th M activation ext step we c FEP calculat difference b S, which prac free energy of m shes), LRA avera alculations made b LRA (black trian e FEP betwe 100 EVB x 0 100 G by the end e full FEP ( as on the TP he activation free energy constructed t ions at the R between the ctically coinc moving from the E ages of the energ by forward FEP ( ngles) and TDI (o een the EVB 0 2 0 0 0 2 1.5 and d-point LRA Figure 2-6). 94 . The FEP n free energ barrier base the histogram RS and at th TP and the cides with th 41 EVB to PM3/MM gy gap, E PM3/MM -E (magenta squares orange stars) are B and MO-ba and TARG E PM3 TARG E E A approach w . However, at the TS an gy barrier fo ed on this est ms of the en he TS. From RP, E PM3/MM he full FEP e M at the TS (A) an E EVB , on E EVB (red s), backward FEP hardly distinguis ased PM3/M PM3/MM 10 E 3/MM 100 was found t we see the nd RS are, re or EVB is ~ timate. ergy gap E E m these histo M – E EVB , is ~ estimates an nd at the RS (B) d dashes) and on P (cyan triangles) shable due to the MM potential 0 0 00 0 2 0 1.5 to provide ag importance espectively, ~ 4 kcal/mol EVB - E PM3/MM ograms we s ~7 kcal/mol d with LRA in water by: FE E PM3/MM (green d ) and their averag e overlap (also sh ls at the TS u 2 0 (2 2 gain a very e of averagin ~2 kcal/mo l higher tha M (see Figure see that the at the RS an A. EP (blue dashes). ge (blue hown on using .22) good ng on ol and an the e 2-7) most nd ~2 Next, usi ξ=0, is ~ the value ~ 3 kcal/ potential function The diffe the FEP e Figure 2-7 RS (green) a . PM3/MM g EVB g ing Eq. (2.2 ~5 kcal/mol a e found from /mol higher given by E of the RC fo EVB(x ) EVB g erence betwe estimate of 2 The distribution and of the gap bet m GkT ln m GkT 3) we find t and at the RS m FEP at the than the co Eq. (2.20) i or the approx m G k een the free 2 kcal/mol. ns of the energy ga tween the EVB m ln T x ne x that the diffe S ( ξ=-1.5) is RS, the diffe orresponding in the FEP ximate EVB ln kTx energy func ap between the E mapping potential 42 PM3 exp E EVB exp E E kT ference betw s ~6 kcal/mo ference betw g FEP estim calculation, potential (s EV exp E tions at the T EVB potential and and PM3/MM w 3/MM m E E kT m m E E ween the free ol. (See Figu een the free mate. This pr , since if w ee Figure 2- VB(x ) EVB m E kT TS is about d the PM3/MM p while doing FEP at m E e energy fun ure 2-8). Whi energy func robably is d we construct -9) at the TS m E 3 kcal/mol, potential, obtained t the TS (red). (2 nctions at the ile this is clo ctions at the due to use o the free en : (2 which is clo d while doing FEP .23) e TS, ose to TS is of the nergy .24) ose to P at the Figure 2-8 potential to while the red Figure 2-9 TS. The blue (weight-aver In any ca kcal/mol Eq(2.24) 8 The free energ PM3/MM in wat d line designates Δ 9 The free energy e line designates rage over red poin ase, the corr , based on ) is -4 kcal/m gy functions alon ter at the TS (A) Δg PM3/MM (weight- functions along the EVB functio nts) rection to th the FEP/LR mol. In Figu ng the reaction c and at the RS (B -averaged over re the reaction coor n (weight-averag he PM3/MM RA (with th ure 2-10 the 43 coordinate ξ=d(C B). The blue line ed points). rdinate for FEP f ged over blue poi M free energy he approxim EVB free en C-Cl L )-d(C-Cl A ) ob designates Δg EVB from EVB mappi ints), while the re y barrier bas mation of E nergy profile btained in FEP B (weight-average ing potential to P ed line designates sed on Eq. ( EVB potenti e and the PM calculations from ed over the blue PM3/MM in wate s the PM3/MM f (2.23) is abo ial at the T M3/MM PM m EVB points), er at the function out -1 TS by MF are given: th and this Eq. (2.23 reference than the b estimate and thus PMF’s o practicall solvent m and then the EVB using FE the TP is vicinity o range of Figure 2-1 dots) and for he EVB free is in a perfe 3). However e free energy barrier at PM of 4 kcal/m when the TS on both surf ly, if the RC model minim use the PMF potential to EP/US 61 as it s required (w of the TS, w the RC. 0 (A) The free en r the PM3/MM (r energy barr ect agreemen r if the EVB y surface, fo M3/MM PM mol. The a S coordinate faces at the C for the Q mization) we F (again on o the actual t is done in which can be which will st nergy profile alon red line); (B) EVB rier is ~1 kca nt with the r B free ener or which the F, one finds above approa es on the RP TS region QM TS is kn e can evaluat the RP) to o EVB TS sta Figure 2-10 e done by on till give sign ng the nuclear RC B FEP/US free en 44 al/mol highe results calcu gy profile v activation fr a good agre ach evaluate P and on the (see 71 for nown (e.g. f te the vertic obtain the fre ate. In this c . If we do n ne of the app nificant time C obtained by the nergy profile alon er than the P ulated in the vs the EVB free energy b eement with es only the v TP are diffe the full dis from the gas cal transition ee energy of case the PM not know the proaches desc e savings co e FEP/US approa ng the energy gap PM3/MM fre PD model u energy gap barrier is ~ 3 the correspo vertical free erent, we hav cussion of s-phase or f n from the Q f moving fro MF on EVB e exact QM cribed below ompared to a ach for the EVB r p between the EVB ee energy ba using FEP/U p is taken a 3 kcal/mol h onding FEP/ e energy cha ve to evaluat this case). M from the im QM TS on th om the QM T can be estim TS then PM w) but only i a PMF in th reference potenti B diabatic states. arrier, US of as the higher /LRA anges, te the More mplicit he RP TS on mated MF on in the e full al (blue . 45 2.2.3 Semi-empirical potential as a model of a QM(ai) potential Semi-empirical methods serve here as models of a general MO-based potential. Semi-empirical approaches are based on making different approximations to the matrix elements of the Roothan equations which are solved explicitly in the Hartree-Fock method 95 : Fc Sc ε (2.25) where F is the Fock matrix, c is the matrix of eigenvectors in columns, S is the overlap matrix, ε is the diagonal matrix of the orbital energies. A matrix element of Eq. (2.25), the Fock matrix is 1 || 2 ij ij lm F h P ij lm im lj (2.26) The most computationally expensive are the two-electron Coulomb and exchange integrals which have to be computed at each self-consistent field iteration: 1 012 |11 2212 ab c d ab cd j r d d (2.27) In terms of eq. (2.26) the difference between semi-empirical methods such as those used here, PM3 and PM6, and ab initio QM methods (such as Hartree-Fock), is in the basis set used to construct molecular orbitals and in the amount of integrals calculated explicitly. In the HF- approach, all integrals are calculated explicitly, while in the PM3/PM6 methods only few, with the rest of integrals being approximated by empirical functions. Using semi-empirical potentials one can easily incorporate the effect of solvent into semi- empirical QM/MM Hamiltonian 96 by adding the interaction energy of an electron with the electrostatic potential created by solvent (MM atoms) in the one-electron Hamiltonian: kj ij ij MM kj q hh r (2.28) Electrostatic coupling between the solute (the QM region) and the solvent (the MM region) in the corner stone in QM/MM theory 36 . Thus it is crucial for capturing the correct physics (that is for the physically correct solute-solvent coupling) to polarize the solute wavefunction. For a general MO-type QM potential it is given by: / i QM MM QM QM MM ij q HH r (2.29) 46 It should be also noted that the computational cost of incorporating the solvent effect into the QM/MM Hamiltonian is small as it involves calculation of the one-electron integrals. Solvent in the QM/MM approach is also polarized by solvent which is captured via interactions of the solvent MM partial charges with the QM charges ji ij ij qq r (2.30) Note that while the solvent charges are the fixed MM charges, the solute charges reflect the effect of redistribution of the electron density in the solute. Other non-bonding coupling term, van der Waals interactions, are described on the MM level: 12 6 ij ij ij ij A B rr (2.31) When QM region is connected to the MM region via a chemical bond there are few assumptions needed to be made. First, is to ensure the integer charge and its conservation, since the QM region must have an integer charge one has to make sure that on the MM level the charge of the QM region is also integer. This can be done by parameterizing the charge distribution of the MM force field by dividing the MM entries (which have the integer charge) into groups with integer charge and by including all atoms of electroneutral groups in the QM region. Second, it gives rise to the bonding coupling between the QM and the MM regions which is treated in terms of MM force field. That is if there is a bond between a QM atom I and a MM atom j it is described as: If there is an angle formed by atoms i, j, k and either i or j are quantum atoms the angle should be included in the MM forcefield (if i, j or j,k are QM atoms the angle is not included) Finally, if there is a torsional angle formed by atoms i, j, k, l then if 2 or less or them are the QM atoms then a torsional angle should be added to the MM force field: The valence bond is closed by adding a hydrogen atom (link atom) to the QM region 1 A apart from the QM link atom host. Subsequently, charge of the H-atom is redistributed over other QM atoms proportionally to their charge. And the force is distributed between the so called QM and MM host-atoms (see e.g. Ref 97 ) 2.2.4 P 2.2.4.1 In consid evaluatin is routin refineme With the Gaussian optimal s The resu where the Figure 2-1 used in the s range: -1> ξ> functions. It was fo PM3, PM PD refineme Refining the dering the PD ng the PES i nely perform nt procedure gas phase P n functions. steepest desc ult of this fit e PES was p 1 Fitting the Γ-fu simulation (blue s >1. (B): PM3 and ound that ev M6 and EVB ent of the re e intramolec D refinemen n the gas ph med nowada e. PES one can For instance cent with ana , k A tting is a fu performed: unctions. (A) Γ EVB squares) was mod d PM6 gas phase ven a combin B (see Figure ference pot cular potent nt, we noted hase for chem ays. This m n simply fit th e, for the T alytical deriv 1 N k i A unction, Γ, w TARG B (green line) fitt dified by removin energy scans fro nation of ju e 2-11 and T 47 ential by m tial in the ga d that it is po mical reactio akes PES e he TP and th P we minim vatives, with TARG i E r which appro exp k k A ed to the EVB ga ng the effect of th m MOPAC2009 st 3 Gaussia able 2-1). Fo odifying its as phase usin ossible to ta on is a relati easily availa he RP (e.g. E mize the leas h respect to p 2 TARG i E r oximates the kk x as-phase PMF/W he EVB distance c (red stars and gr ans was suff or the detail functional ng Gaussian ake advantag ively trivial able source EVB potenti st-square fu parameters o e TP in the 2 WHAM (red triang constraints for th reen triangles) an fficient to fit s of the fittin form ns ge of the fac operation, w of data for al) by a set o unction, usin of the Gaussi (2 range of th (2 gles). The EVB p he RC values outs nd the correspon t the 1D PE ng algorithm ct that which r our of the ng the ians: .32) e RC .33) otential side the nding Γ- ES for m, see 48 Appendix 2. This procedure can be done for both the RP and the TP. In the next step we use the obtained Γ-functions to refine the original RP (e.g. the EVB RP): EVB EVB EVB QM EE (2.34) This refinement procedure is aimed to bring the RP close to the TP in a given range of the RC: TARGET REF REF REF TARGET EE E (2.35) Now we have the refined RP which provides an improved approximation for the TP what is essential for good convergence of the LRA. This approach is particularly useful when the EVB functional form is sophisticated and the refinement of its parameters is lengthy and tedious. As a quick demonstration of the described refinement procedure we consider the example with the EVB potential, which parameters were refined to match the gas-phase PM3 potential. In Figure 2-12 the effect of the Γ-correction on the PMF of the RP is demonstrated. As can be seen this relatively simple functional form is quite effective in reducing the difference between reference FES and the target FES. Table 2-1 Parameters for the Γ-functions presented in Figure 2-11 composed of 3-gaussians ( A i exp(- α i ( ξ- ξ i ) 2 ) )each A1, A3 A2 (kcal/mol) α 1 , α 3 α 2 (A -1 ) ξ 1 ( ξ 3 ) ξ 2 (A) EVB -3.92 6.98 1.58 3.66 -1.1 (1.1) 0.0 PM3 -4.39 6.80 2.40 2.22 -1.0 (1.0) 0.0 PM6 -5.25 3.35 2.89 3.70 -1.0 (1.0) 0.0 PM3 cosmo -0.18 20.85 0.63 2.47 -1.2 (1.2) 0.0 PM6 cosmo -1.02 16.58 1.34 2.16 -1.1 (1.1) 0.0 2.2.4.2 Application of the Gamma ‐correction in PMF calculations The main application of QM/MM calculations is in studies of reactions in condensed phases. But prior to extension of the refinement approach proposed in the previous section to condensed phases, we would like to report several practical applications of this approach in calculating the PMF, which were found to be quite useful and can be also applied to condensed phase PMF calculations. 49 Since PMF calculations might be quite expensive it is important to look for ways to optimize the corresponding calculations. Here we tried to exploit an element of the MTD approach (although, as will be clarified in the concluding discussion, with an entirely different philosophy). That is, as pointed out by van Gunsteren and coworkers 87 and as implemented elegantly in the MTD 27 approach, flattening the original potential by iteratively adding to it the negative potential approximated by a sum of Gaussians improves the convergence of the free energy calculations. The implementation of this strategy in the MTD approach involves building the negative potential iteratively, see the detailed discussion in the recent paper 64 . However, here we apply the idea of using Gaussians for improving the sampling with idea of the PD refinement described above. In contrast, to iterative building of the negative of the real potential we derive the negative potential by fitting the PES in the gas phase by a sum of Gaussians using Eq. (2.32) and (2.33). Thus, in addition to their use in the PD refinement the Γ-functions of Eq. (2.33) help in convergence of the PMF calculations. This is done through flattening the TP by adding to it the negative of the corresponding Γ-function. That is, following the FEP/US approach to PMF 88 we construct the mapping potentials of the form: 1 mmRSmPS EE E (2.36) where RSQM QM CONS RS EE E (2.37) PS QM QM CONS PS EE E and one can for instance use the E CONS of Eq. (2.13). At any rate, the potential of Eq. (2.36) is used in n-steps FEP by changing the mapping parameter λ from 0 to 1. To get the PMF we start with the FEP approach for the corresponding change in free energy using Eq. (2.10). Then the PMF is evaluated by using the modification of Eq. (2.19): ln exp m TARG m m E Ex Ex gGkT x kT (2.38) Figure 2-1 The red line potential We also where N i i-th mapp 2.2.4.3 When m Obviousl and this approach While th perform that refle explicit 2 The gas-phase is for the EVB p found that th i ( ξ) is the nu ping potentia Direct appli oving to a c ly different Q has to be r h to the cond he straightfor the PES in ects the solv solvent mod EVB PMF obtain potential which w he results fro g umber of tim al. ication to co condensed ph QM(ai)/MM reflected in densed phase rward exten the solvent ation effects dels, which ned by using the W was fitted to PM3( om the differ frames f g es MD visite ondensed ph hase one ha M potentials w the RP. Be es in cases of sion of the models, and s, this strateg are extrem 50 WHAM approach (gas), while the g rent simulati i i frames N N ed a particul ases as to take int will have dif elow we giv f implicit an Γ-function c d subsequen gy is obviou mely expensi h and the harmon green line is for th ion windows i g lar RC value to account th fferent charg ve the exten d explicit so correction to ntly derive t usly impract ive for QM nic constraints on he PMF obtained s can be com e, ξ, while pr he solute-so ge distributi nsion of the olvation mod o the conden the correspo tical when on M(ai)/MM st n different values d for the E EVB - Γ EV mbined by: (2 ropagating o olvent interac on for the so e new refine dels. nsed phases nding Γ-fun ne deals wit tudies. How s of RC. VB + Γ PM6 .39) on the ction. olute, ement is to nction th the wever, obtaining thus prov As a firs effect by and fitte Figure 2 Section 2 efficiency RC is sho Unfortun cavity du features Table 2- Thus, we and this p Figure 2-1 adding to th COSMO solv g reasonable vide a practic st step we e y the COSMO d the Γ COS -13) The fre 2.2.4.1 with y of samplin own in Figur nately, the im uring chemic of solvent. 2). Furtherm e must focus performance 13 The PMF’s ob e QM/MM poten vation model (blu e scan is not cal option th examined th O model 98 . A SMO function ee energy pr the MD sim ng at the TS re 2-14. mplicit solve cal reaction Thus, we m more, the im s on perform e will be exp btained with expl ntial the negative ue triangles). expensive w hat will be co he option of As a test sys n to the PE rofile, Δg PM mulation run region and i ent models (even with may have sig mplicit mode mance of our plored below licit (green) and of Γ COSMO . The f 51 when one us onsidered be f using the Γ stem we too S done in M M3,COSMO , was n on the flat its uniform d have difficu a reasonabl gnificant erro els cannot c r models wi w. implicit COSMO figure also depict ses implicit s elow. Γ-function w k the same S MOPAC200 s obtained b potential, E distribution a ulties in cap le calibration ors in estima capture the ith the explic O (red) solvation ts the Γ COSMO (blu simplified so which captu S N 2 reaction 09 46 for PM by the appro E PM3,COSMO - Γ along the stu pturing chan n) and can m ates of the T physics of cit atomistic models. The PM ue line) function, olvent mode ures the solv n described a M3+COSMO oach describ Γ COSMO . The udied range o ges of the s miss micros TS solvation protein inte c solvent mo MF’s were obtain , derived from PE el and vation above O (see bed in e high of the solute copic n (see eriors. odels, ned by ES with Figure 2-1 E PM3+COSMO - does not mak In paralle derived Q E with the one-elect While th obtained very low solvation fully flat Table 2-2 C X TS PM3 ( X RS PM3 ( ΔΔG solv 14 The distributio Γ COSMO (green). ke the explicit pot el we obtain QM charges m polar E polarized P tron integral he obtained P by another w. This can b n models. Th due to the d Comparing the so fixed) fixed) on of the nuclear The histogram fo tential fully flat. ned the PMF ) by running PM3 polar H PM3/MM Ha s of Eq. (2.2 PMF (see F approach, t be explained he addition o differences in olvation energies r RC during MD or E PM3/MM - Γ COSM F with the e g on the pote PM3,COSMO amiltonian i 28) Figure 2-13) the efficienc d by compa of - Γ COSMO d n the TS solv of the COSMO a ΔG AC (0 -> Q gas -52.1 -67.8 15.7 52 D simulation on t MO shows a poor xplicit solva ential: O 1 m K implemented is in excell cy of sampli aring the bar does not mak vation shown and the explicit SC s ) , kcal/mol the potential E PM sampling at the ation model, 2 RS K d in MOPA lent agreeme ing at the T rriers in the ke the TS re n in the Tab CAAS 99 models. Δ − − 9 M3/MM - Γ COSMO (re TS for the expli , Δg PM3/MM , m K C2009 62 via ent with the TS (shown in e explicit an egion on the le 2-2 below ΔH f,COSMO - ΔH f,GA −61.1 −70.4 9.3 ed) and on the p icit model, since (using the 2 PS (2 a modificatio e PMF separ n Figure 2-1 nd in the im explicit pote w. S , kcal/mol otential Γ COSMO ESP- .40) on of rately 14) is mplicit ential 53 The PMF on the PM3+COSMO potential gave a lower free energy barrier than that obtained by the energy scan, also in the explicit solvent model we obtained a higher free energy barrier. To explain the difference in the activation free energy with the different solvent model we calculated the solvation free energies for the gas-phase optimized PM3 RS and TS using MOPAC2009 with the COSMO 98 model and by the adiabatic charging model 36 . From the results given in Table 2-2 one can see that there is a difference in the solvation of the RS and TS by the explicit and implicit models, with the highest disagreement between the implicit and the explicit solvent models for the TS solvation. Essentially for this S N 2 reaction ΔΔG SOLV is a rough approximation for the contribution to the activation free energy barrier due to the solvation difference (if we add the gas phase barrier of ~12 we get roughly the PM3/MM barrier given in Figure 2-13). The deficiencies of the barriers obtained by this specific COSMO implementation is beyond the scope of this work, since different implicit models can give different results and it is always important to compare the solvation estimates by the different models 100 . In general any attempts to obtain a reliable QM/MM surface should involve careful calibration of the model on the observed solvation energies. Evaluation of the PMF above with an implicit solvent model provides extremely powerful way of doing so. This is significant since the implicit solvent models are frequently used in studies of the reference solution reactions, and such studies require a very tedious manual mapping in order to obtain reliable estimates 101 . Mapping with an implicit solvent model does not present major problems (except missing the non-equilibrium solvation effect 94 which would lead to underestimate of the actual solvation barrier) as well as having other aforementioned disadvantages of continuum models. However, the evaluation of the PMF becomes much more challenging when we deal with the explicit solvent models. Here the selection of the proper mapping potential is not trivial as we have to force the combined solute-solvent coordinate to respond to the change in the solute charge distributions. Here we can take advantage of the fact that any RP-based QM/MM calculation in polar environments (water or enzymes), and in particular approaches that use semi-empirical MO-based QM/MM potentials as the RP 74,80 , can be refined using the ΔΓ-corrections of Eq. 54 (2.35). This correction can be derived by Eq. (2.33) with an implicit solvent model for the original RP and for the TP (e.g. COSMO for which the analytical gradients are available): ,, REF REF REF IMPL TARG IMPL EE (2.41) Even though PES with the implicit solvent is relatively expensive, it describes more consistently the solute changes along the reactions path in the condensed phases than the corresponding gas phase PES, since the solute polarization is captured in a physically more reasonable way by the implicit solvent models. 2.2.5 Extension to the condensed phase using the EVB-type solvent potential In order to implement the approach of Eq. (2.37) in a practical way (that also could be used for the improved sampling in PMF calculations) it is desirable to have a transferrable Γ-function which can be easily derived in the gas phase (or perhaps with an implicit solvent model) and somehow applied to calculations in the condensed phases without need for re-parameterization in each phase. The correction, fitted to the gas phase calculations, contains only information about changes in the solute along the reaction path. Namely, the information about the intra-molecular part of the potential. To incorporate the effect of solvent and still to take advantage of the gas- phase derived Γ-function we consider a mapping potential where the QM and the MM regions are coupled classically by : PM3,gas PM3,gas 1 map CONS RS sS RS CONS PS sS PS EE E E E E QQ (2.42) where Q RS and Q PS are the vectors of the QM solvated charges at the RS and at the PS. () sS 12 6 11 k QM MM ij ij ij k ij ij ij ij Qq A B E rr r Q (2.43) Eq. (2.43) contains the EVB-type solvent driving potential, which polarizes solvent in the correct direction towards the product state by the solvated solute charges, and captures the non- equilibrium solvent effect. However it should be applied with care to S N 1 and other charge separation reactions 66 (since essentially the QM region is described by the gas phase QM potential, which is perturbed by the MM force field without the correct solute polarization). This mapping potential allows separating the QM and MM parts for the straightforward application of the gas-phase Γ-functions and addressing the sampling problems separately (namely that the 55 intra QM barrier is flattened with the gas-phase Γ-function and the solvent is pulled towards the TS by the EVB-type potential). The correct treatment (which was used for most of the calculations in this work) of the solute polarization in the case of MO-based QM/MM potential is of course should follow the Eq. (2.29) At this stage it is interesting to see whether we can extend the approach of Eq. (2.35) to simulations of processes in polar environments. As was mentioned above, the gas phase Γ- function only contains information about the intra-molecular contributions and says nothing about the inter-molecular interactions with solvent. Eq. (2.41) with the implicit solvent model Γ- function can be a higher-level approximation but this requires further examination. In particular, it is interesting to examine whether such a correction if obtained by fitting the PES performed with an implicit solvent model can be directly used to refine the RP for enzymatic simulation. Similarly, application of the EVB-type solvent driving potential combined with Eq. (2.40) should be further explored for purposes of improving the sampling of the MO-based QM/MM RP with a correct solute polarization. A hypothetical situation of refining a general RP (e.g. a MO-type QM/MM PM3/MM potential) is considered in Figure 2-15 (A). Here the original reference potential is given by: REF PM3,gas sS PM3,gas EE E Q (2.44) (with the corresponding FES obtained after removing the bias of Eq. (2.42) using Eq. (2.38), and combining the multiple simulation windows using Eq. (2.39)). The corresponding original FES (as well as the PMF obtained using Eq. (2.29)) are given in Figure 2-15 (A). To refine this reference potential for the PD calculation of the free energy barrier on the PM6/MM potential we use: REF PM3,gas PM6,gas REF EE (2.45) with the mapping potential given by Eq. (2.42). The refined reference FES (for the potential of Eq. (2.45)) as well as the target PM6/MM FES (obtained using PM6/MM Hamiltonian) are given in Figure 2-15 (B). Figure 2-1 (polarized P sampling on E PM3(GAS) +E sS (B) Demon for the E PM6/ The refined r using the FE In actual demonstr the corre correspon The EVB EVB dia correctio term can In other w part of th Since the 15 The PMF’s ob PM3 Hamiltonian n the flat E PM3(GA S (Q PM3 ). nstrating the refin /MM potential (pol reference free ene EP/US approach f QM(ai)/MM ration purpo sponding ga nding correc B RP describ agonal elem n for the in be refined b EVB E words, in ca he EVB RP: INT e original EV EVB E btained for the C n with the solven AS) - Γ PM3(GAS) with nement of the ori arized PM6 Ham ergy surface obta for E TARG =E PM3 - Γ M studies on ose) by an ex as phase pote ction obtaine bes the corre ments 66 . Thus ntra-interacti by using: INTRA E ase of the EV TRA TARG VB RP can b 22 11 2 cE c E CH 3 Cl and Cl - S nt polarized by th h the EVB-type iginal reference p miltonian with the ained by sampling Γ PM3 + Γ PM6 +E SOLV ( ne will substi xpensive QM ential and the ed with the im ect polarizati s for the EV ons within t 2 INTRA 1 c E VB RP, the g EVB be partitioned 21212 2 E cc H 56 S N 2 reaction in w he ESP charges) solvent driving potential, E PM3(GAS e solvent polarize g on the flat E PM3( (Q PM3 ). itute the PM M(ai)/MM p e Γ-function mplicit solve ion of solute VB RP the the solute. M 1, RS TARG sS E Q gas phase co d as : INTRA 1 Ec water. (A) (Red) using the WHA potential using t S) +E sS (Q PM3 ), to th ed by the ESP cha (GAS) - Γ PM3(GAS) wit M6/MM TP (w potential. It n of Eq. (2.45 ent model (e e by includin Δ Γ-correct Moreover, t 2 22, P TA sS cE Q orrection, Δ Γ 20 11, RS sS c E Q PMF constructe AM approach. (B the FEP/US app he PM6/MM targ arges) constructe th the EVB-type which was u is reasonabl 5) can be sub e.g. COSMO ng the effect tion in the the solute-so PS TARG Γ, refines th 2 0 22, P sS cE Q ed for E PM3/MM p Blue) PMF obtai proach for the p get potential. (Red ed using WHAM. solvent driving p used here onl le to assume bstituted wit O). t of solvent i gas phase i olvent intera (2 he intra-mole (2 0 PS (2 otential ined by otential d) PMF . (Blue) otential ly for e that th the in the is the action 2.46) ecular 2.47) 2.48) Figure 2-1 Γ EVB + Γ TARG f by the EVB parameters r EVB referen The only calculate Γ EVB for t we demo phase cal 2.3 Co In this c validation MO-type sufficient the TP at potential conclude for the d coordinat LRA G 16 Demonstrating fitted by Gaussia FEP/US approac refined for the ga nce potential for P y minor pro d with the o the new set onstrate in F lculations. In oncluding d chapter, the n begins by e QM/MM p tly accurate t the TS and in determin ed that the ca difference be te ξ. That is, REF CON EE g the refinement ans plus a vector ch along the EVB as phase PM3; (re PM6/MM target p oblem here original vect of charges t Figure 2-16 h n other word discussion e Paradynam y comparing potentials, a in evaluatio d at the RS. ning the free alculated Δ Δ etween the f , it is found t NSTAR E t of the EVB re of the new EVB c B energy gap (A) ed) the refined E potential. can be tha tor of EVB o improve th how we refin ds, we refine mics model the LRA ap and is also d on of the fre It is also fo e energy cor ΔG between free energy that: RGCONS E 57 ference potential charges derived f and along the nu EVB reference po at the intra- charges, Q 0 he accuracy ne the EVB e the EVB RP is further pproach to t done for EV ee energy pe ound that it i rrection at th the two mod functions, Δ FEP G l in condensed p for the target pot uclear RC (B). (B otential for PM3/ -interactions 0 , and one m of the ΔΓ INT RP origina P for PM3/M refined, qu the full FEP VB as a RP erturbation w is sufficient he specific v dified poten Δ Δg( ξ), of th P TARG g phases, using the tential. The free e Blue) the original /MM target poten s reflected might need t TRA correctio ally derived f MM and PM uantified and P treatment P. The LRA when moving to use a har value of the ntials is a goo he TP and o REF g e correction pote energy profiles o reference potent ntial; (green) the by the Γ EVB to recalculat on. In conclu for the PM3 M6/MM as the d validated. for two arb A is found g from the R rmonic cons RC. Overal od approxim of the RP at (2 ential – btained ial with refined VB are te the usion, 3 gas- e TP. The itrary to be RP to straint l it is mation t that .49) 58 Several practical improvements for the PD reference potential are proposed. More specifically, a novel idea of refining the RP in general case is put forward and tested on the CH 3 Cl+Cl - S N 2 reaction in water for two different reference potentials. The approach is based on modifying the RP with help of the Gaussian functions fitted to the PES’s along the RC performed on the original RP and on the TP. This technique is shown to be efficient for the gas phase simulation and in the implicit solvation models with no further modifications. Moreover, its extension to condensed phases with the explicit solvent model is suggested. While our approach for adding Gaussian functions has common features with the MTD basic idea of flattening the original potential by adding to it its negative 27,87 , we however, propose a different strategy of constructing the reference potential. In contrast to spending large amount of time while iteratively elevating the local minima of the ab initio QM/MM potential 102 we propose a simple approach of building the negative potential from the gas phase or the implicit solvent PES along the RC. The present work demonstrated that the end point LRA approach involving averaging both on the RP and on the TP practically coincides with the full n-step FEP whereas the averages taken only on the RP result results in a higher error. This point is important, since most of the works 79,80,103,104 that adopted the RP idea have not yet moved to the end points LRA treatment. Of course, the error associated with the initial LRA estimate will further decrease as the RP is refined since the refinement process increases the overlap with the TP. Both elements of the PD approach, the refinement of the RP and the LRA estimate of the FEP, are crucial for the efficiency and accuracy when the RP is used. 59 3 Chapter 3. Methods for Calculating QM(ai)/MM Free Energy Surfaces 3.1 Introduction In this chapter, we compile, review and apply a number of algorithms for QM/MM free energy calculations with a task of obtaining quantitative estimates of the activation free energy barriers of chemical reactions in the condensed phase. We focus on the practical aspects of computing the relevant free energy terms of the thermodynamic cycle used in evaluation of the activation free energy barrier by reference potential based Paradynamics model. We begin reviewing the computation of the free energy perturbation between two arbitrary potentials focusing on two particular cases: (1) of moving from the reference potential to the target potential and (2) moving between the two harmonically biased potentials. This is accomplished in several approaches: the free energy perturbation, the linear response approximation, and its slow growth extension and the thermodynamic integration 105 . We then review and formulate several strategies for removing different bias from mapping potentials as well as combining the result of several mapping potential when calculating the Potential of Mean Force. This includes the umbrella sampling approach 69 and the weighted histogram analysis method 92,106 (WHAM). Next, we calculate 1D and 2D free energy surfaces and discuss another implementation of the reference potential based scheme, which involves construction of PMF for the reference potential and for the target potential when sampling is performed on the reference potential only. By implementing several computational free energy algorithm in the same software module one, first, is able to compare the relevant estimates of the free energy changes and minimize probability of non-systematic errors, and, secondly, we have in hands a convenient tool with which to evaluate all steps in the Paradynamics thermodynamic cycle. 3.2 Methods 3.2.1 Calculating free energy perturbation First, let us review an extremely popular in computational physics and chemistry strategy of evaluating the free energy perturbation while from one potential energy surface A E to another potential B E , which is sometimes called “changing the force law” 107 . 60 Usually free energy changes rather than absolute free energies are calculated (except the rigid- rotator and harmonic oscillator approximations) ln FkT Q (3.1) ln A B Q FkT Q (3.2) The FEP strategy 68 : exp exp ln exp N BA A N A EEEdr FkT Edr (3.3) involves computing the configurational averages by MD or MC simulations. Eq. (3.1) is a combination of the first and the second laws of thermodynamic. By definition, it relates the partition function Q to the Helmholtz free energy, F. Experimentally however, changes in the Gibbs free energy function, G, are determined. The difference between the two free energy functions in condensed phases at not too large pressures is negligible, and in this work we assume that and for the convenience of comparison with experiments we use the Gibbs free energy notation. exp exp A B BA B A EE GEE EE (3.4) This is usually done however, by creating a series of n intermediate mapping potentials between the end points potentials A E and B E : 1 mmAmB EEE (3.5) by changing the perturbation parameter λ m from 0 to 1 with an increment of δλ. Since the convergence of FEP depends on the energy gap between the two potentials, implementation of Eq. (3.5) is the most efficient for the fastest convergence with a given n (compared to incrementally changing charges, bond lengths or other parameters) because one can do the perturbation with the equal energy gap increments. In the context of the PD model Eq. (3.5) is applied in several cases. First, while moving from the RP to the TP while estimating the corresponding free perturbation, when AREF EE and 61 BTGT EE . Second, when the first potential favors a chemical reaction being in the reactant state ARS EE and the second potential favors the product state BTGT EE by alternating the force law we construct a series of mapping potential to sample the potential energy surface of this reaction along the reaction coordinate. That is, in case of the EVB method RS and PS are represented by the EVB diabatic state, which are mixed into the adiabatic EVB potential with a coupling 12 H element. In the case of a general adiabatic potential QM E RS and PS are defined by constraining a selected RC to be at the values corresponding to the RS and PS: // RSPS QM QM CONS RS PS EE E (3.6) The corresponding examples were considered in Chapter 2. Each mapping potential gives rise to an independent MD trajectory, and the n resulting trajectories can be propagated in parallel (“embarrassing parallelism”). The results of these simulation windows are combined to calculate the free energy change. This can be done using the classical FEP equation: 1 1 1 ln exp m n mm FEP m E EE GkT kT (3.7) Eq. (3.7) provides the relevant FEP estimate for the forward FEP process. However, it can be used also for the backward process and it was argued 108 that the average of the forward FEP and the backward FEP provides a more reliable free energy estimate: 1 11 12 1 ln exp ln exp 2 m m nn mm m m FEP mm EE EE E E GkT kT kT kT (3.8) Alternatively, the same free energy change can also be estimated by thermodynamic integration approach, which however requires analytical partial derivative of the free energy with respect to the changed parameter. 1 0 TDI G Gd (3.9) It is easy to take the corresponding derivative of Eq. (3.5) with respect to the perturbation parameter, and using a trapezoidal numerical integration formula yields: 62 1 2 1 2 mm n TDI TARG REF TARG REF EE m GEE EE (3.10) Finally, this change can be estimated by the n-points LRA approach which is called the “slow growth” method: 1 1 11 1 1 2 mm n n LRA m m m m EE m GEE EE (3.11) In the limit of a big n all the approaches should give the same result, however to minimize the computational cost of a simulation a small value of n is desired. It has been pointed out 94 that a reliable estimate of the corresponding free energy change should involve sampling and averaging at least on both the RP and on the TP. With the end-points FEP, LRA often converges faster than the other approaches 70 : 1 2 REF TARG LRA TARG REF TARG REF EE GE E E E (3.12) As was pointed out in the previous chapters that the convergence of LRA can be further significantly improved by minimizing the difference between the RP and the TP (what is done in the PD scheme). Note also that the LRA formulation of Eq (3.12) coincides with the end points thermodynamic integration approach of Eq (3.10). All these formulations were applied to the validation of LRA in the PD scheme considered in details in chapter 2 for moving from the RP to the TP. Here we take a more complex chemical enzymatic reaction in dehalogenase (shown in Figure 3-1) to repeat the same calculations. Let us consider another important variation of the FEP approach when instead of gradually changing the force law according to Eq. (3.5) one gradually changes a harmonic bias potential acting on the reaction coordinate and containing sampling of the original potential QM E around the RC 109 , 0 m : 2 0 mQM m EE K (3.13) where 00 0 1 mmRSmPS (3.14) 63 Here 0 RS and 0 PS might correspond to the values of a selected RC at the RS and at the PS (or any other desired final point). Now we will use the mapping potential m E given by (3.13) in equations (3.8), (3.9) and (3.11) to evaluate the relevant free energy changes. While applying FEP and LRA approaches to estimate the free energy perturbation is the same, the TDI approach requires calculation of new partial derivative, since the changing parameter now is 0 . The corresponding formula with the fixed force constant is given by: 00 0 2 RS PS G K (3.15) Using the FEP formulation of Eq. (3.14) provides an effective way of improving the efficiency of sampling along a selected RC, which is crucial in calculations of the potential of mean force. The efficiency of sampling can be monitored by creating a distribution function along the RC. For the best sampling efficiency, one should try to achieve a uniform distribution which can be done by adjusting the force constants. However, increasing the force constant will require increasing the number of simulation windows, n, to ensure the even sampling distribution along the whole range of the RC. A special care should be taken when the initial system configuration is far from the minimum of the constraining potential, since it might create very hot atoms. One way to solve this is to gradually increase the force constant monitoring, in parallel, the efficiency of the sampling. The ultimate goal in the free energy simulation, however, is to obtain the probability distribution (and the corresponding free energies) on the original potential QM E . Therefore, we have to use a tool to recover the corresponding probability distribution from the probability distribution obtained on the mapping potential. Below we consider several implementations of the umbrella sampling strategy for this purpose. 3.2.2 Calculating free energy functions with the umbrella sampling approach In chapter 2 it was demonstrated how the performance of the LRA approach can always be checked by constructing the free energy functions of the RC for the TP and for the RP using the specialized combination of FEP/US approach 69,110 : 64 ln exp i m im Em E EE gGkT x kT (3.16) where i E is the potential of interest; and m G is the free energy change of moving from the initial mapping potential to the mapping potential m E given by one of the equations (3.7)-(3.11) and finding the relevant difference: REF TARG TARG TARG EE E E gg g (3.17) The formalism of Eq. (3.16) is however quite general and has multiple applications in the PD model. In particular, it is used for constructing the reference free energy surface. This is done with the EVB FEP/US procedure with the energy gap as the RC 66 , in PMF 108 calculations and in the RP-based strategy for QM/MM calculations 61 . These cases will be considered below. Points for the free energy functions obtained from running on different mapping potentials are superimposed where one might consider only those points for which ithreshold NN or calculate the weight-average: i i frames i frames N gg N (3.18) Here the summation is carried out over all simulation frames (different mapping potentials), i N is the number of data points on the i-th mapping potential in the range corresponding to computer implementation of the delta-function x in Eq. (3.16) Another implementation of the US approach for PMF calculations 111 was considered in chapter 2, where the potential energy profiles for the RP and for the TP along the RC are fitted with Gaussian functions: 2 exp TARG k k k k Ax (3.19) Next, we add the negative of the derived Γ-function to the original potential thus creating a flat mapping potential to improve the efficiency of the MD sampling 27,87 : QM QM EE (3.20) 65 Further, we create a series of mapping potentials pulling system from the RS specified with a potential 108 : RSQM QM CONS RS EE E (3.21) to the PS, given by: PS QM QM CONS PS EE E (3.22) along a RC, ξ: 1 mmRSmPS EE E (3.23) Next, we use Eq. (3.16) to obtain the free energy profile (PMF) on the unbiased potential. The same protocol is adopted using the mapping strategy given by equations (3.13) and (3.14). It is also possible within the FEP/US formalism to perform the free energy projection on the 2D grid (see e.g. ref. 64 ): 12 1 2 , ln exp i m im Em E EE gGkT x x kT (3.24) Eq. (3.24) can be useful to get additional information on the free energy path, obtained with a 1D-mapping potential (e.g. when mm map EE where 12 map or map E . However, a high quality 2D free energy map will require proper sampling along both 1 and 2 which should be obtained with the corresponding mapping potential 12 , mm EE , e.g.: 22 00 QM/MM 1 1 1 2 2 2 m EE K K (3.25) Furthermore, one can construct the free energy functions of any reaction coordinate. For example, free energy calculations in the EVB approach have been traditionally performed using the energy gap, 12 EE E , between the EVB diabatic states. In this case, x E is Eq. (3.16) and the EVB mapping potential is either given by Eq. (3.23) where 1 RS EE and 2 PS EE or by: 12 12 121 mm m mm EEE H (3.26) 66 The 2-states EVB adiabatic potential is calculated by: 22 11 2 2 12 12 2 EVB E cE cE cc H (3.27) Note that iEVB EE in Eq. (3.16). An early version of the RP-approach 62 employed the EVB mapping potential in the QM(ai)/MM free energy evaluation. This approach can be generalized to any reference potential ln exp TARG m TARG m Em E EE gGkT x kT (3.28) where mREF CONS m EE E (3.29) however since all sampling is performed on the reference potential the convergence of the free energy is relatively slow. Nevertheless, with an extensive sampling, and when the reference potential reasonable approximates the target potential, it can be a decent estimate of the target FES. Also using eq. (3.28) it is becoming trivial to estimate the corresponding free energy barrier on several target potential for which only the corresponding single point energy (no gradients and not even every MD step) evaluations are needed. 3.2.3 WHAM approach FEP and FEP/US approaches as well as the related methods considered in the two previous sections have solid physical background. Another popular statistics-based method WHAM 112 combines estimate of the free energy changes due to moving between the mapping potentials (they are also called free energy shifts or constants) and the free energy profile in a self- consistent manner: 1 , 1 exp exp n i i n tot ii iCONS i N P NG E (3.30) 67 , 1 exp exp i iCONS G PE (3.31) Note that ln gkTP and all other notations are consistent to the ones used above. WHAM equations (3.30) and (3.31) can be extended to the 2D-case, and used with mapping potentials given by Eq. (3.25): 12 1 12 ,12 1 , , exp exp , n i i n tot ii iCONS i N P NG E (3.32) and 12 , 1 2 1 exp , exp , i iCONS G PE (3.33) 3.3 Examples 3.3.1 Free energy perturbation in the Paradynamics approach In Chapter 2 we described a thermodynamic cycle used in the PD model. The free energy perturbations at the RS and at the TS are estimated using the LRA approximation of Eq. (3.12). To contain the sampling within the corresponding regions of interest (at a certain value of RC) a harmonic potential which vanishes at the RC value of interest is typically added, for instance: 0 2 QM1/MM 0 REF EE K (3.34) and 0 2 QM2/MM 0 TARG EE K (3.35) The LRA strategy was validated by comparing the LRA estimate of the free energy perturbation to the difference between the explicitly calculated PMFs on the target potential and on the reference potential. Here we repeat the same experiment, but now we move to a more sophistic consists correspon Figure 3-1 sticks. The d We took PM3 47,48 the inter electrosta MOPAC The diffe reaction ated system of an enzy nding reactio Active center of dotted red line dep k two MO-b and PM6 113 raction energ atic potentia 2009 46 -MOL erence betwe coordinate, m – enzymati yme, haloalk on scheme fo haloalkane dehal picts the direction based QM/M 3 Hamiltonia gy of an el al on this at LARIS 114 in een the brea dC C ic reaction. T kane dehalo for in water ( logenase. The pro n of the nucleoph MM potenti ans. QM/MM lectron on a tom to the c nterface). aking (C-Cl) CldC 68 The system ogenase, wit (aspartate is otein backbone sh ilic attack of the s al with the M electrostat an orbital c correspondin and the form O . In the r studied belo th a substra shown as ac hown in green th substrate – 1, 2-d QM part d tic coupling centered at ng one-elect ming (C-O) range of the ow is shown ate, 1,2-dich cetate): e catalytic residu dichloroethane sh described by was implem QM atom tronic integr bond length e RC betwee n in Figure 3 hloroethane. ue, aspartate, is sh hown as sticks. y semi-emp mented by ad i with the rals 111 (usin hs was taken en -2 A and 3-1. It The hown as pirical dding MM- ng the n as a d 2 A 101 map potential QM/MM (water/pr correspon the WHA Figure 3- Figure 3-2 for the refere The TS r of the fre in the pr were pro PM6/MM FEP estim pping potent ) with equa M trajectorie rotein) were nding refere AM program -2. Catalytic effect b ence reaction in w reaction coor ee energy pe rotein. Eq. (3 opagated in M potentials mates are giv tial of the f al spacing s with the e propagated nce reaction m 91 , which based on PMF si water. Black line rdinate was i erturbation o 3.5) was use n parallel fo were used i ven in Figur form (3.13) and forc MM region d for 75, 00 n in water. T were practi mulations using W is PMF for the re identified at f moving fro ed for the 2 or 50 ps. T in the LRA a re 3-3 A. 69 was create ce constants n represente 00 steps wit They were co cally identic WHAM approac eaction in enzyme =0.2 A. T om the PM3 1-step FEP The end-poi approach of ed where QM of 100 kca ed by 18 A th 0.5 fs tim ombined usi cal. The ca ch as reproduced e. This value w /MM potent estimate. Th int trajector f eq. (3.12). M=PM3/MM al/A 2 mol. Th A sphere of me steps in ing the FEP/ alculated PM by PM3/MM pot was used for t tial to the PM he correspon ries on pure The corresp M (the refer he correspon f explicit so n protein and /US approach MFs are give tential. Blue line the LRA esti M6/MM pote nding traject e PM3/MM ponding LRA rence nding olvent d the h and en in is PMF imate ential tories M and A and Figure 3-3 PM3/MM re approach (B correspondin line: PM6/M averages are The LRA correctio activation kcal/mol potential kcal/mol PM6/MM approach (PM6/MM smooth r energy p approach 3.3.2 O We have barrier on energy su Application of th eference potentia BLACK line) and ng correction to MM target free en e in red and cyan A estimates n to the act n barrier of 2 . In the next in the prote . In Figure M target free h of eq. (3 M) dots res red (PM3/MM profiles obta hes shows th Obtaining ca e seen that t n the target urfaces is e he PD approach t l to the PM6/MM d at the TS by a the activation fre ergy surface. Bot respectively. are extrem tivation free 23.7 kcal/mo t step we cal ein. The exp 3-3 B the e energy pro .16). All c pectively. W M) and cyan ained using at the two ap atalytic effec the PD appr QM/MM po valuation of to an enzymatic r M reference pote a 26-step FEP (G ee energy barrier th PMF are calcu ely close to e energy bar ol one gets t lculated the plicitly calcu e correspond ofiles are gi calculated p When these n (PM6/MM the WHAM pproaches yi ct using free roach can gi otential. One f the catalyt 70 reaction in dehalo ential at the RS b GREEN line) and r. (B) RED line: ulated using the F o the multi- rrier is -15 k the PD-appro correspondi ulated PM6/M ding PM3/M iven. They b points are p points are w ) curves are M (Figure ield essentia e energy cal ive a reliabl e important p tic effect of ogenase. (A) Free by a 21-step FEP d by the LRA (B : PM3/MM refer FEP/US approach step FEP tr kcal/mol. By oach estimat ing free ener MM activati MM referen both were c plotted as b weight-avera produced. C 3-2) and th ally identical lculations le estimate o practical app f an enzyme e energy perturba P (RED line) and BLUE line). The rence free energy h black dots and reatment. Th y adding it te of PM6/M rgy surface f ion free ene nce free ene calculated us black (PM3 aged accord Comparison he FEP/US l results. of the activa plication of t e. In Figure ation of moving fr d by the end-poin black arrow sho y surface (PMF); blue dots. Their he correspon to the PM3 MM barrier a for the PM6 rgy barrier i ergy profile sing the FE 3/MM) and ding to eq. ( of PM3/MM (Figure 3- ation free en the QM/MM 3-3 we saw rom the nt LRA ows the CYAN weight- nding 3/MM as 8.7 6/MM is 8.3 e and EP/US blue (3.18) M free -3 B) nergy M free w that QM/MM would lik catalytic reference calculate protein ( kcal/mol Figure 3-4 for the refere Interestin barriers w about 1 k Next we with whi and for th be 21.2 k The corre of 11.7 k M potential ca ke to examin activity. In F e reaction in d the corresp Figure 3-4). . 4 Catalytic effect b ence reaction in w ngly, while was about 15 kcal/mol. tried to mo ich we also a he reference kcal/mol in w esponding ex kcal/mol 3 . an give quite ne how if d Figure 3-2 th n water on ponding PM The corresp based on PMF si water. Black line difference 5 kcal/mol th ove to a hig attempted to e reaction in water and 10 xperimental e different es different QM he catalytic PM3/MM l MF for PM6/M ponding act mulations using W is PMF for the re between the he correspon gher-level of o calculate th water. The c 0.9 kcal/mol estimates ar 71 stimates of th M/MM poten effect ( wa g level was e MM potenti ivation free WHAM approac eaction in enzyme e PM3/MM nding differe f theory DFT he activation correspondin in protein w re 27.0 and 1 he activation ntials also gi atenz g ) of stimated to ial in water a energy diff ch as reproduced e. M and PM6/ ence betwee T-based B3L n free energi ng barrier (s with the cata 15.3 respecti n barriers in ive different f dehalogena be 11.5 kc and compare ference was by PM6/MM pot /MM activa en the catalyt LYP//6-31G es for the en see Figure 3- alytic effect ively with th protein. Her t estimates o ase relative t cal/mol. We ed it to the o found to be tential. Blue line ation free en tic effects is G*/MM pote nzymatic rea -5) were fou of 10.3 kcal he catalytic e re we of the to the e also one in e 12.4 is PMF nergy s only ential, action und to l/mol. effect Figure 3-5 line is PMF f 3.3.3 O First imp There we the studie Figure 3-6 (BLUE), FE approach wi to the tar from 100 equilibra 5 Catalytic effect for the reference Other refere plementation ere reportedl ed system. C 6 PMF for PM3/ P/US (GREEN) ith configurationa rget potentia 0 simulation ation). Our r based on PMF si reaction in water ence potenti n of the ref ly problems Calculations /MM (left) and with sampling o al averages taken al was made n windows e esults (see F imulations using r. Black line is PM al approach ference pote with conver involved up PM6/MM (right) of the target pote on the reference e every 10 st ach containi Fig. 6) indic 72 WHAM approac MF for the reactio h formulatio ential approa rgence. We p to 50 ps sim ) potentials in d ential. Red dots potential. teps, and the ing 4000 da cate that the ch as reproduced on in enzyme. on ach strategy adopted this mulation tim dehalogenase. Th are the weighted e correspond ata points (fi convergenc d by B3LYP//6-31 y involved u s approach a me with 0.5 fs e estimates are d-averages calcul ding PMFs irst 10 ps w ce of the cal 1G*/MM potentia using eq. (3 and tested it s stepsize. A obtained using W lated using the F were constru ere discarde lculated PM al. Blue 3.28). t with A call WHAM FEP/US ructed ed for MF for the targe kcal/mol 3.3.4 M Using m studied e Figure 3 trajectori WHAM Figure 3-7 surfaces are obtained usi shows the LR The corr the react instance, One way 21x21 Q correspon (RIGHT) 1D-mapp et potential from the est Multidimens mapping pote enzymatic re -8. For thes ies were pro equation (3. 7 2D PMF for PM calculated using ing the FEP/US a RA free energy pe esponding f tion mechan one can see y to avoid the QM/MM M nding free e ) the black s ping potentia is relatively timates base sional QM/M entials given action on th se calculatio opagated for 32)-(3.33) M3/MM (BLUE) g 2D-WHAM an approach process erturbation calcu free energy s ism and det e that the loc e enormous MD trajector energy surfa surface was al where y poor, and ed on the dire MM free en n by Eq (3.2 he PM3/MM ons a 21x21 r 50 ps. The ) and PM6/MM ( nd mapping alon sing data generat ulated by the PD a surfaces and tails of the f ation of the computation ries) and st ace landscap obtain in th () dC Cl 73 the weigh-a ect sampling ergy surfac 25) we calc and on the grid was c correspond (RED) potentials g 2 reaction coo ted on 1D mappi approach. d the contou free energy TS on both nal cost of c till to get pe is to use his approach () dC O w averaged es g PMF. ces culated the PM6/MM p created and ding surfaces s for the reaction ordinates (d(C-O ing potential alon ur maps do p landscape. free energy alculating th some relev e the FEP/U h using the s was constrain stimates fluc free energy potentials. Se the corresp s were cons n in dehalogenas ) and d(C-Cl)), ng d(C-O)-d(C-C provide addi For the stud surfaces is v he full 2D m vant inform US eq. (3.24 same data ob ned. The fre ctuate within surfaces fo ee Figure 3-7 ponding QM tructed usin e. The BLUE an while the black Cl) RC. The cyan itional insig died reaction very close. maps (and run mation about 4). In Figure btained usin ee energy su n 2-3 or the 7 and M/MM ng the nd RED surface n arrow ght on n, for nning t the e 3-7 ng the urface generated using 2D Figure 3-8 PM6/MM FE 3.4 Co This cha energy s reaction t LRA fre Second, refined a computat insight o complete surface, h reference d using the D mapping po 8 The correspon ES for the reactio onclusions apter describ surfaces. Fir the activatio e energy pe several prac and applied f tional protoc n the free en e PD thermo here we pro e potential. 1D mapping otentials. ding contour ma on in dehalogenas bes computa rst, it is dem on free energ erturbation a ctical approa for the studi cols one can nergy surfac dynamic cyc ovided sever g potentials i aps for the free se. ational tools monstrated gy barrier on approach wh aches for co ied enzymat n reproduce ce landscape cle at first st al efficient w 74 in fact repro energy surfaces s and metho that by app n the target P hile moving onstructing t tic reaction. the enzyma s, and on th tep involves ways of doi oduces very w s shown in Figur ods used fo plying the P PM6/MM po from the re the free ene It is shown atic catalytic he reaction m construction ing so for a well the lan re 7. (LEFT) PM or calculatin PD model f otential is ca eference PM ergy surfaces that by usin activity and mechanism. F n of the refer general MO dscape gene M3/MM FES, (R ng QM/MM for an enzym alculated wit M3/MM pote s are consid ng the devel d obtain valu Finally, sinc rence free en O-based QM erated RIGHT) M free matic th the ential. dered, loped uable ce the nergy M/MM 75 4 Chapter 4. Quantifying the Mechanism of Phosphate Monoester Hydrolysis in Aqueous Solution by Evaluating the Relevant ab initio QM/MM Free Energy Surfaces 4.1 Introduction Phosphate hydrolysis reactions are arguably the most important class of chemical reactions in biology, 115-121 and it is, thus, important to elucidate the mechanism of such reactions in solution and in proteins. Combining experimental studies with a theoretical approach is crucial in resolving the underlying reaction mechanisms. QM(ai)/MM free energy studies provide a valuable insight 7,122-126 and a practical way of resolving the fundamental mechanistic controversies for chemical processes in condensed phases in general and the phosphate hydrolysis in particular. The key questions in resolving the nature of the hydrolysis of phosphate monoesters and related systems are the mechanism of P-O bond breaking and the mechanism of the following proton transfer. The first question is defined in terms of the 3D free energy surface (or its contour map) where the system energy is plotted as a function of two distances, R 1 and R 2 . R 1 represents the P-O bond between the phosphate and the leaving group, while R 2 describes the bond that is being formed with the attacking nucleophile. The second question is whether the proton transfer (PT) step, following the P-O bond breakage, occur in a direct intramolecular manner from the attacking nucleophilic water molecule to the terminal phosphate oxygen (the 1W mechanism) or with the assistance of several additional water molecules (which will be called below “the 2W mechanism”). To address the second question one has to examine both mechanism, that is to calculate the corresponding free energy surfaces. Traditionally, the hydrolysis of phosphate monoesters has been focused on the competition between the associative and dissociative mechanisms, 127,128 but recently the interest has been also turned towards the exploration of the possible proton transfer pathways from the nucleophilic water molecule to the phosphate oxygen atom. To the best of our knowledge, these issues have not been resolved experimentally and thus a consistent computational modeling is crucial for reaching an unambiguous resolution. One of the first systematic attempts to resolve these issues computationally (using various solvation models) has been reported by Florian and Warshel. 129 Subsequent extensive and systematic computational studies of the relevant 76 surfaces 127,129,130 found a very similar barrier for the associative and concerted paths, while considering a direct proton transfer from the attacking water molecule. The possibility that phosphate hydrolysis involves a PT with more than one water molecule has emerged from Hu and Brinck’s study, 131 but without any considerations of entropic contributions. Subsequent studies found 2W PT mechanism in phosphate monoester hydrolysis 132-134 but this was done without exploring the 1W path, which was explored in systematic studies of Warshel and coworkers. 129,130,135 A recent attempt 136 to address this challenging issue and to compare the two mechanisms involved energy-minimization study using implicit solvation models with a constrained manual TS search ( by scanning the potential energy surface along the selected RC in the gas phase and then performing relaxation in the given solvent model). In this study 136 it was estimated that the 1W barrier is 3-4 kcal/mol higher than the 2W PT barrier, but taking into account a number of approximations made in that approach those result could not be considered as conclusive findings. Some of the difficulties in using energy minimizations with several explicit QM water molecules were already discussed in Ref. 137 Thus, it was also concluded that a quantitative resolution must involve QM(ai)/MM free energy calculations with sufficiently extensive and converging sampling using molecular dynamics. Here we report the result of such a systematic QM(ai)/MM free energy study. In this chapter we explore several model systems for phosphate monoesters in aqueous solution and evaluate the corresponding free energy surfaces with DFT-based QM/MM and a semi- empirical PM3/MM potentials. The resulting free energy surfaces provide quantitative estimates of the relevant reaction pathways. It is found that the reaction follows an associative/concerted mechanism via nucleophilic water attack and that the proton transfer is more likely to proceed through the 2W path. Additionally, it is found that the PT TS in not much higher than the high- energy plateau which serves as a starting point for the 1W and the 2W paths and that the height of this plateau appears to be the key quantity that determines the overall reaction rate 138 . 77 4.2 Methods Our computational protocol for generating free energy surfaces on QM/MM potentials involved construction of a series of mapping potentials of the form: 2 () ( ) ( ) QM/MM 0 mm iii EE K (4.1) where QM/MM E is the unbiased QM/MM potential; and the second term introduces a harmonic constraint which improves sampling of the reaction coordinate near the equilibrium values specified by 0 with () i K being the force constants. For construction of 2D free energy surfaces in R 1 , R 2 coordinates we typically define a series of mapping potentials by setting new values to 0 1 R and 0 2 R with the spacing of 0.1 A in the range of interest for RCs and use the force constants of about 2 100 kcal A mol . For big QM models, we used 12 R R as the RC with the force constant of 100 K . For proton transfer reactions the RC was defined as the difference in distances between the proton and the proton-donor (D) and the proton-acceptor (A): dD H d A H with 75 K for 1W PT and 30 75 K for 2W PT. K values were introduced gradually to avoid high energy gradients in cases when the initial 0 . Prior to switching the force constants on we run 10 ps relaxation of solvent with the fixed solute periodically updating QM charges every 100 steps. The corresponding MD trajectories were propagated in parallel with the step size of 1 fs at 300K for 5-15 ps. The trajectories were analyzed to ensure adequate sampling along the RC(s). The last 90% of trajectories were used to construct the free energy surfaces using a combination of FEP/US 40,42,69 and WHAM 91,92 algorithms, where we performed cross checking of the FEP/US and the WHAM approaches. MD trajectories were propagated using the QM/MM module of MOLARIS-XG 45 simulation package. For QM/MM calculations with the PM3 47 potential we used a modification 111 of MOPAC2009 46 , for the QM/MM calculations with BLYP and B3LYP//6-31G* we used QChem3.2 139 and QChem4.0, while for the B3LYP//6-31G*/COSMO calculations we used Gaussian09. 53 78 4.3 Results 4.3.1 PM3 QM/MM results Our QM/MM studies started with the use of the semi-empirical PM3/MM 46,92 Hamiltonian. The choice of this relatively low level of theory was made for several practical reasons. First, PM3/MM potential has a low computational cost and, in principle, it can be used as the reference potential in the Paradynamics 64 scheme for QM(ai)/MM free energy calculations. Secondly, due to its low computational cost it can be used in order to rapidly explore the corresponding free energy surface and to establish a reliable yet computationally feasible simulation protocol. Since one of the key aspects of this work is to provide a quantitative insight about the 1W and 2W PT mechanism, we started with the simplest model systems relevant to the PT mechanism. Namely, the metaphosphate anion, 3 PO , with 1 QM water was chosen as starting model system for the QM region in case of 1W PT and 3 PO with 2 QM water molecules as a model system for 2W PT (note that these systems essentially corresponds to taking the leaving group to infinity). The configurations of the QM regions near the TS are shown in Figure 4-1, where the calculated free energy surface for the 1W PT mechanism is given in Figure 4-2 A and compared to the corresponding surface for the 2W PT mechanism in Figure 4-2 B. The estimated free energy barriers are 19.4 kcal/mol for the 1W PT mechanism, and 15.3 kcal/mol for the 2W PT mechanism. Next, we took the methyl hydrodiphosphate (MHDP) with 2 QM water molecules as the QM region (see Figure 4-3). In this case we reduced the solute charge (-3) to (-2) by protonating the oxygen of the alpha-P in order to avoid instabilities associated with a high solute charge solute 140 . The 2D free energy surface was constructed in the R 1 , R 2 space (Figure 4-4 A), where one observes a stable intermediate, 32 PO H O with R 2 ~1.9 A (the possible configurations of which are shown in Figure 4-1). Next, we calculated the free energy surfaces for 1W PT and 2W PT (see Figure 4-4 B) and estimated the free energy barriers for the 1W PT mechanism (16.2 kcal/mol) and for the 2W PT mechanism (15.9 kcal/mol). Thus by combining the free energy surfaces given in Figure 4-4 A and 4B one can estimate the free energy changes relative to the RS of MHDP (see Table 4-1). Figure 4-1 (LEFT) the n Figure 4-2 2H 2 O)) syste proton-dono using WHAM dots). Note th The simplest mo near transition st 2 The PM3/MM f em. (A) The 2D fr r and proton-acc M (black dots) an hat here the proto odels used to stud ate configuration free energy surfa ree energy surface ceptor distances. nd FEP/US (red d on acceptor is the dy the proton tra n for the mechanis aces calculated fo e in the ξ, R 2 spa (B) A compariso dots) and for the e second water fo 79 nsfer step of the sms involving one or the 1W and 2W ace, where ξ is the on of the 1D free 2W PT, obtained r 2W PT and the phosphate mono e water molecule W proton transfe e 1W PT coordina energy surfaces d with WHAM (b e oxygen of the hy oester hydrolysis ( . (RIGHT) and tw fer mechanisms f ate, defined as th along the ξ RC f black triangles) a ydrated metaphos (PO 3 - + H 2 O (or 2 wo water molecul for the (PO 3 - + H he difference betw for the 1W PT, o and with FEP/US sphate for the 1W 2H 2 O)). les. H 2 O ( or ween the btained S (green W PT. Figure 4-3 two studied m Figure 4-4 Figure 4-3) ( Finally, w Figure 4 Figure 4- INT. We A description of mechanisms for t 4 The PM3/MM (A) in the R 1 , R 2 s we moved t -5) and the -6) a reduced e note howe f the QM region u the final proton tr free energy surf space for the clea to a larger Q results are d activation ver, that Mg used to study the ransfer step are s face calculated fo avage of the P-O b QM region m summarized barrier along g 2+ with the 80 e hydrolysis of MH hown as pink das or MHDP + 2 QM bond (B) for the 1 model, which d in Table 4 g the associa PM3/MM p HDP. Here the P shes (1W PT) and M water molecule 1W and the 2W P h included M 4-1. With th ative path, as otential doe P-O bond (grey da d as blue dashes ( es (the model of PT mechanisms. Mg 2+ and 16 his model w s well as a sl s not mainta ashes) is broken (2W PT). the QM region g 6 QM water we observed lightly less s ain the octah and the given in r (see d (see stable hedral coordinat DFT/MM Figure 4-5 hydrolysis re (RIGHT) Sh Figure 4-6 O bond. tion and se M see below) 5 A description o eaction (LEFT). hown is the QM re The PM3/MM fr ems to favo ). of a larger mode Here the P-O bo egion solvated by ree energy surfac or the tetrah el for the QM r ond is already br y MM water (blue ce for the QM mo 81 hedral coord region (MDP wit roken and the sy e sticks) odel of Figure 4-5 dination wit th Mg 2+ and 16 ystem is at the pl 5, calculated in th th 4 ligand QM water mole lateau prior to th he R 1 , R 2 space fo s (in contra ecule) used to stu he final proton tr or the cleavage of ast to udy the ransfer. f the P- Figure 4-7 effect of Mg 2 it provides phosphate an While we beyond t associativ elevated and cow theory w barriers f the overa approach reported 7 PM3/MM free e 2+ is that it seems stabilization to nd the residual m e noticed an the scope of ve path with basin for the orkers HF// with BLYP 13 from the inte all reaction. h and a stud in the next s energy surfaces fo to change the pr the associative c methyl(di)phospha n interesting f the presen h a stable PO e subsequen LANL2DZd 4 the interm ermediate ha However, dy with a hig section. or: (A) MTP plus robability of the a complex by decr ate. Also Na + seem catalytic eff nt work. The 32 O HO int nt PT step. Th dp/MM min mediate is me ave a similar these findin gher level o 82 s Mg 2+ plus 16 QM associative/dissoci reasing the repu ms to be doing wo fect of Mg 2+ e surfaces in termediate in his is in a qu imization re etastable. Th r height and ngs may sim of theory is M H 2 O (B) MDP iative paths, mak ulsion between th orse than Mg 2+ in + compared t n the R 1 , R 2 n the right u ualitatively i esults 132 , how he estimated d the PT step mply reflect obviously re P plus Na + plus 16 king the associativ he negatively ch terms of lowerin to Na + (Figu 2 space show upper corner in agreemen wever, at a d 1W and 2 p is being th the use of equired, and 6 QM H 2 O. The o ve more likely. Pr harged leaving te ng the associative ure 4-7), this w a well-de r, which form nt with Nemu a higher lev 2W PT activ he rate limiti a semi-emp d such a stu obvious robably erminal TS1. goes efined ms an ukhin vel of vation ng of prical udy is 83 Table 4-1. Summary of the free energy simulation results obtained using PM3/MM (a) (a) Energies in kcal/mol. SCAAS and FES designate, respectively, the surface constraint all atom model 141 and free energy surface (Potential of Mean Force). (b) Using the estimate of 15.5 for the height of INT found for MHDP with 2 H 2 O (assuming that when R 1 reaches infinity the energy is similar). The value in parentheses designates the free energy difference between the RS for the PT step and the RS for P-O bond breaking step. System Method ξ or R2, R1 ΔG ‡ from RS description Figure PO 3 (-) with 1 QM H 2 O 22 A SCAAS 2D-FES in ξ, R2 space for 1W PT ξ =-0.09, ~1.8 19.4(+15.5) b TS 1W PT Figure 4-2 (A) PO 3 (-) with 2 QM H 2 O 22 A SCAAS 2D-FES in ξ1, ξ2 space for 2W PT ξ =-0.11, ~1.8 15.3(+15.5) b TS 2W PT Figure 4-2 (B) MHDP with 2 QM H 2 O 22 A SCAAS 2D-FES in R1, R2 space for P-O break 2.75, 1.9 25.0 TS1 Figure 4-4 (A) 1.9, 4.0 15.5 INT Figure 4-4 (A) MHDP with 1 QM H 2 O 22 A SCAAS 2D-FES in ξ, R2 space for 1W PT ξ =-0.08, ~1.9 16.2(+15.5) TS 1W PT Figure 4-4 (B) MHDP with 2 QM H 2 O 22 A SCAAS 2D-FES in ξ1, ξ2 space for 2W PT ξ =-0.06, ~1.9 15.9(+15.5) TS 2W PT Figure 4-4 (B) MDP with Na + and 16 QM H 2 O 18 A SCAAS 2D-FES in R1, R2 space for P-O break 1.9, 2.7 31.5 TS1 Figure 4-7 (B) 1.9, 3.6 23.5 INT MDP with Mg 2+ and 16 QM H 2 O PM3/MM 18 A SCAAS 2D-FES in R1, R2 space for P-O break 1.9, 2.7 23.0 TS1 Figure 4-6 1.9, 3.6 13.5 INT MTP with Mg 2+ and 16 QM H 2 O PM3/MM 18 A SCAAS 2D-FES in R1, R2 space for P-O break 2.00, 1.85 11.6 Int1 Figure 4-7 (A) 1.92, 2.65 26 .0 TS1 1.9, 3.6 18.6 INT 84 4.3.2 QM(ai)/MM results Next we move to the more challenging (and much more computationally expensive) issue of calculating the QM(ai)/MM free energy surfaces. Here, the high computational cost is an additional reason to carry out studies on the simplest (smallest) possible QM region model. In this study, following our previous work 142 , we use a hybrid B3LYP functional with 6-31G* basis set. Additionally we also examined here the performance of the BLYP functional (explored here with the two basis sets 6-31G and 6-31G*) in calculating the activation free energy for 1W PT and 2W PT, since this functional is widely used in CPMD (Car-Parinello MD) studies of similar systems 134,143 . Furthermore, in order to compare the minimization results with an implicit solvent to the physically more rigorous free energy studies with the same model, we also used B3LYP//6-31G*/(COSMO) potential with MD propagated in the presence of inert solvent (water molecules with 0 charges) that provided a better solute thermalization. Our MD simulation were based on the adiabatic approximation, where we evaluated the electronic density by SCF at each time step (this approach which has been used by us for a long time (e.g., see Ref. 144 ) is arguably more rigorous than using auxiliary MD variables to effectively achieve the same purpose. Our approach also describes the long-ranged electrostatics consistently (by our QM/MM treatment), while still being able to perform adequate sampling and at this level of theory. As in the case of the PM3/MM study, we calculated first free energy surfaces using the simplest models of the QM region given in Figure 4-1, for the 1W PT with 3 PO plus 1 QM water and for the 2W PT with 3 PO plus 2 QM water molecules (where the residual methylphosphate is assumed to be at infinity). The calculated free energy surfaces with the BLYP//6-31G*/MM potential are depicted in Figure 4-8 where the 1W PT activation barrier is 6.5 kcal/mol (5.8 with 6-31G*) and the 2W PT barrier is 1.4 kcal/mol. Next we evaluated the free energy surfaces with B3LYP//6-31G*/MM potential, and the corresponding results are given in Figure 4-9. Now we have 14.0 kcal/mol activation free energy for 1W PT and 1.4 kcal/mol for 2W PT. Interestingly, with the above BLYP//6-31G*/MM potential, the difference between the 1W and 2W path of 5.1 kcal/mol, while with the B3LYP//6- 31G*/MM the difference is 12.6 kcal/mol. Figure 4-8 system. (A) T donor and p PT path wit set (blue dots Figure 4-9 or 2H 2 O)) sy the proton-d PT path , ob Note that her We no 4-9 A), s stable int cases. Th BLYP/MM free The 2D free ener roton-acceptor di th the 6-31G* ba s). Note that here 9 B3LYP//6-31G* ystem. (A) The 2D donor and proton btained by WHA re the proton acc ote that the 2 hows a plate termediate, a hat is, in the energy surfaces c rgy surface in ξ, istances. (B) A co asis set (black dot e the proton accep /MM free energy D free energy sur n-acceptor distanc AM (blue dots) an ceptor is the secon D free energ eau at R2~1. as is the case 1W PT sim calculated for the R 2 space, where omparison of the s) and the 6-31G ptor is the second y surfaces calcula rface in the ξ, R 2 ces. (B) A compa nd for the 2W PT nd water for 2W gy map of th .9-2.0 A, wh e of the PM3 mulations we 85 e 1W and 2W pro ξ is the 1W PT c 1D free energy su G basis set (red do d water for 2W PT ated for the 1W a space, where ξ is arison of the 1D T path , obtained PT and the oxyge he 1W PT me here the PT p 3/MM simul had to use a oton transfer mec coordinate define urfaces obtained ots) and for the 2 T and the oxygen nd 2W proton tr s the 1W PT coor free energy surfa d by WHAM (gre en of metaphosph echanism fo process begi lations, we h a harmonic c chanisms for the ed as the differen using FEP/US alo 2W PT path , obt n of metaphosphat ansfer mechanism rdinate defined a aces obtained alo een dots) and usin hate for 1W PT. r PO 3 - + H 2 ins. Since thi had to use co constraint of (PO 3 - + H 2 O ( or nce between the ong the ξ RC for tained with 6-31G te for 1W PT. ms for the (PO 3 - + s the difference b ong the ξ RC for ng FEP/US (blac O system (F is plateau is onstraints in f 100 at R 2 = 2H 2 O)) proton- the 1W G* basis + H 2 O ( between the 1W k dots). Figure not a some =1.85, 1.95 and simulatio range as We also this case hydrolys explicit s surfaces solvation PM3/MM better res profile fo motion is 130 ). Nev as is the c Figure 4-1 4-5), for the B3LYP//6-31 Since on associativ depicted 2.05 (the ef ons we used well. o considered e we at first is (which in solvent (MM are given in n models ar M surface of sult for the la or charge re s likely to e vertheless, th case with the 0 The free energ first step of the 1G*/MM model. n the PM3/M ve TS, we al in Figure 4- ffect of whic no constrai d the MHDP t calculated nvolves break M water) mo n Figure 4-1 re in an e f Figure 4-4. arge charge ecombination ncounter som he dissociativ e PM3/MM gy surfaces in the e ester hydrolysi MM level the lso construct -5. ch was, of c int on R2 an P + 2 QM H 2 free energy king P-O bo odel and an i 10, where on excellent qu Here it is n separation la n (see Ref. me steric re ve mechanis model in the R 1 , R 2 space for is (the cleavage e addition o ted the free 86 course remov nd in the RS 2 O model fo y surfaces in ond ) with th implicit CO ne can obse antitative ag not entirely c arge R 1 , sinc 145 ). Furthe sistance whe sm seems to e absence of MHDP plus 2 Q of the P-O bond of Mg 2+ in th energy surfa ved latter). O S for PT R2 or the QM re n the R 1 , R he B3LYP//6 SMO solven erve that the greement, b clear whethe ce it is quite ermore the v en occurs in o be more fa f Mg 2+ . QM H 2 O (the syst d) obtained by he QM regio ace for the la On the other was fluctua egion given R 2 space for 6-31G* pote nt model. Th e results obta but quite di er the explic challenging very large c n enzyme ac avorable ove tem with the QM (A) the PMF /C on led to sta arge model o r hand, in 2W ating in 1.9- in Figure 4- the first st ential, using he correspon ained by the ifferent from it solvent gi g to obtain pe charge separ ctive site (se er the associa M region given in COSMO model ; abilization o of the QM re W PT -2.0 a -3. In ep in g both nding e two m the ives a erfect ration e ref. ative, n Figure (B) the of the egion Figure 4-1 which conta FEP/US(blue The corr were pro RC is giv the same R 2 surfac pathway energy p The plate In the ne (MHDP B3LYP// Figure 4 B3LYP// calculate shows a barrier o between 11 B3LYP//6-31G ained an harmon e) and with WHA esponding s opagated by ven in Figure e data (see F ce) indicates compared t plateau, at R eau region w xt step, we e with 2 QM /6-31G*/CO 4-12 B). Th /6-31G*/CO d for the 1W barrier of 1 f 1.4 for the the two mec G*/MM free ener nic constraints o AM(red) and (B) surface is giv adding harm e 4-11 A. W Figure 4-11 B s that the M to the MHD R 2 ~2.0-2.2 Å was taken as t evaluated the water mole SMO (see F e free energ SMO (see W PT mecha 12.3 kcal/mo e 2W PT m chanisms bei rgy surface for M on R 1 -R 2 RC: (A The free energy ven in Figur monic constr We then estim B). This surf Mg 2+ ion shi DP surface ( Å with the ~2 the starting p e PT free en ecules). The Figure 4-12 gy barrier f Figure 4-1 anism using ol (Figure 4 mechanism (M ing 11 kcal/m 87 MDP plus Mg 2+ p A) The 1D free surface calculate re 4-11. Her aints on the mated the 2D face (as wel ifts the reac (see Figure 20 kcal/mol point for exp ergy surface free energy A) and ~14 for the 2W 3). The B3 MHDP plu 4-14) where MHDP plus mol. plus 16 QM H 2 O, energy surface ed in the R 1 , R 2 sp re B3LYP//6 RC defined D free energy ll as the sam ction path m 4-10) and l free energy ploring the s es for the QM y barrier for 4 kcal/mol fo PT was fo 3LYP//6-31G s 1 QM wat it is also c 2QM water , obtained with a calculated alon pace by the FEP/ 6-31G*/MM d as R 1 -R 2 . P y surface in R mpling distrib more toward seems to st y change, re subsequent P M region mo 1W PT is ~ or B3LYP//6 ound to be G*/MM free ter model fo compared to r model) wi a 1D mapping po ng the R 1 -R 2 R /US approach. M MD traject PMF in the R R 1 , R 2 space bution on th ds the associ tabilize the elative to the PT step. odel of Figur ~15 kcal/mo 6-31G*/MM ~3 kcal/mo e energy b or the QM re o the free en ith the differ otential, RC with tories R 1 -R 2 e with he R 1 , iative high- e RS. re 4-3 ol for M (see ol for arrier egion nergy rence Figure 4-1 PMF/B3LYP Figure 4-1 (A) 2D mapp Finally, w 4-15) at t energy su 2 The Free energ P//6-31G*/COSM 3 The Free energ ping with the PM we calculate the B3LYP// urface (see F gy surfaces for the MO and (B) B3LY gy surface for the MF/B3LYP//6-31G ed the 1W PT /6-31G*/MM Figure 4-16) e 1W PT path cal P//6-31G*/MM p e 2W PT path ca G*/COSMO mod T free energy M. In this ca . 88 lculated for the M potentials. alculated for the del . (B) 1D mappi y surface for ase we obtain MHDP + 2 QM H 2 MHDP + 2 QM ing with the PMF r the system ned a higher 2 O QM region of H 2 O the QM re F/B3LYP//6-31G m containing r barrier of 1 f Figure 4-3 using egion of Figure 4- */COSMO mode Mg 2+ (see F 7.5 kcal/mo (A) the -3 using el. Figure ol free Figure 4-1 water as a m using MHDP Figure 4-1 step in hydro At this p presented that the m step invo proximity energy su 14 Free energy p model for the QM P plus 2 QM wate 5 A snapshot of o olysis of phosphat point, it is i d above in ev mechanism o olves breaka y of the term urface in R rofiles calculated M region (RED w er (BLUE curve c one the studied m te monoester. He important to valuation of of the phosp age of the P minal phosph R 1 , R 2 space d on B3LYP//6-31 was obtained using calculated using W models for the QM re the P-O bond i o comment f the rate dete phate hydrol P-O bond (R hate (R 2 ). Th . The lowes 89 1G*//MM potent g WHAM; BLAC WHAM) M region (MDP wi is already broken about the c ermining act lysis can be R 1 ) and brin his step is qu st free energ tial for: the 1W P CK curve using F ith Mg 2+ and 6 Q n and the system i combination tivation barr separated in nging a nuc uantitatively gy path on th PT mechanism u FEP/US) and for QM water molecul is prior to the fin of the free rier. It was al nto 2 distinc cleophilic w y characteriz hese surface using MHDP plus r the 2W PT mec les) used to study nal proton transfe e energy sur lready menti ct steps. The water molecu zed using the es determine s 1 QM chanism y the PT er. rfaces ioned e first ule in e free es the most pro 1 R, R g Figure 4-1 B3LYP//6-31 This heig PMF is c reasonab then con 4-14). By looki (Figure 4 energy is passed it stored in R1~3.2 A reaction) coordinat (relative obable mech 2 R can be d 16 The Free energ 1G*/MM potentia ght is the zer calculated in le values of nnect it with ing at the 2 4-11) and wi s released. O ts TS1 value n R1, but we A). The free thus obtaine tes. At any to the RS of anism for th determined f gy surfaces for th als. ro-level of th n the vicinity the plateau h the free en 2D free ener ithout Mg 2+ One can tak e) as the sta e solved this e energy va ed from the rate, the fre f the net reac he reactants from the corr he 1W PT path c he free energ y of 1 R and in the R 1 , R 2 nergy surface rgy surfaces (Figure 4-1 e any point arting point s problem by alues for the combination ee energies c ction) are sum 90 to pass to t responding m calculated for the gy level for t 2 R . In othe 2 map (e.g se e describing s calculated 1) one can s on the free for the PT. y fixing the e RS in the n of the free calculated by mmarized in the high-ene map relative e model of the QM the PT step, er words, we ee Figure 4- g the PT ste for the P-O see that after e energy surf In this way R1 value o PT step (re energy map y the descri n Table 4-2. ergy plateau, e to the react M region given i for which th e at first iden -10 at R 2 =2Å ep (e.g. Figu O bond clea r the P-O bo face in R1, y, there is s or the P-P di elative to th ps in R1, R2 ibed approac , which elev ant state. n Figure 4-15, us he correspon ntify the rel Å, R 1 =3.2Å ure 4-12 - F avage with ond is broke R2 space (a still some en istance (at w he RS of th and in R1, P ch in this se vation sing the nding evant Å) and Figure Mg 2+ en the at R1 nergy which he net PT ection 91 Table 4-2 Summary of the BLYP//6-31G*/MM, B3LYP//6-31G*/MM and B3LYP//6-31G*(COSMO) free energy calculations (a) System Method ξ or R2, R1 ΔG ‡ from RS description Figure PO 3 (-) with 1 QM H 2 O; 22 A SCAAS Δg( ξ, R2) 1W PT; BLYP//6-31G*/MM ξ=-0.16; ~1.8 6.5(+21.0) b TS 1W PT Figure 4-8 PO 3 (-) with 2 QM H 2 O; 22 A SCAAS 2D-FES in ξ1, ξ2 space; 2W PT; BLYP//6- 31G*/MM ξ =-0.10, ~ 1.8 1.4(+21.0) b TS 2W PT Figure 4-8 (B) PO 3 (-) with 1 QM H 2 O; 22 A SCAAS nonpolar solvent 2D-FES in ξ, R2 space; 1W PT; B3LYP//6- 31G*/COSMO ξ =-0.14 14(+21) c TS 1W PT Figure 4-9 PO 3 (-) with 2 QM H 2 O; 22 A SCAAS nonpolar solvent 2D-FES in ξ1, ξ2 space for 2W PT, B3LYP//6- 31G*/COSMO ξ =-0.10 1.4(+21) c TS 2W PT Figure 4-9 (B) MHDP with 2 QM H 2 O; 20 Q atoms + COSMO solvent with 22 A SCAAS nonpolar solvent 2D-FES in R1, R2 space for P-O break, B3LYP//6-31G*/COSMO 2.0, 3.1 21 plateau Figure 4-10 (A) 2D-FES in ξ, R2 space for 1W PT, B3LYP//6- 31G*/COSMO 1.88, -0.15 15(+21) TS 1W PT Figure 4-12 (A) 2D-FES in ξ1, ξ2 space for 2W PT, B3LYP//6- 31G*/COSMO 3(+21) TS 2W PT Figure 4-13 MHDP with 1(2) QM H 2 O; 20 Q atoms + 22 A SCAAS 2D-FES in R1, R2 space for P-O break, B3LYP//6-31G*/MM 2.0, 3.1 21 plateau Figure 4-10 (B) 2D-FES in ξ, R2 space for 1W PT, B3LYP//6- 31G*/MM 1.88, -0.15 12.3(+21) TS 1W PT Figure 4-12 (B) Figure 4-14 2D-FES in ξ1, ξ2 space for 2W PT, B3LYP//6- 31G*/MM 1.88, -0.20 1.4(+21) TS 2W PT Figure 4-14 MDP with Mg 2+ and 6 QM H 2 O; 32 Q atoms + 18 A SCAAS 2D-FES in ξ, R2 space for 1W PT, B3LYP//6- 31G*/MM 1.88, -0.15 18(+22.5) TS 1W PT Figure 4-16 MDP with Mg 2+ and 16 QM H 2 O; 62 Q atoms + 18 A SCAAS 2D-FES in R1-R2 space for P-O break, B3LYP//6-31G*/MM dd=1.0 22.5 plateau Figure 4-11 (a) Energies in kcal/mol. SCAAS and FES designate, respectively, the surface constraint all atom model 141 and free energy surface (Potential of Mean Force). (b) Using the estimate for the height of the plateau found for MHDP with 2 H 2 O (assuming that R 1 reaches infinity) (c)Using the estimate of the height of the plateau found for MDP with 16 H2O (assuming that R 1 >>1) 92 4.3.3 Nuclear Quantum mechanical and other corrections Using the reference potential approach one can estimate the nuclear quantum mechanical (NQM) correction for the PT reaction. The quantized classical path (QCP) 29,146 method 146 , allows one to obtain efficiently the results of path integral centroid 147 simulations, by developing a free particle quantum perturbation from the classical trajectory of the reacting system. That is the QCP approach expresses the free energy surface along the reaction coordinate as: ln exp ( ) fp m mm E E GGkT x EEx (4.2) where m G is the free energy associated with for the mapping potential m E ( e.g. the potential of Eq. (4.1) or in the case of EVB modeling the EVB mapping potential). E is the quantum potential of the quasi particles , given by 2 1 (1 / )[ ] 2 P n m EPE (4.3) where P is the number of quasiparticles and 2nh . The notation < > V designates an average on the indicted potential . The mapping potential in Eq. 2 is evaluated at the position of the classical trajectory coordinate, x, which is considered as the centroid and < > fp designates an average calculated on the quantum free particle potential by Monte Carlo or Langevin dynamics. Calculation of the NQM correction is extremely expensive computationally and estimating it on the QM/MM potential presents a major challenge 53 . The central idea of the QCP approach is using the classical trajectory as the reference centroid for NQM calculations. We also note that we can obtain the NQM –corrected free energy surface on the QM(ai)/MM potentials replacing E in E by the adiabatic QM/MM surface. This can be drastically accelerated by using the first order expansion of E at each call. Another important correction for the estimated barriers might be due to moving to a higher-level of theory ab initio Hamiltonian, here the optimal approach would be to use the PD approach, using the lower level surface as a reference for the free energy of the higher-level surface. 93 4.3.4 Emerging Picture As seen from Table 4-2 the difference between the QM(ai)/MM classical PMF for the 1W and 2W paths for MHDP is about 11 kcal/mol. This difference does not include the corresponding NQM corrections. In the presence of Mg +2 the difference seems to increase by a few kcal /mol. To estimate the agreement of the total calculated barrier with the experimentally observe barrier we might have to move to a higher QM level (since B3LYP is known to systematically underestimate the activation barriers) and this can be done by calculating with the PD model the correction of moving from B3LYP//6-31G* to a higher level of theory. Meanwhile, we can simply consider the corresponding correction obtained in our previous work. 136 The corresponding average correction is estimated to be about 3 kcal/mol and adding this value to the QM(ai)/MM barrier of MHDP about 23 kcal/mol gives a barrier of 26 kcal/mol. The non- equilibrium solvation effect is expected to further increase the barrier by about 2-3 kcal/mol, 148 which makes the overall barrier associated with the hydrolysis of MHDP close to about 29 kcal/mol. This is comparable to the experimentally observed barrier of about 31.2 kcal/mol for the hydrolysis of PP i 3- , 149 but these assumptions require further rigorous validation. 4.4 Concluding Discussion The relative energy of the 1W and 2W paths in the hydrolysis of phosphate monoesters has been explored in a systematic and reliable way in the current chapter by means of QM(ai)/MM free energy calculations. Two possible mechanistic paths for the PT step were compared by generating the relevant QM(ai)/MM surfaces. This work presents a careful theoretical study of the hydrolysis of phosphate monoesters in solution, exploring the relevant free energy surfaces by several approaches including: QM (DFT)/implicit solvent, PM3/MM, QM(DFT)/MM. This allows us to move ahead of traditional manual mapping and incomplete minimization studies using the QM(ai)/implicit solvent to a much more solid evaluation of the relevant surfaces and activation free energies. It is found that in solution the 2W mechanism is more likely than the 1W mechanism. Exploring several models with the increasing degree of complexity it was found in agreement with a previous study 136 that the 1W becomes less probable in the presence of factors that reduce the pKa of the acceptor oxygen such as Mg 2+ (which increases the 1W barrier). 94 The present study indicates that semi-empirical approaches may give quit different results than those obtained by ab initio calculations, including the finding that with the PM3 Hamiltonian we obtain similar barriers for the 1W and 2W paths. Interestingly, it is found that a popular BLYP functional gives much smaller differences between the 1W and 2W barriers than that obtained with a hybrid DFT functional. In the present work we demonstrated by performing full free energy (PMF) calculations involving direct sampling on the QM(ai)/MM potential how to address the mechanistic controversies in chemical reaction. Namely, we compared the 1W and 2W PT mechanisms and the possible mechanisms of the P-O bond breakage (which was just briefly touched in this study) of the phosphate hydrolysis reaction. However, by using the reference potential strategy, it is possible to further improve the quantitative understanding of the studied problems. In particular with the PD approach one can evaluate the free energy perturbation of moving to a higher level of theory QM(ai)/MM potential, while the reference-potential based QCP approach can be used to estimate the NQM correction. 95 5 Chapter 5. Prospective of the Reference Potential Approach 5.1 Other applications of the reference potential approach strategy 5.1.1 Calculation of absorption and emission spectra Another relevant application of the PD approach is modeling properties of the systems in the excited state. Here the problem of the QM(ai)/MM methodology is complicated by the fact that in general quantitative description of the excited states by electronic structure methods is several orders of magnitude more expensive. One particular application of QM/MM modeling is characterization of photochemical properties of fluorescent proteins. Here attention of the experimental community has been focused on designing new fluorescent proteins with tailored photochemical properties such as large difference in wavelengths of the absorbed light and the emitted light, flr abs . Rational design (and improvement of the existing structures) of fluorescent proteins require understanding of the photochemical cycle which the fluorophore, the part of the protein responsible for fluorescence, undergoes. A particular family of proteins with a large stock shift is called red fluorescent protein, since the emitted wavelength of the light (fluorescence) lies in the red light region of the electromagnetic spectrum. Suggested models of the photochemical cycle involved proton transfer in the excited state. We took one representative of the red FP family – LSSmKate1 150 (Figure 5-1) The corresponding chromophore and the suggested proton acceptor, GLU 160, are shown in Figure 5-1 (RIGHT) where the chromophore is protonated and GLU 160 is charged (proton transfer leads to protonation of GLU160 and ionization of the chromophore). 96 Figure 5-1 LEFT A monomeric unit of the red fluorescent protein LSmKate1 with the view given from the top of the barrel formed by the protein (shown as green cartoon). A snapshot is taken from a QCFF-Pi trajectory for the protonated fluorophore. (RIGHT) Zoom of the chromophore in the protonated state and the suggested proton acceptor GLU 160. 5.1.2 Methods To verify our computational methodology we attempted to calculate the absorbance and emission spectra of the LSSmKate1 chromophore by means of QM/MM MD simulation. As was already pointed out quantitatively reliable electronic structure methods have several orders of magnitude higher computational cost compared to the ground state counterparts. Therefore, a brute-force QM(ai)/MM approach to calculating the spectra by propagating QM(ai)/MM MD trajectories (with the appropriate for the excited state description level of theory) and evaluating the energy gap between the ground and the excited states is prohibitively expensive. Here, recent attempts to describe the absorbance spectra of related systems involved propagating QM/MM or MM MD trajectories and periodically estimating the energy gap using TDDFT. Since calculation of the emission spectra requires propagating QM/MM MD trajectories in the excited state the emission spectra could not be calculated in this manner. We adopted a semi-empirical potential, QCFF-PI, parameterized for modeling Pi-conjugated heteroatomic systems, and used it as the reference potential in our calculations. With QCFF-PI we are able to propagate trajectories both in the ground state and in the excited state, thus we have an advantageous tool for not only calculating both the absorbance and emission spectra, but also for modeling chemical processes in the excited state. 97 5.1.3 Propagating MD trajectories on the QCFF-PI potential The QCFF-PI approach 151 (Quantum chemical extension of the self-consistent force field), uses ZDO (Zero Differential Overlap) approximation when all 11 ab are set to zero when ab in the frame of Pariser-Parr-Pople approximation assuming separation of and electrons. Furthermore, the rest of the integrals are approximated by parameterized analytical functions, which makes it computationally affordable. With this approach one can describe conjugated heteroatomic molecules in both the ground electronic state and in the excited state by assuming a formal separation of the and electrons. The electrons are described using the classical force field formalism and the electrons are treated with the semiempirical Pariser- Parr-Pople model. First, we identified ionazable potein residues close to the chromophore and determined the corresponding ionization states (pKa) by using the PDLD/S-LRA 152 approach. At pH=7 the corresponsding ionization states of the nearby residues are shown in Figure 5-2. Also all ionic pairs in the protein were ionized, while other remote residues were kept neutral. Figure 5-2 Assignment of ionization states to nearby ionizable residues of chromophore in LSSmKate1 red fluorescent protein at pH=7. Note that the protonation state of GLU160 in simulations is determined by the protonated state of the chromophore. Once the ionization state were determined QM/MM MD trajectories with the QM region described by QCFF/Pi program were propagated for the protonated (24 QM atoms) and for the 98 ionized forms of the chromophore (23 QM atoms) in the ground state and in the excited state. MD trajectories involved 100,000 MD steps with 0.5 fs time steps at 300K. The MM region was represented by an 18 A sphere (SCAAS) with the protein atoms and 128 generated water molecules. First 30,000 steps were discarded (15 ps equilibration). QCFF/Pi excitation energies for an isolated chromophore and for the polarized by MM charges chromophore were recorded. Then the obtained data points (7000 values) were sorted among 100 bins from the lowest encountered wavelength to the highest one. Then intensity was calculated as fraction of hits per bin normalized to the total number of samples. For the MD snapshots recorded each 1000 MD steps QM/MM single point energies were calculated on the CIS(D)//6-31+G**(FC)/MM potential using QChem 3.2. Thus, the spectra calculated on the target potential were simulated from 700 data points, see Figure 5-3 Figure 5-3 Absorption (A) and emission (E) spectra calculated for the protonated (HA) and for the ionized states (A - ) of the chromophore in LSSmKate1 red fluorescent protein. (LEFT) Cyan line/white dots – A(HA), CIS(D); blue line – A(HA), QCFF- π; cyan line/white dots – A experiment at pH=7; cyan line/red dots – A experiment at pH=11. (RIGHT) cyan line/white dots – A experiment at pH=7; cyan line/red dots – A experiment at pH=11; red line/white stars and orange line/white stars – E experiment 150 at pH=11; orange line - A(A - ), QCFF- π; purple line - E(A - ), QCFF- π; cyan line/white dots - A(A - ), CIS(D); magenta line/white dots - E(A - ), CIS(D). Units are: relative intensity versus wavelength of the absorbed/emitted light. Here it is important however to account for the overlap between the reference potential (on which sampling is performed and MD is propagated) and between the target potential (at which spectra are computed. This is due to the fact that probability of being on the reference potential is not equal to that on the target potential. Therefore, this probability of being in a particular configuration on the TP should be related to the probability of being in the same configuration on 99 the reference potential. This probability can be taken into account through the energy difference (the overlap) between the TP and the RP 69 : TGT REF REF TGT REF TGT REF EE E EE E E EEe EE e (5.1) 5.2 Concluding remarks The direction of a spontaneous chemical process is known to be determined by the free energy change, while the rate of the process according to the transition-state theory is determined by the free energy barrier, which a system must climb in order to get from the reactant state to the product state. Accurate calculations of the free energy barrier for the chemical reactions in the condensed phase, in particular for enzymatic reactions, while needed for applications in pharmaceutics and bioengineering (e.g. design of artificial enzymes, catalysts and enzyme-inhibitors) have been one of the major challenges in computer modeling. One of the most widely used techniques to the free energy calculations in condensed phases nowadays is a hybrid QM/MM 56 strategy, which when combined with the ab initio QM potentials (with which one can currently achieve the desired level of chemical accuracy in the gas phase calculations) is believed to make a great advance towards the predictive and accurate results. However, cost of the direct sampling (required for the free energy convergence) on the ab initio QM/MM potential is still often prohibitive for the real system of interest in the condensed phases. The reference potential 17 Paradynamics approach offers an efficient strategy of solving this problem. It is based on the idea of using a computationally cheap reference potential to construct the reference free energy surface, followed by evaluating the free energy perturbation with the linear response approximation while moving from the reference potential to the target potential at the reactant state and at the transition state 71 . It has been shown that one can achieve about 2 order of magnitude time-savings in the PD approach compared to other computational schemes 64 by using the EVB reference potential and calculating the free energy perturbation by LRA. Since the efficiency of the LRA convergence depends on the proximity of the reference potential to the 100 target potential in this work we presented two refinement strategies in order to ensure the desired LRA convergence 64 . By using these strategies one can minimize the difference between the RP and the TP. The first strategy is based on the parametric refinement of EVB parameters. The second strategy employs a general way of refining of an arbitrary reference potential by adding Gaussian functions. Next we performed the validation of the LRA approach by performing a multi-step FEP. Namely, we conducted a comparative study of calculating the free energy perturbation when moving between 2 MO-based QM/MM potentials (where one of them is the reference potential) using the full n-steps FEP and the LRA approaches for chemical reactions in condensed phases (in water and in protein). Next, we compared prediction of the PD model for the activation free energy barrier to the explicitly calculating Potential of the Mean Force. Next we presented a series of gradual model improvements and advances for performing efficient sampling while constructing free energy surfaces. Thus, we refined and quantified all steps in the PD thermodynamic cycle, and formulated a robust strategy of validating it. The developed tools for free energy calculations were applied to several highly practical problems in computer modeling. In chapter 3 we calculated the catalytic effect of an enzyme. In chapter 4 we quantified mechanistic paths for the phosphate hydrolysis, in particular for the mechanism of the proton transfer. In addressing the practical problems new challenges have emerged. One is evaluation of the nuclear quantum effect 153 , the tunneling correction to the activation free energy barrier of the PT (and the related isotopic effect). Another challenge, increasing the computational cost, is description of the excited states and treatment of large-size QM regions 154 , particularly when a chemical conversion takes place in the coordination sphere of a metal (as what seen in case of Mg 2+ in the phosphate hydrolysis 138 and in many enzymes which have one and more metal nuclei in their active center). In spite of the enormous progress in advancing electronic structure calculations, the brute force treatment of such enzymes at the QM(ai)/MM level of theory is still extremely resource demanding and often impractical 155 . The reference potential strategy offers a general and sometimes the only practical way of circumventing the high computational cost of solving many challenging problems 58 including the one of calculating QM(ai)/MM free energy surfaces with the Paradynamics model. 101 6 References (1) Bentzien, J.; Muller, R. P.; Florian, J.; Warshel, A. Hybrid ab initio quantum mechanics molecular mechanics calculations of free energy surfaces for enzymatic reactions: The nucleophilic attack in subtilisin. J. Phys. Chem. B. 1998, 102, 2293-2301. (2) Štrajbl, M.; Hong, G.; Warshel, A. Ab Initio QM/MM Simulation with Proper Sampling: "First Principle" Calculations of the Free Energy of the Autodissociation of Water in Aqueous Solution. J Phys Chem B 2002, 106, 13333-13343. (3) Rosta, E.; Klahn, M.; Warshel, A. Towards Accurate Ab Initio QM/MM Calculations of Free-Energy Profiles of Enzymatic Reactions. J Phys Chem B 2006, 110, 2934- 2941. (4) Rosta, E.; Haranczyk, M.; Chu, Z. T.; Warshel, A. Accelerating QM/MM Free Energy Calculations: Representing the Surroundings by an Updated Mean Charge Distribution. J. Phys. Chem. B. 2008, 112, 5680-5692. (5) Kamerlin, S. C. L.; Haranczyk, M.; Warshel, A. Progress in Ab Initio QM/MM Free-Energy Simulations of Electrostatic Energies in Proteins: Accelerated QM/MM Studies of pKa, Redox Reactions and Solvation Free Energies†. The Journal of Physical Chemistry B 2008, 113, 1253-1272. (6) Shurki, A.; Warshel, A. Structure/function correlations of proteins using MM, QM/MM, and related approaches: Methods, concepts, pitfalls, and current progress. Adv Protein Chem 2003, 66, 249-313. (7) Hu, H.; Yang, W. Free Energies of Chemical Reactions in Solution and in Ezymes with ab initio Qantum Mehanics/Molecular Mechanics Methods. Annu Rev Phys Chem 2008, 59, 573-601. (8) Muller, R. P.; Warshel, A. Ab Initio Calculations of Free Energy Barriers for Chemical Reactions in Solution. J. Phys. Chem. 1995, 99, 17516-17524. (9) Zhang, Y.; Liu, H.; Yang, W. Free energy calculation on enzyme reactions with an efficient iterative procedure to determine minimum energy paths on a combined ab initio QM/MM potential energy surface. J. Chem. Phys. 2000, 112, 3483-3492. (10) Rod, T. H.; Ryde, U. Quantum Mechanical Free Energy Barrier for an Enzymatic Reaction. Phys. Rev. Lett. 2005, 94, 138302. (11) Olsson, M. H. M.; Hong, G.; Warshel, A. Frozen Density Functional Free Energy Simulations of Redox Proteins: Computational Studies of the Reduction Potential of Plastocyanin and Rusticyanin. J. Am. Chem. Soc. 2003, 125, 5025-5039. (12) Liu, W. B.; Wood, R. H.; Doren, D. J. Density and temperature dependences of hydration free energy of Na + and Cl - at supercritical conditions predicted by ab initio/classical free energy perturbation. J Phys Chem B 2003, 107, 9505-9513. (13) Iftimie, R.; Salahub, D.; Schofield, J. An efficient Monte Carlo method for calculating ab initio transition state theory reaction rates in solution. J. Chem. Phys. 2003, 119, 11285-11297. (14) Crespo, A.; Marti, M. A.; Estrin, D. A.; Roitberg, A. E. Multiple-steering QM/MM calculation of the free energy profile in chorismate mutase. J. Am. Chem. Soc. 2005, 127, 6940-6941. 102 (15) Rosta, E.; Haranczyk, M.; Chu, Z. T.; Warshel, A. Accelerating QM/MM free energy calculations: Representing the surroundings by an updated mean charge distribution. J Phys Chem B 2008, 112, 5680-5692. (16) Pradipta, B. Accelerating quantum mechanical/molecular mechanical sampling using pure molecular mechanical potential as an importance function: The case of effective fragment potential. J. Chem. Phys. 2005, 122, 091102. (17) Luzhkov, V.; Warshel, A. Microscopic Models for Quantum Mechanical Calculations of Chemical Processes in Solutions: LD/AMPAC and SCAAS/AMPAC Calculations of Solvation Energies. J. Comp. Chem. 1992, 13, 199-213. (18) Huber, T.; Torda, A. E.; van Gunsteren, W. F. Local elevation: a method for improving the searching properties of molecular dynamics simulations. J. Comput. Aided. Mol. Des. 1994, 8, 695-708. (19) Christen, M.; Keller, B.; van Gunsteren, W. F. Biomolecular structure refinement based on adaptive restraints using local elevation simulation. J. Biomol. NMR 2007, 39, 265-273. (20) Bakowies, D.; van Gunsteren, W. F. Simulations of apo and holo-fatty acid binding protein: Structure and dynamics of protein, ligand and internal water. J. Mol. Biol. 2002, 315, 713-736. (21) Laio, A.; Parrinello, M. Escaping free-energy minima. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 12562-12566. (22) Donadio, D.; Bernasconi, M. Ab initio simulation of photoinduced transformation of small rings in amorphous silica. Phys. Rev. B. 2005, 71, 073307. (23) Asciutto, E.; Sagui, C. Exploring intramolecular reactions in complex systems with metadynamics: The case of the malonate anions. J. Phys. Chem. A. 2005, 109, 7682-7687. (24) Urakawa, A.; Iannuzzi, M.; Hutter, J.; Baiker, A. Towards a rational design of ruthenium CO 2 hydrogenation catalysts by ab initio metadynamics. Chemistry 2007, 13, 6828- 6840. (25) Michel, C.; Laio, A.; Mohamed, F.; Krack, M.; Parrinello, M.; Milet, A. Free energy ab initio metadynamics: A new tool for the theoretical study of organometallic reactivity? Examples of the C-C and C-H reductive eliminations from platinum (IV) complexes. Organometallics 2007, 26, 1241-1249. (26) Ensing, B.; De Vivo, M.; Liu, Z.; Moore, P.; Klein, M. L. Metadynamics as a tool for exploring free energy landscapes of chemical reactions. Acc. Chem. Res. 2006, 39, 73-81. (27) Laio, A.; Parrinello, M. Escaping free-energy minima. Proceedings of the National Academy of Sciences 2002, 99, 12562-12566. (28) Hwang, J.-K.; Warshel, A. A Quantized Classical Path Approach for Calculations of Quantum Mechanical Rate Constants. J. Phys. Chem. 1993, 97, 10053-10058. (29) Hwang, J.-K.; Warshel, A. How Important are Quantum Mechanical Nuclear Motions in Enzyme Catalysis? J. Am. Chem. Soc. 1996, 118, 11745-11751. (30) Fan, Z. Z.; Hwang, J.-K.; Warshel, A. Using simplified protein representation as a reference potential for all-atom calculations of folding free energy. Theor. Chem. Acc. 1999, 103, 77-80. (31) Messer, B. M.; Roca, M.; Chu, Z. T.; Vicatos, S.; Kilshtain, A.; Warshel, A. Multiscale simulations of protein landscapes: Using coarse grained models as reference potentials to full explicit models. Proteins: Struct. Funct. Bioinformat. 2010, 78, 1212-1227. 103 (32) Wesolowski, T.; Muller, R. P.; Warshel, A. Ab initio frozen density functional calculations of proton transfer reactions in solution. J. Phys. Chem. 1996, 100, 15444-15449. (33) Roca, M.; Liu, H.; Messer, B.; Warshel, A. On the relationship between thermal stability and catalytic power of enzymes. Biochemistry 2007, 46, 15076-15088. (34) Roca, M.; Messer, B.; Hilvert, D.; Warshel, A. On the relationship between folding and chemical landscapes in enzyme catalysis. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 13877-13882. (35) Aqvist, J.; Warshel, A. Simulation of Enzyme Reactions Using Valence Bond Force Fields and Other Hybrid Quantum/Classical Approaches. Chem. Rev. 1993, 93, 2523- 2544. (36) Warshel, A. Computer Modeling of Chemical Reactions in Enzymes and Solutions; Wiley-Interscience, 1991. (37) Wesolowski, T.; Warshel, A. Ab initio free energy perturbation calculations of solvation free energy using the frozen density functional approach. J. Phys. Chem. 1994, 98, 5183-5187. (38) Mones, L.; Kulhánek, P.; Simon, I.; Laio, A.; Fuxreiter, M. The energy gap as a universal reaction coordinate for the study of chemical reactions. J. Phys. Chem. B. 2009, 113, 7867-7873. (39) Kamerlin, S. C. L.; Warshel, A. The EVB as a quantitative tool for formulating simulations and analyzing biological and chemical reactions. Faraday Discuss. 2010, 145, 71- 106. (40) Zwanzig, R. W. High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 1954, 22, 1420. (41) Hwang, J.-K.; King, G.; Creighton, S.; Warshel, A. Simulation of free energy relationships and dynamics of S N 2 reactions in aqueous solution. J. Am. Chem. Soc. 1988, 110, 5297-5311. (42) King, G.; Warshel Investigation of the free-energy functions for electron-transfer reactions. J. Chem. Phys. 1990, 93, 8682-8692. (43) Hong, G.; Rosta, E.; Warshel, A. Using the Constrained DFT Approach in Generating Diabatic Surfaces and Off Diagonal Empirical Valence Bond Terms for Modeling Reactions in Condensed Phases. J Phys Chem B 2006, 110, 19570-19574. (44) Lee, F. S.; Chu, Z. T.; Warshel, A. Microscopic and Semimicroscopic Calculations of Electrostatic Energies in Proteins by the POLARIS and ENZYMIX Programs. J. Comp. Chem. 1993, 14, 161-185. (45) Warshel, A.; Chu, Z. T.; Villa, J.; Strajbl, M.; Schutz, C. N.; Shurki, A.; Vicatos, S.; Chakrabarty, S.; Plotnikov, N. V.; Schopf, P. MOLARIS-XG. 9.11 ed.; University of Southern California: Los Angeles, CA, 90089. 2012. (46) Stewart, J. J. P. MOPAC2009, MOPAC2009; Stewart Computational Chemistry: Colorado Springs, CO, 2008. (47) Stewart, J. J. P. Optimization of Parameters for Semiempirical Methods I. Method. Journal of Computational Chemistry 1989, 10, 209-220. (48) Stewart, J. J. P. Optimization of parameters for semiempirical methods. II. Applications. J. Comput. Chem. 1989, 10, 221-264. 104 (49) Kamerlin, S. C. L.; Florian, J.; Warshel, A. Associative versus dissociative mechanisms of phosphate monoester hydrolysis: On the interpretation of activation entropies. ChemPhysChem 2008, 9, 1767-1773. (50) Rosta, E.; Kamerlin, S. C. L.; Warshel, A. On the interpretation of the observed linear free energy relationship in phosphate hydrolysis: A thorough computational study of phosphate diester hydrolysis in solution. Biochemistry 2008, 47, 3725-3735. (51) Kamerlin, S. C. L.; Warshel, A. On the energetics of ATP hydrolysis in solution J Phys Chem B 2009, 113, 15692-15698. (52) Barlow, S. E.; van Doren, J. M.; Bierbaum, V. M. The gas phase displacement reaction of chloride ion with methyl chloride as a function of kinetic energy. J. Am. Chem. Soc. 1988, 110, 7240-7242. (53) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Montgomery, J. A.; Vreven, J. T.; Kudin, K. N.; Burant, J. C.et al. Gaussian 03, Revision C.03, 2004. (54) Villà, J.; Warshel, A. Energetics and dynamics of enzymatic reactions. J. Phys. Chem. B. 2001, 105, 7887-7907. (55) Toniolo, A.; Thompson, A. L.; Martı ́nez, T. J. Excited state direct dynamics of benzene with reparameterized multi-reference semiempirical configuration interaction methods. Chemical Physics 2004, 304, 133-145. (56) Warshel, A.; Levitt, M. Theoretical studies of enzymic reactions: Dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme. Journal of Molecular Biology 1976, 103, 227-249. (57) Levitt, M.; Warshel, A. Computer simulation of protein folding. Nature 1975, 253, 694-698. (58) Kamerlin, S. C. L.; Warshel, A. Multiscale modeling of biological functions. Physical Chemistry Chemical Physics 2011, 13, 10401-10411. (59) Hu, H.; Yang, W. Free Energies of Chemical Reactions in Solution and in Enzymes with Ab Initio Quantum Mechanics/Molecular Mechanics Methods. Annual Review of Physical Chemistry 2008, 59, 573-601. (60) Senn, H. M.; Thiel, W. QM/MM Methods for Biomolecular Systems. Angewandte Chemie International Edition 2009, 48, 1198-1229. (61) Muller, R. P.; Warshel, A. Ab Initio Calculations of Free Energy Barriers for Chemical Reactions in Solution. The Journal of Physical Chemistry 1995, 99, 17516. (62) Luzhkov, V.; Warshel, A. Microscopic models for quantum mechanical calculations of chemical processes in solutions: LD/AMPAC and SCAAS/AMPAC calculations of solvation energies. Journal of Computational Chemistry 1992, 13, 199-213. (63) Bentzien, J.; Muller, R. P.; Florian, J.; Warshel, A. Hybrid ab Initio Quantum Mechanics/Molecular Mechanics Calculations of Free Energy Surfaces for Enzymatic Reactions: The Nucleophilic Attack in Subtilisin. The Journal of Physical Chemistry B 1998, 102, 2293- 2301. (64) Plotnikov, N. V.; Kamerlin, S. C. L.; Warshel, A. Paradynamics: An Effective and Reliable Model for Ab Initio QM/MM Free-Energy Calculations and Related Tasks. The Journal of Physical Chemistry B 2011, 115, 7950. 105 (65) Warshel, A.; Weiss, R. M. An empirical valence bond approach for comparing reactions in solutions and in enzymes. Journal of the American Chemical Society 1980, 102, 6218-6226. (66) Hwang, J. K.; King, G.; Creighton, S.; Warshel, A. Simulation of free energy relationships and dynamics of SN2 reactions in aqueous solution. Journal of the American Chemical Society 1988, 110, 5297-5311. (67) King, G.; Warshel, A. Investigation of the free energy functions for electron transfer reactions. J. Chem. Phys. 1990, 93, 8682. (68) Zwanzig, R. High Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 1954, 22, 1420. (69) Torrie, G. M.; Valleau, J. P. Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. Journal of Computational Physics 1977, 23, 187- 199. (70) Lee, F. S.; Chu, Z.-T.; Bolger, M. B.; Warshel, A. Calculations of antibody- antigen interactions: microscopic and semi-microscopic evaluation of the free energies of binding of phosphorylcholine analogs to McPC603. Protein Engineering 1992, 5, 215-228. (71) Rosta, E.; Klahn, M.; Warshel, A. Towards Accurate Ab Initio QM/MM Calculations of Free-Energy Profiles of Enzymatic Reactions. The Journal of Physical Chemistry B 2006, 110, 2934. (72) Mones, L.; Kulhanek, P.; Simon, I.N.; Laio, A.; Fuxreiter, M. The Energy Gap as a Universal Reaction Coordinate for the Simulation of Chemical Reactions. The Journal of Physical Chemistry B 2009, 113, 7867-7873. (73) Claeyssens, F.; Harvey, J. N.; Manby, F. R.; Mata, R. A.; Mulholland, A. J.; Ranaghan, K. E.; Schütz, M.; Thiel, S.; Thiel, W.; Werner, H.-J. High-Accuracy Computation of Reaction Barriers in Enzymes. Angewandte Chemie International Edition 2006, 45, 6856-6859. (74) Lonsdale, R.; Hoyle, S.; Grey, D. T.; Ridder, L.; Mulholland, A. J. Determinants of Reactivity and Selectivity in Soluble Epoxide Hydrolase from Quantum Mechanics/Molecular Mechanics Modeling. Biochemistry 2012, 51, 1774-1786. (75) Valiev, M.; Yang, J.; Adams, J. A.; Taylor, S. S.; Weare, J. H. Phosphorylation Reaction in cAPK Protein Kinase-Free Energy Quantum Mechanical/Molecular Mechanics Simulations. The Journal of Physical Chemistry B 2007, 111, 13455-13464. (76) Wang, T.; Yin, H.; Wang, D.; Valiev, M. Hybrid Quantum Mechanical and Molecular Mechanics Study of the SN2 Reaction of CCl4 + OH– in Aqueous Solution: The Potential of Mean Force, Reaction Energetics, and Rate Constants. The Journal of Physical Chemistry A 2012, 116, 2371-2376. (77) Woods, C. J.; Manby, F. R.; Mulholland, A. J. An efficient method for the calculation of quantum mechanics/molecular mechanics free energies. The Journal of Chemical Physics 2008, 128, 014109-014108. (78) Wood, R. H.; Yezdimer, E. M.; Sakane, S.; Barriocanal, J. A.; Doren, D. J. Free energies of solvation with quantum mechanical interaction energies from classical mechanical simulations. The Journal of Chemical Physics 1999, 110, 1329-1337. (79) Beierlein, F. R.; Michel, J.; Essex, J. W. A Simple QM/MM Approach for Capturing Polarization Effects in Protein −Ligand Binding Free Energy Calculations. The Journal of Physical Chemistry B 2011, 115, 4911-4926. 106 (80) Cao, J.; Bjornsson, R.; Bühl, M.; Thiel, W.; van Mourik, T. Modelling Zwitterions in Solution: 3-Fluoro- γ-aminobutyric Acid (3F-GABA). Chemistry – A European Journal 2012, 18, 184-195. (81) Zhang, Y.; Liu, H.; Yang, W. Free energy calculation on enzyme reactions with an efficient iterative procedure to determine minimum energy paths on a combined ab initio QM/MM potential energy surface. The Journal of Chemical Physics 2000, 112, 3483-3492. (82) Valiev, M.; Garrett, B. C.; Tsai, M.-K.; Kowalski, K.; Kathmann, S. M.; Schenter, G. K.; Dupuis, M. Hybrid approach for free energy calculations with high-level methods: Application to the S[sub N]2 reaction of CHCl[sub 3] and OH[sup -] in water. The Journal of Chemical Physics 2007, 127, 051102-051104. (83) Yin, H.; Wang, D.; Valiev, M. Hybrid Quantum Mechanical/Molecular Mechanics Study of the SN2 Reaction of CH3Cl+OH– in Water. The Journal of Physical Chemistry A 2011, 115, 12047-12052. (84) Savelyev, A.; Papoian, G. A. Chemically accurate coarse graining of double- stranded DNA. Proceedings of the National Academy of Sciences 2010, 107, 20340-20345. (85) Piela, L.; Kostrowicki, J.; Scheraga, H. A. On the multiple-minima problem in the conformational analysis of molecules: deformation of the potential energy hypersurface by the diffusion equation method. The Journal of Physical Chemistry 1989, 93, 3339-3346. (86) Chandrasekhar, J.; Smith, S. F.; Jorgensen, W. L. Theoretical examination of the SN2 reaction involving chloride ion and methyl chloride in the gas phase and aqueous solution. Journal of the American Chemical Society 1985, 107, 154-163. (87) Huber, T.; Torda, A. E.; Gunsteren, W. F. Local elevation: A method for improving the searching properties of molecular dynamics simulation. Journal of Computer- Aided Molecular Design 1994, 8, 695-708. (88) Kato, M.; Warshel, A. Through the Channel and around the Channel:  Validating and Comparing Microscopic Approaches for the Evaluation of Free Energy Profiles for Ion Penetration through Ion Channels. The Journal of Physical Chemistry B 2005, 109, 19516-19522. (89) Warshel, A.; W. Parson, W. Dynamics of biochemical and biophysical reactions: insight from computer simulations. Quarterly Reviews of Biophysics 2001, 34, 563-679. (90) Aqvist, J.; Hansson, T. On the Validity of Electrostatic Linear Response in Polar Solvents. The Journal of Physical Chemistry 1996, 100, 9512-9521. (91) Grossfield, A. WHAM: an Implementation of the Weighted Histogram Analysis method, WHAM 2.0.6; University of Rochester: Rochester, NY, 2012. (92) Kumar, S.; Rosenberg, J. M.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A. The Weighted Histogram Analysis Method for Free-Energy Calculations on Biomolecules. I. The method. Journal of Computational Chemistry 1992, 13, 1011-1021. (93) Hwang, J. K.; King, G.; Creighton, S.; Warshel, A. Simulation of free energy relationships and dynamics of SN2 reactions in aqueous solution. J. Am. Chem. Soc. 1988, 110, 5297-5311. (94) Villa , J.; Warshel, A. Energetics and Dynamics of Enzymatic Reactions. The Journal of Physical Chemistry B 2001, 105, 7887-7907. (95) Atkins, P.; Friedman, R. Molecular Quantum Mechanics; Oxford University Press, USA, 2010. 107 (96) Warshel, A.; Levitt, M. Theoretical studies of enzymic reactions: dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme. J. Mol. Biol. 1976, 103, 227-249. (97) Walker, R. C.; Crowley, M. F.; Case, D. A. The implementation of a fast and accurate QM/MM potential method in Amber. Journal of Computational Chemistry 2008, 29, 1019-1031. (98) Klamt, A.; Schuurmann, G. COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. Journal of the Chemical Society, Perkin Transactions 2 1993, 799-805. (99) Warshel, A.; King, G. Polarization constraints in molecular dynamics simulation of aqueous solutions: The surface constraint all atom solvent (SCAAS) model. Chem. Phys. Lett. 1985, 121, 124-129. (100) Kamerlin, S. C. L.; Warshel, A. On the Energetics of ATP Hydrolysis in Solution. The Journal of Physical Chemistry B 2009, 113, 15692-15698. (101) Sharma, P. K.; Xiang, Y.; Kato, M.; Warshel, A. What Are the Roles of Substrate-Assisted Catalysis and Proximity Effects in Peptide Bond Formation by the Ribosome?†. Biochemistry 2005, 44, 11307-11314. (102) Barducci, A.; Bonomi, M.; Parrinello, M. Metadynamics. Wiley Interdisciplinary Reviews: Computational Molecular Science 2011, 1, 826-843. (103) Retegan, M.; Martins-Costa, M.; Ruiz-Lopez, M. F. Free energy calculations using dual-level Born--Oppenheimer molecular dynamics. The Journal of Chemical Physics 2010, 133, 064103-064105. (104) Rod, T. H.; Ryde, U. Quantum Mechanical Free Energy Barrier for an Enzymatic Reaction. Physical Review Letters 2005, 94, 138302. (105) Leach, A. Molecular Modelling: Principles and Applications (2nd Edition); Prentice Hall, 2001. (106) Roux, B. The calculation of the potential of mean force using computer simulations. Computer Physics Communications 1995, 91, 275-282. (107) Christ, C. D.; Mark, A. E.; van Gunsteren, W. F. Basic ingredients of free energy calculations: A review. Journal of Computational Chemistry 2010, 31, 1569-1582. (108) Kato, M.; Warshel, A. Through the Channel and around the Channel: Validating and Comparing Microscopic Approaches for the Evaluation of Free Energy Profiles for Ion Penetration through Ion Channels. J. Phys. Chem. B 2005, 109, 19516-19522. (109) Kästner, J. Umbrella sampling. Wiley Interdisciplinary Reviews: Computational Molecular Science 2011, 1, 932-942. (110) King, G.; Warshel, A. Investigation of the Free Energy Functions for Electron Transfer Reactions. The Journal of Chemical Physics 1990, 93, 8682. (111) Plotnikov, N. V.; Warshel, A. Exploring, Refining, and Validating the Paradynamics QM/MM Sampling. The Journal of Physical Chemistry B 2012, 116, 10342- 10356. (112) Kumar, S.; Rosenberg, J. M.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A. THE weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J. Comp. Chem. 1992, 13, 1011-1021. 108 (113) Stewart, J. Optimization of parameters for semiempirical methods V: Modification of NDDO approximations and application to 70 elements. Journal of Molecular Modeling 2007, 13, 1173-1213. (114) Warshel, A. C., Z. T.; Villa, J.; Strajbl, M.; Schutz, C. N.; Shurki, A.; Vicatos, S.; Chakrabarty, S.; Plotnikov, N. V.; Schopf, P. MOLARIS-XG, 9.11; University of Southern California: Los Angeles, CA 2012. (115) Vetter, I. R.; Wittinghofer, A. Nucleoside triphosphate-binding proteins: different scaffolds to achieve phosphoryl transfer. Q. Rev. Biophys. 1999, 32, 1-56. (116) Cleland, W. W.; Hengge, A. C. Enzymatic Mechanisms of Phosphate and Sulfate Transfer. Chem. Rev. 2006, 106, 3252-3278. (117) Benkovic, S. J.; Schray, K. J.; Academic Press: New York, 1973, p 201-238. (118) Cox, J. R., Jr.; Ramsay, O. B. Mechanisms of Nucleophilic Substitution in Phosphate Esters. Chem. Rev. 1964, 64, 317-352. (119) Kirby, J. A.; Warren, S. G. The Organic Chemistry of Phosphorus; Elsevier: Amsterdam, 1967. (120) Mildvan, A. S. The Role of Metals in Enzyme-Catalyzed Substitutions at Each of the Phosphorus Atoms of ATP. Adv. Enzymol. Relat. Areas Mol. Biol. 1979, 49, 103-126. (121) Westheimer, F. H. Monomeric Metaphosphates. Chem. Rev. 1981, 81, 313-326. (122) Kamerlin, S. C.; Haranczyk, M.; Warshel, A. Progress in ab initio QM/MM Free- Energy Simulations of Electrostatic Energies in Proteins: Accelerated QM/MM Studies of pKa, Redox Reactions and Solvation Free Energies. J Phys Chem B 2009, 113, 1253-1272. (123) Senn, H. M.; Thiel, W. QM/MM Methods for Biomolecular Systems. Angew Chem Int Ed Engl 2009, 48, 1198-1229. (124) Kamerlin, S. C. L.; Vicatos, S.; Dryga, A.; Warshel, A. Coarse-Grained (Multiscale) Simulations in Studies of Biophysical and Chemical Systems. Annual Review of Physical Chemistry 2011, 62, 41-64. (125) Senn, H. M.; Thiel, W. QM/MM methods for biomolecular systems. Angew. Chem. Int. Ed. 2009, 48, 1198-1229. (126) Kamerlin, S. C. L.; Vicatos, S.; Dryga, A.; Warshel, A. Coarse-Grained (Multiscale) Simulations in Studies of Biophysical and Chemical Systems. Annu. Rev. Phys. Chem. 2011, 62, 41-64. (127) Kamerlin, S. C. L.; Sharma, P. K.; Prasad, B. R.; Warshel, A. Why Nature Really Chose Phosphate. Q Rev Biophys 2013, 15, 1-132. (128) Kamerlin, S. C. L.; Sharma, P. K.; Prasad, B. R.; Warshel, A. Why nature really chose Phosphate. Q. Rev. Biophys. 2013, 15, 1-132. (129) Florián, J.; Warshel, A. Phosphate Ester Hydrolysis in Aqueous Solution: Associative Versus Dissociative Mechanisms. J Phys Chem B 1998, 102, 719-734. (130) Klahn, M.; Rosta, E.; Warshel, A. On the mechanism of hydrolysis of phosphate monoester dianions in solution and proteins. J. Am. Chem. Soc. 2006, 128, 15310-15323. (131) Hu, C.-H.; Brinck, T. Theoretical Studies of the Hydrolysis of the Methyl Phosphate Anion. J. Phys. Chem. A 1999, 103, 5379-5386. (132) Grigorenko, B. L.; Rogov, A. V.; Nemukhin, A. V. Mechanism of Triphosphate Hydrolysis in Aqueous Solution: QM/MM Simulations in Water Clusters. J Phys Chem B 2006, 110, 4407-4412. 109 (133) Grigorenko, B. L.; Nemukhin, A. V.; Shadrina, M. S.; Topol, I. A.; Burt, S. K. Mechanisms of guanosine triphosphate hydrolysis by Ras and Ras-GAP proteins as rationalized by ab initio QM/MM simulations. Proteins: Struct. Funct. Bioinf. 2007, 66, 456-466. (134) Glaves, R.; Mathias, G.; Marx, D. Mechanistic Insights into the Hydrolysis of a Nucleoside Triphosphate Model in Neutral and Acidic Solution. J. Am. Chem. Soc. 2012, 134, 6995-7000. (135) Kamerlin, S. C.; Florian, J.; Warshel, A. Associative versus dissociative mechanisms of phosphate monoester hydrolysis: On the interpretation of activation entropies. Chemphyschem : a European journal of chemical physics and physical chemistry 2008, 9, 1767- 1773. (136) Ram Prasad, B.; Plotnikov, N.; Warshel, A. Addressing open questions about phosphate hydrolysis pathways by careful free energy mapping. J Phys Chem B 2013, 117, 153- 163. (137) Kamerlin, S. C.; Haranczyk, M.; Warshel, A. Are Mixed Explicit/Implicit Solvation Models Reliable for Studying Phosphate Hydrolysis? A Comparative Study of Continuum, Explicit and Mixed Solvation Models. Chemphyschem : a European journal of chemical physics and physical chemistry 2009, 10, 1125-1134. (138) Plotnikov, N. V.; Prasad, B. R.; Chakrabarty, S.; Chu, Z. T.; Warshel, A. Quantifying the Mechanism of Phosphate Monoester Hydrolysis in Aqueous Solution by Evaluating the Relevant ab initio QM/MM Free Energy Surfaces. The Journal of Physical Chemistry B 2013, submitted. (139) Shao, Y.; Molnar, L. F.; Jung, Y.; Kussmann, J.; Ochsenfeld, C.; Brown, S. T.; Gilbert, A. T. B.; Slipchenko, L. V.; Levchenko, S. V.; O'Neill, D. P.et al. Advances in Methods and Algorithms in a Modern Quantum Chemistry Program Package. Physical Chemistry Chemical Physics 2006, 8, 3172-3191. (140) Jang, S.; Voth, G. A. A derivation of centroid molecular dynamics and other approximate time evolution methods for path integral centroid variables. J. Chem. Phys. 1999, 111, 2371-2384. (141) Warshel, A.; King, G. Polarization Constraints in Molecular Dynamics Simulation of Aqueous Solutions: The Surface Constraint All Atom Solvent (SCAAS) Model. Chem. Phys. Lett. 1985, 121, 124-129. (142) Prasad, B. R.; Plotnikov, N. V.; Warshel, A. Addressing Open Questions about Phosphate Hydrolysis Pathways by Careful Free Energy Mapping. The Journal of Physical Chemistry B 2013, 117, 153-163. (143) Stanton, C. L.; Kuo, I. F. W.; Mundy, C. J.; Laino, T.; Houk, K. N. QM/MM Metadynamics Study of the Direct Decarboxylation Mechanism for Orotidine-5‘-monophosphate Decarboxylase Using Two Different QM Regions: Acceleration Too Small To Explain Rate of Enzyme Catalysis. J Phys Chem B 2007, 111, 12573-12581. (144) Warshel, A. Bicycle-Pedal Model for the First Step in the Vision Process. Nature 1976, 260, 679-683. (145) Warshel, A.; Russell, S. T. Calculations of Electrostatic Interactions in Biological Systems and in Solutions. Q. Rev. Biophys. 1984, 17, 283-422. 110 (146) Hwang, J.-K.; Chu, Z. T.; Yadav, A.; Warshel, A. Simulations of Quantum Mechanical Corrections for Rate Constants of Hydride-Transfer Reactions in Enzymes and Solutions. J. Phys. Chem. 1991, 95, 8445-8448. (147) Gillan, M. J. Quantum-Classical Crossover of the Transition Rate in the Damped Double Well. J. Phys. C. Solid State Phys. 1987, 20, 3621-3641. (148) Villa, J.; Warshel, A. Energetics and Dynamics of Enzymatic Reactions. J Phys Chem B 2001, 105, 7887-7907. (149) Stockbridge, R. B.; Wolfenden, R. Enhancement of the Rate of Pyrophosphate Hydrolysis by Nonenzymatic Catalysts and by Inorganic Pyrophosphatase. J. Biol. Chem. 2011, 286, 18538-18546. (150) Piatkevich, K. D.; Malashkevich, V. N.; Almo, S. C.; Verkhusha, V. V. Engineering ESPT Pathways Based on Structural Analysis of LSSmKate Red Fluorescent Proteins with Large Stokes Shift. Journal Of The American Chemical Society 2010, 132, 10762- 10770. (151) Warshel, A.; Lappicirella, A. Calculations of ground- and excited-state potential surfaces for conjugated heteroatomic molecules. Journal Of The American Chemical Society 1981, 103, 4664-4673. (152) Sham, Y. Y.; Chu, Z. T.; Warshel, A. Consistent Calculations of pK a 's of Ionizable Residues in Proteins: Semi-microscopic and Macroscopic Approaches. J Phys Chem B 1997, 101, 4458-4472. (153) Habershon, S.; Manolopoulos, D. E.; Markland, T. E.; Miller III, T. F. Ring- Polymer Molecular Dynamics: Quantum Effects in Chemical Dynamics from Classical Trajectories in an Extended Phase Space. Annual Review of Physical Chemistry 2013, 64, null. (154) Isborn, C. M.; Götz, A. W.; Clark, M. A.; Walker, R. C.; Martínez, T. J. Electronic Absorption Spectra from MM and ab Initio QM/MM Molecular Dynamics: Environmental Effects on the Absorption Spectrum of Photoactive Yellow Protein. Journal of Chemical Theory and Computation 2012, 8, 5092-5106. (155) Prasad, B. R.; Kamerlin, S. C. L.; Plotnikov, N. V.; Warshel, A. Studying Catalysis by QM/MM Approaches should not be a Black Box Process. Journal of Thermodynamics & Catalysis 2012, 3, 2157-7544. 111 7 Appendices 7.1 Appendix 1. Starting and refined EVB parameters in chapter 1. Table S1: Initial (default) EVB parameters. See Figure 4 in the main text for the definition and atom numbering in the different VB states. Partial atomic charges of the reacting part (in fraction of an elementary charge) in the different VB states for calculating electrostatic interactions by 2 4 332 1 ij r ij elec ij qq Ee r . Bond parameters for the covalent bonds of the reacting part in the different VB states. All bonds are Morse bonds, where 0 () 2 (1 ) rr bond ED e . Angle parameters for the bending of the adjacent covalent bonds in the reacting part in the different VB states, where 2 0 () angle EK , and no bond-angle coupling is activated. Cl(1) H(3,4,5) C(2) Cl(6) State I -0.168 0.200 -0.438 -0.994 State II -0.994 0.200 -0.438 -0.168 D (kcal mol -1 ) r 0 (Å) β, (Å -1 ) C-Cl 79 1.82 0.8 C-H 100.4 1.10 2.0 K θ , (kcal rad -2 mol -1 ) Θ (rad) H-C-H, H-C-Cl 50 1.911 Cl-C-Cl 17 3.142 112 VdW parameters for atom pairs in the reactive part, which are bonded in one of the VB states. Here, 12 6 r VDW AB E Ce rr , and A and B are zero. VdW parameters for atom pairs in the reactive part, which are never bonded in any VB state. Here, 12 6 VDW A B E rr , and B is zero. Screening term to account for the inductive interaction between the Cl - ---C atom pair, using the relationship 166 4 E screen r . Gas-phase shifts of 13.96 kcal/mol were applied to both state 1 ( 1 ) and 2 ( 1 ), and the initial value of the mixing term (H 12 ) before parameter refinement was 50 kcal/mol. Table S2: Final (refined) EVB parameters. See Figure 4 in the main text for the definition and atom numbering in the different VB states. Partial atomic charges of the reacting part (in fraction of an elementary charge) in the different VB states. 2 4.5 332 1 ij r ij elec ij qq Ee r C (kcal mol -1 ) β (A -1 ) Cl - ---C 69999 4 Cl - ---H 12000 3.6 A (kcal·mol -1 ·A 12 ) Cl - ---Cl 1000000 (A 8 ) Cl - ---C 0.6 113 Bond parameters for the covalent bonds of the reacting part in the different VB states. All bonds are Morse bonds, where 0 () 2 (1 ) rr bond ED e . Angle parameters for the bending of the adjacent covalent bonds in the reacting part in the different VB states. After parameter refinement, bond-angle coupling was activated between the H-C-Cl angle and the C-Cl bond, through the relationship 00 0 2 0 1, 2, ba rr rr rr ee rr , such that now 2 0 angle ba EK . VdW parameters for atom pairs in the reactive part, where 12 6 VDW A B E rr . After refinement, gas-phase shifts of -1.05 kcal/mol were applied to both state 1 ( 1 ) and 2 ( 1 ). Additionally, the best form for the mixing term (see main text) was found to be 2 12 12 26 exp( ( ) ) HA r r , where r 12 and r 26 are the distances between atoms 1 and 2, and 2 and 6, respectively. Here, A=37 kcal/mol and =1.45. Cl(1) H(3,4,5) C(2) Cl(6) State I -0.180 0.183 -0.382 -0.987 State II -0.987 0.183 -0.382 -0.180 D (kcal mol -1 ) r 0 (Å) β (Å -1 ) C-Cl 80 1.806 1.650 C-H 100.4 1.093 1.740 K θ (kcal rad -2 mol -1 ) Θ (rad) H-C-Cl 25.4 1.907 H-C-H 35.5 1.914 Cl-CCl 17 3.142 114 7.2 Appendix 2. Simulation protocols used in chapter 2. I. Implementation of solute polarization by MM atoms in MOPAC2009 interfaced with molaris9.11 In the present work all the QM/MM simulations were performed using a development version of MOLARIS-XG simulation package (MM program) and a development version of MOPAC2009 46 (QM program) which we modified to enable the QM/MM coupling described below. In this QM/MM approach, the electrostatic coupling is implemented according to earlier works 17,56 . The implementation involves evaluation of the electrostatic potential (ESP), vqc, at centers of the QM atoms by the MM program: MM atoms () 332 j ij q vqc i r (S1) Where i is the QM atom number, j is the MM atom number, r ij is the distance between the atoms i and j, [vqc]=kcal/(mol·e) where e stands for the elementary charge. The QM program reads the supplied ESP from the file mol.in created by the MM program: first_line 6 0 # of qmmm atoms, # of link atoms in Region I CL -1.591010336 -3.497323620 -4.177329152 119.381953977 C 0.623273531 -3.927769978 -4.243650888 88.802327810 H 0.627631085 -3.831528682 -5.334074435 77.449540155 H 0.737788528 -3.010768158 -3.634868517 83.899739734 H 0.444587282 -4.863821218 -3.677635261 90.477795343 CL 2.837655032 -4.254371189 -4.197078072 120.024810232 The reading is activated by (DEBUG + MOL_QMMM(new) ) keywords, e.g.: PM6 1SCF CHARGE=-1 SINGLET GRAD LET XYZ DEBUG MOL_QMMM snapshot of MD step 0 CL -1.5910103360 1 -3.4973236200 1 -4.1773291520 1 C 0.6232735310 1 -3.9277699780 1 -4.2436508880 1 H 0.6276310850 1 -3.8315286820 1 -5.3340744350 1 H 0.7377885280 1 -3.0107681580 1 -3.6348685170 1 H 0.4445872820 1 -4.8638212180 1 -3.6776352610 1 CL 2.8376550320 1 -4.2543711890 1 -4.1970780720 1 A (kcal mol -1 Å 12 ) B (kcal mol -1 Å 6 ) Cl - ---C 119246 235 Cl - ---H 75848 120 Cl - ---Cl 126247162 8306 115 These new keywords make the QM program open the file mol.in. The first line is skipped; on the second line the first two numbers (QM atoms + link atoms) are read to determine the total number of atoms in the QM input. vqc(i) on the link atoms can be approximated by the ESP on the host QM atom. We add to the one-electron matrix elements the energy of interaction between the electron and MM atoms, -vqc(i): E.g. to convert kcal/mol into eV: QM/MM QM () 23.0606 vqc i hih i (S2) We have made these few changes to the QM program. The MM-program now reads energy (heat of formation), gradients and the solute charges (note that now the solute wavefunction is polarized and the calculated heat of formation and the solute charges correspond to the polarized solute). Thus, the QM/MM coupling electrostatic energy is included in the calculated heat of formation. However, the electrostatic QM/MM coupling in the energy gradients (both for QM and for MM atoms) is still needed to be calculated by the MM program using the read solute charges. Note that the MOPAC2009 program is developed and distributed by Stewart Computational Chemistry, and authors of this article have worked with the MOPAC2009 code under the non-distribution code-donor agreement. The MOLARIS-XG program is developed in Arieh Warshel’s research group (laetro.usc.edu). II. Computational details Simulation 1. FEP from PM3/MM to PM6/MM for the S N 2 reaction in water between methyl chloride and chloride The FEP was performed between the 2 QM/MM potentials, E REF (PM3/MM) and E TARG (PM6/MM) at the TS (transition state). Note that the solute charges in these simulations are derived according to the scheme of Mulliken. The system was solvated by an 18 A sphere of water 99 and pre-equilibrated on the potential E REF . A harmonic constraint with K=100 kcal/(mol·A 2 ) was used to contain sampling within the TS ( ξ= d(C-Cl 1 ) - d(C-Cl 2 ); RC=0 here). We started from 41 independent MD trajectories, which were propagated for 100,000 MD steps at 300 K in NVT ensemble (MD step size of 0.5 fs) on one of the mapping potentials given by: PM3/MM PM6/MM 1 mm m CONS EE E E (S3) where QM/MM QM/MM polar polar VdW EH E (S4) and with the weight of the potential E TARG being changed in the equal increments of 0.025 from 0 to 1. Thus, the potentials in the adjacent windows are close which in turn guarantees good overlap between them and good FEP convergence. The whole procedure was repeated for the simulation at the RS (reactant state) with the only difference that the sampling was contained at RC=-1.75. Note, however, that use of constraints on the RC here is solely for demonstration and testing purposes, since the RS is effectively sampled on the unbiased potential. Next, we process the MD trajectories to calculate the FEP. Each 10 MD steps we collected the following information: E REF , E TAR , the value of the constraining potential, E CONS , and the two distances, d(C-Cl 1 ) and d(C-Cl 2 ) for the QM subsystem. Next, the second half of each MD trajectory (5000 points per simulation window) for the RS was used in the analysis described below. The same procedure was repeated for the simulation at the TS. Two input files obtained in this way were processed by several computational techniques for evaluation of FEP as well as by the generalized FEP/US to get the free energy functions. Part I: FEP estimates by LRA, FEP and TDI LRA: 1 2 REF TARG LRA TARG REF TARG REF EE GE E E E (S5) The 1-st and the 2-nd terms (averages of the energy gap between the reference potential (RP) and the target potential (TP), taken correspondingly on the MD trajectories propagated on the RP and on the TP) of Eq. S5. average FEP: 1 11 12 1 ln exp ln exp 2 m m nn mm m m FEP mm EE EE E E GkT kT kT kT (S6) 116 of the forward FEP and the backward FEP which are the 1-st and the 2-nd terms of Eq. S6 n-steps LRA where each step of the full FEP is approximated by LRA: 1 1 11 1 1 2 mm n n LRA m m m m EE m GEE EE (S7) TDI (Thermodynamic Integration) with the trapezoid rule: 1 1 2 0 1 2 mm n TDI TARG REF TARG REF EE m G G d EE EE (S8) The aforementioned techniques of Part I obtained from the trajectories generated in Simulation 1 are plotted in Figure 2 of the main text (A) and (B). Next part of the analysis included: Part II. FEP/US, constructing the free energy functions The free energy profiles along ξ=d(C-Cl 1 )-d(C-Cl 2 ) were constructed by: i i ln exp m m m E EE gGkT x kT (S9) Weight-averaged Δg REF profile and the weight-averaged Δg TARG profile over all points were obtained by averaging Eq. S9 over all mapping potentials (simulation frames) with the weighting coefficient: j ij frames j frames N gg N (S10) where N j ( ξ) is the number of times MD visited a particular RC value, ξ, while propagating on the j-th mapping potential, and Δg j is the free energy function estimate by Eq. S9 on the j-th mapping potential. Part III. Analysis of MD trajectories and of sampling efficiency In addition to the above analysis , we created the following histograms: The distribution of the energy gap between the RP and the TP was obtained by sorting the values from all MD trajectories. This analysis gives the most probable energy gap value, which is a rough estimate of the free energy difference for moving between the two potentials. The distribution of the difference d(C-Cl 1 )-d(C-Cl 2 ) was also calculated. It shows how often the certain values of the RC were visited during the simulation. The corresponding results of Part III for Simulation 1 are plotted in Figure 3 of the main text. Simulation 1 was also repeated with a larger value of the force constant, K=1000, used to contain the sampling within a certain region of the RC. Simulation 2: PMF’s for PM3/MM, PM6/MM potentials In this series of simulations we used harmonic constraints on the RC to improve the MD sampling. Afterwards we combined the simulation windows and removed the bias potential using the WHAM 91 program. We started with the PM3 47 /MM and PM6 113 /MM potentials (Eq. S4) with the QM/MM coupling scheme of Eq. (S2). The range of the Reaction Coordinate (RC) 2.5,2.5 defined as: 12 ()( ) dC Cl dC Cl (S11) was divided by the equal increment of Δξ=0.05 A into 101 simulation frames. To contain the sampling within a given simulation window (at the corresponding RC) we used harmonic constraints with the force constant of 100 kcal/(mol·A 2 ): 2 0 CONS EK (S12) 117 Starting from the optimized RS gas phase geometry (dipole complex of CH 3 Cl and Cl - ) we changed the distances d(C-Cl 1 ) and d(C-Cl 2 ) thus moving the system to an approximate RC value. Next, the obtained geometries were equilibrated in the gas phase using the constraint of Eq. S12 for 4000 MD steps with 0.2 fs step size at 273 K. These equilibrated geometries were solvated with 18 A spheres of water; the whole system was equilibrated on the actual QM/MM potential with the corresponding constraint for 20000 MD steps (0.2 fs each) at 273 K. The list of non-bonded pair interactions was updated every 5 steps. Next the temperature was increased to 300 K, and the step size was increased to 0.5 fs. The MD trajectories were propagated on the given surface for 50000 steps (with frequency of updating the non-bonded pair interactions equal to 10 steps). After 10000 MD steps we started collecting data for PMF calculations each 10 steps, which were performed using the WHAM 91 program. The PMF’s constructed for the PM3/MM potentials (with the charge derivation schemes by Mulliken and by the ESP-fitting) and for PM6/MM potential (with the Mulliken charge derivation) are given in Figure 5 of the main text. Simulation 3: FEP from EVB to PM3/MM In this simulation we used the EVB potential as a RP and PM3/MM (Mulliken charges) as a TP. The parameters for the EVB potential are the refined parameters which can be found elsewhere 64 . In this work we just used a slightly different EVB off-diagonal element: 2 12 38exp 1.05 H (S13) also, the bond-angle coupling was on in all calculations and the C…Cl- interactions were additionally described by the harmonic constraint in both EVB states (this energy was included in the EVB diabatic states in kcal/mol): 2 ... 5 3 Er C Cl r (S14) The EVB RP is given by: 22 11 2 2 12 12 2 EVB E cE cE cc H (S15) while the PM3/MM TP is given by Eq. S4. We used a harmonic constraint at the RC=-1.5 (which corresponds to the RS) to check the effect of the constrained on the FEP at a particular RC with the LRA treatment. Other simulation details were similar to the ones described in Simulation 1. At the EVB TS, we found it more efficient to constraint the eigenvectors component to be equal (which means mixing 2 EVB diabatic states equally, note that for the EVB TS E 1 =E 2 and thus c 1 =c 2 ), i.e.: 12 12 11 22 EVB E EE H (S16) That is, we carried out the perturbation from the potential given by Eq. S16 to the PM3/MM potential, with sampling still being contained within the TS by the harmonic constraint at RC=0 as in Simulation 1. The data points were collected every 10 MD steps after discarding the first 50000 steps for equilibration. While for the trajectory propagated at the RS we used exactly the same formalism as described in Simulation 1 (created the input file 41x5000 points, and performed Part I-Part III analysis of the trajectories), for the TS we had to modify the procedure of constructing the free energy functions. Namely, we calculated the actual EVB adiabatic energy (since, while propagating the MD trajectory on the potential of Eq. S16, E 1 can be different from E 2 even when the eigenvector components are forced to be equal): 2 2 12 1 2 12 1 4 2 EVB EEE EE H (S17) Subsequently, we constructed the free energy functions by Eq. S9 and S10. Finally, to construct a PMF for the EVB potential (the PMF for the PM3/MM potential was constructed in Simulation 2) we performed an EVB FEP/US calculation. That is we used the EVB mapping potentials to sample the EVB potential along the reaction coordinate defined as the energy gap between the EVB diabatic states, E 1 and E 2 , also frequently called the product state and the reactant state: 12 1 m E EE (S18) In total we generated 41 simulation frames, and ran them sequentially, for 50000 MD steps each (0.5 fs step size, 300K, non-bonded pair list updated every 5 steps, 18 A simulation sphere of water). Before the first EVB frame we let the system equilibrate for 10000 steps. For each frame after the first 10000 steps we collected data (each 10 steps) to construct the free energy function using the FEP/US approach (4000 per frame). Then the data points were processed using Eq. S9 what gave the results reported in Figure 11 of the main text. 118 Fitting of Γ-functions to the Potential Energy Scans First, we performed the potential energy scan along the RC using the QM program for the methyl chloride and chloride system. The scans were carried out at PM3, PM6 level of theory in the gas phase and in the COSMO implicit solvent model. Second, we used the approach described in Simulation 2, which involved propagating 101 MD trajectories on the EVB potential in the gas phase with additional harmonic constraints with the force constants, K, of 70 kcal/(mol·A 2 ) imposed on the equally spaced values of the RC in the range 2.5,2.5 . Then we constructed the PMF’s for the EVB potentials (using the WHAM approach) with the constraints of the Eq. S14 acting in the full range of the RC and in the range where the constraints were imposed only within the RC region of 1.0,1.0 . Finally, the obtained PES’s and the PMF were fitted by the sums of Gaussians by searching the minimum of the least square functions. We found that it was convenient to perform the fitting with a mathematical software package Maple9.5, which gives the analytical derivatives of Gaussians (which were subsequently used in the optimal steepest descent minimization approach to determine the parameters of the Gaussian functions), another advantage being a convenient graphical user interface. See the script example given below. The data points of the PES’s and of the PMF used for fitting, as well as the derived Γ-functions (the sum of the Gaussian functions approximating the corresponding energy functions obtained in the fitting procedure) are given in the main text with the parameters of Gaussians being reported as well. Simulation 4: Gas-phase EVB PMF for the original EVB potential and for the refined EVB potential In this simulation, we additionally calculated the PMF for the refined EVB potential in the gas phase (the PMF for the original EVB potential and the computational details are described in Fitting of Γ-functions to the Potential Energy Scans): PM6 EVB EVB EVB EE (S19) The comparison of the free energy functions for the original EVB potential (S15) and for the refined EVB potential (S19) is given in Figure 2-12. Simulation 5: AC solvation free energies The PM3 optimized gas-phase geometries were first evaluated for both the RS and for the TS. Next we calculated the work of charging the solute by Eq. S6 (to get the solvation free energies for the TS and for the RS reported in Table 2-2 of the main text) from Q 0 =0 to Q(RS) and Q(TS) by performing the FEP with 26 mapping potentials of Eq. S18, where 1 VdW EE (S20) 2 332 ij VdW ij Qq EE r (S21) That is, we propagated 26 MD trajectories on the RS (and for the TS) with the frozen solute coordinates. After the initial equilibration of the solute with zero charges in an 18 A sphere of water for 20000 MD steps (1 fs, 300 K), each trajectory was sequentially propagated for 20000 MD steps (1 fs, 300 K). Q 0 Q(RS) Q(TS) Cl(L) 0.0 -0.1706 -0.7441 C 0.0 -0.4322 0.1407 H 0.0 0.1974 0.1160 H 0.0 0.1969 0.1158 H 0.0 0.1972 0.1157 Cl(A) 0.0 -0.9888 -0.7441 Also for the same geometries we calculated the solvation free energy by the COSMO approach as implemented in the QM program. The obtained results are reported in Table 2-2 of the main text. Simulation 6: The PMF for COSMO and its improvement by the with Γ-COSMO 119 We started by propagating 101 independent MD mapping trajectories for the MeCl +Cl- system in the gas phase The mapping potentials defined with the equally spaced coupling parameter 0.0,1.0 1 mmRSmPS EE E (S22) where 2 RS QM QM RS EE K (S23) 2 PS QM QM PS EE K QM=PM3+COSMO; K=5 kcal/(mol·A 2 ); 2.5 RS PS A RC is defined by Eq. S11 Each trajectory was initially equilibrated during 20000 MD steps (0.3 fs step size, 273 K) and then was propagated for 100000 steps (1 fs step size, 300K), during the final 75000 steps the data points were collected ,each 30 steps, for the subsequent free energy calculations. Additionally, we used a harmonic constraint on Cl 1 -C-Cl 2 angle (at π rad with a force constant, K=15 kcal/(mol·rad 2 )). All the computer experiments were performed in the gas phase, while the solvent effect was taken into account by the implicit COSMO model as implemented in MOPAC2009. In the free energy calculations we used Eq. S9 where E m is given by Eq. S22 and i=PM3+COSMO (PM3 potential with the COSMO solvent, ε=78.4). To get the PM3/MM free energy function for the RC we used E QM =E PM3/MM given by Eq. S4. The simulation protocol was following. First, by changing d(C-Cl 1 ) and d(C-Cl 2 ) we moved the system near the region of interest of the RC. Subsequently, we relaxed the system on each potential specified by Eq. S22 in the gas phase with PM3+COSMO for 25000 MD steps (0.3 fs stepsize, 273 K, K=10 in Eq. S23). At the next step we solvated the relaxed geometry by an 18 A sphere of water molecules and relaxed on the PM3/MM potential (with the ESP-derived solute charges) for 25000 MD steps (at 273 K, with a step size 0.3 fs and the frequency of updating the non-bonded pair list of 5, K=10). Finally, we increased the temperature to 300K and ran a 100000 steps MD trajectory with a step size of 1 fs (10 nonbond update frequency, K=10) on the same potential. The data points for the free energy calculations were collected each 30 steps for the last 2/3 of the trajectory. In the free energy calculations we used Eq. S9 where E m is given by Eq. S22 and i=PM3/MM with the ESP-derived solute charges. The calculated free energy profiles are given in Figure 2-13 of the main text. Additionally, we explored the sampling efficiency for the PM3/MM and PM3+COSMO run described here by the approach described in Simulation 1 That is, we constructed the distribution of the sampled RC values for the point used in the free energy analysis. See Figure 2-14 of the main text. Simulation 7: The free energy functions obtained using the EVB-type solvent driving potentials. Refining a MO-based QM/MM reference potential The approach of Eq. S22 and S23 was extended by including an additional term in the mapping potential: () sS 12 6 11 k QM MM i j ij ij k ij ij ij ij Qq A B E rr r Q (S24) This was done by using : PM3,gas PM3,gas 1 map CONS RS sS RS CONS PS sS PS EE E E E E QQ where Q RS and Q PS are the vectors of the QM charges at the RS and at the PS (ESP-charges for the PM3 gas-phase optimized RS): 120 Cl L C H H H Cl A PM3 RS Q -0.180 -0.382 0.183 0.183 0.183 -0.987 PM3 PS Q -0.987 -0.382 0.183 0.183 0.183 -0.180 First, by changing d(C-Cl 1 ) and d(C-Cl 2 ) we moved the system to an approximate value of the RC. Then we relaxed the system on each potential derived according to Eq. S22 in the gas phase with PM3 gas phase potential for 2000 MD steps (0.2 fs stepsize, 273 K, K=10 in Eq. S23). Second, we solvated the relaxed geometry by an 18 A sphere of water molecules and relaxed on the potential given by Eq. S25 for 10000 MD steps (at 273 K, with a step size 0.2 fs and the frequency of updating the non-bonded pair list of 5, K=10). Finally, we increased the temperature to 300K and ran a 50000 steps MD trajectory with a step size of 1 fs (10 nonbond update frequency, K=10) on the same potential. The data points for the free energy calculations were collected each 10 steps for the final 3/5 of the trajectory. In the free energy calculations we used Eq. S9 where E m is given by Eq. S25 and the PM3/MM potential is approximated by the classically coupled with MM, according to Eq. S24, gas phase PM3 (solute charges are derived by Mulliken) PM3,gas sS PM3,gas i EE E Q (Error! Reference source not found.) The results of the free energy calculation are presented in Figure 15 (A) where the comparison with the PM3/MM potential obtained in Simulation 2 is made. In the next step we repeated the free energy calculations (with the same data generated on potential of Eq. S25) using for the E i of Eq. S9 the following expression : PM3,gas sS PM3,gas PM3,gas PM6,gas i EE E Q (Error! Reference source not found.) The results of this free energy calculation are presented in Figure 2-15 (B) where the comparison with the PM6/MM potential obtained in Simulation 2 is also made. Simulation 8: Refinement of the EVB reference potential Finally, we refined the EVB potential (used in Simulation 3) in the following way: 22 , 1 sS PM3+COSMO 2 sS PM3+COSMO EVB,gas PM3,gas EVB EVB gas E E cE RS cE PS QQ (S29) 22 , 1 sS PM6+COSMO 2 sS PM6+COSMO EVB,gas PM6,gas EVB EVB gas E E cE RS cE PS QQ (S30) where E EVB,gas describes the intra-interactions within the solute of the EVB potential. Mulliken charges for the PM3+COSMO optimized RS: Cl L C H H H Cl A PM3+COSMO RS Q -0.137 -0.067 0.066 0.066 0.066 -0.994 PM3+COSMO PS Q -0.994 -0.067 0.066 0.066 0.066 -0.137 121 Mulliken charges for the PM6+COSMO optimized RS: Cl L C H H H Cl A PM6+COSMO RS Q -0.226 -0.319 0.181 0.181 0.181 -0.998 PM6+COSMO PS Q -0.998 -0.319 0.181 0.181 0.181 -0.226 The EVB FEP/US calculations were performed using the mapping potentials: 12 EVB,gas PM3,gas 1 m EEE (S31) and 12 EVB,gas PM6,gas 1 m EEE (S32) Each simulation involved 41 frames, run sequentially, for 50000 MD steps each (0.5 fs step size, 300K, non-bonded pair list updated every 5 steps, 18 A simulation sphere of water). Before the first EVB frame we equilibrated the system for 10000 steps. For each frame we collected data (each 10 steps ,after the first 10000 steps ) to construct the free energy function using FEP/US approach (4000 points per frame). The data points were processed using Eq. S9 for E i and E m given by Eq. S29 and S31, and for E i and E m given by Eq. S30 and S32. The obtained free energy functions are reported in Figure 2-16 of the main text, where the comparison with the original EVB potential is also given. Additional notes Typically there are about 800 solvent molecules within the distance of 18.0 A from the solute Parameters for water: water !name of the first solvent 0.9897 !density (gram/cm**3) 0.99 0.99 !bond lenth for bond 1, bond 2 375.0 375.0 !force constant for bond 1, bond 2 1.911 !angle between bond 1 and bond 2 60.0 !force constant for angle 00861.440 026.040 016.000 !VDW a,b and mass for OW 00000.420 000.575 001.000 !VDW a,b and mass for W1 00000.420 000.575 001.000 !VDW a,b and mass for W2 -0.82 0.41 0.41 !charges for the three united atoms Maple9.5 script to fit the PES with 3 Gaussian functions #input file format: RC Energy f := fopen( "PM3_PES.csv", READ ): t1:=readdata(f,[float,float]): fclose(f): #INPUT PARAMETERS npoints:=130; #number of point in PES A0:=-3.8: print("A=",A0); #guess value for the height of Gaussians 122 B0:=5.8: print("B=",B0); C0:=0: R1:=-1.0: #values for the centers of Gaussians R2:=0: R3:=1.0: niter:=20: #THE END temp:=C0: C0:=temp+A0: fvalue:=array(1..npoints+1): counter:=array(1..npoints+1): Eref:=array(1..npoints): Efit:=array(1..npoints): xi:=array(1..npoints): Eref:=array(1..npoints): print("PASS1"); for j from 1 to npoints by 1 do xi[j]:=t1[j,1]; Eref[j]:=t1[j,2]-t1[1,2]; end do: a0:=0.5: b0:=1.0: g0:=0.5: print("PASS2"); Ef := proc( x, alpha, beta, gamma , A1, B1, C1) description "evaluate _PDMTD"; A1*exp(-alpha*(x-R1)^2)+B1*exp(-beta*(x-R2)^2)+C1*exp(-gamma*(x-R3)^2) end proc: print("TROUBLE? npoints=",npoints,"i2=",i2); F:=sum((Eref[i2]-Ef(xi[i2],a,b,g,A,B,C))^2,i2=1..npoints); subs(a=a0, b=b0, g=g0, A=A0, B=B0, C=C0, F): evalf(%); dFa:=diff(F,a): dFb:=diff(F,b): dFg:=diff(F,g): dFA:=diff(F,A): dFB:=diff(F,B): dFC:=diff(F,C): 123 print("PASS3"); counter[1]:=1: fvalue[1]:=subs(a=a0, b=b0, g=g0, F): for iter from 1 to niter do F_old:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, F): Gr_a:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, dFa): Gr_b:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, dFb): Gr_g:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, dFg): Gr_A:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, dFA): Gr_B:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, dFB): Gr_C:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, dFC): temp_a:=a0: temp_b:=b0; temp_g:=g0: temp_A:=A0: temp_B:=B0: temp_C:=C0: ss:=1.0e-7: # F_new:=F_old+10: for k from 1 to 7 while (k < 2 or F_new < F_old) do if (k > 1) then F_old:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, F): temp_a:=a0: temp_b:=b0; temp_g:=g0: temp_A:=A0: temp_B:=B0: temp_C:=C0: end if: temp_ss:=ss: ss:=temp_ss*10: a0:=temp_a - ss* Gr_a: b0:=temp_b - ss* Gr_b: g0:=temp_g - ss* Gr_g: A0:=temp_A - ss* Gr_A: B0:=temp_B - ss* Gr_B: C0:=temp_C - ss* Gr_C: F_new:=subs(A=A0, B=B0, C=C0, a=a0, b=b0, g=g0, F): print("k",k,"F_old",F_old,"Fnew",F_new); end do: a0:=temp_a: b0:=temp_b; g0:=temp_g: A0:=temp_A: B0:=temp_B: C0:=temp_C: print("iter",iter,"ss",ss,"F",F_old,"alpha",a0,"beta",b0,"gamma",g0,"A",A0,"B",B0,"C",C0); counter[iter+1]:=iter: fvalue[iter+1]:=F_old: end do: print("PASS4"); xylist := zip ( (x,y) -> [x,y], counter, fvalue): plot(xylist,style=point,symbol=cross,color=black,title="STEEPEST DESCENT MINIMIZATION CURVE"); 124 print("PASS5"); for i2 from 1 to npoints do Efit[i2]:=Ef(xi[i2],a0,b0,g0,A0,B0, C0): end do: xylist2 := zip ( (x,y) -> [x,y], xi, Efit); xylist3 := zip ( (x,y) -> [x,y], xi, Eref); G0:=plot(xylist2,style=point,symbol=cross,color=green,title="PDMTD"): G1:=plot(xylist3,style=point,symbol=cross,color=red,style=point):
Abstract (if available)
Abstract
Reliable computational modelling of chemical processes in condensed phases such as computation of the activation free energies requires both extensive configurational sampling and an appropriate level of theoretical treatment. The former is necessary for capturing solute-solvent fluctuations (which is extremely problematic in the energy-minimization approach) and for obtaining convergent free energies, while the latter is required for reliable description of redistribution of electron density during chemical processes and the corresponding energetics. These two factors make computational cost of ab initio QM/MM (QM(ai)/MM) simulations extremely high or even prohibitively expensive for the real system of interest such as chemical reactions in aqueous solution and enzymatic reactions. A powerful general approach, which circumvents these problems, is the reference potential based Paradynamics approach. In this approach the extensive configurational sampling is performed at a lower level of theory using a cheaper (computationally) reference potential rather than directly on the expensive target QM(ai)/MM potential. This is followed by calculating the free energy perturbation of moving from the reference potential to the target potential. ❧ In this work, a number of advances is presented which are made to the QM(ai)/MM free energy computation techniques in general and to the reference potential strategy in particular. First, a formulation of the reference potential based Paradynamics model is given. Chapter 1 addresses an important issue of improving the convergence of the linear response approximation (LRA), which is used to calculate the free energy perturbation from the EVB reference potential to the QM target potential. The improvement in the LRA convergence is achieved by iteratively reducing the difference between the two potentials by fitting EVB parameters to the energy and its derivatives of the target potential. A thorough comparative analysis of the computational cost of the Paradynamics approach is given relative to other methods based on the direct sampling of the target potential. In Chapter 2, an extensive study exploring, refining, and quantifying the Paradynamics model is given. First, a different refinement strategy for a general reference potential is formulated, where the difference between the reference potential and the target QM/MM surface is reduced using the correction potentials comprised of Gaussian functions along the reaction coordinates. Additionally we propose a way of accelerating the potential of mean force calculations by using a combination of the same correction potentials along the reaction coordinates and a solvent polarizing potential. Secondly, it is rigorously demonstrated how to gradually and in a controlled way improve calculations of the free energy perturbations associated with moving from the reference potential to the target QM/MM surface using the multi-step free energy perturbation approach. Furthermore, the LRA treatment is validate by comparing it to the full FEP approach. Finally, results of the PD model prediction for the activation barrier on the target potential are compared to results of PMF calculations. ❧ In chapter 3 we demonstrate practical applications of the QM(ai)/MM free energy calculation approach in general and the Paradynamics model in particular by addressing challenging questions in chemistry and related fields. Advances in developing computational tools for the free energy calculations required in the Paradynamics thermodynamic cycle are reported with illustrative examples. First, the Paradynamics model is applied to an enzymatic reaction, and its predictions are confirmed by 1D and 2D PMF calculations. Second, we calculate the catalytic effect of an enzyme. Last, we quantify another reference potential approach where the target free energy surface is estimated entirely by sampling the reference potential, and relating the corresponding probabilities. In chapter 4, a thorough computational study is carried out to quantify the mechanism of hydrolysis of a phosphate monoester, in particular the mechanism of proton transfer step. This is achieved by calculating the relevant QM(ai)/MM free energy surfaces for the possible mechanisms. This chapter also discusses application of the reference potential approach in estimating the nuclear quantum mechanical effect terms of the QCP approach. Finally, in the concluding Chapter 5 another application of the reference potential approach towards quantifying the photochemical cycle (which involves calculation of the excited state proton transfer) by calculating properties (the emission and absorption spectra) of the red fluorescent protein is considered. It is demonstrated how one can overcome the problem of prohibitively expensive QM(ai)/MM sampling for a system in the excited state by using the QCFF-PI reference potential.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Advancing computational methods for free energy calculations in proteins and the applications for studies of catalysis in nucleotide-associated proteins
PDF
Two state free energy barrier evaluation technique for dehalogenation reaction
PDF
Development of robust ab initio methods for description of excited states and autoionizing resonances
PDF
Asymmetric response and the reorientation of interfacial water with respect to applied electric fields
PDF
Understanding electrostatic effects in the function of biological systems
PDF
The effect of microhydration on ionization energy and proton transfer in nucleobases: analysis and method development
PDF
Electronic structure analysis of challenging open-shell systems: from excited states of copper oxide anions to exchange-coupling in binuclear copper complexes
Asset Metadata
Creator
Plotnikov, Nikolay V.
(author)
Core Title
Advancing ab initio QM/MM free energy calculations: refining, validating and quantifying the reference potential approach
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Chemistry
Publication Date
04/29/2013
Defense Date
03/21/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
catalysis,free energy calculations,molecular dynamics,multi-scale modelling,OAI-PMH Harvest,quantum chemistry,reference potential
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Warshel, Arieh (
committee chair
), Benderskii, Alexander V. (
committee member
), Haworth, Ian (
committee member
)
Creator Email
nplotnikov@gmail.com,plotniko@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-248667
Unique identifier
UC11294063
Identifier
etd-PlotnikovN-1629.pdf (filename),usctheses-c3-248667 (legacy record id)
Legacy Identifier
etd-PlotnikovN-1629.pdf
Dmrecord
248667
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Plotnikov, Nikolay V.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
catalysis
free energy calculations
molecular dynamics
multi-scale modelling
quantum chemistry
reference potential