Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The effect of microhydration on ionization energy and proton transfer in nucleobases: analysis and method development
(USC Thesis Other)
The effect of microhydration on ionization energy and proton transfer in nucleobases: analysis and method development
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE EFFECT OF MICROHYDR ATION ON IONIZATION ENERGY AND PROTON TRANSFER IN NUCLEOB ASES. ANAL YSIS AND METHOD DEVELOPMENT. Copyright 2013 by Kirill Khistyaev A Disse rtation Presented to the FACULTY OF THE USC GRAD UATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Require ments for the Degree DOCTOR OF PHILOSOPHY (CHEMISTRY) May 2013 Kirill Khistyaev Acknowledgements I am very grateful to my advisor Prof. Anna Krylov for the chance to work in her wonderful group. She is not only a good mentor but a great manager who carefully choose promi sing students and ra ise them into real scienti sts. She cares a lot about intere sts and talents of her studen ts. In her group you can work on a proj ects that you are really interested in and capable to do them well. That is exactly how it was in my case. I really enjoyed proj ects I was working at in Anna's group. I was able to fully dedicate my time and thoughts to these pro jects because I shared their goals and importa nce. During my work in Anna's group I was given a chance not only to fulfill the require ments for PhD in Chemistry but also to learn materials far beyond this requirements and to grow profe ssi onally. In many respects it was thanks to the fact that Anna always encourag es her students desire to learn the newest meth ods and technol ogies. I strongly believe that my experi ence and knowledge obtained in Anna's group will be invaluable my future career. I want to thank all my colleagues of our wonderful group. The staff of our group changed significa ntly during my 5-year PhD term but all members of our group always 11 were very kind and helpful. They were always open to interesting discussions and reached me a lot. I am very grateful to Dr. Evgeny Epifano vskiy for teaching me a lot about electronic structure meth ods and how they actually work on computers hard ware. Without his implementation of ccman2 and libtensor librari es the GPU-related part of this work would not be possible. His help with discussions and implementation of GPU-enabled ccman2 library cannot be overestimate d. I would like to acknowl edge Dr. Ksenia Bravaya for helping me with microhydrated th ymine and dimethyluracil pro jects. Without her deep knowledge of the quantum chemistry field, both theory and applicati on, it would be much harder to succeed with this pro jects. I am grateful to Vadim Mozhayskiy who taught me the basics of coupled-cluster metho ds and Q-Chem packag e. I would like to thank Dmitry Zuev and Nikolay Plotnikov for useful discus sions on computational chemistry and technical quest ions. I am grateful to Prof. Aiichiro Nakano who taught me the basics of GPU program ming and inspired me to do the GPU part of this work. I gratefully acknowl edge our collaborators Dr. Musahid Ahmed and his group from Lawrence Berkeley National Laboratory for providing us with great experimental data and discussions that was used in this work. I thank my wife Olga for all understanding and support during my work on my thesi s. Without her I would not be able to spend so much time on my research. Finally, I would like to thank my fam ily in Russia for making me who I am and making it all possible, especially my parents, grandparents and my brother. 111 Table of contents Acknowledgements u List of tables v1 List of figures vu Abbreviati ons xu Abstract x1 v Chapter 1: Introduction and overview 1 1.1 Computational study of microhydration effects . 1 1.2 Electronic structure methods . . . . . . . . . . 4 1.2.1 Methods for the ground electro nic state 4 1.2.2 Methods for excited electronic states 9 1.2.3 Density functional theory methods 11 1.3 Chapter 1 refer ences . . . . . . . . . . . . . 14 Chapter 2: The effect of microhydration on ionization energy of thymine 17 2.1 Introduction . . . . . . . . . . . . . . . . . 17 2.2 Experimental and computational techniques . . . . . . . . . . . 21 2.2.1 Electronic structure calculat ions . . . . . . . . . . . . . 21 2.2.2 Calculation of the Franck- Condon factors and PIE curves 27 2.2.3 Experimental details . . . . . . . . . . . . . . . . . . . 29 2.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . 31 2.3.1 Ionizat ion-induced geometry chang es in thy mine hydrates 31 2.3.2 Vertical ionization energies . . . . . . . 32 2.3.3 The ori gin of IE chang es . . . . . . . . 35 2.3.4 Adiabatic ionization energies and FCFs 38 2.3.5 Theory versus experiment 40 2.4 Conclusions . . . . . 45 2.5 Chapter 2 refer ences . . . . . . . . 47 iv Chapter 3: The effect ofmi crohydration on proton transfer in 1,3-dimethyluracil and 1,3-dimethyluracil dimers 52 3.1 Introduction . . . . . . . . . . . . . . . . . . . . 52 3.2 Experimental details . . . . . . . . . . . . . . . . 55 3.2.1 Proton transfer in thymine-water clusters . 62 3.3 Computational details . . . . . . . . . . 63 3.4 Computational results . . . . . . . . . . . . 64 3.4.1 Proton transfer in monohydrate . . . 70 3.4.2 Proton transfer in dihydrated species 75 3.5 Conclusions . . . . . 78 3.6 Chapter 3 refer ences . . . . . . . . . . . . . 80 Chapter 4: The implementation of the coupled-cluster family of meth ods on graphics processing unit 84 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2 GPU architecture . . . . . . . . . . . . . . . . . . . . . . 90 4.3 GPU programming languages by the example of CUDA C 94 4.3.1 Thread organization . . . . . . . . . . . . . . . . . 95 4.3.2 Memory organization . . . . . . . . . . . . . . . . 95 4.4 Review of existing quantum chemistry methods accelerated with GPU. . 97 4.5 Chall enges of the implementation of the CCSD method on GPU 101 4.6 Implementation details . . . . . . . . . . . . . . . . . . . 103 4.6.1 Overview of the code structure and librari es design 103 4.6.2 Description of the GPU-enabled code 107 4.6.3 Results and conclusions . 11 0 4. 7 Chapter 4 refer ences . . . . . . . 11 2 Chapter 5: Future work 5.1 Chapter 5 refer ences . 11 5 11 7 Appendix : Tensor algebra library graphical processing unit implementation 118 v List of tables 2.1 Vertical and adiabatic IEs (eV) of thy mine and thy mine-water clusters computed by EOM-IP-CCS D/cc-pVTZ. . . . . . . . . . . . . . . . . . 34 3.1 Vertical and adiabatic ionization ener gies (eV) of mU, (mU) 2 , water, and various hydrated species. All energies are computed by EOM-IP CCSD/6-3 ll +G(d,p) except for (mU) 2 and (mU) 2 ·H 2 0 which are com- puted with 6-31+G( d,p) basis set. . . . . . . . . . . . . . . . . . . . . 71 Vl List of figures 1.1 Action of different flavors of the EOM excitation operator forms differ- ent sets of target states giving rise to a suite of EOM-CC methods. . . . 12 2.1 Structures and binding ener gies (De, kcal/mol) of the considered thymi ne water monohydrate s, dihydrates and trihydra tes, CCSD/cc-pVTZ at RI MP2/cc-p VTZ geome try. . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 Equilibrium structures of thymine and three thy mine monohydrates opti mized by RI-MP2/cc-pVTZ. Bondlengths and changes in bond lengths due to interacti ons with water are shown (in A). . . . . . . . . . . . . . 23 2.3 Equilibrium structures of th ymine dihydrate and trihydrate optimized by RI-MP2/cc-pVTZ. Bondlengths and chang es in bond lengths due to microhydration are shown (in A). . . . . . . . . . . . . . . . . . . . . 24 2.4 Equilibrium structures of ionized thymine and th ymine monohydrates optimi zed by ro B97X-D /cc-pVTZ. Bondlengths and changes in bond lengths due to ionization are shown (in A). . . . . . . . . . . . . . . . 25 2.5 Equilibrium structures of ionized th ymine and thymine di- and trihy drates optimized by roB 97X-D/cc-pVTZ. Bondlengths and changes in bond lengths due to ionization are shown (in A). 26 2.6 Coordinates desc ribing relative water-thymine motion. 28 2. 7 Mass spectrum of micro hydrated thy mine recorded at a photon energy of 10 eV. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.8 VIEs (eV) and the cor responding MOs of thy mine and thy mine-wa ter isomers, EOM-IP-CCSD /cc-pVTZ. Changes in VIEs due to microhy drati on and the leading EOM-IP amplitude are given in parenthe ses. . . 33 2.9 VIEs versus the degree of charge transfer to water in different ionized states of the three thy mine monohydra tes. . . . . . . . . . . . . . . . . 36 Vll 2. 10 Changes in VIEs versus the degree of charge transfer to water in differ- ent ionized states of Tl. . . . . . . . . . . . . . . . . . . . . . . . . . 36 2. 11 Changes in VIEs versus charge-dipole interaction energy between the charg es on thy mine and the dipole moment of water molecul e. . . . . . 37 2. 12 Calculated FCFs for the first ionized state of T1 (lo wer panel). Upper panel : FCFs due to water-thymine motion (undefined scale); middle panel : FCFs due to th ymine moiety. . . . . . . . . . . . . . . . . . . . 39 2. 13 Calculated FCFs factors for the first ionized state of T2 (lo wer panel). Upper panel : FCFs due to water-th ymine motion (undefined scale); mid- dle panel : FCFs due to th ymine moiety. . . . . . . . . . . . . . . . . . 40 2. 14 The experimental (raw and smoothed data, and error bars) and calculated (T1 and T2) PIE curve s. The respective adiabatic 0+--0 tran sitions are al so shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2. 15 The differentiated PIE (DPIE) curves and calculated VIEs for th ymine and th ymine clusters with one, two, and three water molecu les. . . . . . 42 2. 16 The PIE curves and calculated AlEs for thy mine and thymine clusters with one, two, and three water molecu les. 43 3.1 Schematic of experimental apparatu s. 55 3.2 Mass spectra of hydrated (with H 2 0) mU and its dimer using 12 eV photons with different backing pre ssure and temperature. . . . . . . . . 56 3.3 The percentage of differ ent dimer forms relative to the all forms of mU present in the beam. The percentage is calculated as the ratio between the signal of the indicated form and the signal of all forms of mU present in the beam. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4 Structures of 1 ,3-di methyluracil and its dimer hydrated with one or two water molecu les. In all structu res, water acts as a proton donor. Hydra- tion of the dimer does not lead to conside rable changes in the relative position of the two mU moieti es, e.g., the distances between C=O and C-CH 3 groups in dry and hydrated (mU) 2 clusters are around 3.3-3.5 A. Te mperature increase results in higher concen tration of mU cluste rs, whereas backing pressure controls degree of hydrati on. . . . . . . . . . 58 V111 3.5 Mass spectrum of hydrated mU and the dependence of the yield of various pro tonated species on photon energy and backing pressure. A. Mass spectrum of hydrated (with D20) mU and its dimer using 12 eV photons. The inset shows the region at mass to charge (m/z) 180 corresponding to [mU(D20h]+. The dashed lines indicate two additional peaks at m/z 181 and 182 arising due to natural isotope abundance (13C) and due to protonated and deuterated species. The intensity ratios between the peaks marked by the arrows at different photon energy for mU(D20h are shown in panel B. The constant behavior of the m/z 181 peak confirms that it arises from isotopic contributions and is not due to PT. Panel C shows similar ratios (for N+1 and N+2 m/z peaks) for N=182 corre sponding to [DmU(H20)2]+. In this case, the N+2 peak is constant, revealing that there is no deuteron transfer between the bases. Panel D: The effect of backing pressure (Ar gas) on PT. The black curve (mU-�0 PT) character izes deuteron transfer from D20 to uracil; the red curve [mU-D20 PT (nor malized)]� deuteron transfer from D20 to uracil divided by the sum of mU and (mUh hydrates, [[(mU) n (D20) m D]+ I L L [(mU) n (D20) m H k D tl +. n,m n,m�Ok + l=O,l The blue curve (mU-mU PT) corresponds to PT between the mU molecules, [[mU(D20) n H]+f[[(mUh(D20) m ]+. E: The appearance energies of deuter- n m ated species [mU(D20) n D]+ for different cluster sizes n. . . . . . . . . . . . 60 3.6 Yield of protonated thy mine (TH) and thy mine-water (T(H 2 0)H+) clus- ters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3. 7 Monohydrated structu res. Distances in angstroms and binding energies in kcal/mol are given for each structure. The lowest-ene rgy iso mer is mU W1-1a . mU W1-1b, mUW1-2a, and mU W1-2 b are 1.59, 1.58 and 3.06 kcal/mol higher, respectively. . . . . . . . . . . . . . . . . . . . . 65 3.8 Dihydrated structu res. Distances in angstroms and binding energies in kcal/mol are given for each structure. The lowest-ene rgy iso mer is mUW2- 1a. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.9 MOs cor responding to the lowest ionized state in mU·H 2 0, (mU) 2 , and (mU) 2 · H 2 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.10 Photoionization effi ciency curve (black) of [(mU+D 2 0)]+ and its deriva tive (re d), observed using 8 e V to 12 e V photons. The derivative plot reveals multiple ionized states derived by removing the electron from differ ent MOs. Black arrows points towards the calculated ionization ene rgi es. 68 IX 3.11 Structural changes in ionized mU W1-1a. Left: optimi zed neutral state. Top right : Franck- Condon optimi zed structure of the 1st (lA") adiabatic state of the cation. Bottom right : optimization of the 5t h (3A") adiabatic state of the cation gives opti mized proton-transferred structure of the cati on. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.12 Potential energy profiles for low-lying states of [mU·H 2 0l+ along the PT reaction coordinate. The proton is moved from the water molecule to the mU oxygen site. The 5th ionized state, 3A", in which the hole is on the water molecule (see Fig. ??), shows no barrier facilitati ng downhill PT. PT from lower ionized states are possible, however this involves chang es in the electronic wave function character and requires more than 10.6 eV photon energy. The left panel shows the experimental ratio between the [mU(D 2 0) 2 ]+ signal (m/z 180) and [mU(D 2 0) 2 Dl+ the deuterated species at m/z 182; it shows dramatic enhancement in PT when the 3A" state is accessed. . . . . . . . . . . . . . . . . . . . . . 73 3.13 Energy diagram describing relevant ionized states and their ordering at differ ent geometries along PT coordinate. All energies are given in e V and are calculated with EOM-IP-CCSD /6 -3 11+ G**. . . . . . . . . . . 74 3.14 Possible proton-transferred structures in [mU(H 2 0) 2 ]+. Left panels show manually distorted structures used as staring points for optimi zation. Ri ght panel s show the final optimi zed structures of the ionized species. Distances are in Angstroms. . . . . . . . . . . . . . 76 4.1 Calculat ions per second per $1 000, loga rithmic plot. 85 4.2 The evolution of computing platfo rm's peak performance and CPU fre quency demonstrates CPU frequency standstill around 2004. . . . . . . 87 4. 3 Theoretical peak performance for the CPU and GPU, floating-point oper- at ions per second. . . . . . . . . . . . . . . . . . . . . 90 4.4 Sketchy representation of CPU and GPU architect ures. 91 4.5 NVIDIA Kepler GK11 0 architecture. 2880 CUDA cores organized in 15 SMXs are shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.6 NVIDIA Kepler GKllO streami ng multip rocessor (SMX) architecture. 192 single preci sion CUDA core s, 64 doubleprecision units, 32 special function units (SFU), and 32 load/store units (LD /ST). 93 4. 7 Kepler Memory Hierarchy. . . . . . . . . . . . . . . . 94 X 4.8 Grid structure. Ti rr ead blocks form a grid that cor respond to one kernel . 96 4.9 Speedup for vari ous *GEMM calls as a function of square matri x size (averag ed over 10 run s). MGEMM correspond to the hetero geneous model with different thresho lds. Ti mes are scaled relative to running DGEMM on the CPU. . . . . . . . . . . . . 100 4.10 Overview of CCIEOM-CC code architecture. 4.11 Overview of the libten sor library structure. A multi-layer design allows for various ext ensions in terms of new al gorith ms and data typ es as well as new hardware architect ures. The layers interact through well-defined interf aces; any layer can be substituted by an alternative implementation 104 without the need to modify the code in the layers above or below. 105 4.12 Diagram of clas ses in libtensor. For each type of tensors there are spe cialized tensor operat ions. General block tensors and operati ons on them are generic implemen tati ons. Concrete block tensors use the generic structures and algorit hms within, but provide a simpli fied interface. The template arg ument in gen_block_te nsor al lows to use differ ent types of tensors, e.g., real , complex or CUDA dense tensors, sparse tensors, or tensors with special propert ies. Any such implementation of tensor have to provide tensor operation classes (tod_add, tod_contract2, etc.). . . . . 106 4.13 Memory layers of GPU implementation of VMM. Multiple copies of the tensor can be stored in all three laye rs. VMM keeps track of the latest copy of the tensor and updates older version as needed. . . . 109 A.1 Schematic order of execution of tensor contraction operati on. 11 9 Xl Abbreviations QM QMIMM cc CCSD EOM EOM-IP-CC quantum mechani cs quantum mechanics/molecular mechanics coupled-cluster coupled-cluster with single and double substitutions equation- of- moti on equat ion-of-motion coupled-clusters EOM-IP-CCSD equat ion-of-motion coupled-clusters single and double substitutions TD-CISD HOMO LUMO IE AlE VIE PIE time-dependent configuration interaction with single and double excitati ons highest occupied molecular orbital lowest unoccupied molecular orbital ionization energy adiabatic ionization energy vertical ionization energy photoionization efficiency xu vuv vacuum ultravi olet FCF Franck -Condon factor PT proton transfer NAB nucleic acid base mU 1 ,3- dimethyluracil DmU d6-1 ,3-di methyluracil GPU grap hics pr ocessing unit EFP effective fr agment potential Xlll Abstract Quantum mechani cs can predict basic properti es of molecul es such as relative ener gies, electronic charge dist ributi ons, dipoles, ionization and excitation ener gies. By solving the Schr o dinger equation with the electronic molecular Hamiltonian we can determine the electronic structure of the molecule that implies physical and chemical properti es of the molecule. In this thesis we use quantum mechani cs to study basic properti es of microhydrated nucl eobases - essen tial building blocks of DNA. Water plays a central role in chemistry and biology by mediating the interact ions between molecu les, altering energy levels of solvated species, modifying potential energy profiles along reaction coordina tes, and facilitati ng efficient proton transport through ion channels and interface s. The effect of hydration on different properti es of molecules and chemical reactions has been inten sively studied for many years both theoretically and experimentally. Numerous sol vation models were developed in an effort to simulate the proper ties and reactions in the bulk water. However, even the interaction between several water molecul es are not completely understood. This work demonstrates how microhydration can affect such properti es as ionization ener gies and control proton transfer mechani sms. It expla ins the experimental results and pro poses mecha nisms of observed effec ts. xiv In order to facilitate theoretical studies the development of new quantum chemistry programs that are able to utilize the resources of modern high-performance hardware is require d. An exact solution for the Schrdinger equation can only be obtai ned for the species with just a few elect rons. However, there are a number of quantum chemistry metho ds that give app roximate solut ions. Among the most widely used and accurate metho ds are coupl ed-cluster methods. In the last chapter the deta ils of the GPU enabled implementation of the post- Hart reeFock ab initio quantum chemistry methods is given. In Chapter 2 a combined theoretical and experimental study of the effect of micro hydration on ionization energies (IEs) of thy mine is pre sented. The experimental IEs are derived from photoi onization efficiency curves recorded using tunable synchrotron VUV radiati on. The onsets of the PIE curves are 8.85±0.05, 8.60±0.05, 8.55±0.05, and 8.40±0.05 eV for thy mine, thymine mono-, di-, and tri-hydrate s, respectively. The com puted (EO M-IP-CCS D/cc-pVTZ) AlEs are 8.90, 8.5 1, 8.52, and 8.35 eV for thymine and the lowest isomers of thymine mono-, di-, and tri-hydrate s. Due to large structural relaxati on, the Franck- Condon factors for the Ot --0 tran sitions are very small shifting the apparent PIE onsets to higher ener gies. Microsolvation strongly affects IEs of thymine - addition of each water molecule reduces the first vertical IE by 0.10-0.15 eV. The adiabatic IE decreases even more (up to 0.4 e V). The magnitude of the effect vari es for different ionized states and for different isomers. For the ionized states that are localized on th ymine the dominant contributi on to the IE reduction is the electrostatic interaction between the delocalized positive charge on thymine and the dipole moment of the water molecule. In Chapter 3 proton transfer in a model system compri sing dry and microhydrated clusters of nuc leobases is investigated. Experiments with mass spectrometry and tunable vacuum ultraviolet synchrotron radiation show that water shuts down ionization-induced XV proton transfer between nucleob ases, which is very efficient in dry cluste rs. Instead, a new pathway opens up in which protonated nucl eobases are generated by proton tran s fer from the ionized water molecule and elimination of a hydroxyl radical. Electronic structure calculati ons reveal that the shape of the potential energy profile al ong the pro ton transfer coordinate depends strongly on the character of the molecular orbital from which the electron is removed, i.e., the proton transfer from water to nucl eobases is bar rier less when an ionized state localized on water is accessed. The computed energet ics of proton transfer is in excellent agreement with the experimental appearance ener gies. Possible adiabatic passage on the ground elect ronic state of the ionized system, while energetically accessible at lower ener gies, is not ef ficient. Th us, proton transfer is con trolled electronically, by the character of the ionized state, rather than statistically, by simple energy considerat ions. Proton transfer from ionized outer water to nucleo bases in dihydrated cluster through the Grott huss-like mechani sm is barrierl ess and the most energetically favorable mechani sm. Chapter 4 describes an implementation of coupled-cluster (CC) post- HartreeFock ab initio quantum chemistry methods, which are widely used in the research, includ ing the research described in previous chapter, on GPU with CUDA C languag e. The implementation of these methods is based on the ccman2 and libtensor librari es and is part of the Q-Chem 4 electronic structure package. These librari es use layered modular architecture which allows relatively easy addition and replacement of low-level modules without changing high-level code (such as CC equat ions). Developed layered architec ture make possible a fast adaptati on of the existing code in the ccman2 library to other languages and technolo gies, such as OpenCL with AMD GPU s or Intel Xeon Phi copro cessor. The development of CC code for new massively parallel architectures is crucial for future research of larger systems with higher accuracy and for QM ! MM metho ds. XVl Chapter 1: Introduction and overview 1.1 Computational study of microhydration effects Computational chemistry is a rapidly developing subfield of theoretical chemistry that uses principles of computer science to solve chemical pro blems by calculat ions. Among computational chemistry methods ab initio quantum chemistry methods play a very spe cial role. These methods are based on the systematic application of the principals of quantum mechanics to chemical systems. The uniq ueness of these metho ds is in the fact that they derive complex physical and chemical properti es of molecul es from the first principl es, that is, the basic physical properti es of such simple parti cles as electron and atomic nucleus and quantum mechanics laws. It is amazing that one can calculate with high accuracy complex properti es of such large systems as nucl eobases tetramers or porphyri nes by just knowing the positions and typ es of at oms. Experiments can give us certain information about molecular properti es and yields of chemical react ions. However, they typically are not able to explain why the properti es have certain valu es and what is the nature of the observed phenomena. Without theory it is often hard to even assign the experimental signals to certain species. However, the ory can do much more than interpretati on of experimental results. It can help develop a mecha nism of an observed phenomena and understand the fundamental physical prin ciples behind it. It is extremely important to understand the fundamental princip les 1 because only then it is possible to extend the obtai ned knowledge to other systems. For example, in the resea rch presented in Chapter 2 of this thesis we had from the experi ment unassigned photoionization effi ciency (PIE) curves of microhydrated thymine clusters with certain ma sses. At first, quantum chemistry calculati ons allowed us to match the differentiated PIE peaks to ionization energies of microhydra tes. Further theoretical investigation led us to a conclusion that the effect of the microhydration on IE energies in these clusters can be expla ined by simple electrostatic interacti on. It gave an insight into the nature of interaction between the water and other molecu les. Thus by applying quantum chemistry meth ods we were able to go beyond just the statement of the experi mental facts about the studied system and derive a broader pictures that have implication to other systems and methods. Ma jor part of this work is dedicated to the study of the effects of microhydration of nucleob ases. It is hard to overestimate the importance of water in chemistry and life sciences. Almost all chemical reactions in living orga nisms and many of the indus trial chemical pro cesses occures in water soluti ons. Thus it is no surpri se that water and its effects on other molecul es and react ions has been intensively studied for a long period of time. Many methods were developed in attempt to model the solvation eff ects. However, we are still far from complete understanding of the behavior of water and its interacti ons with other molecul es. One of the reason is that it is very difficult to model such a complex environment as bulk water. Iso lating limited number of water shell s and studying them with accurate QM methods does not give the correct properti es of the bulk water beca use of the long-range and many-body effects and polarizati on. From the other side, simpler and cheaper metho ds, such as vari ous molecular mechanics methods and semi- empirical metho ds, typically are not able to capture all important interact ions. A 2 number of hybrid QM/MM methods were developed in attempts to overcome the short falls of individual QM and MM techniq ues. However, the results of QM/MM methods can drastically differ between meth ods and even within a single method with differ ent parameter s. In this thesis instead of the study of a complex molecule (such as DNA) in the complex environment of bulk water, the small building blocks of DNA in the smallest possible water environment - microhydrates of nucl eobases in the gas phase - were studied. Both pre cise experimental techniq ues and accurate theoretical methods were applied to discover the exact mechani sms of interaction between water and nude abases. The results obtained for microhydrated nucl eobases are not limited to these systems only but should be similar for other molecules as well. They can be used in other models such as QM/MM and be furth er extended to large systems and bulk water. Quantum chemistry pro vides us with powerful tools for understanding the proper ties of molecu les. Unfortunately, the exact solution for the Schrdinger equation -t he basis of all quantum chemistry meth ods - can only be obtained for the species with only few electro ns. However, there are a number of quantum chemistry methods that give ap proximate solution s. Among the most widely used and accurate methods are coupled-cluster (CC) methods. Most of the computati ons performed in this resea rch were based on the CC metho ds. However, the CC methods have a high computational cost and steep scaling factor (e.g. N 6 for CCSD). In order to be able to apply these meth ods to larger systems such as microhydrated nucl eobases clusters with more than first solvation shell it is crucial to use a faster hardware. Unfortunately, it is no longer possi ble to get better performance with just a single CPU core as it was before. Any modern architecture requires parallel programs for effi cient execut ion. One of the most efficient computation hardware architectures is the Graphical Pro cessing Unit architecture. Many computational science programs have been recently ported to GPU to benefit from their 3 perform ance. Note that there are two main vendors - AMD and NVIDIA, that produce GPUs and there are several languages (e.g. CUDA C, OpenCL, OpenACC) designed to program for GPU. To complicate matters further, there are different competing archi tectures such as Intel Xeon Phi coprocesso r, multi core CPU s, APU s, vector pr ocessors, etc. It is hard to predict which technologies will better work with the CC metho ds in the future and it is very time-consuming to develop a separate programs for every tech nology. The goal of the research presented in Chapter 4 was not just develop a GPU capable implementation of the CC meth ods but to design a library architecture that can be used for implementat ions with other architectures with mini mal effort s. 1.2 Electronic structure methods 1.2.1 Methods for the ground electronic state The ground state is of parti cular intere st in electronic structure theory beca use most of chemical reacti ons occur on the lowest-energy PES. One of the conventional approaches to approximately solve for the ground-state energy is to use Hartree-F ock method, which employs variational principle and is based on the mean-field desc ription of electron-electron interac tions. Using a set of one-electron basis set functi ons, one solves a system of Roothaan equati ons self -consist ently to obtain molecular orbital s and their energ i es. The Hartree-F ock method can be roughly described as taking the following steps. The reader is encouraged to consult electro nic structure texts 1 •2 for more details. 1. A one-electron basis set is selected for each atom of the molecule. In this work we use Pop le's split-valence 3 - 5 and Dunning 's 6 collections of basis sets, which are based on the Car tesian Ga ussian representation of Slater-type orbital s. Using 4 the basis set, orbital s of each type (s, p, d, etc.) are created. They are called atomic orbital s �w 2. Initial molecular orbital matri x C and Pock matrix F are formed using core Hamil- tonian, superposition of atomic densit ies, Hi ickel theory or another similar strat- egy. The orbital overlap matrix Si s computed. 3. Roothaan equation FC= SCE is solved for the matri x C and the diagonal matrix E. Because Pock matri x F itself depends on C, the equation has to be solved self- consi stently. Iterati ons continue until the converg ence criterion (usually IC(i) - c(i-l) I < e) is met. Resulting from this procedure, the C matrix conta ins in its rows the expansion coef- ficients for obtai ning molecular orbitals 'l' i in terms of atomic orbitals (1.1) and the diagonal matrix E conta ins cor responding molecular orbital energi es. Hartree-P ock theory does not include electronic correlati on, but prov ides a starting point for correlated methods. Coupled-cluster theory It is well known that the inclusion of electron correlation is essen tial for the accurate determination of molecular propert ies. Among the vari ous approaches to the correlation pro blem, coupl ed-cluster theory has proven to be very powerful and accurate. Since the first application of the CC methods to the electron correlation problem by Cizek in 5 1966 7 • 8 , they have become one of the most successful and widely used tools of quan- tum chemistry. Although the CC equat ions are more complicated than the configuration interaction equations, they have significant benefits over CI methods. One of the main advantages of CC methods is size extensivity which assures that the results of calcula- tions scale properly with the size of the system. This is a very important property for calculati ons of interaction ener gies, chemical reactions and potential energy surfa ces. Similar to the CI meth od, CC theory rep resents the wave function of the ground state as an expa nsion in terms of the reference deter minant, then all determinants with sin- gle substitut ions, followed by double substitut ions, and so on. Unlike CI, which uses a linear excitation operator to generate the expan sion, in CC theory the wavefunction is written in the following exponential form 8 : ['¥cc) = exp ( T) [<Po) where [<Po) is a single Slater determinant (usually Hart ree-Fock determinant) called the reference determinant and T is a cluster operator, which consists of a series of n- tuple excitation operato rs: Each term in the series can be expressed in the second quantization form : and so on. These equati ons follow the conventional notati on: i, j, ... stand for occupied in the reference determinant I <Po) spin- orbital s, a, b, ... designate virtual (unoccupi ed) spin- orbital s. The creation and annihilation operators a b and a p create or remove an 6 electron from the spin- orbital [p). When T includes all excitation operators up to Tn for n electron s, the result is equivalent to that from ful l-CI(FCI) calculati ons as all possible excited determinants are present in the expa nsion. The CC energy in that case conta ins 100% of the electro nic correlation energy within the given basis set. Using the full expan sion of the cluster operator T, however, is impractical because the computational complexity of the method grows exponentially with the size of the system. FCI results have only been reported for di- and triatomic molecul es in rather small basis sets. To reduce the scaling of the CC method, the T operator is tru ncated. Because of the expo- nential form of the wave function ansatz, CC theory retains size- extensivity even with the truncated cluster operator. This is unlike CI, in which only the full expansion is size-extensive. Advantages of the CC approach are apparent from the fact that for any truncation of T such as T�T 2, still introduc es certain types of quadru ple, hextuple and higher excita tions into the calcu lat ions via the T� ,T�, etc. term s. This arg ument is applicable to any level of truncation of the T operator. These two quali ties make the truncated CC models superior to their CI counterpa rts. Limiting the expa nsion of T to the first two term s, for example, gives rise to coupl ed- clusters with single and double excita tions (CCSD) method, initially implemented by Purvis and Ba rtlett 9 [ l.JlccsD)= exp ( T 1 + T2) [<I>o) Among all iterative CC methods, CCSD pro vides the most attractive balance between the computational cost and accuracy. When chemical accuracy is desired (of 7 the order of a few kcal/ mol), the energy can be further improved using a non-iterative CCSD( T) correction via perturbati on theory 10 . The wave function and energy are obtained by solving the system of CC equati ons for the T amplitud es: (<I>o[ exp (- T ) Hexp ( T)[ <I>o) =Ecc (<Po[ exp ( - T)H exp ( T) - Ecci<I>i) = 0 In practice the CC equati ons are solved numerically using the Jacobi method for the system of linear equat ions in the matri x form. The vector solution is the CC amplitu des and the update procedure is formulated as a recurring expre ssion for the T vector 11 . To improve the converg ence of the scheme, the direct inverse of the iterative subspace (DIIS) 1 2 method is ap plied. For example, The Schr o dinger equation for CCSD is: (H-Eccso)P ccso = (H-Eccso) exp( T 1 + T2) [ <Po)= 0 A set of equat ions sufficient for deter mining the ti and ti l coefficients can be obtained from the proj ections onto a reference determinant [<Po) and all excited determinants generated by the action of T on I <Po) 9 . When this set of equat ions is appropriately factori zed, all terms involve the contraction of a cluster operator T with either two- or four-index quant ities 13 . In the case of CCSD, the quanti ties that are contracted with the amplit udes tf and ti l are one- and two-particle intermediates in which the correspond ing Fock matri x element fpq or anti symmetrized molecular orbital integral ( pq II rs ) is the leading term of an expansion which also conta ins contract ions between Fock matrix elements and integral s with T amplit udes. The calculati ons of these intermediates and in particular the contracti ons of the amplit udes with the intermediates have the largest 8 computational cost in CCSD method. The addition of tr iples correction in CCSD( T) method besides tensor contraction adds a specific operation similar to dot-product oper- ation in matri x algebra but requires an elementwi se divi sion by the sum of elements of delta- matri ces. This operation is used in the tri ples correction to the converg ed CCSD energy expre ssion 10 : t( a ) t( b ) 11£ = L ijkabc ijkabc ijkabc �ia + �jb + �kc 1.2.2 Methods for excited electronic states Equation-of-motion method Equati on-of-motion approach is a powerfull and versatile electro nic structure method allows the desc ription of many mul ti-confi gurational wavefuncti ons within a single- reference forma lism. The formal starting point for this method is the CC wavefunction for a convenient reference state with N elect rons. Applying the general linear excitation operator R on the ground state CC wave function yields the excited state wave function (1.2) (the integer m identifies the excited state). The structure of the excitation operator R is the same as in CI. It can be expanded as (1.3) where R 1 generates all possible singly excited determi nants with respect to the reference deter minant, R 2 generates all doubly excited deter minan ts, and so on. 9 In the limit of the zero cluster operator T = 0, this approach becomes regular CI as the excitation operator R acts directly on the refer ence determinant. If the series for R is not tru ncated, the operator generates all possible excitat ions. In this case the full CI result is recovered, no matter what the cluster operator T is. The idea can be reformulated and generalized by introducing a similarity transfor- mation for the electro nic Hamiltonian H using the cluster operator: fi = exp ( � T)H exp ( T) (1.4) Regard less of the choice of T, the simila rity transformed Hamiltonian fi has the same spectrum as the original Hamiltonian H. That allows us to "fold" the electronic corre- lation in the form of the cluster operator into the Hamiltonian and find the eigenvalues (targ et state energ ies) and eigenvectors (targ et state wave functi ons) in vari ous subs paces of the Pock space. When T and R are truncated (e.g. at single and double excitati on), the EOM models are numerically superior to the cor responding CI models, beca use cor relation effects are "folded in" in the transformeds Hamiltonian. This approach is called equat ion-of-motion coupled-clusters (EO M-CC) method 14 - 22 . If the excitation operator R promotes one or more electrons from occupied to virtual orbital s, it can be written in the second quantization form as (m) R 0 = ro R (m) = '\' r£?-ata· 1 i....J z a z za (1.5) This form of R pre serves the number and spin of elect rons. It is used in EOM-CC for excitation energies (EO M-EE-CC), which yields the ener gies and wave funct ions for electronically excited state s. 10 Operator R can be designed to remove or attach electron s, and change their spin (Fig. 1.1). This gives rise to a suite of EOM-CC methods: for ionization energies (EO M-IP-CC), for attachment energies (EO M-EA-CC), spin-flip (EO M-SF-CC). It is a powerful toolkit that al lows one to start with a convenient well-behaved reference state, and form target states by applying the different flavors of the operator R 23 . EOM operator R is written in the tensor form in the basis of molecular orbital s. The elements of this tensor (lf, lff in Eq. 1.5) are called EOM amplit udes. They are found by solving the eigenproblem (1.6) where (<I> .ul are all ,u:- tuply excited with respect to the reference deter minan ts. Just like the cluster operator T in CC, the operator R can be truncated at differ ent excitation level s. In EOM-CCSD 16 • 17 • 2 4 • 25 , both T and R include terms up to T 2 and R 2 . The EOM-CCSD metho ds give an error in the range of 0. 1 �0. 3 e V for excitation energ ies. 1.2.3 Density functional theory methods Long-range- corrected density functi onals In long-range-co rrected functi onals, a ra nge-s eparated representation of the Coulomb operator 26 • 2 7 is used to mitigate the effects of the self-interaction error. The contribu tion from the short-range part is described by a local fu ncti onal, whereas the long-range part is desc ribed using the exact Hart ree-Fock exchan ge. The separation depends on a parameter y. In ro PB97X 2 8 , y and other parameters are optimi zed using standard training sets. Benchmark results have demonstrated consi stently improved performance relative 11 * * 4-'o * 4-'o * -++ +--+ * EOM-EE-CC (excitation ener gies) +- +-+-+- +-* *+--+ *+- * --++- 4-'; lllij EOM-IP-CC (ionization potential) +- -++-+-+- +- * +-+--+ ** +- +--++-* lila 4J� I EOM-EA-CC (electron attachment) +- *-+--+ EOM-SF-CC (spin-flip) Figure 1.1: Action of diffe rent flav ors of the EOM excitation operator fo rms dif ferent sets of target states giving rise to a suite of EOM-CC meth ods. 12 to non-long-range- corrected functi onals. In roPB97X-D 2 9 functional parameters were reoptimi zed to include empirical atom-atom dispersion correcti ons. Te sts show that for non-covalent systems, roPB97X-D shows slight improvement over other empirical dispersion- corrected density functi onals, while for covalent systems and kinetics it per forms noticeably better. Relative to roPB97X, the new functional is signifi cantly superior for non-covalent interact ions, and very similar in performance for bonding interacti ons. 13 1.3 Chapter 1 references [1] A. Szabo and N.S. Ostlund. Modern Quantum Chemist ry : Introduction to Advanced Electronic Structure Theory. McGraw-Hill, New York, 1989. [2] T. Helgaker, P. Jorgensen, and J. Olsen. Molecular electronic structure theory. Wiley & Sons, 2000. [3] W.J. Hehre, R. Ditch field, and J.A. Pople. Self- consistent molecular orbital meth ods. XII. Further ext ensions of gaussian-type basis sets for use in molecular orbital studies of organic molecu les. 1. Chern. Phy s., 56:2257, 1972. [4] P.C. Hariharan and J.A. Pop le. Th eor. Chim. Acta, 28:213, 1973. [5] R. Kri shnan, J.S. Binkley, R. Seeger, and J.A. Popl e. Self-consistent molecular orbital metho ds. XX. A basis set for correlated wave func tions. 1. Chern. Phy s., 72:650, 1980. [6] T.H. Dunning. Gaussian basis sets for use in correlated molecular calculati ons. I. The atoms boron through neon and hydr ogen. 1. Chern. Phy s., 90:1007-1023, 1989. [7] J. Cizek. On the correlation problem in atomic and molecular systems. Calcu lation of wavefunction components in Ursell-type expa nsion using quant um-field theoretical methods. 1. Chern. Ph ys., 45:4256-4266, 1966. [8] J. Cizek. Adv. Chern. Phy s., 14:35, 1969. [9] G.D. Purvis and R.J. Bartlett. A full coupled-cluster singles and doubles model: The inclusion of disconnected tripl es. 1. Chern. Phys., 76:1910-1918 , 1982. [10] K. Raghavachari, G.W. Tr ucks, J.A. Pople, and M. Head-Gordon. A fifth -order perturbati on compari son of electron correlation theori es. Chern. Phys. Lett ., 157:479-483, 1989. [11] J. Ga uss, J.F. Stanton, and R.J. Bartlett. Coupl ed-cluster open-shell analytic gra dien ts: Implementation of the direct product decomposition approach in energy gradient calculat ions. 1. Chern. Phys., 95 :2623-263 8, 1991. [12] P. Pulay. Chern. Phys. Lett ., 73:393, 1980. [13] J.F. Stanton, J. Gauss, J.D. Watt s, and R.J. Bartlett. A direct product decomposition approach for symmetry exploitation in many-body metho ds. I. energy calculati ons. 1. Chern. Phys., 94:4334-4345, 1990. 14 [14 ] D.J. Rowe. Equat ions- of-motion method and the extended shell model. Rev. Mod. Phys., 40:1 53-166, 1968. [15] K. Emrich. An extension of the coupl ed-cluster formalism to excited states (I). Nucl. Phys., A35 1:379-396, 1981. [16 ] J. Geertsen, M. Rittby, and R.J. Bartlett. The equati on-of-motion coupled-cluster method: Excitation ener gies of Be and CO. Chern. Phys. Let t., 164: 57-62, 1989. [17] J.P. Stanton and R.J. Bartlett. The equation of motion coupled-cluster method. A systematic biorthogonal approach to molecular excitation ener gies, transition probabilit ies, and excited state properti es. 1. Chern. Phys., 98:7029-7039, 1993. [18 ] S.V. Levchenko and A.l. Krylov. Equati on-of-motion spin-flip coupled-cluster model with single and double substitut ions: Theory and application to cyclobu tadiene. 1. Chern. Ph ys., 120(1):175-185, 2004. [19] D. Sinha, D. Mukhopadhyay, and D. Mukher jee. A note on the direct calculation of excita tion-energies by quasi-deg enerate MBPT and coupled-cluster theory. Chern. Phys. Le tt., 129:36 9-374, 1986. [20] S. Pal, M. Rittby, R.J. Bartlett, D. Sinha, and D. Mukher jee. Multiref erence coupled-cluster methods using an incomplete model space - application to ionization-potential s and excita tion-energies of formal dehyde. Chern. Phys. Lett ., 137:2 73-278, 1987. [21] J.P. Stanton and J. Gauss. Analytic energy derivatives for ionized states described by the equat ion-of-motion coupled cluster method. 1. Chern. Phys., 101(10 ):8938- 8944, 1994. [22] M. Nooijen and R.J. Bartlett. Equation of motion coupled cluster method for elec tron att achment. 1. Chern. Phys., 102:3629-3 647, 1995. [23] A.l. Krylov. Equati on-of-motion coupled-cluster methods for open-shell and elec tronically excited spec ies: The hitchhiker 's guide to Fock space. Annu. Rev. Phys. Chern., 59:433 -462, 2008. [24] H. Sekino and R.J. Bartlett. A linear response, coupl ed-cluster theory for excitation energy. Int. 1. Quant. Chern. Symp. , 26:255-265, 1984. [25] H. Koch, H.J .A a. Jensen, P. J ¢r gensen, and T. Helgaker. Excitation ener gies from the coupled clusters singles and doubles linear response functi ons (CCSDLR). Applicati ons to Be, CH+, CO, and H 2 0. 1. Chern. Ph ys., 93(5):3 345-3350, 1990. 15 [26] H. Iikura, T. Tsuneda, T. Yanai, and K. Hirao. A long-range correction scheme for genera lized- gradient- a pproximation exchange functional s. 1. Chern. Phys., 115:3 540, 2001. [27] R. Baer and D. Neuhauser. Density functional theory with correct long-range asymptotic behavior. Phys. Rev. Lett., 94:043002, 2005. [28] J.-D. Chai and M. Head-Gordon. Systematic opti mization of long-range corrected hybri d density functional s. 1. Chem Phys., 128:084106, 2008. [29] J.-D. Chai and M. Head-Gordon. Long-range corrected hybrid density fu nction al s with damped atom-atom dispersion interact ions. Phys. Chern. Chern. Phys., 10:661 5-6620, 2008. 16 Chapter 2: The ef fect of microhydration on ionization energy of thymine 2.1 Introduction Ionization of nucleic acid bases is involved in DNA radiation and photo-damage and may eventually lead to dangerous mutati ons with a risk for cancer and neurodegenerative deceases. Due to their relatively low ionization ener gies, individual nucleo bases are the most likely components of DNA to be oxidized. Their IEs, however, are affected by cou pling to DNA's sugar-phosphate backbone, hydro gen-bonding and rc-st acking between neighboring bases, as well as interact ions with solvating water molecules and counter ions. Quantif ying the exact nature of the se effects has proven difficult, and even the val ues of IEs of solvated DNA are still controv ersial. For example, a recent computational study 1 that attempted to compute the IE of guanine in solvated DNA by a QM/MM approach reported a large increase ( 4 e V) of guan ine's IE, contrary to the conclusions drawn from experimental studies 2 . As part of our effort to understand the role played by differ ent interact ions of NABs with the environment 3 - 8 , we recently characterized the effect of hydrog en bonding and 17 rc-stacking on the ionized states of the gas-phase dimers of AA, TT, AT, and CC in a combined experimental and theoretical study 7 • 8 . We found that stacking and h-bonding can reduce the IEs by 0.4-1 eV via two distinct mechani sms: hole delocalization and electrostatic charge-dipole interac tions. We al so analyzed ionization-induced structural changes in isolated nucleic acid bases 9 and in uracil dimers 4 • 5 . Consistent with delo calized character of the highest occupied molecular orbital (HOMO), structural changes involve several CC and CN bonds, the largest change being for the double CC bond. In bare NABs the relaxation energy is 0.2-0.4 eV, whereas in the dimers the difference between adiabatic and vertical IEs is larger. To quantify the effect of structural relax ation on photoionization efficiency curve s, we computed Franck -Condon factors (FCFs) for the lowest-ene rgy tautomers of NABs 9 . In all cases, we observed that the 0+--0 tran sitions have non-negligible FCFs and that the onsets of the PIE curves indeed represent AlEs. Microsolvation has been found to decrease the IEs of nucl eobases by about 0.1 eV per water molecule 6 • 10 . The absolute values of IEs reported in VUV 6 and electron impact 10 studies are slightly different. An excellent agreement of the IE of isolated thy mine deter mined from the PIE curve (8.90±0.05 e V) 6 with the value derived from MATI spectra (8.9178±0.0010 eV) 11 va lidates the accuracy of the synchrotron VUV measurements 6 . The computational studies 1 2 • 13 performed with B3LYP predicted similar magnitude shifts and pointed out substantial geometric relaxation in hydrated species leading to even larger chang es in AlEs. These studies were motivated by differ ences between the results obtai ned by the two experimental approaches, i.e., using electron impact ioniza tion and VUV photoionization 6 • 10 . A number of tau tomers were calculated to interpret 18 the early electron impact resul ts, while attempts were made to fit the appearance energy measurements to various vertical and adiabatic values. In other nucleobas es, micro hydrati on leads to similar effects 6 , although the magni tude of the IE drop var ies. For examp le, the changes in AlE in adenine-water clusters 6 · 14 are smaller than those for thy mine. The effect of microsolvation on electronically excited states and photoinduced dynami cs in nuclear bases and other model chromophores has been investigated theo retically and experimentally 15 - 22 . In addition to perturbati ons to the electronic spectrum and differential stabilization of excited state s, microsolvation can open up new relax ation channel s includ ing, among ot hers, hydrogen/proton and electron transfer, charge transfe r-to-solvent states, and zwitter- ion formati on. The theoretical treatment of ionized species is challenging owing to their open-shell character and electro nic near-degeneracies 23 · 2 4 . Wave-function approaches using dou blet refe rences often suffer from spin-con tamination and symmetry-breakin g, which result in hole over-localization 23 . DFT metho ds are affected by self-interaction error leading to charge over- delocalizati on. Owing to these def ects, computational studies often observe artif acts of electronic structure metho dology rather than real physical properti es of these systems. We employ the EOM-IP method that is free from the above pro blems and is the method of choice for these systems. EOM-IP describes open-shell target states as ionized states derived from well behaved closed-shell neutral reference wave functi ons (see Section 2.2.1). We also use DFT with a range-s eparated functional (roB 97X-D) that greatly reduces self-interaction errors 25 · 26 . The appearance ener gies of microhydrated thy mine ions [T(H 2 0)n, n=1-3)] have been reported previ ously by our collaborators from Dr. Musahid Ahmed group 6 . In the present work they have performed the measurements with a broader energy range 19 and report improved error bars. Furthermore. derivation of the PIE curves as reported by us for NABs and their dime rs. allows for qualitative interpretati on of the VIEs for thy mine and hydrated thymine as maxima at obtai ned curv es. In the earlier experimental work 6 • the beam contained mixed adenine-thymine and water cluste rs. and in the new experi ments reported here thymine al one was microhydrated with water. T1 (11. 3) T2 (9.1) T3 (8.9) T4 (6.7) Figure 2.1: Structures and binding energies (D0 kcal!mol) of the considered thymine-water monohydrates, dihydrates and trihydrates, CCSD /cc-pVTZ at RI MP2/cc- pVTZ geometry. Previous experimental and theoretical studies on the microhydration of thymine did not provide a detailed physical picture of the ionization processes. Hence. the focus of this work is on quantif ying the effect of microsolvation on IEs and on understanding its origin. We consider several isomers of the microhydrates shown in Fig. 2.1 (all struc tures cor respond to the lowest-energy tautomer of thy mine). In order to unambiguously compare with the experimental measurements. we also perform modeling of the FCFs 20 for the lowest electronic state of the cations. Accurate FCF calculati ons for hydrogen bonded systems of such complexity are rare and provide important benchmarks as well as insight into the spectroscopy of biologically relevant species. All experimental data presented in this chapter were obtained by Dr. Musahid Ahmed group from Lawrence Berkeley National Laboratory. 2.2 Experimental and computational techniques 2.2.1 Electronic structure calculations Open-shell doublet wave fu nctions can be formally derived from a closed-shell systems by addition or subtraction of an electro n. As such, they can be desc ribed accurately and efficiently by the ionization potential (IP) and electron affinity (EA ) variants of equat ion-of-motion coupled-cluster (EO M-CC) methods 2 7 - 30 . EOM-IP, which relies on the N-electron closed-shell refe rence, is free from the symmetry breaking and spin contamination probl ems that are ubiquitous in open-shell calculat ions, and is capable of describing charge localization patterns in ionized clusters 4 • 7 • 23 • 31 . EOM-IP simultane ously includes dynamical and non-dynamical correlati on, describes multiple electronic states in one calculati on, and treats states with different number of electrons on the same footi ng. Using the EOM-CC family of metho ds, electronically excited, ionized, or attach ed states of the thy mine-wa ter clusters can be computed starting from the same closed-shell CCSD (coupl ed-cluster with singles and dou bles) reference wave function of the neutral 2 4 • 32 . 21 The target open-shell wave fu nctions are generated by a Koopmans-like excitation operator R acting on the reference CC wave funct ion: (2. 1) where <I>o ( N) is the refe rence determinant of the N-electron neutral system, T is the coupled-cluster excitation operator including single and double substitut ions, and R consists of lh and 2hlp (1 -hole and 2-hole- 1- parti cle) operators generating (N �I ) electron determinants from the N-electron refere nce. Amplitudes T are found by solv ing CCSD equat ions for the ground-state wave function of the neutral, while amplitu des R are obtained by subsequent diagonalization of the similarity transformed Ha miltonian, fi =e - THer. EOM-IP-CCSD yields accurate energy splitt ings and smooth potential energy sur faces al ong charge transfer coordinates 23 . This method has been succes sfully applied to desc ribe electronic structure of ionized benzene dimers 31 • 33 , water clusters 34 - 36 and dimers of nucleo bases 3 - 5 , 7 • 8 . We also employed DFT with the long-range and dispersion- corrected roB 97X D functional 26 (for geometry optimization and frequency calculat ions). Long-range Hart ree-Fock exchange included in roB 97X-D miti gates the effect of self-interaction error yielding accurate structures and frequ encies 7 • 9 . There are several tau tomers of thymi ne; however, the energy gap between the canon ical form and next most stable tautomer is more than 10 kcal/mol 37 in the gas phase. Since we used thermal vaporization to generate th ymine in the gas phase, there is not enough energy to populate higher-lying tautomer s. Th us, the canonical form should be predominantly present in the molecular beam. 22 0 H3C u " / Ji: :: CH 3 1.374 N us1 0 1.006 1 T2 ...__ /o .957 0.971 1.896 1.021 N u81 (o.ow) T1 0.958\ T3 Figure 2.2: Equilibrium structures of thymine and three thymine monohydrates opti mized by RI-MP2/cc- pVTZ. Bondlengths and changes in bond lengths due to interactions with water are shown (in A). We considered the three most stable monohydrated thymine isomers (Tl-T3, see Fig. 2.1) and the two thymi ne-(H 2 0) 2 structures (Ti l, Tl2) obtained by Hobza et al. using the molecular dynami cs/quenching technique (MD/Q) 37 , The forth most stable monohydrate structure (T4) shown in Fig. 2.1 is 4.4 kcal/mol higher in energy than the most stable TL Therefore we excluded T4 from further conside rati on. Thymine trihydrate structures (Ti ll and Tl l2 ) were obtained by addition of a water molecule to dihydrate structu res. All ground-state geometri es were optimi zed using the RI-MP2/cc- pVTZ and roB 97X-D /cc-pVTZ metho ds, which yielded similar resul ts. For example, the differences in all bond lengths for the thy mine molecule in T1 is less than 0.01 A; 23 1 400 (0 001) ., ./ 10 11 I )j N --:- 375 co ooo) 1 352 (-0 006) (0 002) 1 229 (0 013) 1372 N � (-0 002) 1 365 10 24i (-0016) 1 784 (0 018) 1764 071 5 ______..P--- 0 957 / 0 975 ° 957 1 761 Tll 1.401 (0.002) ./ / 1.011 N 3 ;o 2 ::o) (0.017) 1 372 N 1.361 \ (-0 002) I (-0.020) 1 034 1.722 (0 028) t"" 1 6..... 6 0 --- -.� o: , Tlll o971 I --- /.J957 1.2320 l H 3 C ( :::: ) � 0 ::9 1 ( �-��� ) I � N (- � - ��� ) (0.010) . 1.227 (0.011) 1.017 1.871 1.368 N 1.374 1 (0.011) (-0.006) s: -0.007) 0.973 1.865 0.958\ Tl2 / o9ss 1.232 0 ); -;;;-- (0.011) 1.451 13 90 19 01 H 3 C (-o.oos) (-o 009) 10 20 I N 1.375 (o.oo9) 1.353 �-0.006) (0.003) 1.229 (0.011) 1368 N � (-0.006) 1.368 1.024 i (- O . Ol 3) 1.781 (0.018) 1.763 0.976 �--- 0.95 1 1.755 0.957 Tll2 Figure 2.3: Equilibrium structures of thymine dihydrate aud trihydrate opt imized by RI-MP2/cc- pVTZ. Bondlengths and changes in bond lengths due to micro hy dration are shown (in A). the hydro gen bond between the N-H group of thymine and the oxygen of water is 0.028 A shorter and the hydrogen bond between C=O of th ymine and the hydrog en of water is 0.026 A shorter for RI-MP2 (1.871 and 1.9 03 A respectively) than for roB 97X-D (1.8 99 and 1.929 A, respectively ). Figs. 2.2 and 2.3 show the RI-MP2 ground states geometr ies. The respective Car tesian geometries as well as roB 97X-D structures are given in Supplemen tary Materials for Ref. 38. 24 0958 >CM � (-0013) 1 383 (0 023) 0 958 (-0 0 05)- ,_AJ -.I£..,? Ol) CH 3 /16 7 0 14 01 I �N 7 (-O n6) (0 050) 4) 11 90 1312 N 1446 d-0025) 1(0063) (� 009) T2 and T3 Figure 2.4: Equilibrium structures of ionized thymine and thymine monohydrates opti mized by roB97X-D/cc-pVTZ. Bondlengths and changes in bond lengths due to ionization are shown (in A). The equilibrium structures of the cations were computed with roB 97X- D/cc-pVTZ (see Figs. 2.4 and 2.5). This fu nctional. when used with 6-3 11 ++G(3d f.3pd) basis set. was shown to desc ribe geometri es and binding energies of the weakly bound complexes with mean absolute errors of 0.064 A and 0.22 kcal/mol. respectively 26 . For the ionized thy mine monohydrate s. this functional (with the cc-pVTZ basis) yields structures that are similar to those obtained with EOM-IP-CCSD/6 -3 11 +G(d.p). i.e .. the mean absolute errors are 0.01 A for the thy mine moiety and 0.11 A for th ymine-water hydrogen bond. All IEs were calculated using EOM-IP-CCSD /cc-pVTZ at the equilibrium geome- tri es described above. The cc-p VTZ basis set provides a good balance between accu racy and computational effi ciency. The first IE of th ymine computed using EOM- IP-CCSD with the extended cc-pVTZ basis [augmented by diffu se s and p fu nctions from 6-3 11 ++G(d.p) as was done in Ref. 39] is 9.20 eV. which is 0.07 eV higher than the cc-p VTZ value. Zero point energy (ZPE) correction lowers AlEs of thy mine and thy mine monohydrates (Tl and T2) by 0.08 e V. Thu s. due to error cancellation. non ZPE-corrected AlEs computed with cc-p VTZ are very close to the ZPE-corrected AlEs obtai ned with the extended cc-p VTZ basis set. 25 Tll Tlll 1 202 0 (-0 030) 14 86 (0 026) T12 T112 0957/ 0 957 16 88 � 09 5 8 I (-0 014) 0 9 5 7 -- (-0001) Figure 2.5: Equilibrium structures of ionized thymine and thymine di- and trih y drates opt imized by roB97X-D/cc-pVTZ. Bondlengths and changes in bond lengths due to ionization are shown (in A). The charge distribution analy sis were performed using Natural Bond Orbital Pack age (NBO, v. 5.0) 40 . All calculati ons were conducted using the Q- CHEM electronic structure package 41 , Molecular structu res, frequ encies, and relevant total ener gies are given in Supplemen tary Materials for Ref. 38. 26 2.2.2 Calculation of the Franck-Condon factors and PIE curves Unambiguous compari son with the experimental PIE curves requires calculation of FCFs. While in molecular systems (i.e., ionized NABs), FCFs can be reliably com puted using the doubl e-harmonic approximation with Duschinsky rotati ons 4 2 , as was done in Ref. 9, the calculat ions in clusters are more challenging due to large structural relaxation of soft (and anharmonic) inter-frag ment degrees of fr eedom. To correctly describe the se effec ts, we combine double-harm onic treatment of the thy mine moiety and water with a one-dimensional quantum treatment of the inter-fragment coordinate assuming that the water-thymine and intramolecular th ymine vibrati ons are uncoupled and that the respective FCFs are multiplicative. Using the ezSpectrum program 43 , we first compute FCFs for the thymine moiety using double-harm onic ap proximation with Duschinsky rotati ons 4 2 at the Cs geometry using normal modes and fr equencies for the non-planar structures with one hydrogen of water out of the plane. The water molecule itself was excluded from this calculation and only the geometry of the thy mine moiety (from the monohydrate) was used. Duschinsky rotati ons are important because the normal-mode overlap matrix for the neutral and the ionized states is signifi cantly non-diagonal. In these calculati ons, we used harmonic freq uencies and structures computed by roB 97X- D/cc-pVTZ for both the neutral and the 1st ionized states of the monohydrate s. The effect of water-thymine degrees of freedom on FCFs is described by a one dimensional model, which takes into account anharmonicity and large structural relax ati on. This treatment is similar to the intri nsic reaction coordinate connecting the initial and final state. Wa ter-thymine motion is defined by three coordina tes: r, the distance between the water oxygen and the nearest hydro gen in thy mine, 8, the angle formed by 27 0 , - - �� - - - --- ,... __ _ _ ' •, :E-"' (n ' " · ... ... 'I' Figure 2.6: Coordinates describing relative water-thymine motion. the NH bond in thy mine and the oxygen in water, and <p, the rotation of the water center- of-mass relative to the axis defined by r (Fig. 2.6). In the neutral th ymine monohydrate (T1), these coordinates obtain the valu es r = 1.9 23 A, 9 = 36.7°, and <p = 40.6°, while in the cation they equal r = 1.668 A, 9 = 1.6°, and <p = 2.2°. Along the simplest path, each coordinate is described by a linear equation connecting the valu es in the neutral and in the cati on. This path is used to evaluate two potential surf aces, V(r), one with the thy mine held at its equilibrium neutral geometry and the second with thy mine held at its equilibrium cation geome try. The effective one-dimensional Hamiltonian describing this motion is: A 1 a 1 a2 1 a2 1 a2 A H = - 2m ar- 2maf2- 2mr2 a92 - 2/ a� + V( r ), (2.2) 28 where m is the mass of water and I is the moment of inertia of water rotating in the plane of the molecul e. This equation was then solved to obtain vibrati onal eigenstates for the water motion on the neutral and cation surfac es which, in turn, were used to compute FCFs. Within the approximation that the vibrational mode cor responding to the water motion is decoupled from the vibrational modes of the thy mine, the energy associated with the water-water vibrational tran sitions is additive to the thy mine-only spectra and the FCFs are mult iplicat ive. Each peak appearing in the spectrum asso ciated with the thy mine moiety thus has the spectrum for the water fr agment super imposed on top of it. This leads to both qualitative and quantitative changes in the theoretical spectr um, since the peaks with the largest FCFs in the water vibrational motion are the 1 +--0 and 2+--0 peaks. 2.2.3 Experimental details The experiments were performed on a molecular beam apparatu s coupled to a 3 meter VUV monochro mator on the Chemical Dyna mics Beamline at the Advanced Light Source (ALS ). The thermal vaporization source has been described recently in a pub lication detailing the microsolvation of DNA bases 6 . The nozzle consisted of a 0.953 em diameter disk (1 mm thick) with a 100 J1Ill diameter center hole welded on to one end of a closed stain less steel tube of 0.953 em OD and 15.24 em long. This front end of the sta inless steel tube contained thy mine and could be heated to between 298 and 700 K with a cartri dge heater mounted in an aluminum heating block. The temperature of the tube was monitored with a thermocouple to the heating block. To produce the water complexes, Ar carrier gas at 58.7 kPa was passed over a water reservoir held at room temperature and directed into the stain less steel nozzle. The temperature utilized for generating thymine vapor was 503 K. 29 700 T 600 500 TW � ro 4oo � c � 300 0 0 200 100 0 100 150 200 250 300 m/z Figure 2.7: Mass spectrum of microhydrated thymine recorded at a photon energy of lO eV. Shown in Figure 2.7 is a representative mass spectrum of microhydrated thymine recorded at a photon energy of 10 e V. The main peak is th ymine followed by thymi ne- water cluste rs. where up to five waters clustered around th ymine are detecta ble. Also observed are the thymine dimer with up to four water clusters connected to the dimer. The clusters are ionized by tunable synchrotron radiation in the 8.0 -13.0 eV region. and the ions are detected by a time-of-flight mass spectrometer. For each mass. the yield of the ions is measured as a function of photon energy. which produces PIE spectra. The typical step size for the PIE scans is 50 me V with a dwell time of 30 s at a repetition rate of 10 kHz. The dif ferentiati on of the PIE curve s. following the method used by Berkowitz in interpreting the photoionization of methanol 44 • produces a spectrum sim- ilar to a photoelectron spectrum from which information about vibrati onal progre ssions and other electronic states can be extracted. The differentiation is performed numeri- cally after taking a five points nearest- neighbor average to reduce the effects of noise in the PIE. The accuracy of reported onset energies in the PIE spectra is 0.05 eV. 30 2.3 Results and Discussion 2.3.1 Ionization-induced geometry changes in thymine hydrates We begin with an overview of the structures of thymine and thymine/water clusters summarized in Figs. 2.2 and 2.3. The minimum energy structures of the thy mine mono hydrates (see Fig. 2. 1) are non planar with the hydrogen atom of water out of the plane of thymine (the dihedral angle is 78 ° for T1). However, the energy difference between the planar (optimized with Cs constraint) and non-planar structures is less than 0.5 kcal/mol . Moreover, the effect of non-pl anarity on VIE is also small (about 0.02 eV). Thu s, we employed planar (Cs) geometries in all calculati ons of th ymine monohydrates for computational effi ciency. The binding energy per hydro gen bond increases with addition of each water molecule. Binding energies (non ZPE-corrected, De) are 11.3, 9.1, 8.9 kcal/mol for the T1, T2, and T3 monohydrate s, respe ctively. For the dihydrates they are 23.4 and 20.1 kcal/mol for Tl l and T12, and for the trihydra tes (T1 11 and Tl 12) they are 32.8 and 31. 0 kcal/mol. The opti mized geometries reveal that the only structural parame ters affected by hydrati on are tho se of the thymine part which is in close proximity to the water molecule (Figs. 2.2 and 2.3). The hydrogen bonding with water results in an increase in the bond length of the N-H and C-0 grou ps, and a decrease in the C-N bond length. With the addition of each water molecule the induced structural changes incre ase, with the larg est changes in bond lengths being 0.01 1, 0.018 and 0.028 A for the monohydrate, dihydrate and trihydra te, respectively. The ionization-induced structural relaxation of thy mine hydrates is much larger than that of iso lated thymine 9 . For example, the largest change in bond lengths of the thymine moiety in clusters is 0.062 A, whereas for iso lated th ymine it is only 0.032 A. A possible 31 explanation is a large shift of water molecules towards the positively charg ed N-H group and away from the negatively charg ed oxygen. This results in a shorter hydrogen bond between the N-H group and oxygen of water (0.245 A for T1), and a longer hydro gen bond between the C=O and the hydro gen of water (0.893 A for T1 ). As discussed below, this large structural relaxation results in vanishing FCFs for the 0�0 tran sition. 2.3.2 Vertical ionization energies The VIEs of all lowest ionized states summarized in Table 2.1 and Fig. 2.8 are affected by water. The first VIE in all clusters is reduced relative to thy mine, and the addition of each water molecule leads to a roughly 0.1 eV drop in VIE (except for Tl l- T1 11); however, the magnitude of the effect vari es for different cluste rs. The changes in higher VIEs also vary for different states and cluster structures from -0.24 to 0.74 eV. For monohydrate s, the largest change in the first VIE is 0.12 eV observed in Tl. 32 9.01 9.05 9 08 9.13 � (0.961) (0.12; 0.960) (0.08; 0.961) (0.05; 0.960) .r' � 9.97 10.13 10.10 10.17 (0.16; 0.951) (0.952) (0.03; 0.951) (-0.04; 0951 � <(" � --- 10.36 10.34 10.52 10.51 (0.16; 0.957) (0.18; 0 956) (0.957) (0.01 ; 0.956) .... 10.87 > � 11 .04 11.10 (0.17; 0.949) 11 03 Q) (0.951) (-0.06; 0.951) (0.01; 0.951) w > � 11 .94 11 .97 (0.45; 0.965) (0.42; 0.946) � 12.39 12.30 � (0.972) (0.09; 0 96: ) • � 12.53 12.67 12.58 12.63 (0.939) (0.14; 0.939) ( -0. 19; 0.940) (-0.24; 0.923) �- 13.55 13.60 13.70 (0.22; 0.802) (0.27; 0.829) � 13.82 (0.12; 0.905) (0.832) ..,.. Thymine Tl T2 T3 Figure 2.8: VIEs (eV) and the corresponding MOs of thymine and thymine-water isomers, EOM-IP-CCSD/cc-pVTZ. Changes in VIEs due to microhydration and the leading EOM-IP amplitude are given in parentheses. 33 U.) -+;.. Table 2.1: Vertical and adiabatic IEs (eV) of thymine and thymine-water clusters computed by EOM-IP-C CSD/cc- pVTZ. State T T- H 2 0 T(H 2 0) 2 T(H 2 0) 3 Tl T2 T3 Tl l T12 Ti ll T1 12 1 9.13 9.01 (0.12) 9.05 (0.08) 9.08 (0.05) 8.89 (0.24) 8.93 (0.20) 8.88 (0.25) 8.83 (0.30) 2 10.13 10.10 (0.03) 10.17 (-0.04) 9.97 (0.16) 10.10 (0.03) 10.15 (-0.02) 10.03 (0. 10) 10.08 (0.05) 3 10.52 10.51 (0. 01) 10.36 (0. 16) 10.34 (0.18) 10.44 (0.08) 10.36 (0.16) 10.48 (0.04) 10.30 (0.22) 4 11 .04 11.1 (-0.06) 10.87 (0. 17) 11 .03 (0.01) 11 .04 (0.00) 10.92 (0.12) 11 .08 (-0.04) 10.88 (0. 14) 5 (H 2 0) 12.39 12.30 (0.09) 11. 94 (0.45) 11 .97 (0.42) 11 .93 (0.74) 11 .93 (0.74) 11. 97 11. 85 6 12.67 12.53 (0. 14) 12.58 (-0.19) 12.63 (-0.24) 12.38 (0.01) 12.21 (0.18) 12 .18 11. 87 7 13.82 13.70 (0. 12) 13.60 (0.22) 13.55 (0.27) 13.50 (0.32) 13.52 (0.60) 12.3 1 12.3 1 1st AlE 8.90 8.51 (0.39) 8.65 (0.25) 8.64 (0.26) 8.52 (0.38) 8.30 (0.60) 8.35 (0.55) 8.21 (0.79) Ma 0.23 0.50 0.40 0.44 0.37 0.63 0.53 0.62 a Relaxation energy for the first ionized state, M=VIE-AIE The first six occupied molecular orbital s of th ymine mono hydrates and the four MOs of di- and trihydra tes are localized on either thy mine or water. The shapes of the se MOs are almost the same as in bare thymine/water. The corresponding ionized states are of Koopmans character, e.g., the leading R 1 amplitude values are greater than 0.94 (Fig. 2.8). However, for the states with IEs close to water IEs, the respective MOs (e.g., the sixth orbitals of monohydrate, fifth orbital of dihydrate) become delocal ized, and the cor responding wave functi ons become multi configurational (there are several R 1 amplit udes with valu es greater than 0.15). 2.3.3 The origin of IE changes We considered several possible explanati ons for the observed changes of the VIE in thy mine cluste rs: the geometry change of thymine molecule in cluste rs, charge transfer from th ymine to water resulting in hole delocalizati on, and electrostatic (charge-dipole) interaction between the ionized thymine moiety and water. Below we evaluate differ ent effects and conclude that the dominant contributi on to the VIE changes is due to electrosta ti cs. To estimate the effect of the change in the geome try, we computed IEs of bare thy mine at the equilibrium geometry of thy mine in the T1 monohydrate. The first VIE at this geometry is 9.16 eV, which is 0.03 eV higher than that of thymine at its own equilibrium geometry (9.13 eV). Therefore the geometry change does not explain the change of VIEs. The degree of charge transfer between the thymine and water was evaluated using the NBO analy sis. We found that the distribution of the positive charge is consistent with the shape of the cor responding Hart ree-Fock orbital s. The hole is localized on thymine and the maximum charge transfer in monohydrates is only 0.027 a.u. Moreover, there 35 0.20 • 0.15 - • - T 1 0.10 -• - T2 > Q) - •- T3 u.i 0.05 • > <] 0.00 -0.05 • 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026 0.028 L'l.q( Hp ) Figure 2.9: VIEs versus the degree of charge transfer to water in different ionized states of the three thymine monohydrates. 0.15 • 0.10 \ > CL1 0.05 w "- <I 0.00 -0.05 • 0.010 0.012 0.014 0.016 O.D18 0.020 0.022 0.024 0.026 0.028 �q(H20) Figure 2.10: Changes in VIEs versus the degree of charge transfer to water in different ionized states of Tl. 36 is no correlation between charge transfer and the change in the IE (Fig. 2.9); in fact, for Tl there is even the opposite dependence, the larger the charge transfer the smaller the change in the IE, as shown in Fig. 2.10. To calculate the charge-dipole interaction energy, we employed a simple classical model following the analy sis in Ref. 7. The energy was calculated as the sum of interac- tion ener gies between the dipole of water and the partial (NBO) charg es on the thymine atoms using the fol lowing equati on: Jl• qi E= -[-cose . if l l (2.3) where J1 is the dipole moment of water, qi is the charge of the ith thymine atom calculated by the NBO anal ysis, r is the distance between the atom and the center of the dipole, and 0 is the angle between the dipole vector and the vector connecting the center of the dipole and the atom. 0.25 0.20 0.15 0.10 � w� o.o5 <1 0.00 -0.05 • • • T1 • T2 _. T3 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 t. Ecd' eV Figure 2.11: Changes in VIEs versus charge-dipole interaction energy between the charges on thymine and the dipole moment of water molecule. 37 The ionization introduces a net positive charge on the th ymine resulting in a new charge dist ribut ion. This leads to the change in electrostatic interaction energy, which is calculated as the diffe rence of the interaction energy in the neutral and the ionized state s. Fig. 2. 11 shows computed IEs for different states and structures versus these charg e-dipole interaction ener gies. There is a strong correlation between the change in the electrostatic interaction energy and the change in VIE. We conclude that the change in VIE is expla ined by a simple charg e-dipole interac tion between the partial charg es on thy mine and the water dipole, which either stabilize or destabilize the ionized states relative to the neutral ground state. 2.3.4 Adiabatic ionization energies and FCFs Due to large geometry relaxati ons of the ionized states (see Figs. 2.4 and 2.5), the AlEs are signi ficantly lower that VIEs. The relaxation ener gies for different structures are in the range of 0.4-0.6 e V, which is consid erably larger than the relaxation energy of bare thymine (0.23 e V) 9 . The structures that undergo greater structural relaxation in the ionized states are characterized by a larger relaxation energy. Compare, for example, Ti l and Tl2 (Fig. 2.5). The AlEs of the clusters are consi derably lower than that of bare thymine and are reduced by about 0.1 e V with addition of each water molecule. To compare the calculated AlEs with the experimental PIE onsets, we calculated FCFs for Tl and T2, as desc ribed in Section 2.2.2. The computed intensities are convo luted by a Ga ussian with width of 0.05 e V which cor responds to the experimental width, and are shown in Fig. 2. 12 and 2.13. As one can see, the relaxation in the water-thymine degrees of freedom affects the FCFs resulting in negligible intensity for the 0+--0 tran sitions and shifting the spectra to higher ener gies. The resulting PIE curves obtained by integration of FCFs are shown in Fig. 2.14. The computed PIE onsets are shifted by 38 ::::l cri ;:3 ·w c (!) c -- water-th mine motion 0.3 1-- -__:_;_:= '---='-' ...t....:..e... .:..:..:...= _:_:_:_ =-:c..:: '-'-'--J 1. 0 0.5 0.0 8.4 8.5 8.6 8.7 8.8 8.9 9.0 9.1 9.2 9.3 9.4 Energy, eV Figure 2.12: Calculated FCFs for the first ionized state of Tl (lower panel). Upper panel: FCFs due to water-thymine motion (undefined scale); middle panel: FCFs due to thymine moiety. 0.03 eV to higher ener gies relative to the 0+---0 transiti ons. Moreover, the intensities at the computed onsets are so small that the apparent onsets are shifted by about 0.1 e V to higher energi es. 39 0.4 0.3 0.2 0.1 ::l ro >. 0.10 ....... "(jj c 2 0.05 c 0.00 0.8 0.6 0.4 0.2 0.0 -- water-th mine motion 8.4 8.5 8.6 8.7 8.8 8.9 9.0 9.1 9.2 9.3 9.4 Energy, eV Figure 2.13: Calculated FCFs factors for the first ionized state of T2 (lower panel). Upper panel: FCFs due to water-thymine motion (undefined scale); middle panel: FCFs due to thymine moiety. 2.3.5 Theory versus experiment Figs. 2.15 and 2.16 show differentiated and raw PIE spectra for T, T(H 2 0), T(H 2 0) 2 , T(H 2 0) 3 as well as the computed VIEs and AlEs for different structu res. The com- puted IEs for different structures are summarized in Table 2. 1. For bare thymi ne, the experimental and computed IEs are in excellent agreement 9 . 40 7000 6500 -- T1 6000 -- T2 5500 -- raw experiment 5000 -- smoothed exp. 4500 :::J 4000 til 3500 :i:- 3000 "iii c 2500 Q) O-c>OT 1 � 2000 1500 1000 500 0 -500 8.4 8.5 8.6 8.7 8.8 8.9 Energy, eV Figure 2.14: The experimental (raw and smoothed data, and error bars) and cal culated (Tl and T2) PIE curves. The respective adiabatic 0+--0 tran sitions are also shown. Our present experimental results from the differentiated PIE for thymine are in agreement with the previous experimental results and calculati ons 9 . The onset is 8.85± 0.05 e V. The small peak at the onset ari sing at 8.80 e V could come from frag mentation of higher cluste rs. The two peaks at 11 .8 eV and 12.2 eV do not have theoret- ical counterparts (Fig. 2.15, T). The peak at 11 .8 eV is an experimental artifact because argon used in the gas filter to remove higher harm onics in the incident radiation has absorption lines at 11 .62 eV and 11 .83 eV. These lines strongly perturb PIE intensities and were removed from the experimental data. For th ymine monohydra tes, the experimental PIE curve onset at 8.60 eV is in per- feet agreement with the calculated AlEs for T2 (8.65 eV) and T3 (8.64 eV), but 0.09 eV higher that the calculated AlE for T1 (8.51 eV) (Fig. 2.16, T-H 2 0). This discrep- ancy is due to the unfavorable FCFs for T1 ionization obscuring the onset, as evidenced by the simulated PIE curves (computed by integrating Franck-Condon prog ressions as 41 "" c 2.0 � 10 00 8.0 0.5 04 :J "' 0.3 ?: tii c 2 0.2 <:: 0 1 0.0 80 T 8.5 9.0 T(Hp) 2 85 90 9.5 10.0 10.5 11. 0 Photon energy, eV 95 10 0 105 11 0 Photon energy, eV 11. 5 12.0 12.5 -exp DPIE - T11 - T12 11 5 120 12 5 1.2 1.0 ::J 0.8 "' :>, � 0.6 c: 2 <:: 0.4 0.2 0.0 i--r>- "T""T'"' "r-T-f'r-r .,.... ...,.,- .,..,.... -++ ....,.+ -++-. .....,- � .,...,_, ,J-,--.f '!--.' 80 85 90 95 100 105 11 0 11. 5 120 125 0 14 012 ::i 0.10 "' � 0.08 c (]) � 0.06 0.04 0 02 Photon energy, eV -exp. DPIE - T112 - T111 0 00 f-,--.- 1\-,. ....,JI- .......,.. ...,. ..,....... /"r-.'L,-t-,--.+ .,...,.,. .,..,. � ........... 4-y-, ,....J 8.0 8.5 9.0 9.5 10.0 10.5 11. 0 11. 5 12.0 12.5 Photon energy, eV Figure 2.15: The differentiated PIE (DPIE) curves and calculated VIEs for thymine and thymine clusters with one, two, and three water molecules. described in Section 2.3.4) shown in Fig. 2.14. We used our best estimate of adiabatic energ ies, i.e., including the ZPE (-0.08 eV) and extended basis (+0.07 eV) correcti ons. The calculated onsets are 8.53 eV and 8.70 eV for T1 and T2. They are shifted from the respective AlEs by 0.03 eV (AlE is 8.50 eV) and 0.06 eV (AlE is 8.64 eV). The apparent onsets are shifted even more (by � 0.1 eV), as the intensity of the calculated PIE for T1 below 8.6 eV is signifi cantly smaller than at higher ener gies which makes it difficult to observe low energy signal in the experi ment (8.60 eV). Overall, the experi mental onset at 8.60 eV agrees very well with the computed PIE curves for T1 and T2 42 T 5 · � 8.85 eV ,;, � 3 ""' c " E' ::j '" � 1.5 ;:;, c " � 10 05 T1 T2 T3 00� � �� ��� ��� �� ��� 80 85 90 95 100 105 110 115 120 125 80 85 90 95 100 105 110 115 120 125 Photon energy. eV 10 ,----------------------------. · = T(Hp) 2 0·8 :::· 8. 55 eV ::i 0.6 "' � � 04 E' 02 .•� T12 ot"ll1C j c <m T11 T12 T11 0.0 1--' o...-t ......... -rr � T""""' �� -rr � T""""' �� -rr ............j 80 85 90 95 100 105 11 0 115 120 125 Photon energy, eV Photon energy. eV 0.30 .------------------------------, am. T112 T111 0.20 - � :J � ttl :am:z � 0.15 c " � 0.10 0.05 n1r T112 ooo H ......... � -rr � ,.......... ...-r � "'T""'" .,......., � --r� T""'" � 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11. 5 12.0 12.5 Photon energy. eV Figure 2.16: The PIE curves and calculated AlEs for thymine and thymine clusters with one, two, and three water molecules. and the AlE of 8.56 eV (Tl). We also note that the experimental PIE rises much faster after the calculated T2 PIE onset (8. 70 e V). The VIEs for thymine monohydrate are in good agreement with the peaks maxima in the experimental differentiated PIE curve s, which cor respond to experimental VIEs (Fig. 2.15, T- H 2 0). From the compari son of the experimental differ entiated PIE and calculated VIEs, we conclude that all three isomers may be present in the beam because for evry calculated VIE there is a peak in diffirentiated PIE that could be assigned to this VIE. 43 For th ymine dihydrate, the experimental onset at 8.55 e Vi s in good agreement with the calculated AlE for Ti l (8.52 eV) [see Fig. 2.16, T(H 2 0) 2 ]. The calculated AlE for the Tl2 structure is 8.30 eV, which is 0.25 eV lower than the experimental onset. A possible explanation is that Tl2 underg oes a larger geometry relaxation upon ionization than Ti l. The VIEs of Ti l and Tl2 are very close (8.89 and 8.93 eV respectively), but Tl2 has a much larger relaxation energy (0.63 eV versus 0.37 eV for Ti l) . This can lead to a small FCF for Tl2, so it is not clear if the se isomers can be seen at the onset. The sharp rise of the experimental PIE curve suggests more favorable FCFs, as in Til. Finally, Tl2 is 3.3 kcal/mol higher in energy than Ti l and is not likely to be populated much in the molecular beam. The theoretical VIEs for both thy mine dihydrate structures agree well with the exper imental differentiated PIE peaks [Fig. 2.15, T(H 2 0) 2 ]. There are a few experimental peaks that do not have theoretical VIE counterparts suggesting that more than the two T(H 2 0) 2 isomers considered here are present in the beam. The theoretical AlE for thymine trihydrate Ti ll is 8.35 eV, which is close to the experimental onset at 8.40 eV [(Fig. 2.16, T(H 2 0) 3 ]. The theoretical AlE for the Tll2 structure is 8.21 e V, which is 0.19 e V lower than the experimental value. Possible expla nations of the difference are unfavorable FCFs and smaller concentration of Tll2 in the molecular beam relative to Ti ll. The theoretical VIEs for both trihydra tes are in good agreement with the experimental PIE curve, which can be seen in Fig. 2.15, T(H 2 0) 3 . As suggested for T(H 2 0) and T(H 2 0) 2 , there could also be minor contributi ons from other structures which are not considered here. 44 2.4 Conclusions We have presented a combined theoretical and experimental study of the ionized clusters of thy mine with one, two and three water molecu les. The PIE onsets are 8.85±0.05 eV, 8.60±0.05 eV, 8.55±0.05 eV, 8.40±0.05 eV for bare thymi ne, thymine monohydrate, dihydrate and trihydra te, respectively. The computed EOM-IP-CCSD AlEs are 8.90 eV, 8.51- 8.64 eV, 8.30-8.52 eV and 8.21-8.35 eV for bare thy mine, th ymine monohydrate s, dihydrates and trihydra tes, respe ctively. All th ymine microhydrates undergo significant structural relaxation upon ioniza tion, which results in higher relaxation energies for microhydrates than for the bare thy mine molecule, e.g., the computed relaxation energies for thy mine, T- H 2 0, T(H 2 0) 2 and T(H 2 0) 3 are 0.23, 0.40-0.50, 0.37-0.63, 0.53-0.62 eV, respectively. We found that due to the large structural relaxation in hydrated species, the FCFs for the 0+---0 transiti ons are small thus shifting the apparent onsets to higher energy (0.03- 0.10 eV). The agreement between theoretical and measured PIEs is good, however, due to unfavorable FCFs the PIE onsets overestimate the AlEs. The theoretical VIEs are in good agreement with the maxima of the differentiated PIE curv es. There is a significant effect of microhydration on the AlE of thy mine. Addition of the first water molecule decreases the AlE as much as 0.39 e V, the second water molecule decreases the AlE as much as 0.35 eV, and the third one - as much as 0.31 e V. VIEs al so are affected by the addition of water. The effect is differ ent for differ ent structures and ionized state s. The change of VIE depends on the shape of the corre sponding MOs and the position of the water and is dominated by electrostatic (charge dipole) interac tions. We found that there is no significant charge transfer from thymine 45 to water, and that the chang es of VIEs due to hydra tion-induced structural chang es in the thy mine moiety are small. 46 2.5 Chapter 2 references [1] E. Cauet, M. Va liev, and J.H. Weare. Vert ical ionization potentials of nucl eobases in a fully solvated DNA environment. J. Phys. Chern. B, 11 4:5886-5894, 2010. [2] P. Slavfcek, B. Winter, M. Faubel, S.E. Bradforth , and P. Jungwirth. Ioniza tion ener gies of aqueous nucleic acids: Photoelectron spectroscopy of pyrimidine nucleosides and ab initio calculati ons. J. Am. Chern. Soc., 131: 6460-6 467, 2009. [3] A.A. Golubeva and A.l. Krylov. The effect of rc-stacking and H-bonding on ion ization energies of a nucleob ase: Uracil dimer cation. Phys. Chern. Chern. Phy s., 11:13 03-1 311, 2009. [4] A.A. Zadorozhnaya and A.l. Krylov. Ionizat ion-induced structural chang es in uracil dimers and their spectroscopic sig natu res. J. Chern. Theory Com put., 6:705- 717, 2010. [5] A.A. Zadorozhnaya and A.l. Krylov. Zooming into pi-stacked manif olds of nude abases: Ionized states of dimethylated uracil dime rs. J. Phys. Chern. A, 11 4 :2001- 200 9, 2010. [6] L. Belau, K.R. Wil son, S.R. Leone, and M. Ahmed. Vacuum-ultrav ioled photoi on ization studies of the microhydration of DNA bases (guanine, cytosine, adenine, and thy mine). J. Phys. Chern. A, 111:7 562-7568, 2007. [7] K.B. Bravaya, 0. Kostko, M. Ahmed, and A.l. Krylov. The effect of rc-stacki ng, h-bonding, and electrostatic interact ions on the ionization ener gies of nucleic acid bases: Adenine-adenine, thy mine-thymi ne and adenine-thymi ne dime rs. Phys. Chern. Chern. Phy s., 12: 2292-23 07, 2010. [8] 0. Kostko, K.B. Bravaya, A.l. Krylov, and M. Ahmed. Ionization of cyto sine monomer and dimer studied by VUV photoionization and electronic structure cal culat ions. Phys. Chern. Chern. Phys., 12:286 0-- 2872, 2010. [9] K.B. Bravaya, 0. Kostko, S. Dolgikh, A. Landau, M. Ahmed, and A.l. Krylov. Electronic structure and spectroscopy of nucleic acid bases: Ionization ener gies, ionizat ion-induced structural chang es, and photoelectron spectra. J. Phys. Chern. A, 11 4:12305-123 17, 2010. [10 ] S.K. Kim, W. Lee, and D.R. Herschbach. Cluster beam chemi stry : Hydration of nucleic acid bases; Ionization potentials of hydrated adenine and thy mine. J. Phys. Chern., 10 0: 7933-7937, 1996. 47 [11] K.-W. Choi , J.-H. Lee, and S.K. Kim. Ionization spectroscopy of a DNA base: Vac uum-ultraviolet mass-analyzed threshold ionization spectroscopy of jet-cooled thymi ne. J. Am. Chern. Soc., 12 7:1 5674 -15675, 2005. [12 ] D.M. Close, C.E. Crespo-Hernandez, L. Gorb, and J. Leszczynski. Influence of microhydration on the ionization energy th resholds of thymi ne: Compari sons of theoretical calculati ons with experimental values. J. Phys. Chern. A, 11 0:7 485- 7490,2006. [13] D.M. Close, C.E. Crespo-Hernandez, L. Gorb, and J. Leszczynski. Theoretical elucidation of conflicting experimental data on vertical ionization potentials of mycrohydrated thymi ne. J. Phys. Chern. Lett. , 11 2:4405--4409, 2008. [14 ] V.R. Smith, E. Samoylova, H.H. Ritze, W. Radloff, and T. Schultz. Excimer states in microhydrated adenine cluste rs. Phys. Chern. Chern. Phys., 12:9632-9636, 2010. [15] A.L. Sobolewski and W. Domcke. Computational studies of aqueou s-phase pho tochemistry and the hydrated electron in finit e-size cluste rs. Phys. Chern. Chern. Phys., 9:3818 -3829, 2007. [16 ] A. L. Sobolewski, W. Domcke, C. Dedonder-Lardeuxc, and C. Jouvet. Excited state hydrogen detachment and hydrog en transfer driven by rep ulsive 1n cr* state s: A new paradi gm for nonradiative decay in aromatic biomolecu les. Phys. Chern. Chern. Phy s., 4:10 93-1 10 0, 2002. [17] A.L. Sobolewski and W. Domcke. Phot oinduced electron and proton transfer in phenol and its clusters with water and ammonia. J. Phys. Chern. A, 10 5: 9275-9283, 2001. [18 ] H.-H. Ritze, H. Lippert, E. Samoylova, V.R. Smith, I.V. Hertel , W. Radloff, and T. Schultz. Relevance of ncr* states in the photoinduced proc esses of adenine, adenine dimer, and adeninewater complexes. J. Chern. Phys., 122:2 24320, 2005. [19] Y. He, C. Wu, and W. Kong. Photop hysics of meth yl-subst ituted uracils and thymi nes and their water complexes in the gas phase. J. Phys. Chern. A, 108:943- 949, 2004. [20] M.K. Shukla and J. Leszczynski. Interaction of water molecules with cyto sine tauto mers: An excited-state quantum chemical investigati on. J. Phys. Chern. A, 106: 11 338 -1 1 346, 2002. [21] A. Yoshikawa and S. Matsika. Excited electronic states and photophysics of uracil and water complexes. Chern. Phys., 347:393-404, 2008. 48 [22] C.M. Mari an, F. Schneider, M. Kleinschmidt, and J. Tatchen. Electro nic excita tion and singlet-t riplet coupling in uracil taut omers and uracil-water complexes - a quantum chemical investigati on. Eur. Phys. 1. D, 20:357-367, 2002. [23] P.A. Pieniazek, S.A. Arnstein, S.E. Bradforth, A.l. Krylov, and C.D. Sherrill. Benchmark full configuration interaction and EOM-IP-CCSD results for proto typical charge transfer syste ms: Noncovalent ionized dime rs. 1. Chern. Phy s., 127: 164110, 2007. [24] A.l. Krylov. Equati on-of-motion coupled-cluster methods for open-shell and elec tronically excited spec ies: The hitchhiker 's guide to Pock space. Annu. Rev. Phys. Chern., 59:433 -462, 2008. [25] J.-D. Chai and M. Head-Gordon. Systematic opti mization of long-range corrected hybri d density functional s. 1. Chem Phys., 128:084106, 2008. [26] J.-D. Chai and M. Head-Gordon. Long-range corrected hybrid density fu nction al s with damped atom-atom dispersion interact ions. Phys. Chern. Chern. Phy s., 10:661 5-6620, 2008. [27] D. Sinha, D. Mukhopadhyay, and D. Mukher jee. A note on the direct calculation of excita tion-energies by quasi-deg enerate MBPT and coupled-cluster theory. Chern. Phys. Lett. , 129:36 9-374, 1986. [28] S. Pal, M. Rittby, R.J. Bartlett, D. Sinha, and D. Mukher jee. Multiref erence coupled-cluster methods using an incomplete model space - application to ionization-potential s and excita tion-energies of formal dehyde. Chern. Phys. Lett. , 137:2 73-278, 1987. [29] J.F. Stanton and J. Gauss. Analytic energy derivatives for ionized states described by the equat ion-of-motion coupled cluster method. 1. Chern. Phys., 101(10 ):8938- 8944, 1994. [30] M. Nooijen and R.J. Bartlett. Equation of motion coupled cluster method for elec tron att achment. 1. Chern. Phys., 102:3629-3 647, 1995. [3 1] P.A. Pieniazek, S.E. Bradforth , and A.l. Krylov. Charge localization and Jahn Tel ler distortions in the benzene dimer cation. 1. Chern. Ph ys., 129:074104, 2008. [32] C.M. Oana and A.l. Krylov. Dyson orbital s for ionization from the ground and electronically excited states within equat ion-of-motion coupled-cluster formali sm: Theory, implementat ion, and examp les. 1. Chern. Ph ys., 127:2 3410 6, 2007. [33] P.A. Pieniazek, A.l. Krylov, and S.E. Bradforth . Electronic structure of the benzene dimer cation. 1. Chern. Phys., 127:0443 17, 2007. 49 [34] P.A. Pieniazek, J. Vande Vondele, P. Jungwirth, A. I. Krylov, and S.E. Bradforth . Electronic structure of the water dimer cation. 1. Phys. Chern. A, 11 2:6159 --6170, 2008. [35] P.A. Pieniazek, E.J. Sundstrom, S.E. Bradforth , and A.l. Krylov. The degree of initial hole localization/delocalization in ionized water cluste rs. 1. Phys. Chern. A, 113: 4423�4429 , 2009. [36] E. Kamarchi k, 0. Kostko, J.M. Bowman, M. Ahmed, and A.l. Krylov. Spectro scopic signa tures of proton transfer dynami cs in the water dimer cation. 1. Chern. Phys., 132:19 4311, 2010. [37] J. Rej nek, M. Hanus, M. Kabela c;;, F. Ryj ek, and P. Hobza. Correlated ab initio study of nucleic acid bases and their taut omers in the gas phase, in a micro hydrated envir onment, and in aqueous soluti on. Part 4. Uracil and thy mine. Phys. Chern. Chern. Phy s., 7:2 006-- 2017, 2005. [38] K. Khistyaev, K.B. Bravaya, E. Kamarchi k, 0. Kostko M. Ahmed, and A.I. Krylov. The effect of micro hydrati on on ionization energies of thy mine. Faraday Discuss., 150:313 �330, 2011. [39] A. Landau, K. Khistyaev, S. Dolgikh, and A.I. Krylov. Frozen natural orbitals for ionized states within equat ion-of-motion coupled-cluster forma lism. 1. Chern. Phys., 132:01410 9, 2010. [ 40] E.D. Glendening, J.K. Badenhoop, A.E. Reed, I.E. Carpenter, J.A. Bohmann, C.M. Morales, and F. Weinhold. NBO 5.0. Theoretical Chemistry Institute, University of Wi sconsin, Madison, WI, 200 1. [41] Y. Shao, L. Fusti-Mo lnar, Y. Jung, J. Kussmann, C. Ochsenfeld, S. Brown, A.T.B. Gilbert, L.V. Slipchenko, S.V . Levchenko, D.P . O' Neill, R.A. Distasio Jr, R.C. Lochan, T. Wang, G.J.O. Beran, N.A. Besley, J.M. Herbert, C.Y. Lin, T. Van Voorh is, S.H. Chien, A. Sodt, R.P. Steele, V.A. Rassolov, P. Maslen, P.P. Koram bath, R.D. Adamson, B. Austin, J. Baker, E.F.C. Bird, H. Daschel, R.J. Doerks en, A. Dreuw, B.D. Dunietz, A.D. Dutoi, T.R. Furlani, S.R. Gwaltney, A. Heyden, S. Hirata, C.-P . Hsu, G.S. Kedziora, R.Z. Khalliulin, P. Klunziger, A.M. Lee, W.Z. Liang , I. Lotan, N. Nair, B. Pete rs, E.l. Proynov, P.A. Pieniazek, Y.M. Rhee, J. Ritchi e, E. Rosta, C.D. Sherrill, A.C. Simmonett, J.E. Subotnik, H.L. Woodcock III, W. Zhang, A.T. Bell, A.K. Chakraborty, D.M. Chipman, F.J. Keil , A. War shel, W.J. Hehre, H.F. Schaefer III, J. Kong, A.l. Krylov, P.M.W. Gill, M. Head-Gordon. Advances in methods and algorithms in a modern quantum chemistry program packag e. Phys. Chern. Chern. Phys., 8:3 172�3191, 2006. 50 [ 42] R. Berger, C. Fisc her, and M. Klessinge r. Calculation of the vibronic fine structure in electronic spectra at higher temperatures 1. Benzene and Pyrazi ne. J. Phys. Chern. A, 102:71 57-7167, 1998. [ 43] V.A. Mozhayskiy and A. I. Krylov. ezSpectrum, http ://io penshell .usc.edu/down loads/. [44] J. Berkowitz. Photoi onization of CH 3 0H, CD 3 0H, and CH 3 0D: Diss ociative ion ization mechani sms and ionic structu res. J. Chern. Phy s., 69:3044-3054, 1978. 51 Chapter 3: The ef fect of microhydration on proton transfer in 1,3-dimethyluracil and 1,3-dimethyluracil dimers 3.1 Introduction Excited-state proton transfer is ubiquitous in chemistry 1 - 3 and biology, occurr ing, for example, in photoactive protei ns such as green fiuorescent 4 and photoactive yellow pro teins 5 . In DNA, excited-state proton transfer between the nucl eobases is a pathway contributi ng to photoprotection 6 • 7 . The driving force for excited-state proton transfer in DNA is the increased acidity of electronically excited nucleob ases. Likewi se, oxidized nucleob ases, in which a valence electron is completely removed, al so exhibit enhanced acidity leading to PT between the strands of DNA 7 and competes with electron hole (positive charg e) migration along the stra nds 8 . Studies of isolated model systems, such as clusters of nucleo bases 9 • 10 , reveal that ionization-induced PT is very facile, even in systems with no h-bonds such as the methylated rc-st acked uracil dimer 10 . Elec tro nic structure calculati ons show that ionization-induced PT between nucl eobases is 52 endothermic in the neutral ground state (e.g., 0.8 eV uphill in AT), while it is exother mic in ionized species by 0.4-0.8 eV. Furthermore, ionization-induced PT in h-bonded pairs is barrierl ess suggesting a high effi ciency for this proce ss. Interestingly, even in It-stacked systems that have no h-bonds, PT is only slightly endothermic and involves a moderate barrier (0.2 eV in methylated uracil dimer) ; consequently, this channel opens up very close to the ionization threshold. A very recent study 11 of one-electron oxida tion of DNA in solution has found that the initial steps involve proton transfer from a methyl group of thymi ne; th us, providing experimental evidence of the facile PT from non-hydrogen bonded moieti es in realistic environmen ts. Water is believed to be inst rumental for PT in biological systems 1 2 , such as in water filled ion channel s, as well as through interface s, membranes and in aero sols 13 . The ability of water to form so-called "water wire s" facilitati ng a relay-type transport of pro tons, the Grotthu ss mechani sm, is essential in all these processes 14 · 15 . The proton coupled electron transfer in DNA also involves water wires 11 . Notwith standing the importance of water- mediated ground- and excited-state PT, these pr ocesses are not well understood, and only a few studies have emerged to shed light on the underlying mech anistic details and dynami cs. For examp le, a recent study of small NQ+(H 2 0)n clusters investigated how the shape of h-bonded network controls proto n-coupled water activa tion in HONO formation in the ionosphere 16 . Sequential PT through water br idges in acid-base react ions has been studied by time-resolved experiments in which the reaction has been initiated by an optical trigger exciting the photoacid 17 • Resonant ionization spectroscopy of gas-phase h-bonded clusters was employed to investigate proton versus hydrog en transfer pathways 18 . Va rious experimental technique s, most notably ion based infrared spectrosco py, have been used to quantify important energet ics and dynami cs of ionization-induced PT, and the catalytic action of solvating waters on tautomerization 53 equilibria via PT 19 - 22 . However a detailed understanding of the reaction pathways and directionality of PT in model systems, let alone in real biological and chemical systems remains elusive. In this work, we report that water has a dramatic effect on the PT in ionized species. We focus on methylated uracil cluste rs, capita lizing on our previous experi ence with this model system and absence of low energy tautomers 10 • 23 ; however, similar effects were also observed in microhydrated thy mine species. We consider 1,3-d imethyluracil (mU, structure shown in Fig. 3.4) and its deuterated analog, d6-1,3- dimethyluracil (DmU). Mass spectrometry coupled with tunable VUV radiation molecular beam experi ments show that microhydration changes the branching ratio between different relaxation chan nels and entirely shuts down PT between the bases. Instead, a new pathway opens up, where protonated nucleo bases are produced via PT from the ionized water molecule and elimination of the hydroxyl radical. Electronic structure calculat ions reveal that the shape of the potential energy profile al ong the PT coordinate depends strongly on the character of the molecular orbital from which the electron is removed, i.e., the PT from water to nucleo bases becomes barrier less upon access of an ionized state localized on water. The computed energeti cs of PT is in excellent agreement with the experimental appearance energ ies. We also note that the possible adiabatic proc esses, which become energetically accessible at lower energ ies, are not effi cient. Th us, PT is controlled elec tronically, by the character of the ionized state, rather than statistically, by simple energy conside rati ons. All experimental data presented in this chapter were obtained by Dr. Musahid Ahmed group from Lawrence Berkeley National Laboratory. Ananl ysis of the raw expe rimantal data was done jointly by me and Dr. Musahid Ahmed. 54 3.2 Experimental details To reflectron and detector Source chamber TOF MS chamber E 3 ··· · ······ ··· ·· · ·· ··········· ·-- E2 E1 Skimmer Effusive Source Supersonic Molecular Beam Figure 3.1 : Schematic of experimental apparatus. The experiments were performed on a molecular beam apparatus 2 4 coupled to a 3 meter VUV monochromator on the Chemical Dynamics Beamline at the Advanced Light Source by our collaborators from Musahid Ahmed group from Lawrence Berke ley National Laboratory. The experimental apparatus is described in Fig. 3. 1. To gen erate monomers and dimers in a supersonic jet expansion, a small sample is placed in a thermal vaporization source (3/8" tube with 100 ,urn orifice) and heated to generate suf ficient vapor pressure. Argon gas at 600 Torr was bubbled through water (D 2 0) and then passed over the sample vapors (1,3-mU from Sigm a-Aldrich, at 11 2 °C) before expanding to vacuum through a 100 ,urn orifice and a 2 mm skimmer (1 em downstream) to produce a molecu lar beam at the interaction region of a reflectron mass spe ctrometer (R. M. Jordan) where it is ionized by the VUV light. As the synchrotron light is quasi-c ontinuous (500 MHz), 55 pulsing the electrical fields of the ion optics forms the ion packet in the mass spectrom- eter. Measurements are taken at photon energies between 8 eV and 12 eV with a step size of 50 me V. The ion signals from the microchannel plate detector are collected with a multichannel-scalar card (FAST Comtec 7889) and accumulated over multiple (e.g. 300, 000) sweeps at each point of the scan 2 4 . We repeated the experiments with H20 vapors and d6- 1,3-mU, the latter is unavailable commercially, and therefore it was syn- thesized according to the procedure developed by Davidson and Baudisch (Davidson and Baudisch 19 26) and we described in detail in our previous paper 10 . Increasing solvation (backing P increase) T= 100 C T= 60C Increasing stacking (nozzle T increases) T= 65C T= 50 C D 100 200 YXl 400 500 0 50 100 150 200 250 300 350 4<XJ 450 5CO 0 50 100 150 200 250 300 350 4C0 450 500 0 5(l 100 150 200 250 300 350 400 450 500 �00 � � 1� � ����� � I 0 nso 500 50C/ 140 I 200 1so 200 M -200 1 140 -200 1 1500 000 OO'J 5{)J M, 1 1 0 75 3000 2500 """ 1500 1000 5() 0 1500 1000 500 2000 j 11 I. ,: .11 I. ,:.11 j, ll II [ M , 500 200> 500 I 3001 1 1 L 0 I�. 1 .. J.lllJ. J. . lJ J � 1 .l.l ... L ,_, � 1 5() 000 500 2000 5() :JIIJ I I J t.l [J_ ll l..llil I 500 1 1 - 1 ·- 0 o :JI 1.1 l.tl l __ [ l_______§gQ_ , l.l i llil 1 0 : ./ il ll ill. i l -� j, 500 200>f-'t'-""'--'-L.....W..���--� 3001 1 500 1 � 1 500 000 500 -· -- - '. 1 ':["+ u.w..l oJ...Wl. ......... ��� 500 1 1 l_______§gQ_ 1 1 100 200 :KXJ 400 :;:()0 o 50 100 150 200 2'W m 350 400 450 500 MIZ MIZ MIZ 500 MIZ Figure 3.2: Mass spectra of hydrated (with H20) mU and its dimer using 12 eV photons with different backing pressure and temperature. A judicious combination of experimental source conditions (backing pressure and reservoir heater temperature) allowed us to vary the population of dimethyl uracil monomers, dimers, and their microhydrated clusters with up to 7 water molecules in the molecular beam (Fig. 3.2). The origin of transferred proton is determined by using various combinations of deuterated and non-deuterated species, such as DmU-H20 ver- 56 35 ?F. vl 30 ""ffi c Ol . iii 25 Ti � -= 20 ""ffi E g 15 c � Q; 10 [L L .-- --- All hydrated forms of mU dimer ---.- (mU)2 + ____.____ All forms of mU dimer 1 00 200 300 400 500 600 700 BOO Pressure (Torr) Figure 3.3: The percentage of different dimer fo rms relati ve to the all fo rms of mU present in the beam. The percentage is calculated as the ratio between the signal of the indicated form and the signal of all fo rms of mU present in the beam. Fig. 3.3 shows the percentage of different dimer forms of mU relative to all forms of mU present in the beam. The percentage of all (mU) 2 is roughly constant beyond 250 Torr, with the value above 25%. However, the percentage of bare mU dimers sig nificantly decreases, while the percentage of the hydrated mU dimers incr eases. Th us, increased pre ssure does not inhibit the formati on of the mU dimer, rather, it increas es its hydrati on. Higher mU clusters (tr imers, etc) are not present in the beam in any sig- nificant amount according to the low signal of the respective m/z. The structures of the representative isomers of mU and its dimer hydrated with one or two water molecul es are shown in Fig. 3.4. Because of methylati on, only urac il's oxygens, O(mU) are ava ilable for h-bond ing. Cons equently, in mono-hydrated struc ture s, water acts as a proton donor. The H(H 2 0) ... O(mU) bond length are 1.8 7 A and 1.80 A in mU-H 2 0 and (mU) 2 -H 2 0, respectively. The second water molecule forms 57 1. 80 • ' �· . . 2 .1 6 • ·, ..__�_,._.'-"� 1. 87 0>� c :::J :.s2 CJ) (.) CJ) ro a.> ..0 !.... 0.. T T L...--------,> Figure 3.4: Structures of 1,3- dimethyluracil and its dimer hydrated with one or two water molecules. In all structures, water acts as a proton donor . Hydration of the dimer does not lead to considerable changes in the relative position of the two mU moieties, e.g., the distances between C=O and C-CH3 groups in dry and hydrated (mUh clusters are around 3.3-3.5 A. Temperature increase results in higher con centration of mU clusters, whereas backing pressure controls degree of hydration. an h-bond with the first water molecule in mU-(H20h. In hydrated (mUh, the second water forms an h-bond with the second uracil ring. 58 Previously 10 , we demonstrated that PT between the bases in mU dimers occurs from a methyl group. Thus, the following PT react ions are possible in ionized (mU) 2 CD 2 0) cluste rs: (mU)2(D20)+-+ (mU)2D+ + OD -+ mUD++ mU + OD (3 .1) (mU)2(D20)+-+ (mU)2H+ + D20 -+ mUH+ + mU + D20 (3.2) Likewi se, in (mU)n(D 2 0)m clu sters, the appearance of protonated species is due to the PT between the bases, whereas the deuteron transfer will signify PT from the solvent to uracil. By considering the rati os of the respective rn/z peaks, one can quantify the efficiency of these competing PT channel s. Fig. 3.5A shows a VUV single photon ionization mass spectrum of the molecular beam with ion signals cor responding to the mU monomer (at rnlz 14 0), dimer (at rn/z 280), and their clusters with D 2 0. The inset in Fig. 3.5A shows an enlarged portion of the spectrum around rnlz 18 0 where the main feature cor responds to the mU(D 2 0) 2 ion, a cluster of one mU and two deuterated wate rs. The two ad jacent smaller peaks (at rn/z N+1 and N+2, marked by solid and dashed arrow s, respectively, where N=180) ari se either due to the natural iso tope abundance ( 13 C) or from protonated/deuterated species. As discussed below, similar spectra were obtai ned for the DmU-H 2 0 mixture. The iso topes account for 7.5 % and 1 % of the peaks at rn/z 181 and rnlz 18 2, respectively ; sim ilar values are obtai ned for the other hydrated species. Contrary to the PT yield, the nat ural isotope contributi ons do not depend on the photon energ ies, th us, the energy depen dence of the ratio between N+ 1 and N+2 peaks to the parent peak [N=180 for mU(D 2 0) 2 or N=182 for DmU(H 2 0) 2 l allows us to distinguish between proton/deuterium tran s fer versus natural isotopes. As illustrated in Fig. 3.5B, the (N+1 )/N ratio (solid line) is constant in the mU-D 2 0 beam, whereas the (N+2)/N (dashed line) exhibits a clear 59 14k 24k A I 1.0k \ 500.0 \ 1 6k ''"LL 0.0 176 180 184 800 0 lt t. llt,,__ 0 0 r-- 140 o.2 B ---- (N+2) /N - (N+1 )/N 280 m/z 04 c 0 420 - (N+1 )/N ---- (N+2) /N 0 1il 0.1 (t � (t 0.2 0.0 00 - - ---·---· -- ---- - · · 95 100 105 110 115 9.5 10.0 10.5 11.0 11.5 12.0 Photon energy (eV) Photon energy (eV) 0 -•- mU-D,o PT 0.08 -•- mU-D p PT(norrnalized) E 2 -•- mU-mU PT 0.4 0.06 0 � 0.2 0.04 11:: 0 ; - n=5 - n=4 - n=3 - n=2 -n=1 200 400 600 8oo 0 · 02 9.5 10.0 10.5 11.0 11.5 Pressure (torr) Photon energy (eV) Figure 3.5: Mass spectrum of hydrated mU and the dependence of the yield of var ious protonated species on photon energy and backing pressure. A. Mass spectrum of hydrated (with D20) mU and its dimer using 12 eV photons. The inset shows the region at mass to charge (m/z) 180 corresponding to [mU(D2 0)z]+. The dashed lines indicate two additional peaks at m/z 181 and 182 arising due to natural isotope abundance (13C) and due to protonated and deuterated species. The intensity ratios between the peaks marked by the arrows at different photon energy for mU(D20)z are shown in panel B. The constant behavior of the m/z 181 peak confirms that it arises from isotopic con tributi ons and is not due to PT. Panel C shows similar ratios (for N+1 and N+2 mlz peaks) for N=182 corresponding to [DmU(H 20)z]+. In this case, the N+2 peak is con stant, revealing that there is no deuteron transfer between the bases. Panel D: The effect of backing pressure (Ar gas) on PT. The black curve (mU-020 PT) character izes deuteron transfer from D20 to uracil; the red curve [mU-020 PT (normalized)] - deuteron transfer from D20 to uracil divided by the sum of mU and (mU)z hydrates, '[ [(mU)n(DzO) mD]+f L L [(mU)n(DzO) m H kDz]+. The blue curve (mU-mU PT) cor- n ,m n ,mfO k + l=O,l responds to PT between the mU molecules, '[ [mU(DzO)nH]+t [[(mU)z (DzO)m]+. E: The n m appearance energies of deuterated species [mU(DzO)nD]+ for different cluster sizes n. 60 onset (followed by a sharp rise) at about 10.8 eV. This demonstrates that there is no PT between the bases; rather, there is a deuterium transfer from the solvating D 2 0 to the mU dimer. To confirm that the proton/deuterium transfer can only occur from the solvent, we repeated the exper iment with DmU and non-de uterated water (H 2 0). Fig. 3.5C shows (N+1)/N and (N+2)/N ratios (solid and dashed lines, respectively) for N=182, which corre sponds to the DmU(H 2 0) 2 cation. Here we observe that the N+1 peak (proton transfer) exhibits a thre shold behavior (at 10.8 eV, as in the mU-D 2 0 exper iments, Fig. 3.5B), while (N+2)/N remains constant. Th us, there is no deuterion transfer between the DmU species. Note, that the kinetic isotope effect on the inter -base PT was found to be minor for the stacked mU dimer 10 , and, therefore, the constant behavior of the (N+2)/N peak in the DmU-H 2 0 is not due to HID exchange in the base. Essentially water shuts down PT between the mU bases, which opens up at 8.9 eV in the absence of water 10 . In Fig. 3.5A, there is evidence of mU dimers in the molecular beam. Furth ermore, we can control the degree of dimerization relative to solvation by varyi ng the backing pressure of the carrier gas (Ar), as illustrated in Fig. 3.2. Fig. 3.5D shows the effect of backing pre ssure on the relative effi ciency of PT in the mU- �0 beam. The increase of backing pre ssure increas es the yield of hydrated species, at the expense of bare mU dimer and monomer. However the total amount of all forms of the mU dimers (bare dimer plus all hydrated dimers) remains roughly the same (Fig. 3.3) .The yield of interf ragment PT is given by the signal of all protonated species (dominated by mUH+, more than 85% ). When normalized to the total dimer populati on, the yield of protonated species decreases with backing pre ssure (Fig. 3.5D mU-mU PT blue curve ). This suggests that the interf ragment PT is supp ressed by hydration of dimers rather than reducing the population of dimers via monomer hydration (hence 61 reducing the number of molecules avai lable for cluster ing). The yield of all deuterated forms of mU increas es upon hydrati on, as shown by the ratio of all deuterated forms to all forms of mU present in the beam (black line). Finally, upon normalization to the population of the hydrated species, we observe a constant ratio of all deuterated forms to all hydrated forms of mU (mU-D 2 0 (normalized) red line around 0.25), hence confirming that the increased yield of PT is proporti onal to the degree of hydrati on. This suggests that the rate of PT in the hydrated clusters (and, possibly, its mecha nism) does not depend on degree of hydrati on. 3.2.1 Proton transfer in thymine-water clusters Fig. 3.6 shows the yield of protonated th ymine (TH+) and th ymine- water (T(H 2 0)H+) clusters as a function of ionization energy. We observe that the onsets for TH+ and T(H 2 0)H+ are the same, which is similar to the mU(H 2 0)nH+ appearance ener gies that do not depend on the cluster size. Note that in the absence of water, PT in thymine occurs at 9.20 eV, with a ma jor rise in signal between 9.7 and 9.9 eV 9 . Thy mine pro vides an interesting compari son since it was demonstrated previously 9 , hydrog en bonded and n:-stacked dimers populate the molecular beam in contrast to dimethylura cil, where only n:-st acked dimers exist. The calculati ons suggest that it is hydrog en bonded thymine dimers which give rise to this signal, while the lower onset is expla ined by a dimer with n:-st acked geome try. Upon solvation, PT switches off at these lower ener gies, as is evidenced in the signal for TH+ and T(H 2 0)H+ shown in Fig. 3.6. Using a similar analy sis as performed for mU, a constant ratio of 0.073 is due to isotopes while the onset of PT begins around 10.6 e V, with a ma jor rise at 11 .2 e V. Both the onset and the shapes of the appearance curves are very similar to those for mU shown in Fig. 2 of the main manuscript, suggesting that a similar PT mecha nism from 62 0.25 - TH+ 0.20 - T(H20)H+ 0 ....... 0. 15 co 0::: 0. 10 0.05 +-----�,.!..,.-,.....--�-.---,,.,-����"T""""""T"��....-��---, 9.5 10.0 10.5 11.0 11.5 12.0 12 .5 Photon energy ( eV) Figure 3.6: Yi eld of protonated thymine (TH) and thymine-water (T(H 2 0)H+) clusters. the solvent is occurr ing. This mea ns, whether one has hydro gen bonded or rc-st acked nucleobase dimers and higher cluste rs, upon sol vati on, PT which was quite effective from cluster fr agmentation shuts down completely from the base itself. It is only when the solvent is ionized that PT begins again. 3.3 Computational details To understand the mechanism by which water shuts down PT between the bases, we turn to electronic structure calculati ons. Electronic structure calculati ons were performed following the computational pro tocols developed and validated in our previous studies of ionized species 9 • 10 • 23 • 25 - 2 8 . All calculati ons were performed using Q-Chem 2 9 and employed meth ods ranging from EOM-CC to DFT with range-s eparated fu nctionals and dispersion correction (roB 97X-D). 63 For accurate desc ription of the ionized states we employed EOM-IP-CCSD in which problematic target open-shell wave functions are derived by Koopmans-like operators acting on well-behaved closed-shell reference states 30 . EOM-IP-CCSD simultaneously includes dynamical and non-dynamical correlati on, describes multiple electronic states in one calculati on, and treats states with different numbers of electrons on the same footi ng. It is free from arti ficial symmetry breaking and spin-con taminat ion. All neutral ground- state structures and ionized protonated structures were optimi zed using DFT with the roB 97X-D functional and the 6- 311 + G(d,p) basis set. Ionization energies were calculated using EOM- IP-CCSD with the 6-3 11 +G( d,p) basis set. Ionized excited states geometries were optimi zed at the EOM-IP-CCS D/6-31+G(d,p) level of theory. Mono- and di-hydrated structures were obtained by placing the water at positions that are most favorable for h-bonding and opti mizing these structu res. Previ ous stud ies 9 • 2 8 · 31 and chemical intuition point out that in the most stable microhydrates water forms a hydro gen bond with either C=O or N-H group of a nuclear base. mU has two C=O groups and there are two possible water positions for every group. This gives rise to four monohydrated structures shown in Fig. 3.7. Dihydrated structures were obtained by adding a water molecule to monohydrated structu res, with a subsequent optimization (Fig. 3.8). 3.4 Computational results Previous theoretical studies of microhydrated nucl eobases 2 8 reported a small red-shift ( rv 0.4 e V) in the lowest IE, in excellent agreement with experi ments 2 0 · 2 8 . The calcula tions revealed that the character of the ionized state remains the same as in the isolated base (7t cc orbital ); the red-sh ift was expla ined by the fact that in the lowest-ene rgy 64 mUW1-1a 9.17 mUW1-2a 7.59 mUW1-1b 7.58 mUW1-2b 6.12 Figure 3.7: Monohydrated structures. Distances in angstroms and binding ener gies in kcal/mol are given for each structure. The lowest-energy isomer is mUWl la. mUWl-lb, mUW1-2a, and mUW1-2b are 1.59, 1.58 and 3.06 kcal/mol higher, respectively. microhydrated structures, the nucleobase is acting as a proton donor. Using similar computational protocols described in the Computational Details section, we conducted electronic structure calculations of micro-hyd rated mU dimers . We observe that the hydration by one or two water molecules does not change the relative distance between the two mU moieties (see Fig. 3.4), e.g., the distance between C(=O) and C(-CH3) moieties, which are involved in inter-base PT, in the mU dimer is 3.4 A, whereas in (mU)2(H20) and (mU)2(H20h it varies between 3.3-3.5 A. 65 mU W2-1a 20.93 mU W2-2a 14.72 mUW2-1b 18.25 mUW 2-2b 14.21 Figure 3.8: Dihydrated structures. Distances in angstroms and binding energies in kcallmol are given for each structure. The lowest-energy isomer is mUW2-la. The effect on the lowest ionized state is small, both in terms of energy and the character of the state (Table 3.1 ). We observe a moderate blue shift ( ""'0.1-0.3 e V) in the VIE, which is consistent with the structures of hydrated species (see Fig. 3.4) where uracil acts as a proton acceptor. The character of the lowest ionized state is also unaff ected, as evidenced by the wave function composition and the shapes of the respective MOs (Fig. 3.9). 66 mUWl Figure 3.9: MOs corresponding to the lowest ionized state in mU·H20, (mUh, and (mUh·H20. Thus, neither structure nor energetics of the lowest IE explains the observed behav- ior. However, we note that water blocks the proton-accepting sites in mU; it may also add structural rigidity to the system. The analysis of higher ionized states (see Table S1) reveals that while hydration has relatively small effect on the lowest ionized states of mU, the ionized states localized on water are affec ted much more strongly by the interaction with m U. Specifically, the state corresponding to ionization from a lone pair in water appears at 10.9-11.5 eV in mU mono- and dihydra tes, which is 1-1.5 eV lower compared to the bare water monomer. The lower bound of the energy range is remark- ably close to the observed onset of PT in micro hydrated clusters (1 0.8 e V). These results suggest that the PT channel opens up when the lowest ionized state on solvated water which corresponds to an excited ionized state of the mU(H 2 0)n cluster becomes acces- sible. These results are consistent with the experimentally observed onsets of PT, which are independent of the cluster size (see Fig. 3.5E), in stark contrast to the lowest IE of microhydrated nucleobases which show notable dependence on the number of hydrated waters ( rv 0.1 e V drop in IE per water molecule ) 20· 24· 28 . To gain further insight into the electronic structure of hydrated species and to validate theory, we focus on the PIE curve (obtained by integrating the area under the respective 67 2A' (1 0.8 eV) 60 2A" (9 9 eV) J 50 �-J 40 "' "E • :J 30 0 u !:: E: 20 8.6 eV (AI�:\ 10 \ 0 8.0 8.5 90 95 10.0 10.5 11.0 11.5 12.0 Photon energy (eV) Figure 3.10: Photoionization efficiency curve (black) of [(mU+D2 0)]+ and its derivative (red), observed using 8 eV to 12 eV photons. The derivative plot reveals multiple ionized states derived by removing the electron from different MOs. Black arrows points towards the calculated ionization energies. m/z peak) of the smallest hydrated cation, mU(D20). The differentiation of the PIE curve allow identification of multiple ionized states (the peaks on the diff erentiated PIE curve correspond to the VIEs). The PIE (black) and the differentiated (red) curves are shown in Fig. ?? , along with the computed VIEs. The curve features the ionization onset at "' 8.6 eV and a series of peaks between 8.5 and 11 .5 eV. The com puted AlE is in excellent agreement with the experiment al onset, whereas the computed VIEs match well the peaks of the differentiated curve. Thus, the peaks at 8.9, 9.9, 10.0, 10.8 and 11. 2 eV correspond to vertical ionizations from the 1A", 2A", 1A1, 2A1, and 3A" states, respectively. The character of these states are illustrated by the respective molecular orbitals which are also shown in Fig. ??. As Fig. ?? clearly illustrates, low-lying 68 electronic ionized states cor respond to the ionization from mU, whereas the 3A" state at 11. 2 e V is localized on water, and is similar to the water -localized states observed in microhydrated dime rs. 69 3.4.1 Proton transfer in monohydrate To understand proton transfer in microhydrated mU, we consider the lowest energy monohydrated mU (tl b) as a model structure. The first and fifth ionized states (which correspond to the first ionizations of the mU and water moieties, respectively) were optimized using EOM-IP-CCSD/6-31+G(d,p). The resulting geometries are presented in Fig. 3.1 1. In the first ionized state, in which the hole is localized on mU, the elec- trostatic interactions push water away from the C=O group (the H-0 distance increases by 1.13 A); this displacement acts against PT. In contrast, in the fifth ionized state (the hole is localized on water), electrostatic forces pull water closer to mU resulting into a barierless PT and gives optimized proton-transferred structure of the cation. Figure 3.11: Structural changes in ionized mUWl-la. Left: optimized neutral state. Top right: Franck-Condon optimized structure of the 1st (lA") adiabatic state of the cation. Bottom right: optimization of the sth (3A") adiabatic state of the cation gives optimized proton-transferred structure of the cation. 70 -.J >-' Table 3.1: Vertical and adiabatic ionization energies (eV) ofmU, (mU)2, water, and various hydrated species. All energies are computed by EOM-IP-C CSD/6-311+G(d,p) except for (mUh and (mU)2·H20 which are computed with 6-31+G(d,p) asis set. State H20 mU (mU) 2 mU·H 2 0 mU·(H 2 0) 2 mU Wl-la mU Wl-lb mUW1-2a mU Wl-2 b mUW2- la mUW2- lb mUW2-2b 1 12.22 8.87 8.40 8.93 9.01 9.03 9.08 8.89 9.00 9.18 2 14.42 9.74 8.81 9.93 9.91 9.95 9.94 9.95 9.87 10.11 3 18.93 9.77 9.42 10.00 10.02 9.91 9.92 10.03 10.03 10.19 4 10.66 9.66 10.77 10.77 10.91 10.90 10.74 10.74 11 .03 5 12.16 9.69 11.21 11.14 11.41 11.24 11.24 10.92 11.19 6 9.85 12.19 12.26 12.16 12.22 11.61 11.63 11.51 7 10.46 13.23 13.12 13. 33 13.23 12.14 12.24 12.3 1 8 10.51 13.63 13.67 13. 73 13.68 13.01 12.73 13.22 9 11.67 13.76 13.85 13.87 13.86 13.59 13.65 13.44 10 11. 88 14.20 14.14 14.25 14.27 13.66 13 .65 13.91 11 14.27 14.21 14.16 14.13 14.11 13.87 14.03 12 14.59 14.6 14.52 14.54 14.17 14.05 14.31 1st AlE 8.59 8.59 8.56 8.61 mU2Wl-la 9.24 10.08 10.20 11 .05 11 .14 11.19 12.37 13.10 13.27 13.93 13.94 14.25 To further understand PT in solvated systems, we analyze the potential energy pro files along the PT coordinate in [mU(H 2 0)]+. In order to analyze potential energy pro files along the PT coordinate we generated ten structures using the following expre ssion for the Car tesian coordinates 32 : Xn = Xinitial + (X final -Xinitial) . n I 10 (3.3) The initial and final structures are optimized structures of the neutral mU·H 2 0 and proton-transferred system, mUH+ ·OH. Th us, n can be interpreted as a step al ong the PT coordinate. The profiles are shown in Fig. 3.12 . The energy of the ionized states that are local ized on mU (1A" and 2A", see Fig. ??) increases al ong the PT coordinate (the PT is also endothermic in the neutral state). In contrast, the energy of the fifth state (3A") cor responding to the ionization of water decreases showing that PT from this state is a barrier less downhill process. A similar behavior is observed for other state s, in that energetically, water- ionized states go down, whereas uracil-localized states go up. Alter natively, one can consider a possibility of adiabatic PT, e.g., on the lowest ionized state (1A11). This will of course involve chang es in the electronic state character from the mU-localized one to the water- ionized one and a barrier. The analy sis of energy pro files shows that PT is energetically accessible at rv 10.6 e V (upper bound), i.e., at this energy the system has enough energy to overcome the barrier on the adiabatic PES cor responding to the lowest ionized state of the system. Yet, the onset of PT yield occurs only at 10.8 eV thus suggesting that such an adiabatic pro cess is inefficient. This can be readily rati onalized by analyzing the respective electro nic wave funct ions. The low est ionized state cor responds to the ionization of mU, which reduces the proton affinity of mU. Hence the short-ti me dynami cs will involve structural changes which are not 72 ---------- neutral ---+--- 1 A" 12 --+- 2A" -*- 3A" 11 10 2.19 eV 9 2� { r · -- � � � -r 8 . - 9 3_ e_ V __ -r--����--�--��--� 2.61 eV ---- ---- �- � Figure 3.12: Potential energy prof iles for low-lying states of [mU·H20]+ along the PT reaction coordinate. The proton is moved from the water molecule to the mU oxygen site. The 5th ionized state, 3A", in which the hole is on the water molecule (see Fig. ??), shows no barrier facilitating downhill PT. PT from lower ionized states are possible, however this involves changes in the electronic wave function character and requires more than 10.6 eV photon energy. The left panel shows the experimental ratio between the [mU(D20h]+ signal (rnlz 180) and [mU(D20hD]+ the deuterated species at rnlz 182; it shows dramatic enhancement in PT when the 3A" state is accessed. favorable for PT from water. Indeed, in the Franck-Condon opti mized structures of the lowest electro nic state of [mU·H20]+ (Fig. 3.11) the distance between O(mU) and water hydrog en increases from 1.8 7 A to 2.90 A. In contrast, PT is barrier less starting from the Franck-Condon point in the fifth ionized state. Th us, even though the system may have enough energy to overcome the barrier on the lowest ionized state adiabatic PES, this pathway is not favored dynamically because the gradients in the Franck-Condon region point away from the PT coordinate. In contrast, when the right electronic state is 73 accessed, the PT may occur ballistically on the respective diabatic surface. We observe that PT in hydrated mU species is controlled electronically, by the character of the state, rather than statistically, by energy considerations al one. !It 2.19 mUH + *OH VIE = 11.21 (3A") • � ______________ -rm _u_ H _ •*- OH - ____ 1 0.60 � � -+r- -------------- *0.09 VIE = 8.93 (1A") Figure 3.13: Energy diagram describing relevant ionized states and their ordering at different geometries along PT coordinate. All energies are given in eV and are calculated with EOM-IP-C CSD/6-311+G**· Relevant energy diffe rences in mU Wl-la are summarized in Fig. 3.13. VIEs for the 1st and 5th ionized states are 8.93 and 11 .21 eV, respe ctively. The energy difference between the final PT structure (ful ly relaxed mUH+ ·OH in the lowest ionized state) and initial ionized state is 2.18 eV (50.4 kcal/ mol). This can be described as energy gain due to PT on the diabatic surface cor responding to a water ionized state. The PT struc- ture is 0.09 e V (2. 13 kcal/mol) above the lowest ionized state at the Franck-Condon geom etry. Th us, for the lowest ionized state, the PT is an uphill pr ocess adiabatically. 74 Finally, energy required for PT structure to dissociate producing mUH+ and OH is 0.6 eV (13.76 kcal/mol ). Th us, adiabatically, the energy difference for the following pro cess: [mU·H 2 0]+ --+ mUH+ + OH is 0.69 eV or 15.9 kcal/mol (for the lowest ionized state). Combining this value with VIE we arrive at 9.62 eV - this would be mUH+ appearance energy on the lowest ionized state assuming no barrier. However, as illu s trated by Fig. 5 of the main manuscript, we anticipate a barrier on the lowest adiabatic PES. The experimental onset for mUH+ is much higher ( rv 11 eV) and agrees well with VIE cor responding to the 5th ionized state of mU·H 2 0. 3.4.2 Proton transfer in dihydrated species The appearance energies of mU(H 2 0)nH+ does not depend on the cluster size, as illu s trated in Fig. 3.5E, and to understand thi s, we consider a dihydrated system mUW2-1a (see Fig. 3.8). In this structure, one (inner) water forms a hydro gen bond with the C=O group of mU and the second (outer) water acts as proton donor for ming an h-bond with the first water (and a weaker h-bond with the C-H group of mU). We consider three possible PT structures that can be formed upon ionization. The first structure with the H 3 0+ moiety (Fig. 3. 14A) can be obtai ned by PT from the outer water molecule to the inner water molecule. The second structure (Fig. 3.14C) that has protonated mU and an inner OH radical can be derived by direct PT from the inner ionized water to mU. The third structure with protonated mU and an outer OH radical (Fig. 3. 14E) can be desc ribed as a result of Grotthu ss-like double PT from ionized outer water to the inner water, and from the inner water - to mU. The first structure can be described as an intermediate for the double PT. To gain insight into possible PT pathways in dihydrate s, we prepared such struc tures by manually displacing the parent structure and optimi zed them by using protocols 75 Ini tial A � Si n g � " ( Transfer c 0 n g le Transfer E � � oub le Transfe r Optimized - � ·90 1.55 ' ""-."' � ?'__ � Double Transfer B D F Figure 3.14: Possible proton-transferred structures in [mU(H20h]+. Left panels show manually distorted structures used as staring points for optimization. Right panels show the final optim ized structures of the ionized species. Distances are in Angstroms. described above. The initial structures are shown in the left panel of Fig. 3.14, whereas the resulting optimized structures are shown on the right. The lowest ionization energy of water in this dihydrate correspond to the state with a hole localized on the outer water. Thus, we expect this state to relax via PT to either the first or the third structu res. However, the opti mization of the structure shown in Fig. 3.14A converged to the double PT structure (see Fig. 3.14B) indicating that there is no 76 local minimum cor responding to the first structure (with H 3 0+). Thu s, ionization of water in a dihydrate leads to the double PT structure via a Grott huss-like pathway. The energet ics of this proce ss is in agreement with a higher proton affinity of mU compared with water. However, it is interesting that there is apparently no barrier along the double PT reaction coordinate (the opti mization of the state with the hole localized on the outer water converg es to doubly proton-transferred state). We also note that energy diffe rences between the structures cor responding to the single and double PT (Fig. 3. 14B and Fig. 3.14D, respectively) is 22.7 kcal/mol (computed by roB97X-D). This is in agreement with known facts that in clusters more stable structures are the ones with OH radical on the outside. These energy diffe rences can be ra tionalized by counting the number of h-bonds in the two structu res. The energy difference between 5th ionized state at the initial Franck -Condon geometry and lowest ionized state at double PT stricture is 53.4 kcal/mol. In large water cluste rs, the lowest ionized states correspond to the surface state s, where there are waters that serve only as proton donors 33 . Th us, the IEs cor responding to the surface states should be relatively independent of the cluster size (and even the chemical nature of its core). The experimental onsets for protonated mU(H 2 0)n clus ters are remarkably insen sitive to the cluster size (Fig. 3.5E); this suggest that in larger clusters the surf ace-ionized states lead to multiple barrier less PT yielding solvated pro tonated uracil with the OH radical on the surface. Th us, such clusters of water with molecules with relatively high proton affinity could serve as model systems for studying directionality in Grotthu ss-like PT through water wires and membrane interf aces 34 · 35 . While most of the experi ments in this work focu sed on mU, this nucleo base is by no means unique in that water has a significant effect on PT. Similar experiments per formed on thy mine show that in the absence of water, PT begins at 9.20 e V, with a 77 major rise in signal between 9.7 and 9.9 eV 9 . Thymine provi des an interesting com pari son, because both h-bonded and n-st acked dimers populate the molecular beam 9 , in contrast to mU which forms only n-stacked dimers 10 . The calculati ons suggested that it is h-bonded thy mine dimers which give rise to this signal at 9.7 eV, while the lower onset was expla ined by a dimer with n-st acked geometry 9 . Upon solvation, PT switches off at these lower energi es, as is evidenced in the signal for TH+ and T(H 2 0)H+ shown in Fig. 3.6. The onset for PT is around 10.6 eV, with a maj or rise at 11 .2 eV, which is very similar to the onsets observed in mU. The shapes of the curves for protonated thy mine species are also very similar to those in Fig. 3.5B and C. This suggests that a similar PT mecha nism from the solvent is occurr ing. 3.5 Conclusions We conclude that both in h-bonded and n-st acked nucleobase dimers and larger cluste rs, solvation shuts down PT between the bases, which is rather efficient in "dry" cluste rs. It is only when the solvent is ionized that PT begins again. Our findings illustrate that water has a dramatic effect on PT pathways, not only by serving as a wire for proton transport, but also by shutting down other PT route s. In our model systems, an outer most ionized water molecule acts as an acid (activated by an ionization event), and the nucleobase - as a base, whereas other waters may parti cipate in PT either as spectators or as intermediate proton acceptors, as shown recently in photo induced acid-base reac tions 17 • We explain the remarkable similarity between the appearance onsets in solvated mU and thy mine, as well as insen sitivity of the onsets and shapes of the appearance curves on the cluster sizes to the fact that the lowest ionized states in which the hole is localized on the solvent cor respond to the surface state s, i.e., water molecules acting as proton donors only. Electro nic structure calculati ons show that these IEs are rather 78 insen sitive to the size and/or chemical identity of the cluster core (mU versus mU dimer versus thy mine). Th us, these states become accessible at very similar energies initiat ing facile PT to the accepting base, either directly (in monohydrates ), or through the mediating water molecu les. While our study focuses on PT in a simple model system, one can anticipate that some of these mecha nisms may be operational in solution or in biological environm ents. A growing body of studies illustrating the central role of PT has led to a paradigm shift in the discussion of water and its active role in biology and chemistry. Water is no longer seen as just a solvent, but is an active parti cipant in a variety of pro cesses such as enzyme cataly sis and membrane transport. Water has also been shown to catalyze reactions 36 which are important in biology and at mospheric chemistry 16 • 37 . Proton coupled electron transfer in DNA is mediated by water chains 11 . Autoionization in water also drives a variety of pr ocesses which are critical to life and biology 1 2 , while PT through nano pores, arti ficial membranes and structures have maj or rami ficati ons in energy conver sion and storage technologi es. PT in nano confined geometries 38 have implications for cataly sis and solar energy conve rsion, while ions have been shown to enhance the transfer of protons through aqueous interf aces 39 . In this work we have shown that PT can be very effectively control led by subtly changing how DNA bases hydrog en bond and stack within them selves and upon solvation and thus can provide a template for novel dynamical studies in the temporal, spatial, and spectroscopic domain. 79 3.6 Chapter 3 references [1] A. Rosspeintner, B. Lang, and E. Va uthey. Ultrafast photochemistry in liqu ids. Annu. Rev. Phys. Chern. , 64:247,2013. [2] L.M. Tolbert and K.M. Solntsev. Excited-state proton transfer: From constrained systems to "super" photoaci ds to superfast proton transfer. Ace. Chern. Res., 35 :19 - 27, 2002. [3] S.J. Formosinho and L.G. Arnaut. Excited-state proton transfer reactions ii. intramolecular react ions. J. Photochem. Phot obiol., 75: 21-48, 1992. [4] J.J. van Thor. Photoreacti ons and dynami cs of the green ftuorecent protei n. Chern. Soc. Rev., 38:2 935-2950, 2009. [5] E.C. Carrol l, S.-H. Song, M. Kumauchi, I.H.M. van Stokkum, A. Jailaubekov, W.D. Hoff, and D.S. Larsen. Subpicosecond excited-state proton transfer preced ing isomerization during the photorecovery of photoactive yellow prot ein. J. Phys. Chern. Lett. , 1 :2793-2799, 2010. [6] T. Schultz, E. Samoylova, W. Radloff, LV. Hertel , A.L. Sobolewski, and W. Dom cke. Efficient deactivation of a model base pair via excited-state hydro gen transfer. Science, 306: 1765-1768, 2004. [7] A.K. Ghosh and G.B. Schuster. Role of the guanine N1 imino proton in the migration and reaction of radical cat ions in DNA oligome rs. J. Am. Chern. Soc., 128:417 2- 4173, 2006. [8] T. Charkaborty, editor. Charge Migration in DNA. Springer, 2007. [9] K.B. Bravaya, 0. Kostko, M. Ahmed, and A.I. Krylov. The effect of It-stacki ng, h-bonding, and electrostatic interact ions on the ionization ener gies of nucleic acid bases: Adenine-adenine, thy mine-thymi ne and adenine-thymi ne dime rs. Phys. Chern. Chern. Phy s., 12: 2292-23 07, 2010. [10 ] A. Golan, K.B. Bravaya, R. Kudirka, S.R. Leone, A.I. Krylov, and M. Ahmed. Ionization of dimethyluracil dimers leads to facile proton transfer in the absence of H-bonds. Nature Chern., 4:323-329, 2012. [11] R.N. Barnet, J. Joseph, U. Landman, and G.B. Schuster. Oxidative thymine muta tion in DNA: Wat er- wire-mediated proto n-coupled electron transfer. J. Am. Chern. Soc., 135:3 904- 3914, 2013. 80 [12 ] P. Ball. The importance of water. In I.W. M. Smith, C.S. Cockell , and S. Leach, edi to rs, Astrochemistry and Astrobiology, Physical Chemistry in Action, pages 169- 210. Springer Berlin Heidelberg , 2013. [13] G.A. Voth. Computer simulation of proton solvation and transport in aqueous and biomolecular systems. Ace. Chern. Res., 39:143-150, 2006. [14 ] N. Agmon. The Grott huss mechani sm. Chern. Phys. Lett. , 244:45 6-462, 1995. [15] D. Chandler, C. Dellago, and P. Geissler. Wired-up water. Nat. Chern. , 4: 245-247, 2012. [16 ] R.A. Relph, T. L. Guasco, B.M. Elliott, M.Z. Kamrath, A.B. McCoy, R.P. Steele, D.P. Schofield, K.D. Jordan, A.A. Vi ggiano, E.E. Fer guson, and M.A. Johnson. How the shape of an h-bonded network control s proto n-coupled water activation in HONO formati on. Science, 327:30 8-3 12, 2010. [17] O.F. Mohammed, D. Pines, J. Dreyer, E. Pines, and E.T.J. Nibber ing. Sequential proton transfer through water br idges in acid-base react ions. Science, 310:83-86, 2005. [18 ] C. Manca, C. Tanner, and S. Leutwyler. Excited state hydrogen atom transfer in ammonia-wire and water-wire cluste rs. Int. Rev. Phys. Chern. , 24:457-488, 2005. [19] L. F. Sukhodub. Interacti ons and hydrati on of nucleic acid bases in a vacuum. experimental study. Chern. Rev., 87:5 89-606, 1987. [20] S.K. Kim, W. Lee, and D.R. Herschbach. Cluster beam chemi stry : Hydration of nucleic acid bases; Ionization potentials of hydrated adenine and thy mine. J. Phys. Chern., 10 0: 7933-7937, 1996. [21] N.J. Kim, H.M. Kim, and S.K. Kim. Ionizat ion-induced proton transfer in th ymineammonia van der Waal s cluste rs. Int. J. Mass. Spectr., 261:32, 2007. [22] K. Mizuse, J.-L. Kuo, and A. Fujii. Structural trends of ionized water networ ks: Infrared spectroscopy of water cluster radical cations (H 2 0)n+ (n=3- 11 ). Chern. Science, 2: 868-876, 2011. [23] A.A. Zadorozhnaya and A.I. Krylov. Zooming into pi-stacked manif olds of nucle obases: Ionized states of dimethylated uracil dime rs. J. Phys. Chern. A, 11 4 :2001- 2009, 2010. [24] L. Belau, K.R. Wil son, S.R. Leone, and M. Ahmed. Vacuum-ultrav ioled photoi on ization studies of the microhydration of DNA bases (guanine, cytosine, adenine, and thy mine). J. Phys. Chern. A, 111:7 562-7568, 2007. 81 [25] A.A. Golubeva and A.l. Krylov. The effect of rc-stacking and H-bonding on ion ization energies of a nucleob ase: Uracil dimer cation. Phys. Chem. Chem. Phy s., 11:13 03-1 311, 2009. [26] 0. Kostko, K.B. Bravaya, A.l. Krylov, and M. Ahmed. Ionization of cyto sine monomer and dimer studied by VUV photoionization and electronic structure cal culat ions. Phys. Chem. Chem. Phys., 12:2860-2872, 2010. [27] K.B. Bravaya, 0. Kostko, S. Dolgikh, A. Landau, M. Ahmed, and A.l. Krylov. Electronic structure and spectroscopy of nucleic acid bases: Ionization ener gies, ionizat ion-induced structural chang es, and photoelectron spectra. J. Phys. Chem. A, 11 4:12305-123 17, 2010. [28] K. Khistyaev, K.B. Bravaya, E. Kamarchi k, 0. Kostko M. Ahmed, and A.I. Krylov. The effect of micro hydrati on on ionization energies of thy mine. Faraday Discuss., 150:313 -330, 2011. [29] Y. Shao, L. Fusti-Mo lnar, Y. Jung, J. Kussmann, C. Ochsenfeld, S. Brown, A.T.B. Gilbert, L.V. Slipchenko, S.V . Levchenko, D.P . O' Neill, R.A. Distasio Jr, R.C. Lochan, T. Wang, G.J.O. Beran, N.A. Besley, J.M. Herbert, C.Y. Lin, T. Van Voorh is, S.H. Chien, A. Sodt, R.P. Steele, V.A. Rassolov, P. Maslen, P.P. Koram bath, R.D. Adamson, B. Austin, J. Baker, E.F.C. Bird, H. Daschel, R.J. Doerks en, A. Dreuw, B.D. Dunietz, A.D. Dutoi, T.R. Furlani, S.R. Gwaltney, A. Heyden, S. Hirata, C.-P . Hsu, G.S. Kedziora, R.Z. Khalliulin, P. Klunziger, A.M. Lee, W.Z. Liang , I. Lotan, N. Nair, B. Pete rs, E.I. Proynov, P.A. Pieniazek, Y.M. Rhee, J. Ritchi e, E. Rosta, C.D. Sherrill, A.C. Simmonett, J.E. Subotnik, H.L. Woodcock III, W. Zhang, A.T. Bell, A.K. Chakraborty, D.M. Chipman, F.J. Keil , A. War shel, W.J. Hehre, H.F. Schaefer III, J. Kong, A.l. Krylov, P.M.W. Gill, M. Head-Gordon. Advances in methods and algorithms in a modern quantum chemistry program packag e. Phys. Chem. Chem. Phys., 8:3 172- 3191, 2006. [30] A.l. Krylov. Equati on-of-motion coupled-cluster methods for open-shell and elec tronically excited spec ies: The hitchhiker 's guide to Pock space. Annu. Rev. Phys. Chem., 59:433 -462, 2008. [3 1] D. Ghosh, 0. Isa yev, L. V. Slipchenko, and A. I. Krylov. The effect of solvation on vertical ionization energy of thymi ne: From microhydration to bulk. J. Phys. Chem. A, 115:6028-6 038, 2011. [32] P.A. Pieniazek, S.A. Arnstein, S.E. Bradforth, A.l. Krylov, and C.D. Sherrill. Benchmark full configuration interaction and EOM-IP-CCSD results for proto typical charge transfer syste ms: Noncovalent ionized dime rs. J. Chem. Phy s., 127: 164110, 2007. 82 [33] P.A. Pieniazek, E.J. Sundstrom, S.E. Bradforth , and A.l. Krylov. The degree of initial hole localization/delocalization in ionized water cluste rs. 1. Phys. Chern. A, 113:4423-4429 , 2009. [34] A. Springer, V. Hagen, D.A. Cherepanov, Y.N. Antonenko, and P. Pohl. Pro tons migrate along interfacial water without significant contributi ons from jumps between ionizable groups on the membrane surfac e. Proc. Nat. Acad. Sci., 108(35):14461-14466, 2011. [35] 0.-H. Kwon and O.F. Mohammed. Water-wire catalysis in photoinduced aci d-base react ions. Phys. Chern. Chern. Phys., 14:8974 - 8980, 2012. [36] Y. Matsuda, A. Yamada, K. i. Hanaue, N. Mikami, and A. Fujii. Catalytic action of a single water molecule in a proton-migration reaction. Angew. Chern. Int. Ed. Engl. , 49(29):4898 -490 1, 2010. [37] R.J. Buszek, J.R Barker, and J.S. Fra ncisco. Water effect on the OH + HCl reacti on. 1. Phys. Chern. A, 11 6(19):4 712 -47 19, 2012. [38] J. Kofinger, G. Hummer, and C. Dellago. Single-file water in nanopore s. Phys. Chern. Chern. Phy s., 13: 15403-1 5417, 2011. [39] H. Mishra, S. Enami , R.J. Nielsen, M.R. Hoffmann, W.A Goddard, and A.J. Colussi. Anions dramatically enhance proton transfer through aqueous interfa ces. Proc. Nat. Acad. Sci., 10 9(26):10228-10232, 2012. 83 Chapter 4: The implementation of the coupled-cluster family of methods on graphics processing unit 4.1 Introduction From early days of computational chemistry quantum chemistry calculati ons have been limited by the avai lable computational resource s. Quantum chemistry started from calculati ons of very simple diatomic molecules with simple and inaccurate methods. Nowadays quantum chemistry tools can easily be used to compute systems which con sist of more than 20 heavy at oms, such as DNA bases dimers, with accurate corre lated methods. However, there are many important molecular systems, such as pho toactive protei ns, porphyrin and chlor in-based molecules (e. g. heme and chlorophy ll), carotenoi ds, molecules in organic solar cells, solvated molecules and oth ers, for which the calculati ons with desirable accuracy are still practically inaccessible. Other impor tant systems that require large computational resources are DNA bases clusters and DNA nucleotides that are model systems for interact ions in DNA. There are metho ds that allow splitting complicated systems apart and applying different levels of theory to sep arate parts to achieve reasonable computational costs and accuracy. These are QM/MM, embedd ing, and frag ment- based approaches. Other approaches neglect some high-level 84 interactions between distant parts of the system. For example, local correlation methods eliminate the steep scaling of traditi onal CC methods for mol ecules with well localized electronic structures 1 . However, such methods perform less well for genuinely del ocal- ized systems, systems with strong nontrivial interactions between fragments, or for basis sets augmented with diffuse functions. Thus we can see that there are numerous com putational projects in quantum chemistry which require large computational resources. o to• � ;, & 10 ' "0 c: 8 ' c% 11)" "' ·� 1 3 " 8 1LI·� Moore's Law The Fifth Paradigm • • • • • • • •• • • • • • • • • •• • • ... .. • ., . Logarithmic Plot . � . •• • • .. • • • • • • 10 ... -+---.----.----.----..----..----.--.--.----.--------. Fle.�trm• ;p�Jumicnl .((o!loy \4Jcmml TH[.·r rmmidoJr ltrlliJJrtl irr! Cim·.i; WOO 19JO 1920 WJU 1940 1.950 1\160 1970 1.980 1990 2WO Year Figure 4.1: Calculations per second per $1000, logarithmic plot. During the last century there was a dramatic decrease in the price of computations (Fig. 4.1 2 ). From the beginning of microprocessors era this decrease was mostly due to the increasing computational performance of a single microprocessor. From the 197ot h till the beginning of 21st century microprocessors (and central processors units (CPU) in 85 particular) performance gains were in many respects achieved by increasing micropro cessor clock fr equency : from 2 MHz in Intel 8080 to 3.8 GHz in Pentium 4 (Fig. 4.2 3 ). Higher CPU frequency means more operati ons per second. Thus the increase of the CPU frequency allows faster execution of the same code without any modificati on. For many years theoretical che mists were able to compute properti es of bigger and bigger systems using basically the same algorithms due to increasing CPU frequency. How ever, around 2004 CPU frequency speedup has come to an end with Pentium 4 reaching a frequency of 3.8 GHz. At this point CPUs have hit the "power-wal l": higher CPU frequency requires more power consumption and cause more power dissi pation that can lead to a CPU overheating and requires special cooling systems. This made further increase of the CPU frequency commercially impracti cal. Further performance gains were achieved through the development of multiple cores on a single die CPU. Starting from the production of dua l-cores Pentium D and Athlon X2 in 2005 manufactu res have increased the number of cores in one CPU up to 16 pro cessors (e.g. AMD Opteron 6200 "Interl agos"). Multipl e-core CPUs are able to execute more than one inst ruction stream simultane ously ("in parallel"). However, a code designed for sequential execution with a one-core processo r cannot be executed in parallel on multi ple-core s, unless the compiler auto matically parallelizes this code. Despite decades of work by compiler researc hers, auto matic parallelization has had only limited success 4 . Thus multipl e-cores CPUs require an existing code to be rewritten with explicit paralleli sm in the most time-consuming part s. Parallelism has been employed for many years in high-pe rformance computing that uses supercomputers and computer cluste rs. One of the main prob lems of such sys tems is slow communication between the nodes. Another disadvantage is a very high 86 rvol11tion of Tntd Platforms Floating point pe.ak performance [Hflop /s] CPU frequency [M.U] 10.000 1.000 100 . . . . .. .. ... .. .. era of parallelism . � ............ .... - I / O.:o n : Z DY- '- ' ' -- Pc.-rtiurn 4 Co10 l Quad fn::e speedup ....._, i r.g la pracicicr -.-..J vu llt: 1-'�i:. i .. u ...,._C:Ptlti'PI'JI!C'IfTY 1·------------------------------------------- l993 :995 1S9 7 199S 2001 3>05 Year Figure 4.2: The evolution of computing platform's peak performance and CPU frequency demonstrates CPU frequency standstill around 2004. price of supercomputers and powerful computer clusters which limits its availability to theoretical chemists. Another type of systems that had employed parallelism for a relatively long time is graphical processing unit. GPUs are co-processors that have been heavily optimized for computer graphics processing. In a computer graphics processing numerous uniform operations are applied to large 2D and 3D arrays of data, which correspond to pixels. To handle such computations the single instruction, rrultiple data (SIMD) approach was utilized. SIMD devices have rrultiple processing elements that perform the same oper ation (e.g. rrultiplication) on rrultiple data units simultaneously. Modern GPUs have a higjlly parallel structure with up to 2880 (NVIDIA GK110 architecture) arithmetic logic units (ALUs) or cores able to perform simple math operations. The nature of many sci entific comput ational problems allows efficient use of high parallelism and power of 87 GPU s. For example matrix operati ons can be calculated using GPU with a significant speedup compared to the CPU. The technique of using a GPU to perform computation in applications traditionally handled by CPU is called General-purpo se computing on GPU (GPGPU). GPGPU is a fairly recent trend and for a long time was constrained by limited programmability of GPU and supported data type s. First attempts to imple ment GPGPU had to use the standard grap hics application programming interface (API) and high-level shading langua ges such as Direc tX, OpenGL and Cg, which made the developing of GPGPU programs a challenging task. Release of the Compute Unified Device Architecture (CUDA) developed by NVIDIA in 2007 and later the Khronos Group 's Open Computing Language (OpenCL ), Microsof t's DirectCompute, OpenACC and other GPGPU oriented langua ges have made the development of the GPGPU pro grams much easier and reali stic. However, GPUs made prior to 2010 either didn't support double pre cision data types or double preci sion opera tions were very slow because they have very limited number of ALUs capable of perfor ming double pre cision operat ions. Until recently, the lack of double pre cision support has made GPGPU inapplicable to quantum chemistry probl ems due to large computational errors 5-8. The production of NVIDIA Tesla C2050/C2070 GPUs with a new Fermi architecture with up to 512 stream processors in 2010 changed that. Each stream pr ocessor on this chip is able to perform one double-preci sion floating point operation per every other cycle. The latest release in 2012 of the NVIDIA Te sla K20x GPU with 2688 ALUs made it very promi sing to use GPUs of this type in quantum chemistry calcula tions. To benefit from the implementation on GPU a computational chemistry method must be able to utilize the high paralleli sm of the GPU architecture. The example of such methods is coupled-cluster fam ily of methods. Such meth ods as CCSD, CCSD( T) 88 and many others are widely used for high accuracy quantum chemical computati ons. Some of the most accurate calculat ions for small to medium sized molecules use these metho ds 9 . One of the main disadvantages of the se methods is that they have very high computational costs and a scaling of N 6 for CCSD and N 7 for CCSD(T) where N is the system size. The most computationally expensive operati ons in these meth ods are tensor contract ions, which can be reduced to matrix operat ions. It is known that matrix operation can be very efficiently para llelized on SIMD devices as they mostly include the same arithmetic operation on the large set of data. It was shown that matrix multipl i cation operati ons can be executed on GPU with high efficiency 10 , 11 . For example, the latest NVIDIA Tesla GPU (K20x) reaches impre ssive 1.2 2 Teraflop double-preci sion performance in DGEMM matrix- multi plication 1 2 . In addition, the performance of the CPU s incre ases much faster that the performance of CPU s (Fig. 4.3 11 ). Th us, it is very promi sing to implement CC methods with a GPGPU technique. In additi on, it is important to keep in mind that there are several competing massively parallel architect ures, such as NVIDIA Te sla and AMD FirePro GPU s and Intel X eon Phi Copr ocessor. Modern CPUs with 16 cores and integrated GPUs could potentially also be used as massively parallel pr ocessors. Thus the implementation model should be flexible in order to quickly adapt to different technol ogies. This chapter expla ins our approach to the implementation of the CC family of meth ods on GPU using CUDA C including working implementation of the CCSD(T) metho d. Our implementation is part of the ccman2 and libtensor librari es of Q-Chem electronic structure package 13 . It is based on a modular and layered architecture which allows to adapt the library to different underlying technolo gies. 89 The or eti cal GFL OP/ s 3250 3000 2750 2500 2250 2000 1750 1500 1250 1000 Gef'or<:eG TX UO • - NVIOI A G PU Si ngle Preci sion I - -NVIOIA GP U Double Precision I - --I ntel CP U Single Pr ecision .....,.. I ntel CP U O ruble Precision I I I �on: e GTlt : i lll J GeforceGTX4 1 Y GefcrceGTX111l / / 750 500 250 \t GeFo n: eUIIOGTX ./ T esla C2 050 Sand y Bridge ...._ 0 Gel'ort e 7100 GTX / � / I"'' [e m_ Q1Jl.6. Q. . · . !t Q r«t t } Dl!W _., woexx: r es t . Rl mft� l rl � ......... Sep - 1ff ntium 4 Jun-04 ar e rt own _ vv estmere 1 />ar-o fi P Dec- 0 9 Aug-12 Figure 4.3: Theoretical peak performance for the CPU and GPU, floating-point operations per second. 4.2 GPU architecture This chapter gives a brief description of GPU architectures by the example ofthe newest NVIDIA Kepler GK110 architecture. ATI GPUs architecture has the same basics con- cepts but organization of processors and memory structure are different. Most scientific application for GPU and all known quantum chemistry programs used NVIDIA GPUs, mostly because of the NVIDIA CUDA C language support. GPUs, initially designed for graphics rendering, represent a single instruction, mul- tiple data (SIMD) class of devices. They have multiple processing elements that perform the same operation on multiple data simultaneously. GPU is specialized for compute intensive, highly parallel computations and therefore designed such that more transistors 90 - - I - CPU GPU Figure 4.4: Sketchy �presentation of CPU and GPU ardllU!ctures. are devoted to data processing rather than data c..:hing and flow control as in CPU. This is illustrated in Fig. 4.411 , Flgure 4.5: NVIDIA Kepler GK110 archltecture. 2880 CUDA cores organized In 15 SMXs are shown. 91 Kepler GKl lO based GPUs consist of up to 2880 stream processo rs called cores (Fig. 4.5 14 ). Each core has a fully pipelined arithmetic logic unit (ALU) and float ing point unit (FPU) and able to perform one fused mult iply-add (FMA) inst ruction for single precision arithmetic (with IEEE 754-2008 floating-point standard) per cycle. Cores are organized in 15 groups of 192, called streami ng multipro cessor (SMX) (Fig. 4.6 14 ). Besides 192 single-preci sion cores each SMX has 64 doubl e-preci sion units and 32 load/store units allowing source and destination addresses to be calculated for 32 threads per clock and supporting units load and store the data at each address to cache or DRAM. Each SM has 32 special function units (SFUs) able to execute one fast ap proximate tran scendental inst ruction such as sin, cosine, reciprocal, and square root per thread, per clock. Each SMX has 64 KB of on-chip memory that can be configured as either 48 KB of Shared memory with 16 KB of L1 cache, 32 KB Shared with 32 KB L1 cache or as 16 KB of shared memory with 48 KB of L 1 cache. Shared memory is availabl e for all SMX cores and can be used for cooperation between the core s. SMX has a register with a capacity of 65536 by 32 bits shared by all thread s. Besides 15 SMXs Kepler based GPU can have up to 6G b of local DRAM memory and 1536 KB L2 cache (shared by all SMXs). Kepler architecture introduc es new Read-only cache that previ ously (in Fermi architecture) could be used only by the Texture unit s. Kepler memory hi erarchy is represented in Fig. 4. 7 14 GPUs are connected to the northbri dge chip (or directly to the CPU in latest models) through the PCI Expre ss bus. Northbri dge is connected to the CPU and CPU DRAM. The bandwidth between the CPU and northbri dge and CPU DRAM on modern systems is typically higher than the bandwidth of the PCI Expre ss bus. Moreover, most mod ern CPUs have integrated PCI expre ss and RAM contro llers. Thus the communication 92 Figure 4.6: NVIDIA Kepler GKllO streaming multip rocessor (SMX) architecture. 192 singleprecision CUDA cores, 64 doubleprecision units, 32 special function units (SFU), and 32 load/store units (LD/ST). between CPU and GPU and memory transfer between CPU DRAM and GPU DRAM is limited by the PCI Express bandwidth. Latest GPUs use PCI Express 3.0 with a 16 Gb/s bandwidth. 93 Thread DRAM Figure 4. 7: Kepler Memory Hierarchy. 4.3 GPU programming languages by the example of CUDA C CUDA (an acronym for Compute Unified Device Architecture) is a general purpose par allel computing architecture developed by NVIDIA that leverages the parallel compute engine in NVIDIA GPUs to solve many complex computational problems in a more efficient way than on a CPU. Programmers can use C language with NVIDIA exten sions (C for CUDA or CUDA C) to develop programs executable on the GPU. CUDA architecture also allows using other programming languages, such as CUDA Fortran, OpenCL, DirectCompute and OpenACC. CUDA, OpenCL, and DirectCompute share similar paradigm with special kernel functions written for execution on GPU while Ope nACC uses special directives to parallelize certain parts of the code similar to OpenMP API. 94 4.3.1 Thread organization CUDA prov ides a scalable programming model through three key abstract ions: a hierar chy of thread grou ps, shared memories, and barrier synchronization. CUDA C extends C by allowing the programmer to define C func tions, called kern els, that, when called, are executed N times in parallel by an array of N th reads. Threads form blocks which can be executed independently in parallel (Fig. 4.8 11 ). The maximum number of threads per block is 10 24 11 . Blocks are executed independently in a random order and on a random streaming multiproc essor (SM) concurrently or sequen tially. This allows thread blocks to be scheduled in any order acro ss any number of cores and provide the scalability of the code. Thus no assumption about data integrity in SM shared memory between different block executions can be made. On the contrary, threads within one block are guaranteed to be executed on the same SM. Thus threads within a block can cooperate by sharing data through the shared memory and by synchronizing their execution. Blocks form a grid that performs the kernel execution. The max- Imum number of blocks per kernel is 2 32 -1. The number of CUDA blocks in a grid and the number of threads per block are specified using <<< NumberOfBlocks,NumberOfTHreads >>> execution configuration syntax at the kernel invocat ion. For example, MatMult < < < 3, 32 > > > will execute the kernel with the name MatMult by three blocks of 32 threads each. 4.3.2 Memory organization There are three six memory scopes in CUDA: register, local, shared, global, constant and texture. Register memory is the fastest form of memory on the multip rocessor and is only accessible by the thread. However, it has very limited size and when the data does not fit into register part of the variabl es will be placed in the local memory. Local 95 Ho st Device Kernel � +- .... 1 Gr id 1 Bl ock (0, 0) Bl ock ( 1 , 0) , ' ' Bloc )' Bl ock \ ((),�) ( 1 , 1) ' , I 1 \ ,, ,' : \ ,'Gr id 2 ,' I \ , ' I I \ Kernel -�'� I I \ I I \ 2 .' U\ Figure 4.8: Grid structure. Thread blocks form a grid that correspond to one kernel. memory is located in global memory DRAM, which makes it very slow compared to the register. Local memory is a private memory of the thread. Shared memory is an on chip memory and is almost as fast as a register one. Shared memory is accessible by all threads of the block and has the same lifetime as the block. Global memory resides in GPU DRAM and accessible by all threads of all blocks and kernels. Global memory is potentially 100-150 times slower than register or global memory ; however, on new GPU s (Fermi and Kepler architectures) it is cashed. Constant memory is a read-only GPU DRAM memory, which can be written only from the host (CPU) . Constant mem- ory cashed on all OPUs. Texture memory is also resides in the GPU DRAM memory and is used mostly by graphical applications . 96 Large latency and size difference between different typ es of memory makes the appropriate memory management very important. Inefficient memory use is one of the main bottl enecks of the GP GPU program s. 4.4 Review of existing quantum chemistry methods accelerated with G PU. GPU s are emerging as a very promi sing architecture for computational chemistry and for quantum chemistry in particular. There were several examples of implementation of quantum chemical methods on GPU, such as quantum Monte Carlo 15 • 16 , density func tional theory (DFT) 5 • 7 , Hart ree-Fock self- consistent field (HF SCF) 6 - 8 • 17 , second-order Moller-P lesset (MP2) 18 • 19 and coupled-cluster (CCD, CCSD, CCSD( T)) 2 0 • 2 1 quantum chemistry methods. These implementat ions are bri efly discussed below. One of the first quantum chemical methods implemented on GPU was Quantum Monte Carlo method 15 . The nature of this method is well suited for parallelizati on, which al lowed effi cient implementation of this method on GPU with reported speedup factors up to 350 on an NVIDIA Te sla C1060 GPU compared to a 2.66 GHz Intel Xeon E5430 CPU 16 . The implementation of the DFT and HF SCF metho ds on GPU started from the two electron integral evaluati on al gorith ms 5 • 6 . The speedups of more than 130x in the single preci sion were achieved using an NVIDIA GeForce 8800 GTX GPU in compari son to an Opteron 175 CPU 6 . Later the complete SCF procedure was implemented using CUB LAS library for matrix operati ons 7 . In this implementation specially designed 97 memory management was used which al lowed to preload required data from CPU mem ory to GPU memory and then further to SM shared memory to overcome memory band width and latency limitat ions. This code demonstrated the speedup of up to 650x in the single preci sion for molecules with as many as 453 at oms (2131 basis functi ons) on an NVIDIA GeForce 280GTX compared to an Intel Pentium D 3 GHz. In the same work 7 the 2.0-2.8x speedups over a single GPU was obtai ned on a 3-GPU (280 GTX) system. Further development of the analytical energy gradients calculati ons on GPU made possible to run geometry optimization and first pri nciples molecular dynamics on GPU with a 180x speedup for such systems as a hydronium ion solvated by 30 waters (94 atoms, 405 basis functi ons) and an aspartic acid molecule solvated by 147 waters (457 atoms, 2014 basis funct ions) 8 . The implementation of the direct SCF calcula tion with Hart ree-Fock method on a high-performance computer cluster with GPUs has demonstrated impre ssive 2,452x speedup for Olestra molecule ( 453 at oms, 2131 basis funct ions) when using eight cluster nodes with two NVIDIA Tesla C1 060 GPU s each 17 . Hybrid parallel model with CUDA for GPU and message passing interface (MPI) and POSIX threads for mult i-node multi -GPU configuration was used in this implementa tion. However, despite such impre ssive speedups there was a severe disadvantage in all these calcula tions. GPU s used in the desc ribed calculati ons were either not able to perform 64-bit doubl e-preci sion (DP) operati ons at all or such operati ons were very inef ficient. The absolute energy error generated by 32-bit single-precision (SP) arithmetic as compared to DP calculati ons can be as big as 36 millihartree (20 kcal/mol) which is an order of magnitude worse than required "chemical accuracy" (typically 1 kcal/mol ?. To improve the accuracy DP capability of newer GPU (NVIDIA 280GTX) was used to implement summation of integral s while the integrals th emselves were still calculated in SP . The use of DP accumulation made the error to be less than 1 kcal/mol 7 . Further 98 improvement of this code and use of modern GPU s allowed to achieve impre ssive 114x speedup in RHF/6-31G* calculation of large Olestra molecule using TeraChem on an singe GeForce GTX580 GPU, as compared to GAMESS on the state-of-the-art Intel Westmere 3.33 GHz CPU core 22 . The first reported implementation of correlated quantum chemistry method on GPU was resolution- of-the-identity MP2 metho d. One of the most time consuming routine in MP2 calculati ons is matri x-matrix multiplicati ons, which can be efficiently performed on GPU using CUBLAS 23 funct ions. In the paper 18 it was shown that in SP matri x matrix multiplication for matri ces larger than 750 elements per side the GPU (NVIDIA Quadro FX 5600 GPU) signifi cantly outperf orms CPU (Opteron 170) with up to 13x speedup. However, for smaller matri ces the CPU was more effi cient. Therefore, the hybrid model where matri ces with a size above the certain thre shold were calculated on GPU while the rest were calculated on CPU was pro posed. Speedups of 1.5x to 4.3x were achieved with this algorithm for a series of linear alkan es using the cc-p VDZ basis set. Described algorithm gives reasonable average error 0.3 millihartree due to use of SP operati ons on GPU. To improve the accuracy of matri x-matrix multiplica tion more advanced heterogeneous computing model was pro posed 19 . This model splits each matri x element- wise into 'lar ge' and 'small ' components and calculates only small components on GPU. With this model applied for the 168-atom va linomycin molecule in a cc-pVDZ basis set the speedup of 10.1 times was observed on an NVIDIA Te sla C1060 GPU compared to AMD Athlon 5600+. While for calculati ons with pure GPU matrix operati ons speedups of 13.8 and 7.8 times for single-preci sion (SGEMM) and doubl e-preci sion (DGEMM) were observed. The errors in the correlation energy were -10.0 and -1.2 kcal mo l-l for SP and hetero geneous calculat ions, respectively. Consider ing drastically improved DP performance of the modern GPU, which is only three times 99 smaller than SP for the latest NVIDIA Tesla K20x, the advantages of the proposed het- erogeneous model are questionable. Speedups for various matrix-matrix multiplications using different *GEMM calls from CUDA library can be seen on Fig.4.9 19 . � � w (.9 0 ::::> a.. () .8 <I> > � � a. ::I -6 <I> <I> a. (/) 18 16 14 12 10 8 6 4 2 MGEMM fSalt = 10 ·� -- _ _ _ __ .. ..-:, MGEMM fSalt = 1 o·4 ------- , _ __ .. . . - - - - MGEMM fSalt = 1 a· ········ \ .. SGEMM (Cleaver) ··········· · ···· _ .. -- DGEMM (Cleaver) - - - - ... .- · · \ _ _ . . . ! . ... \. . . · ··· ' . .. . ·· ·· i . . ·· ····· -· / _.,tA 'i';"Fi'JiiV'N;!NNFFNiF!'i'ti NYVy ivv·,: ..-l i "Y� -� - -- ·\_ _r, - -- -- -- -- ·- ' - - ---· '-.. - -----� -- - - - - - - - __ , ;, ( Y .. , . i' ··�' I' IIA11 , .. •ji,WiH W!"ii'!!" ''•! 'W•dJ.!,,!d• � .1 : :'V"� A'V vv �"iv�t�v�N�:� :�����::v:N u�: :�v�: :i: : v : N:�vv.r�::��v��� : �.:v:.v::i�,\ V'V"' l "V 11/" :�,. .. ,�\ ,., ,, .. . /;.' ) I l I ' ' I j I l 'I': : : j i : j I j i \ � I I I I { f : { I � I I \ \ • { � . - o jrv.... 0 2000 4000 6000 8000 1 0000 12 000 Matrix size Figure 4.9: Speedup for various *GEMM calls as a function of square matrix size (averaged over 10 runs). MGEMM correspond to the heterogeneous model with dif ferent thresholds. Times are scaled relative to running DGEMM on the CPU. Recently some of the CC methods were also successfully implemented on GPU. DePrince et al. implemented CCSD and TD-CISD methods in Psi3 electronic structure package 2 0 . They developed a new code that uses CUB LAS and GotoBLAS libraries and performs operations on both CPU and GPU simultaneously. For the large systems where the data structures does not fit entirely in GPU memory, the tensors were tiled and required data copied to the GPU as needed. This approach resulted in 4.3 times faster execution on NVIDIA Tesla C2050 (Fermi) GPU compare to the corresponding CPU-only algorithm on two Intel Xeon X5550 CPUs for the systems as large as system 10 0 with 70 electrons and 250 virtual orbital s. Wenjing et al . used different approach to implement CCSD(T) method on GPU in NWChem that produce optimized CUDA code given the specification of a tensor contraction expre ssion. Instead of using CUBLAS DGEMM special custom CUDA kernels for tensor contrac tions were developed. This approach demonstrated speedup over a factor of 8.4 using one NVIDIA Te sla T1 0 GPU as compared to one CPU core and over 2.6 when utilizing the entire system using hybrid CPU + GPU solution with 2 GPU s and 5 cores (instead of 7 cores per node). Execution of this code on NVIDIA Te sla T20 (Fermi architecture) demonstrated speedup of about 3.4 over T10 GPUs with this code. 4.5 Challenges of the implementation of the CCSD method on GPU As described in the Chapter 1 the proce ss of solving CC equat ions requires calculati ons of many complex tensor contract ions. Consider one of the typical contract ions in CCSD method from T 2 equation 2 4 [ t :Jwabef ef where W abef is an intermediate, i, j rep resent occupied orbitals and a, b, e, fare unoc- cupied orbital s. The computational complexity of this term using big 0 notation is 0(0 2 V 4 ), where 0 is the number of occupied orbitals and Vi s the number of virtual orbital s. The overall computational cost per iteration is typically 0(0 2 V 4 + 0 3 V 3 + 0 4 V 2 ) operati ons. Considering that, for example, one DNA base molecule with the cc- p VTZ basis set has more than 10 00 basis functions one can estimate the tremendous computational resources required to calculate DNA bases cluste rs. Thus it is extremely important that tensor contraction operati ons implemented as efficiently as possible. The 101 described contraction as well as most tensor contracti ons in CCSD method can be rep- resented as matri x operati ons, in particularly as matri x-matrix multiplicat ions. The effi - ciency of the matrix operati ons has been studied for a long time and there are several efficient algorit hms and programming librari es, such as different Basic Linear Algebra Subprograms (BLAS) implemen tati ons. Because of the nature of matrix opera tions and the layout of matri ces in memory, parallelization of the matrix opera tions can give a substantial speedup, which is implemented in many BLAS librar ies, such as ATLAS, Intel Math Kernel Library (MKL), AMD Core Math Library (ACML) and NVIDIA CUBLAS. Therefore CCSD method computational time can be signifi cantly reduced through parallelizati on. In addition to the contraction operati ons efficient implementation of the permutation operation is needed especially for CCSD( T) method. In CCSD( T) method up to 50% of time could be spent for permutation operati ons. There is no permutation operati ons for tensors in standard librari es for GPU. Thus special kernel s for permutation opera tions has to be written. CCSD( T) method requires one specific operation that comes from the fol lowing expression for CCSD( T) energy correc tion: t( a ) t( b ) E = L ijkabc ijkabc ijkabc �ia + �jb + �kc It is similar to dot-product matrix operation but requires the divi sion by elements of � matr ices. Thus it cannot be implemented through the BLAS functions and has to be developed as a special CUDA kernel . 102 4.6 Implementation details 4.6.1 Overview of the code structure and libraries design Our GPU implementation of CC methods is a part of ccman2 library 25 of Q-Chem 4 electronic structure package 13 . Fig. 4.10 gives an overview of differ ent components of CC/EOM-CC codes and librar ies. ccman2 drives the execution, libcc conta ins codes for specific CC equati ons (such as CC ampli tudes update codes, etc) implemented using tensor objects, libsolve conta ins code for DIIS and Davi dson procedures, libctx stores the context of calcula tions. The two interface components that link ccman2 to a host electronic structure plat form are: (i) liblegacy that pro vides the information about orbital spaces, integra ls, MO coefficients; and (ii) virtual memory manager (livmm) that handles the 10 and mem ory. Only vertical dependencies (e.g., ccman2 knows about libcc, but libcc does not depend on ccman2) are present (no circular or horizontal dependenc ies). The two weak dependencies (ccman2 to libvmm and libtensor) are used for initialization only. Initially ccman2 code was developed for CPU calculati ons only. However, the librari es were specifically designed to allow an extension to different computer architect ures. Using modular and layered architecture of the ccman2 package we we able to implement GPU version of the ccman2 with changes only in libtensor, libcc and libvmm librari es. The short description of this librari es are given below. Libtensor library is an open-source object-oriented C++ library of classes and rou tines designed to perform general-purpo se tensor al gebra. The primary pur pose of the library is to enable post- Hartree Pock electronic structure metho ds, such as coupl ed cluster methods. The library supports tensors of arbitrary order (number of dimensions), size, and symmetry. 103 Figure 4.10: Overview of CC/EOM- CC code architecture. The library featur es a straightforward programming interface, full tensor symm etry (point group including non-Abelian subgroup s, permuta tional, spin), flexible memory management via a separate virtual memory component, and shared-memory parallel algorit hms. Modular layered architecture of the library (Fig. 4.11) provides multiple points of extension making it possible to add support for new tensor structures and sym metr ies, al gorith ms and computer architectur es. In layered architecture modules depend only on modules immediately below their level . In such architecture the replacement or addition of a certain module in the layer will require only changes in the current layer and immediately above. For examp le, we can replace BLAS module with different ven dor implementation and we will have to change only Simple Tensor Operati ons level without changing Block Tensor Operati ons and library interf ace. Described layered architecture of libtensor library is very important for the code development in the emerging epoch of hetero geneous computing. Currently there are a 104 Block tensor operations Pthre ads, OpenMP and MPI Serial, but pa ralle l-aware Simple Tensor Operations BLAS/CUBLA S LA PACK Low- level ro utines , ClJ DA kern els Figure 4.11: Overview of the libtensor library structure. A multi -layer design allows for vari ous extensions in terms of new algorithms and data types as well as new hardware architectures. The layers interact through well-defined interface s; any layer can be substituted by an alternative implementation without the need to modify the code in the layers above or below. number of computational architectu res: NVIDIA and ATI GPUs, Intel Xeon Phi copro- cessor, etc. The development of the separate code for every architecture would be very time-consuming . Th us, it is crucial to develop a code that can be easily extended to a different computer architecture by addition of new modules rather than by rewriting the entire code. Implemented data structures and algorithms operate on large tensors by splitting them into smaller blocks, storing them both in memory and files on disk, and apply ing divi de-and-c onquer-type parallel algorithms to perform tensor algebra. The library offers a set of general tensor symmetry algorit hms and a full implementation of ten- sor symmetri es typically found in electronic structure theory : permutati onal, spin and molecular point group symmetry. The Q-Chem electronic structure software uses this 105 library to drive coupled-cluster, equati on-of-moti on, and al gebra ic-dia grammatic con- struction methods. interface dense _tensor_rd_i <lf-- dense _tensor _ wr _i interface template< T > gen_block_ tenso r_rd_i <lf-- gen _block _tensor_ wr _i dense_tensor Simple tensor stored as a multidimensional array template< T > gen_block _tensor Generalized block tensor structure, block type is a template parameter r block _tensor Block tensor with dense blocks Dense tensor operations: tod_copy (Copy tensor elements) tod_add (Add multiple tensors) tod_contract2 (Contract two tensors) tod_ dotprod (Inner product of two tensors) Generalized block tensor operations: gen_bto_copy gen_bto_add gen_bto_ contract2 gen_bto_dotprod Block tensor operations: btod_copy btod_add btod_co ntract2 btod_ dotprod Figure 4.12: Diagram of classes in libtensor . For each type of tensors there are specialized tensor operat ions. General block tensors and operations on them are generic implementati ons. Concrete block tensors use the generic structures and algorithms within, but provide a simplified interface. The template argument in gen_block_tensor allows to use different types of tensors, e.g., real, complex or CUDA dense tensors, sparse tensors, or tensors with special propert ies. Any such implementation of tensor have to provide tensor operation classes (tod_add, tod_contract2, etc.). Fig. 4.12 shows libtensor class diagram. External programs should work with the library through the btensorJ or tensorJ interface s. They are guaranteed to be unchanged even though the underlying code can be signifi cantly chang ed. These interfac es are implemented by different classes, such as btensor and direcL btensor. In order to adopt 106 a library to a new computer architecture the set of operati ons classes, such as tod_add, tod_contract2, tod_copy, etc., has to be pro vided. Then a new block_t ensor that is based on these operati ons can be crea ted. The example will be give below in the desc ription of CUDA C implementation for GPU. A virtual memory management tool in libvmm library implements necessary func tions for storing large tensors on disk. This component provides a memory allocator and an interface to adv ise on the memory use pattern. Inside it maintai ns a schedule for reading blocks from disk to RAM and writing the results back to disk if needed. 4.6.2 Description of the GPU-enabled code In order to enable GPU computati ons in ccman2 new codes were added mostly in libten- sor library. The CPU version of the library left intact and can be using al ong with GPU version. The foundation of the new implementation consists of the math kernels that use CUBLAS matrix operati ons and custom CUDA C kernels for such simple opera- tions as tod_copy, tod_set and tod_add. In additi on, a simple class cuda_uti ls is created to simplify copy operation between the host and device memory and to handle CUDA error s. On top of this foundation new classes cuda_ dense_tensor and cuda_blocL tensor are created. This classes follow the same design pattern as regular dense_tensor and block_te nsor classes but use CUDA operati ons. As menti oned before non-iterative triple energy correction in the CCSD(T) method t( a ) t( b ) E = L ijkabc ijkabc ijkabc llia + fl. jb + llkc cannot be implemented through the CUBLAS function. A new kernel was written in libcc library in order to calculate it. Then a class that calculates triple correction to 107 CCSD energy entirely on GPU using this kernel and cuda_btod_tensor class of libtensor was created in libcc. In addition, a new class cuda_ccsd_pt had to be created in ccman2. All operati ons, required for the calculation of tripl es correction were programmed in this class using cuda_b lock_ tensor and cuda_btod contracti ons. It was required because currently default btod operati ons do not work with cuda_block_tensor. After the creation of special VMM for GPU the calculation of triple correction on GPU can be done with regular ccman2 code and the separate CUDA version of ccsd_ pt class will be removed. CCSD( T) has a high computational cost that scales as N 7 and relatively low require ments for memory (N 4 ). These facts allowed us to store all the tensors required for cal culation of tri ples' correction in the GPU memory. We copy all tensors to GPU memory, compute tri ples' correction and copy back only a minor amount of data, that we sum up on CPU to get tri ples' correction to energy. Low memory require ments of CCSD( T) allows us to compute the energy of the system with more than 300 basis functions on a typical Te sla GPU s with 6 Gb of memory. This can be further improved if hatching of contraction operation is implemented. In this approach only part of the tensors are copied to the GPU and a partial result is calculated at a time. During the execution of tensor operati ons on GPU, the next batch of data can be copied to the GPU memory and then the new batch is executed as soon as the calculation of the previ ous batch is done. In order to use other CC methods, such as CCSD that has higher memory require ments, an implementation of a new virtual memory manager in libvmm is require d. Current implementation of VMM in libvmm library manages data storage between the hard disk and RAM. When tensors are too large to fit in the RAM, VMM stores ten sors that are currently not in use on disk. When a tensor is required it is automatically loaded by VMM from disk to RAM. In order to use such VMM with GPU a new layer 108 of GPU memory has to be added (Fig. 4.13). In this version of VMM tensor can be stored in up to three locat ions: on disk, in host (CPU) RAM or in device (GPU) RAM. In order to minimize the data transfer it should be al lowed to store a copy of the ten sor in multipl e memory laye rs. Thus VMM has to keep track where the "latest" (most recently changed) version is stored and update the "older" version when the program tri es to access them. In addition VMM distinguish two types of tensor copies: read-only and read-write. Read-only copies are never changed thus they can never become the "latest" copies and never have to be copied back, for example from GPU memory to host memory. The GPU-enabled version of VMM is currently under development. GPU memory I CPU memory Figure 4.13: Memory layers of GPU implementation of VMM. Multiple copies of the tensor can be stored in all three layers. VMM keeps track of the latest copy of the tensor and updates older version as needed. 109 4.6.3 Results and conclusions The GPU-enabled version of the CCSD( T) method was implemented. Preliminary benchmarks on previous generation NVIDIA Te sla M2070 based on Fermi architecture ( 448 cores) showed the performance close to the performance of Intel X eon CPU of the same generation. According to Te sla K20X (Kepler architecture) DGEMM benchmarks we can expect three times speed up on the latest NVIDIA GPU. GPU load analy sis shows that our code uti lizes about 50% of the GPU. The reason for underutilization of the GPU could be because one tensor is not big enough to load the whole GPU and we use only one host stream to launch kern els. Thus other kernels are not able to use idle core s. The use of several streams can signi ficantly improve perform ance, especially on Kepler GPU s. Anal ysis of time spent for differ ent operati ons shows that about half of the time is spent in DGEMM calls and the other half is permutation operati ons. The time spent in permutati ons operati ons can be signi ficantly reduced by optimization of the permutation kernel. In parti cular the big performance gain can be achieved by use of shared memory and coa lesced memory access. However, while non-coalesced memory access can result in a significant performance lost in Fermi architecture GPUs it is less important in Kepler GPUs. All tod operati ons required for other CC and EOM-CC methods are implemented and tested on GPU. A new efficient GPU-en abled VMM is required in order to use this metho ds with rea l- sized systems. When such VMM is implemented cuda_block_tensor can be directly plugged in to existing ccman2 code and all available CC meth ods will work on GPU. The pur pose of this research was not only to develop a CUDA C implementation of CC methods but to design an architecture that allows a relatively easy development of new versions for difference computer architecture. Our implementation demonstrates 11 0 how the version for CPU and for heterogeneous computing on GPU can be implemented within the same library, design and interfa ces. A new modules for computing on AMD GPUs (with OpenCL) and Intel Xeon Phi copro cessor can be added to the library with reasonable efforts using the same approach. 111 4. 7 Chapter 4 references [1] S. Saebo and P. Pulay. The local correlation trea tment. ii. implementation and tests. J. Chern. Phys., 88:1 884-1890, 1988. [2] Moore 's law. http : I /www . singularity . com/ chart s/page67 . html, 2009. [3] http : I /www. spiral . net /problem . html, 2009. [4] J.P. Shen and M.H Lipasti . Mcgraw-Hill, 2004. [5] K. Yasuda. Tw o-electron integral evaluati on on the grap hics pro cessor unit. Jour nal of Com putational Chemist ry, 29( 3):33 4-- 342, 2008. [6] I.S. Ufimtsev and T.J. Marti nez. Quantum chemistry on graphical processing units. 1. stra tegies for two-electron integral evaluati on. Journal of Chemical Theory and Com putation, 4(2):222-23 1, 2008. [7] I.S. Ufimtsev and T.J. Marti nez. Quantum chemistry on graphical processing units. 2. direct self -consi stent- field implementati on. Journal of Chemical Theory and Com putation, 5(4):1004-1015, 2009. [8] I.S. Ufimtsev and T.J. Marti nez. Quantum chemistry on graphical processing units. 3. analytical energy gradients, geometry optimizati on, and first principl es molec ular dynami cs. Journal of Chemical Theory and Com putation, 5 (10):261 9-262 8, 2009. [9] H.G. Kmmel. A biogr aphy of the coupled cluster method, chapter 41, pages 33 4-- 348. [1 0] V. Volkov and J.W. Demmel. Benchmarking gpus to tune dense linear algebra. In Proceedings of the 2008AC MI IEEE con f erence on Supercom puting, SC '08, pages 31:1 - 31:11, Piscataway, NJ, USA, 2008. IEEE Press. [11] CUDA C programming guide. http : I /docs . nvidia. com/cuda/ cuda-c-progra mming-guide I index . html, 2012. [12 ] NVIDIA Te sla K20-K20X GPU accelerators - benchmarks. Application perfor mance technical brief. http : I /www. nvidia. com/ docs I IO/ 122 8 7 4 I K20- and-K 20X-ap plication-pe rformance- technical- brief . pdf, 2012. 11 2 [13] Y. Shao, L. Fusti-Mo lnar, Y. Jung, J. Kussmann, C. Ochsenfeld, S. Brown, A.T.B. Gilbert, L.V. Slipchenko, S.V. Levchenko, D.P . O' Neill, R.A. Distasio Jr, R.C. Lochan, T. Wang, G.J.O. Beran, N.A. Besley, J.M. Herbert, C.Y. Lin, T. Van Voorh is, S.H. Chien, A. Sodt, R.P. Steele, V.A. Rassolov, P. Maslen, P.P. Koram bath, R.D. Adamson, B. Austin, J. Baker, E.F.C. Bird, H. Daschel, R.J. Doerks en, A. Dreuw, B.D. Dunietz, A.D. Dutoi, T.R. Furlani, S.R. Gwaltney, A. Heyden, S. Hirata, C.-P . Hsu, G.S. Kedziora, R.Z. Khalliulin, P. Klunziger, A.M. Lee, W.Z. Liang , I. Lotan, N. Nair, B. Pete rs, E.L Proynov, P.A. Pieniazek, Y.M. Rhee, J. Ritchi e, E. Rosta, C.D. Sherrill, A.C. Simmonett, J.E. Subotnik, H.L. Woodcock III, W. Zhang, A.T. Bell, A.K. Chakraborty, D.M. Chipman, F.J. Keil , A. War shel, W.J. Hehre, H.F. Schaefer III, J. Kong, A.L Krylov, P.M.W. Gill, M. Head-Gordon. Advances in methods and algorithms in a modern quantum chemistry program packag e. Phys. Chern. Chern. Phys., 8:3172- 3191, 2006. [14] Whitepaper. NVIDIAs next generation CUDA compute architect ure: Kepler GK110. http://www.nvidia. com/content /PDF/kepler/ NVIDIA- Kepler-G K1 10-Arc hitecture-Whit epaper.pdf, 2012. [15] A.G. Anderson, W.A. Goddard III, and P. Schrder. Quantum monte carlo on graph ical processing uni ts. Com puter Physics Communications, 177(3):298 - 306, 2007. [16] K.A. Wendt, J.E. Drut, and T.A. Lhde. Toward lar ge- scale hybrid monte carlo simulati ons of the hubbard model on grap hics pr ocessing units. Com puter Ph ysics Com munications, 182(8):1651 - 1656, 2011. [17] S. Guochun, V. Kindratenko, I. Ufimtsev, and T. Mart inez. Direct self- consistent field computati ons on gpu cluste rs. In Parallel Distributed Processing (I PDPS), 2010 IEEE International Sym posium on, pages 1-8, 2010. [18] L. Vogt, R. Oliva res-Amaya, S. Ker mes, Y. Shao, C. Amador- Bedolla, and A. Aspuru-Guzik. Accelerating resolution-of-t he-ide ntity second-order mollerp lesset quantum chemistry calculat ions with graphical proce ssing unit s. The Jour nal of Ph ysical Chemistry A, 112(10):2049-205 7, 2008. PMID: 18229900. [19] R. Olivare s-Amaya, M.A. Wat son, R.G. Edgar, L. Vogt, Y. Shao, and A. Aspuru Guzik. Accelerating correlated quantum chemistry calculat ions using graphical pr ocessing units and a mixed prec ision matrix multiplication library. Journal of Chemical Theory and Com putation, 6(1):135-144, 2010. [20] E.A. DePrince III, J.R. Hammond, and S.K. Gray. Many-body quantum chemistry on grap hics processing units. In Proceedings of SciDAC, July 10-14 2011. [21] A.A. Auer, G. Baumgartner, D. E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X.Y. Gao, R. Harrison, S. Kri shnamoorthy, S. Krishnan, C.C. Lam, 113 Q.D. Lu, M. Nooijen, R. Pitzer, J. Ramanujam, S. Sadayapan, and A. Sibiryakov. Automatic code generation for many-body electronic structure methods: The ten sor contraction engine. Mol. Phys., 104:211-228, 2006. [22] A.V. Titov, I.S. Ufimtsev, N. Luehr, and T.J. Mart inez. Generating effi cient quan tum chemistry codes for novel architecture s. Journal of Chemical Theory and Com putation, 9(1):213 -2 21, 2013. [23] CUBLAS library. http : I /docs . nvidia . com/ cuda/ cublas/index . html, 2012. [24] G.D. Purvis and R.J. Bartlett. A full coupled-cluster singles and doubles model: The inclusion of disconnected tripl es. 1. Chern. Phys., 76:1910-1918 , 1982. [25] E. Epifanovsky, M. Wor mit, T. Kus, A. Landau, D. Zuev, K. Khistyaev, P. Manohar, I. Kaliman, A. Dreuw, and A.l. Krylov. New implementation of high-level corre lated methods using a general block-tensor library for high-performance electronic structure calculati ons. 1. Com put. Chern. , 2012. to be submitted soon. 11 4 Chapter 5: Future work In this work the effect of microhydration on ionization ener gies and proton transfer mecha nism in nucleo bases was studied. During this study the focus was on a small clusters with no more than three water molecul es. This study can be extended to bigger clusters with more water molecu les. In particular, it could be interesting to consider the effects of the complete fist and second solvation shells and compare it with the bulk water. Partially it was done in the work of Ghosh et al. 1 where the EOM-IP-CCSD method was combined with effective frag ment potential method to study the effect of solvation on the vertical ionization energy of thy mine. Using the same technique one can try to study the proton transfer from ionized water to nucl eobases through water wires in larger cluste rs. However, it has to be noted that the water involved in proton transfer can not be described as part of the EFP system and has to be included in the pure quantum part which can result in a relatively large system. The continuation of the development of the GPU implementation of CC meth ods requires the development of GPU-enabled Virtual Memory Manager (VMM) as described in Chapter 4. The development of new VMM will remove the limit to the system size imposed by GPU memory size and will unlock all CC methods imple mented in the current CPU-only version of ccman2 library. Then after minor mod ificati ons it will be possible to use RI and Cholesky decomposition meth ods already implemented in ccman2 on GPU. These meth ods signifi cantly reduce the amount of 115 data storage required for computati ons but increase the computational cost. Consider ing GPU architecture these methods should be especially beneficial because they will reduce the amount of data transfer through the PCI-Expre ss interface that could be the bottl eneck in many computat ions on GPU. The increased computational cost of this metho ds should be resolved by the excess computational power of GPU. The optimization of permutation kernel s can increase the performance of such meth ods as CCSD( T) which heavily use this operati on. The implementation and optimization of the multi -s treaming version of the library that will run several kernels simultaneously along with data transfer operati ons will help to increase the load of the GPU and further increase the perform ance, especially on the new NVIDIA Kepler architecture. Using the same library design and patterns it should be easy to develop implemen tat ions with other langua ges such as OpenCL and OpenA CC. It will allow to use AMD GPU s which potentially can have better performance in the future. The development of the library version for Intel Xeon Phi most likely will require some changes to the kernel design because of the difference in architecture s. However, the basic principl es are the same and thus there should not be any significant obstacles with this implementati on. Finally, the development of the effi cient implementation of the ccman2 library for mul tinode clusters with GPU s will open the possibility to use the incredible computa tional power of supercomputers that is expected to reach exascale performance in the nearest fut ure. 116 5.1 Chapter 5 references [1] D. Ghosh, 0. Isa yev, LV. Slipchenko, and A.l. Krylov. The effect of solvation on vertical ionization energy of thymi ne: From rnicrohydration to bulk. J. Phys. Chern. A, 115:6028-603 8, 2011. 117 Appendix: Tensor algebra library graphical processing unit implementation Details of the GPU implementation of tensor algebra library As described in Chapter 4 ccman2 and libtensor librari es have layered structure. Fig. A.1 gives a schematic representation of tensor contraction operation execution across differ ent laye rs. At the top layer the tensor contraction operati ons are pro grammed in libcc library. This tensor contracti ons use btod_contract2 class from libten sor library. btod_contract2 class is a general representation of block tensor contrac tion operat ions. It call gen_bto_contract2 operati ons with specific traits as a tem plate parameter. For examp le, in order to execute contract ions on GPU one should use cuda_block_tensor_tra its. gen_bto_contract2 forms batches of tensors from block tensors and execute tod_contract2.perform() operation for every batch. tod_contract2 maps required contraction to cor responding mathematical kernel (e.g. kern_mul2) with required linalg parameter (e.g. if cuda_bl ock_tens or_traits were used than linalg_cublas is used). This kernel class calls suitable linalg operation (e.g. linalg_cuda ::i j_ip_jp_x) which then calls DGEMM operation (e.g. cubl asDgemm). 118 btod contract2 � gen_btod_contract2 <Traits> for every batch: � tod contract2 � kern_mui<Lina lg> � Linalg :: ij_ip _jp _x � dgemm Figure A.l: Schematic order of execution of tensor contraction operation. Ideally, only the lowest level linalg opera tions has to be differ ent for imple- mentati ons for different hardware architect ures. However, in order to handle sepa- rate GPU memory without complex virtual memory manager we had to create par- aile! cuda_tod_contract2 and cuda_btod_contract2 clas ses. gen_bto_contract2 clas ses are the same for CPU and GPU implementat ions. when the VMM is created cuda_btod_contract2 class can be replaced with regular btod_contract2 class. 119
Abstract (if available)
Abstract
Quantum mechanics can predict basic properties of molecules such as relative energies, electronic charge distributions, dipoles, ionization and excitation energies. By solving the Schrödinger equation with the electronic molecular Hamiltonian we can determine the electronic structure of the molecule that implies physical and chemical properties of the molecule. ❧ In this thesis we use quantum mechanics to study basic properties of microhydrated nucleobases--essential building blocks of DNA. Water plays a central role in chemistry and biology by mediating the interactions between molecules, altering energy levels of solvated species, modifying potential energy profiles along reaction coordinates, and facilitating efficient proton transport through ion channels and interfaces. The effect of hydration on different properties of molecules and chemical reactions has been intensively studied for many years both theoretically and experimentally. Numerous solvation models were developed in an effort to simulate the properties and reactions in the bulk water. However, even the interaction between several water molecules are not completely understood. This work demonstrates how microhydration can affect such properties as ionization energies and control proton transfer mechanisms. It explains the experimental results and proposes mechanisms of observed effects. ❧ In order to facilitate theoretical studies the development of new quantum chemistry programs that are able to utilize the resources of modern high-performance hardware is required. An exact solution for the Schrödinger equation can only be obtained for the species with just a few electrons. However, there are a number of quantum chemistry methods that give approximate solutions. Among the most widely used and accurate methods are coupled-cluster methods. In the last chapter the details of the GPU enabled implementation of the post-Hartree Fock ab initio quantum chemistry methods is given. ❧ In Chapter 2 a combined theoretical and experimental study of the effect of microhydration on ionization energies (IEs) of thymine is presented. The experimental IEs are derived from photoionization efficiency curves recorded using tunable synchrotron VUV radiation. The onsets of the PIE curves are 8.85 ±0.05, 8.60± 0.05, 8.55± 0.05, and 8.40± 0.05 eV for thymine, thymine mono-, di-, and tri-hydrates, respectively. The computed (EOM-IP-CCSD/cc-pVTZ) AIEs are 8.90, 8.51, 8.52, and 8.35 eV for thymine and the lowest isomers of thymine mono-, di-, and tri-hydrates. Due to large structural relaxation, the Franck-Condon factors for the 0←0 transitions are very small shifting the apparent PIE onsets to higher energies. Microsolvation strongly affects IEs of thymine--addition of each water molecule reduces the first vertical IE by 0.10-0.15 eV. The adiabatic IE decreases even more (up to 0.4 eV). The magnitude of the effect varies for different ionized states and for different isomers. For the ionized states that are localized on thymine the dominant contribution to the IE reduction is the electrostatic interaction between the delocalized positive charge on thymine and the dipole moment of the water molecule. ❧ In Chapter 3 proton transfer in a model system comprising dry and microhydrated clusters of nucleobases is investigated. Experiments with mass spectrometry and tunable vacuum ultraviolet synchrotron radiation show that water shuts down ionization-induced proton transfer between nucleobases, which is very efficient in dry clusters. Instead, a new pathway opens up in which protonated nucleobases are generated by proton transfer from the ionized water molecule and elimination of a hydroxyl radical. Electronic structure calculations reveal that the shape of the potential energy profile along the proton transfer coordinate depends strongly on the character of the molecular orbital from which the electron is removed, i.e., the proton transfer from water to nucleobases is barrierless when an ionized state localized on water is accessed. The computed energetics of proton transfer is in excellent agreement with the experimental appearance energies. Possible adiabatic passage on the ground electronic state of the ionized system, while energetically accessible at lower energies, is not efficient. Thus, proton transfer is controlled electronically, by the character of the ionized state, rather than statistically, by simple energy considerations. Proton transfer from ionized outer water to nucleobases in dihydrated cluster through the Grotthuss-like mechanism is barrierless and the most energetically favorable mechanism. ❧ Chapter 4 describes an implementation of coupled-cluster (CC) post-Hartree Fock ab initio quantum chemistry methods, which are widely used in the research, including the research described in previous chapter, on GPU with CUDA C language. The implementation of these methods is based on the ccman2 and libtensor libraries and is part of the Q-Chem 4 electronic structure package. These libraries use layered modular architecture which allows relatively easy addition and replacement of low-level modules without changing high-level code (such as CC equations). Developed layered architecture make possible a fast adaptation of the existing code in the ccman2 library to other languages and technologies, such as OpenCL with AMD GPUs or Intel Xeon Phi coprocessor. The development of CC code for new massively parallel architectures is crucial for future research of larger systems with higher accuracy and for QM/MM methods.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Electronic structure of ionized non-covalent dimers: methods development and applications
PDF
Excited state proton transfer in quinoline photobases
PDF
Development of robust ab initio methods for description of excited states and autoionizing resonances
PDF
New electronic structure methods for electronically excited and open-shell species within the equation-of-motion coupled-cluster framework
PDF
Electronically excited and ionized states in condensed phase: theory and applications
PDF
Ultrafast dynamics of excited state intra- and intermolecular proton transfer in nitrogen containing photobases
PDF
Modeling x-ray spectroscopy in condensed phase
PDF
Multiphoton ionization of tTris(2-phenylpyridine)Iridium
PDF
Development of predictive electronic structure methods and their application to atmospheric chemistry, combustion, and biologically relevant systems
PDF
Development and application of robust many-body methods for strongly correlated systems: from spin-forbidden chemistry to single-molecule magnets
PDF
Electronic structure analysis of challenging open-shell systems: from excited states of copper oxide anions to exchange-coupling in binuclear copper complexes
PDF
Temperature-dependent photoionization and electron pairing in metal nanoclusters
PDF
Photoinduced redox reactions in biologically relevant systems
PDF
Advancing computational methods for free energy calculations in proteins and the applications for studies of catalysis in nucleotide-associated proteins
PDF
Predictive electronic structure methods for strongly correlated systems: method development and applications to singlet fission
PDF
Spectroscopic signatures and dynamic consequences of multiple interacting states in molecular systems
PDF
The role of the environment around the chromophore in controlling the photophysics of fluorescent proteins
PDF
Photoexcitation and nonradiative relaxation in molecular systems: methodology, optical properties, and dynamics
PDF
Computational spectroscopy in gas and condensed phases
PDF
Electronic structure of strongly correlated systems: method development and applications to molecular magnets
Asset Metadata
Creator
Khistyaev, Kirill
(author)
Core Title
The effect of microhydration on ionization energy and proton transfer in nucleobases: analysis and method development
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Chemistry
Publication Date
06/26/2013
Defense Date
04/28/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
electronic structure methods,excited states,ionized states,nucleobases,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Krylov, Anna I. (
committee chair
), Benderskii, Alexander V. (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
khistyae@usc.edu,kirhist@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-281184
Unique identifier
UC11292624
Identifier
etd-KhistyaevK-1708.pdf (filename),usctheses-c3-281184 (legacy record id)
Legacy Identifier
etd-KhistyaevK-1708.pdf
Dmrecord
281184
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Khistyaev, Kirill
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
electronic structure methods
excited states
ionized states
nucleobases