Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Computational spectroscopy in gas and condensed phases
(USC Thesis Other)
Computational spectroscopy in gas and condensed phases
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
COMPUTATIONAL SPECTROSCOPY IN GAS AND CONDENSED PHASES by Ronit Sarangi A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (CHEMISTRY) August 2023 Copyright 2023 Ronit Sarangi Acknowledgements I would like to take this opportunity to express my sincere gratitude to everyone who has con- tributed to the successful completion of my PhD thesis. First and foremost, I would like to thank my advisor Professor Anna I. Krylov for her guidance, support, and encouragement throughout my PhD journey. The feedback and suggestions from Anna have been instrumental in shaping my doctoral research and guiding me towards my future goals. I am especially grateful for her understanding and invaluable advice during the tougher parts of my time here. I am honored to be a part of the iOpenShell group. I am grateful to Prof. Curt Wittig, Prof. Aiichiro Nakano, and Prof. Alexander Benderskii, who have provided me with the academic environment and resources necessary to pursue my research. Their courses were enlightening and — most importantly — fun to participate in. I have learned a lot from them and their contribution towards my scientic growth cannot be understated. In addition, special thanks go to the members of my qualifying and defense committees. Thank you, Prof. Jahan Dawlaty, Prof. Oleg Prezhdo, Prof. Moh El-Naggar, and Prof. Mike Inkpen, for accommodating me despite your busy schedules. I would also like to thank my under- graduate advisor Prof. Amlan K. Roy, who introduced me to quantum chemistry and supported me throughout my Master’s degree and beyond. ii I would like to extend my thanks to the current and past members of Anna’s research group, Kaushik, Yongbin, Maristella, Sven, Sahil, Tirthendu, Pavel, Saikirian, Sourav, Goran, Madhubani, Nayanthara, and Pawel who have generously shared their time, knowledge, and expertise with me. Thank you for being my family away from family. I appreciate all of them for creating a welcoming community and helping me adjust and thrive in USC. Last but denitely not the least, I would like to express my gratitude to my family, who have supported and encouraged me throughout my PhD journey. Their unwavering support and en- couragement have kept me motivated and inspired me to strive for excellence. Thank you to everyone. iii TableofContents Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Equation-of-motion coupled-cluster framework . . . . . . . . . . . . . . . . . . . 3 1.1.1 EOM-CC method for electronically excited states (EOM-EE-CCSD) . . . . 5 1.1.2 EOM-CC method for core-ionized states . . . . . . . . . . . . . . . . . . . 6 1.2 Cost of coupled-cluster calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2: On the basis set selection for calculations of core-level states: Dierent strategies to balance cost and accuracy . . . . . . . . . . . . . . . . . . . . . . . 13 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Study design, theoretical methods, and computational protocols . . . . . . . . . . 17 2.2.1 Computational Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.1 Nitrogen molecule, N 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.2 Carbon monoxide, CO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3.3 Simple hydrides and the eect of basis on hydrogen . . . . . . . . . . . . . 36 2.3.4 Using mixed basis sets for molecules with multiple edges . . . . . . . . . . 37 2.3.5 Other cost-saving strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.3.6 Acrolein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.3.7 Glycine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Chapter 3: Charge-transfer-to-solvent states provide a sensitive spectroscopic probe of the local solvent structure around anions . . . . . . . . . . . . . . . . . . . . . 53 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.2 Computational Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.1 Equilibrium dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 iv 3.3.2 UV–vis spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.5 Appendix A: Protocol for excited-states calculations . . . . . . . . . . . . . . . . . 74 3.5.1 Number of waters in the QM subsystem . . . . . . . . . . . . . . . . . . . 74 3.5.2 Basis set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.5.3 Number of excited states computed . . . . . . . . . . . . . . . . . . . . . . 77 3.5.4 CIS versus EOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.5.5 QM-MM water exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.6 Appendix B: Rigid waters in QM/MM AIMD . . . . . . . . . . . . . . . . . . . . . 79 3.7 Appendix C: Calculations and analysis of the spectra . . . . . . . . . . . . . . . . 80 3.8 Appendix D: Excited-state analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.9 Appendix E: Sample inputs for QM/MM AIMD and EOM calculations . . . . . . . 84 Chapter 4: Investigation of new algorithms for tensor contraction using STRUMPACK for their potential application in coupled-cluster calculations . . . . . . . . . . 92 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.2.1 Numeric rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.2.2 Binary tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.2.3 HSS representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.2.4 STRUMPACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.3.1 Real-life matrices from Q-Chem . . . . . . . . . . . . . . . . . . . . . . . . 101 4.3.2 Compression tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.3.3 Tensor contraction using HSS matrices . . . . . . . . . . . . . . . . . . . . 103 4.4 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.4.1 Choice of orbital representation . . . . . . . . . . . . . . . . . . . . . . . . 105 4.4.2 Alkanes at distorted geometry . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.4.3 Canonical versus Locovirt matrix contraction . . . . . . . . . . . . . . . . 113 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Chapter 5: Future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 5.1 Core-level states of larger systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 5.2 Expanding the scope of compressed matrix contractions . . . . . . . . . . . . . . . 122 5.3 Non-linear spectra of aqueous anions . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.3.1 2PA spectra of aqueous SCN . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.3.2 Challenges in modeling SFG spectra . . . . . . . . . . . . . . . . . . . . . . 125 v ListofTables 1.1 Scaling of electronic structure methods with system size,N, and comparison of maximum system size accessible in the 1990s and 2010s. . . . . . . . . . . . . . . 8 2.1 Basis sets, contraction schemes, and the number of functions per atom a . . . . . . 22 2.2 Core IEs for N 2 , fc-CVS-EOM-IP-CCSD with various basis sets. . . . . . . . . . . 25 2.3 CO, carbon edge. Total and ionization energies; fc-CVS-EOM-IP-CCSD with various basis sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.4 CO, oxygen edge. Total and ionization energies; fc-CVS-EOM-IP-CCSD with various basis sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.5 Core IEs for H 2 O, NH 3 , and CH 4 computed with mixed basis sets. . . . . . . . . . 37 2.6 Core IEs for H 2 O, NH 3 , and CH 4 computed with mixed basis sets. . . . . . . . . . 37 2.7 Core IEs in CO computed with mixed basis sets on carbon and oxygen edges a . . . 38 2.8 Core IEs for H 2 O, NH 3 , and CH 4 computed with single and double precision CCSD a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.9 Core IEs for H 2 O, NH 3 , and CH 4 computed with the FNO-based truncation of the virtual space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.10 Acrolein, core IEs for the oxygen edge. . . . . . . . . . . . . . . . . . . . . . . . . 40 2.11 Acrolein, core IEs for the carbon edge. . . . . . . . . . . . . . . . . . . . . . . . . 41 2.12 Glycine core IEs for all edges with mixed basis sets a . . . . . . . . . . . . . . . . . 43 2.13 Glycine core IEs with Pople basis sets. . . . . . . . . . . . . . . . . . . . . . . . . 44 vi 3.1 Key RDF parameters extracted from MD and AIMD simulations and the experimentally derived values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2 Raw data for excitation energies (E ex , eV) and oscillator strengths (f). . . . . . . 65 3.3 Comparison of peak positions (eV) of computed and experimental UV–vis spectra of SCN (aq) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.4 Parameters for gaussians used for tting in Fig. 3.12. . . . . . . . . . . . . . . . . 81 3.5 Peak positions (eV) of the three lowest bands a of aqueous thiocyanate as reported by Fox et. al study[19]. Oscillator strengths are given in parenthesis. . . . . . . . 81 3.6 Wave-function analyses of the reduced 1PTDMs corresponding to the lowest eight excitations in the EOM-EE-CCSD/MM calculations for a representative snapshot sampled from the MD simulations. The wave-function analysis for each 1PTDM includes its norm (jj jj) and participation ratio (PR)[? ? ]. Only the dominant hole and particle NTOs are shown. The singular values ( K ) correspond to normalized 1PTDMs. Isosurface is 0.015. . . . . . . . . . . . . . . . 82 3.7 Wave-function analyses of the reduced 1PTDMs corresponding to the lowest eight excitations in the EOM-EE-CCSD/MM calculations for a representative snapshot sampled from AIMD simulations. The wave-function analysis for each 1PTDM includes its norm (jj jj) and participation ratio (PR)[? ? ]. Only the dominant hole and particle NTOs are shown. The singular values ( K ) correspond to normalized 1PTDMs. Isosurface is 0.015. . . . . . . . . . . . . . . . 83 4.1 Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developed by Zhenling (locovirt) for system I. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.2 Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developed by Zhenling (locovirt) for system II. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.3 Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developed by Zhenling (locovirt) for system I with distorted geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.4 Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developed by Zhenling (locovirt) for system II with distorted geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.5 Contraction timings and accuracy for system I. . . . . . . . . . . . . . . . . . . . . 115 vii 4.6 Contraction timings and accuracy for system II. . . . . . . . . . . . . . . . . . . . 116 5.1 Diagonal 2PA moments for 8 lowest excited states and par in atomic units. . . . 125 viii ListofFigures 1.1 Reference and target states for dierent EOM-CC variants. Reproduced from Ref. 19. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Acrolein structure with atom labels. . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 Glycine structure (canonical isomer) with atom labels. . . . . . . . . . . . . . . . 20 2.3 Core IEs for N 2 versus the number of basis functions per atom. Gray shaded area marks0.01 eV interval around the basis-set limit (u-aug-cc-pV5Z). . . . . . . . 26 2.4 CO, carbon edge IEs. Top: Only the carbon basis is uncontracted, whereas the original matching basis is used for oxygen. Bottom: Both carbon and oxygen bases are uncontracted. Gray shaded area marks the0.01 eV interval around the basis-set limit (u-aug-ccp-pV5Z). . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5 CO, oxygen edge IEs. Top: Only the oxygen basis is uncontracted, whereas the original matching basis is used for carbon. Bottom: Both carbon and oxygen bases are uncontracted. Gray shaded area marks the0.01 eV interval around the basis-set limit (u-aug-ccp-pV5Z). . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.6 Glycine IEs for the carbon, nitrogen and oxygen edges versus the number of basis functions per active edge atom. The total number of basis functions in each calculation is shown in parentheses in the respective panel. The best estimate is obtained with the uC-aug-cc-pV5Z/aug-cc-pVTZ/aug-cc-pVTZ basis; the respective total number of basis functions and the number of basis functions per active edge atom are shown in parentheses. . . . . . . . . . . . . . . . . . . . 45 3.1 RDFs of H (left panel) and O (right panel) atoms on waters with respect to the (a) S, (b) C, and (c) N atoms on SCN , extracted from the MD and QM/MM AIMD simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 ix 3.2 Average distances and standard deviations for the 11 nearest water molecules from the S (left) and N (right) ends of SCN extracted from the MD and AIMD simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3 NTOs for the lowest excited states from EOM-EE calculations: (a) intramolecular state; (b)s-type CTTS state; (c)p-type CTTS state; (d)p-type CTTS state. . . . 65 3.4 Average exciton, hole, and particle sizes and their standard deviations for MD (left) and AIMD (right) snapshots. . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.5 Comparison of aqueous SCN spectra calculated at T=300 K using MD and AIMD with the experimental spectrum (at T=274 K) from Ref. 1. . . . . . . . . . . 69 3.6 CIS/aug-cc-pVDZ spectra with an increasing number of waters in the QM region. 75 3.7 CIS spectra with dierent bases on the QM atoms. The total number of basis functions is given in parenthesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.8 CIS/Bas2:20-11 spectra computed with 8 and 10 excited states per snapshot. . . . 77 3.9 CIS versus EOM spectra calculated using the Bas2:20-11 scheme with 8 excited states over 80 snapshots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.10 Histogram of MM waters replacing the QM waters. . . . . . . . . . . . . . . . . . 78 3.11 N–H (left) and N–O (right) RDFs for AIMD runs with and without constraints on water bonds and angles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.12 Fitting the MD (left) and AIMD(right) spectra with gaussians. . . . . . . . . . . . 81 4.1 A level 2 binary tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.2 Binary trees for level 1(left) and level 2 (right) HSS representation. Adapted from Ref. 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.3 Matrices used in this report. The size of: OVOV matrix is (nocc*nvir x nocc*nvir), OOVV is (nocc 2 x nvir 2 ), and VVVV is (nvir 2 x nvir 2 ), where nocc and nvir are the number of occupied and virtual orbitals, respectively. . . . . . . . . . . . . . 102 4.4 Memory ratio (Memory of HSS matrix / Memory of Dense matrix) versus the number of virtual orbitals for CH 4 system for compression of OVOV (left) and VVVV (right) matrices with dierent relative tolerances. . . . . . . . . . . . . . . 105 x 4.5 Time for compression versus the number of virtual orbitals for CH 4 system for compression of OVOV (left) and VVVV (right) matrices with dierent relative tolerances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.6 Memory ratio (Memory of HSS matrix / Memory of Dense matrix) versus the number of C in system for compression of OVOV (left) and VVVV (right) matrices with dierent relative tolerances. . . . . . . . . . . . . . . . . . . . . . . 106 4.7 Time for compression versus the number of C in system for compression of OVOV (left) and VVVV (right) matrices with dierent relative tolerances. . . . . 107 4.8 Time for compression (left) and memory ratio (right) versus number of carbons in the system (I) for compression of VVVV matrix. . . . . . . . . . . . . . . . . . 113 4.9 Time for compression (left) and memory ratio (right) versus number of virtual orbitals in the system (II) for compression of VVVV matrix. . . . . . . . . . . . . 114 4.10 Contraction time comparison for systems I (left) and II (right). . . . . . . . . . . . 115 5.1 1PA and 2PA CTTS spectra of SCN (aq) . The 2PA spectrum is computed for parallel polarization and the units are in the Göppert-Mayer (GM) units. . . . . . 124 xi Abstract Computational spectroscopy is a branch of quantum chemistry concerned with the modeling of various spectroscopic signals and analysis of the underlying electronic structure. The chal- lenge encountered in this eld is twofold: (i) Selection of a reliable and robust quantum-chemical method capable of describing relevant transitions and, (ii) accurate inclusion of environmental eects such as solvent and presence of interfaces. We use equation-of-motion coupled-cluster (EOM-CC) formalism to study core-ionized and charge-transfer-to-solvent linear (UV-Vis) and non-linear multiphoton spectra of solvated anions. We also developed strategies for reducing the steep scaling of coupled-cluster method and proposed using a new tensor contraction algorithm involving compression of two-electron inte- gral tensors. We used the linear algebra library STRUMPACK developed by Xiaoye et al., that is capable of compressing square matrices and performing algebraic operations on them, to study the compressibility of typical matrices (matricized two-electron integral tensors) appearing in coupled-cluster calculations. The structure of this thesis is as follows, Chapter 1 introduces EOM-CC theory and its exten- sions to valence, core-excited, and core-ionized states. Chapter 2 discusses the basis set require- ments of core-valence separation (CVS) EOM-IP-CCSD calculations. Using a set of second–row molecules (N 2 , CO, etc.), we systematically benchmarked the performance of dierent basis sets xii for core-ionization energies. Our goal was to compare the performance of uncontracted small Pople’s basis sets with large basis sets with tight core basis functions. This study resulted in rec- ommendation of a general basis set scheme for EOM calculations involving core-excitations or ionizations. Chapter 3 describes our study of UV-Vis charge-transfer-to-solvent (CTTS) states for a proto- typical aqueous anion. We simulated equilibrium dynamics of thiocyanate anion in bulk water us- ing classical force-elds and density functional theory (DFT) based quantum mechanics/molecular mechanics (QM/MM) ab initio molecular dynamics (AIMD) method and compare the solvation dynamics. We describe the diculties of obtaining accurate UV-Vis spectra in the condensed phase and the sensitivity of the CTTS spectra to the method chosen to simulate dynamics. We also developed a cost-eective protocol that can be generalized to studying other anions in the condensed phase. This study serves as the rst step in our eorts to model multi-photon spec- troscopies such as two-photon absorption (2PA) and sum-frequency generation (SFG) of aqueous anions. In Chapter 4 we discuss an exploratory study aimed at reducing the cost of many-body cal- culations. Specically, we analyze the feasibility of implementing a matrix compression linear algebra package in Q-Chem. We studied the eects of matrix compression algorithms on tensor contraction and storage using compressed matricized two-electron integral tensors. We identi- ed matrices involved in post-Hartree Fock calculations that could be compressed, and bench- marked memory saving and contraction timings against Q-Chem’s internal library. This proof- of-principle study lays the foundation of future work to take advantage of compressibility of real-world matrices involved in quantum chemical calculations. xiii Chapter 5 outlines the future directions for studying anions in condensed phase, i.e., anions in air-water interface used in non-linear spectroscopies such as SFG and modeling 2PA in bulk water using strategies developed in Chapter 3. We also highlight the application of core-level methods to systems in condensed phase and further development in benchmarking compressed matrices. xiv Chapter1 Introduction As experimental spectroscopic techniques become more sophisticated, interpreting the results is becoming increasingly challenging[1]. Theoretical modeling allows one to connect spectroscopic observables with ne details of molecular structure. Dierent spectroscopies require dierent theoretical models to explain spectra and other observables. For example, X-ray spectroscopies such as X-ray absorption[2, 3] or photoelectron spectroscopy operate using excitations and ion- izations of core–level electrons. Non-linear spectroscopies, such as second harmonic generation (SHG)[4–6] are used to investigate interfacial processes and determine nonlinear susceptibility of materials. Computational spectroscopy bridges the experimental observables and the underly- ing physical properties of a system. Spectroscopic modeling requires that the quantum–chemical method of choice be robust and exible enough to accommodate a wide range of electronic struc- tures. Many experiments are conducted in the condensed phase where the environment strongly perturbs the electronic structure of a molecule[7–10] and can lead to new transitions (e.g., anions exhibit broad bands in the UV region only when solvated). Therefore, inclusion of the envi- ronment in the computational model is rather important for accurate description of spectra and 1 properties. However, one also needs to consider the steep scaling of high-level quantum chem- ical methods. This balancing act between accuracy and cost is a major challenge in studying excited-state properties of solvated molecules. Light-matter interactions can result in the production of open-shell species[11] and a reliable theoretical model should be able to treat open-shell and closed-shell species on an equal footing. Equation-of-motion coupled-cluster (EOM-CC) formalism[11–17] is one such method — it can treat multiple open-shell patterns through the use of dierent EOM operators, as described in the next section. The research presented in this thesis involves application of EOM-CC methods to describe core-ionization, UV-Vis, and nonlinear spectra such as two-photon absorption (2PA) in the con- densed phase. I also investigated a new tensor contraction algorithm involving compressed ma- trices with the potential to reduce the storage and computational cost of CC calculations. The results of the research present in this thesis were published in the following: 1. R. Sarangi, M. L. Vidal, S. Coriani, and A. I. Krylov, “On the basis set selection for calcu- lations of core-level states: Dierent strategies to balance cost and accuracy", Mol. Phys., 118, e1769872 (2020) – Chapter 2. 2. R. Sarangi, K. Nanda, and A.I. Krylov, “Charge-transfer-to-solvent states provide a sensitive spectroscopic probe of the local solvent structure around anions", Mol. Phys., e2148582 (2022) – Chapter 3. 2 1.1 Equation-of-motioncoupled-clusterframework EOM-CC is a versatile theoretical framework capable of describing open-shell and electronically excited systems within a single-reference formalism[18]. In the EOM formalism the target–state wavefunction,j exc i, is described as, j exc i =Re T j 0 i; (1.1) whereR is a general excitation operator acting on the reference CC wavefunction,e T j 0 i. At the lowest level of theory, the cluster operator,T , includes all singly and doubly excited determinants giving rise to the coupled-cluster singles and doubles (CCSD) ansatz T =T 1 +T 2 = X ia t a i a y i + 1 4 X ijab t ab ij a y b y ji; (1.2) where t a i and t ab ij are the cluster amplitudes, a y and i are creation and annihilation operators, respectively. Thej 0 i in Eq. (1.1) represents the Hartree-Fock (HF) vacuum, the choice of which along with the choice ofR denes dierent variants of EOM-CC[17, 19, 20] method, as shown in Fig. 1.1. The R operator is truncated in a consistent manner with the T operator, e.g., for EOM-CCSD R =R 0 +R 1 +R 2 : (1.3) 3 The amplitudes ofR and their corresponding eigenvalues are obtained by diagonalizing the sim- ilarity transformed Hamiltonian, H =e T He T , in the space of the target congurations dened by the EOM-CC variant HR =ER: (1.4) Figure1.1: ReferenceandtargetstatesfordierentEOM-CCvariants. Reproduced from Ref. 19. Fig. 1.1 shows several variants of the EOM-CCR operator connecting appropriate reference choice to dierent types of target states. The ionization potential (IP) and double IP (R IP and R DIP ) are singly and doubly ionizing operators, i.e, they connect N-1 and N-2 electron target states and a closed-shell N-electron reference, respectively. Similarly, electron attachment (EA) 4 and double EA (R EA andR DEA ) operators describe N+1 and N+2 electron states starting from an N electron reference. Electron excitation operator (R EE ) is an electron number conserving operator that can describe singlet and triplet transitions. Finally, spin-ip operator (R SF ) acts on high-spin reference to describe multi-congurational states ipping spin of an electron. The advantage of EOM approach lies in the fact that they are size-extensive and their accuracy can be systematically improved by including higher excitations. Fig. 1.1 also highlights the fact that the EOM methods are multi-state methods, i.e., multiple target states are obtained by the diagonalization of the similarity–transformed Hamiltonian. This treatment enables calculations of transition properties between target states. Furthermore, the reference does not have to be the ground state of the system, which extends the applicability of these methods to systems with dicult to treat ground-state wavefunctions. The choice of reference signicantly aects the quality of target states description, and hence, spin-pure and closed shell references are preferred. In my research, I focused on describing electronically excited states and core-ionized states using the EOM-EE and EOM-IP approaches. 1.1.1 EOM-CCmethodforelectronicallyexcitedstates(EOM-EE-CCSD) The EOM approach for calculating electronically excited states is called EOM-EE[21]. The com- putational cost of EOM-CCSD isO (N 6 ) with respect to the system size, N, the same as that for conguration interaction singles and doubles (CISD). Despite the conceptual similarities, the EOM-EE-CCSD method is size-extensive, in contrast to CISD. As shown in Fig. 1.1, the target EE 5 states can be described with the referencej 0 i corresponding to the ground-state wavefunction andR that conserves the number o electrons and total spin-projection R EE =r 0 + X ia r a i a y i + 1 4 X ijab r ab ij a y b y ji; (1.5) where,i;j;::: denote occupied orbitals,a;b;::: denote virtual orbitals in 0 , andr denotes the EOM amplitudes. In Chapter 3, we describe the excited states of thiocyanate anion (SCN ) solvated in water, including local as well as charge-transfer-to-solvent (CTTS) excitations[22]. These excitations are solvent supported — small anions such as halides, OH , and SCN do not have bound excited states in gas phase. This means that solvent needs to be included explicitly in order to properly describe the CTTS excitations, which is expensive given the steep scaling of EOM-EE-CCSD. We carefully benchmarked protocols for selection of an appropriate QM region, basis sets, and other cost-saving approximations such as using single precision calculations, with the goal being to create a general computational protocol that balances cost and accuracy and can be applied to other solvated anions. 1.1.2 EOM-CCmethodforcore-ionizedstates The operatorR can be chosen to add or remove electrons from the reference wavefunction, which leads to the EOM-IP (ionization potential) and EOM-EA (electron attachment) variants[18]. In the EOM-IP-CCSD method,R is a 1 hole (1h) and 2 hole 1 particle (2h1p) operator, R IP = X i r i i + 1 2 X ija a y ji (1.6) 6 Describing core-ionization using the standard EOM-IP ansatz presents unique challenges relative to valence states. First, core-level states lie very high in energy, for example, core ionization energy (IE) of carbon, nitrogen, and oxygen 1s electron is around 290, 400, and 540 eV. While this makes the core IEs are highly element–specic, it also means that core-level states lie above the ionization onset and are coupled with the continuum of valence excitations. For a numerically stable calculation using standard solvers, these continuum states need to be projected out. This can be achieved by core-valence separation (CVS) [23] scheme, which modies theR IP operator to include at least one core orbital in each excitation term[24] R CVSIP = X I r I I + 1 2 X Ija r a Ij a y jI; (1.7) where I denotes a core orbital. Second, a core-hole resulting from core-ionization/excitation de-shields the nucleus and in- troduces signicant perturbation to other orbitals. These orbital relaxation eects must be ac- counted for by the method, which leads to special basis-set requirements for core-level states. We investigated the eect of increasing the variational exibility available to the core and valence orbitals by uncontracting core and valence functions[25] of Pople’s and Dunning’s basis sets. In this study, we calculated core IEs of molecules with second–row atoms, such as carbon, nitrogen, and oxygen, using the CVS-EOM-IP method and compared them against the IEs obtained using large basis sets with additional tight-core functions (aug-cc-pCVXZ)[26]. 7 1.2 Costofcoupled-clustercalculations A common theme in this thesis is the balancing act between cost and accuracy of EOM-CC cal- culations. Table 1.1 shows the scaling of electronic structure methods with molecular size and estimates of the maximum molecular sizes these calculations can be applied to in the 90s and in 2010s[27, 28]. The table highlights how the advancement in high performance computing (HPC) resources has increased the range of application of electronic structure methods to larger systems. However, the dierence in molecular sizes for Hartree-Fock (HF) and CCSD with per- turbative triples (CCSD(T)) is vast, because ofO (N 7 ) scaling of the latter. The storage cost of CCSD(T) scales asO (N 4 ), which means that doubling the system size requires approximately 128 times more computing power and 16 times more storage. This steep scaling of CC family of methods limits their application to relatively small systems. Table1.1: Scalingofelectronicstructuremethodswithsystemsize,N,andcomparisonofmax- imumsystemsizeaccessibleinthe1990sand2010s. Method Scaling Estimate of max. system size 1990s 2010s HF, DFT N 3 50–200 100,000+ MP2 N 5 25–50 1000+ CCSD, EOM-CCSD N 6 10–15 100+ CCSD(T) N 7 8–12 25–30 In addition to the improvement in HPC resources, signicant developments have been made to improve the underlying algorithms of the methods mentioned in Table 1.1. Linear scaling HF and DFT methods[29–31], local CC methods[32–35] that take advantage of the “near-sightedness"[36] of electronic interactions have garnered signicant interest and are available in software packages like ORCA[37]. However, these methods are not widely used due to their sensitivity to cuto parameters and system dependent performance[38, 39]. 8 One avenue of reducing the cost of CC calculations lies in eective handling and storage of matrices involved in these calculations. One of the most computationally and storage-demanding steps in CCSD calculations is a tensor contraction of the form A ab ij = X cd t cd ij I ab cd ; (1.8) wheret cd ij represents a CC amplitude andI ab cd is a two-electron integral tensor. In matrix repre- sentation, the size oft cd ij isN 2 occ N 2 vir , andI ab cd isN 2 vir N 2 vir , whereN occ andN vir are the number of occupied and virtual orbitals in the calculation, and the contraction itself scales asO (N 6 ) using a direct implementation. I explored methods for optimizing the storage of the two-electron integral (TEI) tensors and explored new contraction algorithms involving compressed matrices with the goal of reducing the cost of tensor contractions such as the one presented above. Using a linear algebra library de- veloped by Xiaoyeetal. calledSTRUMPACK[40], which is capable of compressing square matrices and performing linear algebra operations on such compressed objects, we explored methods of integrating it into Q-Chem. The goal of this pilot study was to establish that TEIs are compress- ible and performing tensor contractions in STRUMPACK using compressed matrices is a viable strategy. 9 Bibliography [1] V. Barone, S. Alessandrini, M. Biczysko et al. Computational molecular spectroscopy, Nat Rev Methods Primers.1, 38, (2021) [2] W.D. Derricotte and F.A. Evangelista, Simulation of x-ray absorption spectra with orthogo- nality constrained density functional theory, Phys. Chem. Chem. Phys.17, 14360 (2015). [3] J. Wenzel, M. Wormit, and A. Dreuw, Calculating core-level excitations and x-ray absorption spectra of medium-sized closed-shell molecules with the algebraic-diagrammatic construc- tion scheme for the polarization propagator, J. Comput. Chem.35, 1900 (2015). [4] J. Kongsted, A. Osted, K.V. Mikkelsen, and O. Christiansen, Second harmonic generation second hyperpolarizability of water calculated using the combined coupled cluster dielectric continuum or dierent molecular mechanics methods, J. Chem. Phys.120, 3787 (2004). [5] R.K. Lam, S.L. Raj, T.A. Pascal, C.D. Pemmaraju, L. Foglia, A. Simoncig, N. Fabris, P. Miotti, C.J. Hull, A.M. Rizzuto, J.W. Smith, R. Mincigrucci, C. Masciovecchio, A. Gessini, E. Al- laria, G. De Ninno, B. Diviacco, E. Roussel, S. Spampinati, G. Penco, S. Di Mitri, M. Trovó, M. Danailov, S.T. Christensen, D. Sokaras, T.-C. Weng, M. Coreno, L. Poletto, W.S. Drisdell, D. Prendergast, L. Giannessi, E. Principi, D. Nordlund, R.J. Saykally, and C.P. Schwartz, Soft X-ray second harmonic generation as an interfacial probe, Phys. Rev. Lett. 120, 023901 (2018). [6] P. B. Petersen, R. J. Saykally, M. Mucha, and P. Jungwirth, Enhanced concentration of polar- izable anions at the liquid water surface: SHG spectroscopy and MD simulations of sodium thiocyanide, J. Phys. Chem. B109, 10915 (2005). [7] C. Reichardt, SolventsandSolventEectsinOrganicChemistry. VCH: Weinheim, 2nd edition, 1990. [8] A. DeFusco, N. Minezawa, L.V. Slipchenko, F. Zahariev, and M.S. Gordon, Modeling solvent eects on electronic excited states, J. Phys. Chem. Lett.2, 2184 (2011). [9] A. DeFusco, J. Ivanic, M.W. Schmidt, and M.S. Gordon, Solvent-induced shifts in electronic spectra of uracil, J. Phys. Chem. A115, 4574 (2011). [10] K. B. Bravaya, M. G. Khrenova, B. L. Grigorenko, A. V. Nemukhin, and A. I. Krylov, Eect of protein environment on electronically excited and ionized states of the green uorescent protein chromophore, J. Phys. Chem. B115, 8296 (2011). 10 [11] A. I. Krylov, The quantum chemistry of open-shell species, in Reviews in Comp. Chem., edited by A. L. Parrill and K. B. Lipkowitz, volume 30, pages 151–224. J. Wiley & Sons, 2017. [12] J. Geertsen, M. Rittby, and R. J. Bartlett, The equation-of-motion coupled-cluster method: Excitation energies of Be and CO, Chem. Phys. Lett.164, 57 (1989). [13] J. F. Stanton and R. J. Bartlett, The equation of motion coupled-cluster method. A systematic biorthogonal approach to molecular excitation energies, transition probabilities, and excited state properties, J. Chem. Phys.98, 7029 (1993). [14] J. F. Stanton and J. Gauss, Analytic energy derivatives for ionized states described by the equation-of-motion coupled cluster method, J. Chem. Phys.101, 8938 (1994). [15] M. Nooijen and R. J. Bartlett, Equation of motion coupled cluster method for electron at- tachment, J. Chem. Phys.102, 3629 (1995). [16] S. Pal, M. Rittby, R. J. Bartlett, D. Sinha, and D. Mukherjee, Multireference coupled- cluster methods using an incomplete model space — application to ionization-potentials and excitation-energies of formaldehyde, Chem. Phys. Lett.137, 273 (1987). [17] K. Sneskov and O. Christiansen, Excited state coupled cluster methods, WIREs: Comput. Mol. Sci.2, 566 (2012). [18] A. I. Krylov, Equation-of-motion coupled-cluster methods for open-shell and electronically excited species: The hitchhiker’s guide to Fock space, Annu. Rev. Phys. Chem.59, 433 (2008). [19] D. Casanova and A. I. Krylov, Spin-ip methods in quantum chemistry, Phys. Chem. Chem. Phys.22, 4326 (2020). [20] R. J. Bartlett, Coupled-cluster theory and its equation-of-motion extensions, WIREs: Com- put. Mol. Sci.2, 126 (2012). [21] A. I. Krylov, C. D. Sherrill, and M. Head-Gordon, Excited states theory for optimized orbitals and valence optimized orbitals coupled-cluster doubles models, J. Chem. Phys. 113, 6509 (2000). [22] M. Luria and A. Treinin, Photochemistry of thiocyanate ion in solution, J. Phys. Chem.72, 305 (1968). [23] L. S. Cederbaum, W. Domcke, and J. Schirmer, Many-body theory of core holes, Phys. Rev. A22, 206 (1980). [24] S. Coriani and H. Koch, Communication: X-ray absorption spectra and core-ionization potentials within a core-valence separated coupled cluster framework, J. Chem. Phys.143, 181103 (2015). [25] N. A. Besley, A. T. B. Gilbert, and P. M. W. Gill, Self-consistent-eld calculations of core excited states, J. Chem. Phys.130, 124308 (2009). 11 [26] T. H. Dunning Jr., Gaussian basis sets for use in correlated molecular calculations, I. The atoms boron through neon and hydrogen, J. Chem. Phys.90, 1007 (1989) [27] M. Head-Gordon, Quantum Chemistry and Molecular Processes, J. Phys. Chem.100, 13213 (1996) [28] R. J. Harrison and R. Shepard, Ab-Initio Molecular Electronic Structure on Parallel Comput- ers, Annu. Rev. Phys. Chem.45, 623 (1994) [29] W. Yang and T. Lee, A density-matrix divide-and-conquer approach for electronic structure calculations of large molecules, J. Chem. Phys.103, 5674 (1995) [30] X. He and K. M. Merz Jr., Divide and Conquer Hartree-Fock Calculations on Proteins, J. Chem. Theory Comput.6, 405 (2010) [31] M. S. Gordon, D. G. Fedorov, S. R. Pruitt, and L. V. Slipchenko, Fragmentation Methods: A Route to Accurate Calculations on Large Systems, Chem. Rev.112, 632 (2012) [32] M. Schutz, Low-order scaling local electron correlation methods. III. Linear scaling local perturbative triples correction (T), J. Chem. Phys.113, 9986 (2000) [33] P. E. Maslen, A. D. Dutoi, M. S. Lee, Y. H. Shao, and M. Head-Gordon, Accurate local ap- proximations to the triples correlation energy: formulation, implementation and tests of 5th-order scaling models, Mol. Phys.103, 425 (2005) [34] H. J. Werner and M. Schutz, An ecient local coupled cluster method for accurate thermo- chemistry of large systems, J. Chem. Phys.135, 144116 (2011) [35] P. R. Nagy and M. Kallay, Optimization of the linear-scaling local natural orbital CCSD(T) method: Redundancy-free triples correction using Laplace transform, J. Chem. Phys. 146, 214106 (2017) [36] W. Kohn, Nobel Lecture: Electronic structure of matter-wave functions and density func- tionals, Rev. Mod. Phys.71, 1253 (1999) [37] F. Neese, F. Wennmohs, U. Becker, and C. Riplinger, The ORCA quantum chemistry program package, J. Chem. Phys.152, 224108 (2020) [38] D. G. Liakos, M. Sparta, M. K. Kesharwani, J. M. L. Martin, and F. Neese, Exploring the Accuracy Limits of Local Pair Natural Orbital Coupled-Cluster Theory, J. Chem. Theory Comput.11, 1525 (2015) [39] L. Gyevi-Nagy, M. Kallay, and P. R. Nagy, Accurate Reduced-Cost CCSD(T) Energies: Parallel Implementation, Benchmarks, and Large-Scale Applications, J. Chem. Theory Comput.17, 860 (2021) [40] J. Xia, S. Chandrasekaran, M. Gu, and X. S. Li, Fast algorithms for hierarchically semisepa- rable matrices, Numer. Linear Algebra Appl. 953 (2010) 12 Chapter2 Onthebasissetselectionforcalculationsofcore-level states: Dierentstrategiestobalancecostandaccuracy 2.1 Introduction Owing to their unique capabilities, spectroscopies exploiting core-level transitions are gaining popularity[1–5]. The transitions involving core electrons are element-specic because of large energy gaps (hundreds of electron-volts) between dierent edges, yet they are sensitive to the chemical environment. Compact and localized shapes of core orbitals result in local sensitivity, which is particularly important for designing spectroscopic probes of the electronic structure and dynamics in complex environments. As in the case of valence spectroscopies[6], theoretical modeling is crucially important for the interpretation of the experimental spectra. Thus, the progress in experimental techniques, ranging from advanced light sources to table-top X-ray instruments, has stimulated vigorous theoretical developments[7, 8]. At rst glance, electronic structure behind valence (UV-VIS) and core-level (X-ray) spectro- scopies appear to be similar. For example, modeling photoelectron spectra entails calculations 13 ofN-1-electron states of a neutral molecule. Since the molecular Schrödinger equation contains the solutions for all states, regardless of their energy, one may expect that the same quantum- chemistry method could be used to describe both valence-ionized and core-ionized states. This is, however, not the case: although the Schrödinger equation is the same, the approximations to it, which are used to construct a practical quantum chemistry method (i.e., theoretical model chemistry, in John Pople’s terms[9]) may lead to manifestly dierent outcomes in the valence and core domains. Theoretical model chemistry[9] is dened by the pair of approximations: one to the many- body problem (correlation treatment) and one to a one-electron basis set used to represent molec- ular orbitals and construct Slater determinants. Equation-of-motion coupled-cluster (EOM-CC) theory[10, 13–15, 43, 44] provides an eec- tive and robust treatment of electron correlation and is capable of a treating multiple electronic states on the same footing. Its accuracy can be systematically improved up to the exact limit. These properties make EOM-CC the method of choice for spectroscopy modeling. Challenges in correlated treatment of core-level states and possible solutions have been analyzed in re- cent reviews[7, 8] and original research papers[16–25, 27–34, 47]. Particularly eective is the extension of the EOM-CC methods to core-level states via the core-valence separation (CVS) scheme[24, 25, 29, 35, 36]. Numerous benchmarks illustrated that CVS-EOM-IP/EE-CCSD [24, 25, 27, 29, 30, 37] provides an eective and reliable description of core-ionized and core-excited states, including treatment of non-linear optical properties such as RIXS cross sections[31–33, 38]. The special requirements to one-electron basis sets in calculations of core-level states have been recognized and documented in many papers[16, 20, 39–52]. In a nutshell, obtaining con- verged and accurate results for core-level states requires considerably larger bases than needed 14 for their valence counterparts. This high sensitivity of the results to the one-electron basis is observed already in uncorrelated calculations, e.g., at the Hartree-Fock or Kohn-Sham DFT levels[16, 46, 48, 50–52]. Its physical origin is a strong perturbation caused by the creation of a core hole as a result of removing or exciting core electrons. To describe the ensuing changes in electronic structure, traditionally referred to as orbital relaxation, suciently exible basis sets are needed. Several studies pointed out that at the Hartree-Fock or DFT SCF levels, the cc-pCVTZ basis[53], designed to describe core-valence correlation eects, delivers good perfor- mance for core-ionized states[16, 39, 48, 49]. This is a considerably larger basis that typically used in uncorrelated calculations; hence, a number of strategies towards designing bases sets capable of providing a balanced treatment of the parent neutral and target core-ionized species have been explored. In particular, IGLO bases (individual gauge for localized orbitals), originally developed to improve the description of the electron density around the nuclei (which is needed for NMR spectroscopy), were found to provide good balance between accuracy and cost[40, 47, 50]. A more general strategy was introduced by Besley and coworkers[50], who have shown that core relaxation eects can be eectively described by augmenting standard bases with functions for the next highest nuclear charge (Z + 1) than the element that is being ionized. By usingZ + 1 augmenting functions, SCF core-ionization energies computed with double- bases (6-31G* and cc-pVDZ) are within 0.5 eV from the cc-pCVQZ results[50]. This idea was further developed by Ambroise and Jensen, who proposed to use functions with interpolated exponents (betweenZ andZ+1) within polarization-consistent basis sets [51]. They observed a near-optimal balance of treating the neutral and core-ionized states with bases augmented byZ + 1 2 functions. In correlated calculations, the basis-set requirements are higher, as the basis should be su- ciently exible to treat both orbital relaxation and electron correlation in the parent and target 15 species. Thus, an optimal basis should aord a more exible description of the core and a bal- anced treatment of the electron correlation. In the previous CVS-EOM-CC benchmark studies, series of standard contracted basis sets have been tested[36, 41]. In Ref. [41], Coriani and cowork- ers investigated the convergence with respect to the basis-set size with coupled-cluster methods of increasing complexity (CC2, CCSD, CC3, and CCSDT). The largest bases tested in this study were aug-cc-pCV5Z and d-aug-cc-pCV5Z. The authors observed monotonous decrease of the computed ionization and excitation energies towards the experimental values upon increasing the basis-set cardinal number, which illustrates that the target core-level states are more sensi- tive to the basis set than the ground-state reference. Coriani and co-workers[41] exploited this smooth convergence of the results to extrapolate the computed excitation energies to the com- plete basis set (CBS) limit. They noted that for ionization energies the aug-cc-pCV5Z basis (the largest used for ionization energies in their study) appears to be close to the convergence limit, judging from the small dierences between this and smaller bases. The authors also noted good performance of (aug)-cc-pCVTZ: the core IEs were within 0.1 eV from the aug-cc-pCV5Z results. While additional diuse functions are required to properly describe core-excited states, they were found to be less important for core-ionized states. The relativistic eects were found to be less sensitive to the basis set[16, 36]. These observations conrm that the main reason for extended basis sets in core-level calculations is due to orbital relaxation. Here we systematically explore an alternative strategy, used by Gill and coworkers[16] and by us in recent applications[31, 38]. Instead of following the hierarchy of Dunning’s bases, op- timized to describe electron correlation in ground-state molecules, we build series of basis sets by uncontracting the core and valence functions. We consider Pople’s and Dunning’s sets and show that using uncontracted Pople’s bases[54, 55] is much more eective that using Dunning’s 16 bases[53, 56, 57]. Our results provide a simple guideline for choosing basis sets for calculations of core-level states. In addition to eective basis-set choices, we also briey explore other cost- saving strategies. 2.2 Study design, theoretical methods, and computational protocols In this study, we focus on the calculations of ionized states using the fc-CVS-EOM-IP-CCSD method[29, 58, 59], with the goal of investigating the ability of various basis sets to describe orbital relaxation eects at a correlated level of theory. By focusing on core-ionized states, we can investigate perturbation of the electronic structure due to creation of the core hole. Because excitation of core electrons also creates a core hole, the results should be largely transferable to core-excited states, with the caveat that the calculations of the XAS transitions require additional diuse functions to describe states of Rydberg character. While we provide experimental results when available, our main emphasis is not on the dierences relative to the experiment, but rather on the convergence of the theoretical values to the basis-set limit. To explain the rational behind the design of our study, let us briey discuss the eects caused by the removal of a core electron. Because core orbitals are compact, they screen the nuclear charge much more eectively than the valence orbitals do. Thus, removing a core electron from an atom is roughly equivalent to increasing the nuclear charge by one, in terms of the Coulomb eld experienced by the remaining electrons (this is the rational behind theZ + 1 approach of Besley[50, 51]). This increased Coulomb attraction causes the valence atomic orbitals to collapse toward the nucleus. To describe such collapse, the basis set must have signicant radial exibility; 17 angular exibility is less important. For this reason, one should use at least a triple- (or better) basis. The collapse of the core orbitals has even a larger energetic eect because of the large contribution of core electrons to the total electronic energy. According to Slater’s rules [60], the shielding eect of one 1s electron on the other 1s electron is roughly 0.3 protons. This core collapse has huge energetic consequences; thus, it is essential to describe it well to obtain accurate results. The basis, therefore, should include a sucient number of the core functions. This can be achieved by choosing polarized-core Dunning’s sets (cc-pCVXZ) or by decontracting the core functions, such as the “6-” core function in the split-valence Pople bases, as was done in Refs. [16, 31, 38]. Core-correlation eects are considerably smaller (in energy) than these “radial collapse” eects and, for that reason, one may expect core-correlated basis sets to be less eective than core-decontracted ones [16]. To verify whether this expectation holds when using a high-level correlated method, we consider series of the original contracted Dunning and Pople basis sets of a triple- quality and above and their partially or fully decontracted variants. The full description of the basis sets is given below. 2.2.1 ComputationalDetails All calculations were carried out using the Q-Chem electronic structure program[54, 61]. We employ the fc-CVS-EOM-IP-CCSD method[29] in which the target ionized states are described by the following ansatz: N1 = (R 1 +R 2 )e T 1 +T 2 0 (N); (2.1) where 0 (N) denotes the reference determinant of anN-electron system, the singles and doubles excitation operatorsT 1 andT 2 contain the amplitudes for the reference state obtained by solving 18 CCSD equations. The excitation operators R 1 and R 2 contain the EOM amplitudes obtained solving an EOM eigenproblem. WhileT 1 andT 2 are particle- and spin-conserving operators, the EOM-IP operators are of an ionizing type: R 1 = X i r i i; R 2 = 1 2 X ija r a ij a y ji: (2.2) Following the standard notation, indicesi;j;k;::: denote occupied orbitals anda;b;c;::: denote virtual orbitals, as dened by the choice of the reference determinant 0 . In fc-CVS-EOM-IP- CCSD, the core electrons are frozen at the CCSD step (i.e., respective amplitudes inT 1 andT 2 are zero) and the EOM amplitudes should involve at least one core orbital, as prescribed by the CVS scheme. The denition of the core in our CVS scheme depends on the edge[29]: the edge of interest and all lower edges are frozen while all higher edges are active. In this study we focus on molecules containing rst- and second-row elements (C,N,O, and H). Thus, in calculations of the carbon edge, the standard denition of the frozen core is used: all 1s orbitals of the second-row atoms are frozen. In calculations at the nitrogen edge, only oxygen and nitrogen 1s orbitals are frozen while carbon’s 1s orbitals are active. Likewise, in calculations at the oxygen edge, only 1s O orbitals are frozen and the rest of the core orbitals are active. Our benchmark set comprises two simple diatomics, carbon monoxide (CO) and dinitrogen (N 2 ), three hydrides (water, ammonia, methane), and two polyatomic molecules, acrolein and glycine. This set allows us to investigate basis set eects for carbon, nitrogen, and oxygen edges, including molecules with several atoms of the same type and molecules with more than one edge. 19 The calculations for dinitrogen and carbon monoxide were carried out at the experimental ge- ometries taken from Ref. [63] (R NN = 1.097685 Å andR CO =1.128323 Å). The hydrides’ structures were taken from Ref. [47], where they were optimized with RI-MP2/cc-pVTZ. For acrolein (Fig. 2.1), we used an MP2/cc-pVTZ optimized structure. Glycine calculations were performed for the canonical isomer (the main form of the gas-phase glycine) using the RI-MP2/cc-pVTZ optimized structure taken from Ref. [47] (Fig. 2.2). Figure2.1: Acroleinstructurewithatomlabels. We usedQ-Chem’s default convergence thresholds, except for the EOM amplitudes for which a tighter threshold was used. SCF convergence was 10 8 , CCSD convergence was 10 6 , and the Davidson convergence was 10 7 . In single-precision calculations (cf Sec. 2.3.5), CCSD conver- gence thresholds were 10 4 for amplitudes and 10 5 for energies. The basis sets were decontracted manually and inputed as user-specied bases. For each basis, we considered two decontracted versions: one in which only the core orbitals were decontracted Figure2.2: Glycinestructure(canonicalisomer)withatomlabels. 20 (this converts one core function from the 6-311+G(3df) basis set into six variationally independent functions) and one in which all functions were uncontracted. Using 6-311+G(3df) as an example, the latter procedure amounts to converting a triple- basis into a 5- one. The redundant basis functions, which appear in decontracted Dunning’s sets, were removed from the calculations. We note that in the segmented bases with optimized contractions (such as Pople’s bases), there is a signicant overlap between the exponents of the primitives in the contracted core and in the valence functions; thus, one may expect that further optimization of fully uncontracted bases is possible. In this study we only removed exactly redundant functions and did not attempt to remove strongly overlapping ones. Table 2.1 collects the basis sets used in this study, their contraction schemes, and the number of basis functions per atom for the second row elements. It also introduces short-hand notations for the uncontracted bases. We used pure angular momentum functions (5d, 7f, 9h, 11i,:::) for all bases. For Dunning’s bases, we used the versions with optimized contraction, as implemented in Q-Chem. The aug-cc-pCV5Z and aug-cc-pV6Z bases were taken from the Basis Set Exchange database[64], without optimizing the general contractions (numeric tests indicated that using the variants of these bases with optimized general contractions leads to essentially the same results). 21 Table2.1: Basissets,contractionschemes,andthenumberoffunctionsperatom a . Basis Contraction level Contraction scheme #b.f. 6-311+G(2df) Original (12s6p2d1f)/[5s4p2d1f] 34 uC-6-311+G(2df) Core-uncontracted (12s6p2d1f)/[10s4p2d1f] 39 u-6-311+G(2df) Fully uncontracted (12s6p2d1f)/[12s6p2d1f] 47 6-311+G(3df) Original (12s6p3d1f)/[5s4p3d1f] 39 uC-6-311+G(3df) Core-uncontracted (12s6p3d1f)/[10s4p3d1f] 44 u-6-311+G(3df) Fully uncontracted (12s6p3d1f)/[12s6p3d1f] 52 aug-cc-pVTZ Original (11s6p3d2f)/[5s4p3d2f] 46 uC-aug-cc-pVTZ Core-uncontracted (11s6p3d2f)/[11s4p3d2f] 52 u-aug-cc-pVTZ Fully uncontracted (11s 6p 3d 2f)/[11s 6p 3d 2f] 58 aug-cc-pVQZ Original (13s7p4d3f)/[6s5p4d3f2g] 80 uC-aug-cc-pVQZ Core-uncontracted (13s7p4d3f2g)/[13s5p4d3f2g] 87 u-aug-cc-pVQZ Fully uncontracted (13s7p4d3f2g)/[13s7p4d3f2g] 93 aug-cc-pV5Z Original (15s9p5d4f3g2h)/[7s6p5d4f3g2h] 127 uC-aug-cc-pV5Z Core-uncontracted (15s9p5d4f3g2h)/[15s6p5d4f3g2h] 135 u-aug-cc-pV5Z Fully uncontracted (15s9p5d4f3g2h)/[15s9p5d4f3g2h] 144 aug-cc-pV6Z Original (17s11p6d5f4g3h2i)/[8s7p6d5f4g3h2i] 189 uC-aug-cc-pV6Z Core-uncontracted (17s11p6d5f4g3h2i)/[17s7p6d5f4g3h2i] 198 u-aug-cc-pV6Z Fully uncontracted (17s11p6d5f4g3h2i)/[17s11p6d5f4g3h2i] 210 aug-cc-pCVTZ Original (13s8p4d2f)/[7s6p4d2f] 59 uC-aug-cc-pCVTZ Core-uncontracted (13s8p4d2f)/[13s6p4d2f] 65 u-aug-cc-pCVTZ Fully uncontracted (13s8p4d2f)/[13s8p4d2f] 71 aug-cc-pCVQZ Original (16s10p6d4f2g)/[9s8p6d4f2g] 109 uC-aug-cc-pCVQZ Core-uncontracted (16s10p6d4f2g)/[16s8p6d4f2g] 116 u-aug-cc-pCVQZ Fully uncontracted (16s10p6d4f2g)/[16s10p6d4f2g] 122 aug-cc-pCV5Z Original (19s13p8d6f4g2h)/[11s10p8d6f4g2h] 181 uC-aug-cc-pCV5Z Core-uncontracted (19s13p8d6f4g2h)/[19s10p8d6f4g2h] 189 u-aug-cc-pCV5Z Fully uncontracted (19s13p8d6f4g2h)/[19s13p8d6f4g2h] 198 a For a 2nd row element (C,N,O, etc). 22 2.3 Resultsanddiscussion 2.3.1 Nitrogenmolecule,N 2 The results for N 2 are collected in Table 2.2 and shown graphically in Fig. 2.3. Table 2.2 shows the total CCSD energy of the neutral reference state and two core IEs, corresponding to ionization from u (1s) (lower value, IE1) and g (1s) (higher value, IE2) orbitals. The total energies show anticipated trends: they decrease upon uncontraction and the magnitude of the decrease is larger when the valence orbitals are uncontracted than when only the core orbitals are uncontracted. The magnitude of this decrease is larger for Pople’s bases than for Dunning’s bases, which is also expected because the relative increase in the number of basis functions is larger for Pople’s bases. As noted in the previous EOM-CC benchmark study[41], the IEs decrease monotonously in the series of contracted basis sets of increasing size. Here we observe that the IEs also generally decrease upon uncontraction. This is a manifestation of core-relaxation eects, which lower the energy of the target ionized state. In contrast to the total energies, the drop in IE is always larger when the core orbitals are uncontracted than when the valence orbitals are uncontracted. The magnitude of the change is larger for smaller bases than for larger bases because the increase in the basis size is larger for the smaller bases. We also observe that the changes are rather small when polarized-core basis is used, because these bases already aord sucient exibility in describing core electrons. The results in Table 2.2 show that the IEs reach the basis-set limit within 0.01 eV for aug-cc- pCV5Z (and the respective uncontracted variants), uC-aug-cc-pV6Z/u-aug-cc-pV6Z, and u-aug- cc-pV5Z. The smallest among these bases is aug-cc-pCV5Z, which is not surprising, as this basis 23 has more functions optimized for the core description (although they are optimized for describing the correlation of the core electrons in the ground state). The energy gap between the two core states converges much faster with respect to the ba- sis set than the absolute values of IEs, owing to error cancellation. For example, the dierence between the smallest basis (6-311+G(2df)) and the basis-set limit is less than 0.001 eV. Fig. 2.3 shows the lower core IE ( u ) for all basis sets as a function of the number of basis functions, which clearly indicates the eectiveness of dierent bases in describing core IEs. While it is not surprising that larger bases perform better than smaller bases, as illustrated by the smooth trend of the red curve (original contracted bases sets), the dierence between contracted and core- uncontracted bases is remarkable. For example, uC-aug-cc-pVQZ gives better result than aug- cc-pV5Z, despite being 1.5 times more compact. The performance of core and fully uncontracted Pople’s bases is even more impressive—uncontracted u-6-311+G(3df) (with 52 functions per atom) delivers the same result as uC-aug-cc-pVQZ (87 functions per atom). Overall, the values with fully uncontracted u-6-311+G(3df) are within 0.06 eV from the basis set limit (u-aug-cc-pV5Z). Adding a second set of diuse functions to uC-6-311+G(3df) lowers the IE by 0.002 eV (see Table 2.2). Somewhat unexpectedly, the uncontracted aug-cc-pVTZ bases yield noticeably larger errors relative to the basis set limit than more compact uncontracted u-/uC-6-311+G(3df) bases. Dunning’s bases with core-valence correlation (aug-cc-pCVTZ and aug-cc-pCVQZ, Table 2.2) perform better than their aug-cc-pVXZ counterparts, but are less eective than the uncontracted Pople’s bases, e.g., the aug-cc-pCVTZ result is within 0.16 eV from the basis-set limit, to be com- pared with 0.06 eV for u-6-311+G(3df), despite the former having fewer basis functions (59 versus 52, see Table 2.1). 24 Table2.2: CoreIEsforN 2 ,fc-CVS-EOM-IP-CCSDwithvariousbasissets. Basis CCSD energy a (a.u.) IE1 b (eV) IE2 (eV) IE (eV) 6-311+G(2df) -109.350207 410.6041 410.4996 0.1045 uC-6-311+G(2df) -109.353623 410.0853 409.9802 0.1051 u-6-311+G(2df) -109.357797 409.9451 409.8402 0.1049 6-311+G(3df) -109.354688 410.5026 410.3981 0.1045 uC-6-311+G(3df) -109.357388 410.0750 409.9701 0.1049 u-6-311+G(3df) -109.361379 409.9331 409.8282 0.1049 6-311(2+)G(3df) -109.354828 410.4998 410.3953 0.1045 uC-6-311(2+)G(3df) -109.357517 410.0725 409.9676 0.1049 u-6-311(2+)G(3df) -109.361498 409.9315 409.8266 0.1049 aug-cc-pVTZ -109.361574 410.3500 410.2454 0.1046 uC-aug-cc-pVTZ -109.363030 410.1667 410.0620 0.1047 u-aug-cc-pVTZ -109.366680 409.9827 409.8781 0.1046 aug-cc-pVQZ -109.386793 410.0417 409.9370 0.1047 uC-aug-cc-pVQZ -109.387264 409.9192 409.8143 0.1049 u-aug-cc-pVQZ -109.387920 409.4026 409.7976 0.1050 aug-cc-pV5Z -109.394586 409.9124 409.8073 0.1051 uC-aug-cc-pV5Z -109.394648 409.8791 409.7740 0.1051 u-aug-cc-pV5Z -109.395016 409.8722 409.7671 0.1051 aug-cc-pV6Z -109.397296 409.8853 409.7802 0.1051 uC-aug-cc-pV6Z -109.397313 409.8703 409.7652 0.1051 u-aug-cc-pV6Z -109.397514 409.8665 409.7614 0.1051 aug-cc-pCVTZ -109.365969 410.0192 409.9146 0.1046 uC-aug-cc-pCVTZ -109.366344 410.0033 409.8987 0.1046 u-aug-cc-pCVTZ -109.369113 409.9582 409.8536 0.1046 aug-cc-pCVQZ -109.388436 409.8997 409.7946 0.1051 uC-aug-cc-pCVQZ -109.388576 409.8955 409.7905 0.1050 u-aug-cc-pCVQZ -109.388920 409.8912 409.7861 0.1051 aug-cc-pCV5Z -109.395330 409.8695 409.7644 0.1051 uC-aug-cc-pCV5Z -109.395322 409.8693 409.7642 0.1051 u-aug-cc-pCV5Z -109.395413 409.8685 409.7634 0.1051 a Total energy for the neutral reference state. b Experimental IE u =409.9 eV is taken from Ref. [47]. 25 Figure 2.3: Core IEs for N 2 versus the number of basis functions per atom. Gray shaded area marks0.01eVintervalaroundthebasis-setlimit(u-aug-cc-pV5Z). 2.3.2 Carbonmonoxide,CO The results for CO are collected in Tables 2.3 and 2.4 and shown graphically in Figs. 2.4 and 2.5. The experimental values were taken from Ref. [65]. Overall, the trends for both edges follow very closely the trends observed for N 2 , reinforcing the main nding—impressive eectiveness of un- contracted Pople’s bases in describing the core IEs. For the carbon edge, u-6-311+G(3df) is within 0.01 eV from the basis-set limit (u-aug-cc-pV5Z), whereas for the oxygen edge the dierence is slightly larger (0.1 eV). Here again we observe that while the aug-cc-pCVXZ bases deliver better results than the respective aug-cc-pVXZ variants, they are less eective than the uncontracted Pople’s bases. Using this molecule with two edges as an example, we tested the protocol of using dierent bases for active and inactive edges, e.g., using uncontracted bases for both C and O or using an 26 uncontracted basis for the active edge and an original, contracted basis for the inactive edge. The results show that the dierence between the two schemes is small (except for the smallest basis, 6-311+G(2df)), suggesting an eective compromise for calculations of polyatomic heteronuclear molecules. 27 Figure2.4:CO,carbonedgeIEs.Top:Onlythecarbonbasisisuncontracted,whereastheoriginal matching basis is used for oxygen. Bottom: Both carbon and oxygen bases are uncontracted. Grayshadedareamarksthe0.01eVintervalaroundthebasis-setlimit(u-aug-ccp-pV5Z). 28 Figure2.5:CO,oxygenedgeIEs.Top:Onlytheoxygenbasisisuncontracted,whereastheoriginal matchingbasisisusedforcarbon.Bottom:Bothcarbonandoxygenbasesareuncontracted.Gray shadedareamarksthe0.01eVintervalaroundthebasis-setlimit(u-aug-ccp-pV5Z). 29 Table 2.3: CO, carbon edge. Total and ionization energies; fc-CVS-EOM-IP-CCSD with various basissets. Basis on C Basis on O CCSD energy a (a.u.) IE b (eV) 6-311+G(2df) 6-311+G(2df) -113.133212 296.8822 uC-6-311+G(2df) 6-311+G(2df) -113.134520 296.4147 uC-6-311+G(2df) -113.142902 296.3908 u-6-311+G(2df) 6-311+G(2df) -113.136044 296.3176 u-6-311+G(2df) -113.122722 296.2963 6-311+G(3df) 6-311+G(3df) -113.139488 296.7699 uC-6-311+G(3df) 6-311+G(3df) -113.140761 296.3917 uC-6-311+G(3df) -113.142902 296.3908 u-6-311+G(3df) 6-311+G(3df) -113.142200 296.2955 u-6-311+G(3df) -113.147177 296.2968 aug-cc-pVTZ aug-cc-pVTZ -113.1445197 296.6638 uC-aug-cc-pVTZ aug-cc-pVTZ -113.144989 296.4907 uC-aug-cc-pVTZ -113.145565 296.4911 u-aug-cc-pVTZ aug-cc-pVTZ -113.146335 296.3600 u-aug-cc-pVTZ -113.149699 296.3623 aug-cc-pVQZ aug-cc-pVQZ -113.171609 296.4291 30 uC-aug-cc-pVQZ aug-cc-pVQZ -113.171748 296.3163 uC-aug-cc-pVQZ -113.172132 296.3164 u-aug-cc-pVQZ aug-cc-pVQZ -113.171948 296.3048 u-aug-cc-pVQZ -113.172888 296.3053 aug-cc-pV5Z aug-cc-pV5Z -113.180060 296.3223 uC-aug-cc-pV5Z aug-cc-pV5Z -113.180078 296.2916 uC-aug-cc-pV5Z -113.180128 296.2916 u-aug-cc-pV5Z aug-cc-pV5Z -113.180193 296.2868 u-aug-cc-pV5Z -113.180538 296.2871 aug-cc-pV6Z aug-cc-pV6Z -113.183007 296.3027 uC-aug-cc-pV6Z aug-cc-pV6Z -113.183014 296.2871 uC-aug-cc-pV6Z -113.183029 296.2871 u-aug-cc-pV6Z aug-cc-pV6Z -113.183076 296.2846 u-aug-cc-pV6Z -113.183257 296.2847 aug-cc-pCVTZ aug-cc-pCVTZ -113.149006 296.3988 uC-aug-cc-pCVTZ aug-cc-pCVTZ -113.149135 296.3807 uC-aug-cc-pCVTZ -113.149289 296.3805 u-aug-cc-pCVTZ aug-cc-pCVTZ -113.150101 296.3442 u-aug-cc-pCVTZ -113.152347 296.3453 31 aug-cc-pCVQZ aug-cc-pCVQZ -113.173548 296.3036 uC-aug-cc-pCVQZ aug-cc-pCVQZ -113.173603 296.2988 uC-aug-cc-pCVQZ -113.173707 296.2988 u-aug-cc-pCVQZ aug-cc-pCVQZ -113.173777 296.2918 u-aug-cc-pCVQZ -113.174502 296.2926 aug-cc-pCV5Z aug-cc-pCV5Z -113.180920 296.2854 uC-aug-cc-pCV5Z aug-cc-pCV5Z -113.180916 296.2852 uC-aug-cc-pCV5Z -113.180917 296.2853 u-aug-cc-pCV5Z aug-cc-pCV5Z -113.180928 296.2850 u-aug-cc-pCV5Z -113.181004 296.2851 a Total energy for the neutral reference state. b Experimental IE for the C edge, 296.2 eV, is taken from Ref. [65] 32 Table 2.4: CO, oxygen edge. Total and ionization energies; fc-CVS-EOM-IP-CCSD with various basissets. Basis on O Basis on C CCSD energy a (a.u.) IE b (eV) 6-311+G(2df) 6-311+G(2df) -113.153403 543.7282 uC-6-311+G(2df) 6-311+G(2df) -113.156050 543.1010 uC-6-311+G(2df) -113.159353 543.1019 u-6-311+G(2df) 6-311+G(2df) -113.159201 542.9073 u-6-311+G(2df) -113.167966 543.2097 6-311+G(3df) 6-311+G(3df) -113.160756 543.6443 uC-6-311+G(3df) 6-311+G(3df) -113.163070 543.0814 uC-6-311+G(3df) -113.166218 543.0820 u-6-311+G(3df) 6-311+G(3df) -113.166134 542.8913 u-6-311+G(3df) -113.193741 542.8975 aug-cc-pVTZ aug-cc-pVTZ -113.159233 543.4400 uC-aug-cc-pVTZ aug-cc-pVTZ -113.159820 543.2373 uC-aug-cc-pVTZ -113.166897 543.2388 u-aug-cc-pVTZ aug-cc-pVTZ -113.162900 542.9741 u-aug-cc-pVTZ -113.194170 542.9772 aug-cc-pVQZ aug-cc-pVQZ -113.199349 543.0147 uC-aug-cc-pVQZ aug-cc-pVQZ -113.199742 542.8764 33 uC-aug-cc-pVQZ -113.201736 542.8770 u-aug-cc-pVQZ aug-cc-pVQZ -113.200312 542.8504 u-aug-cc-pVQZ -113.220245 542.8509 aug-cc-pV5Z aug-cc-pV5Z -113.213669 542.8534 uC-aug-cc-pV5Z aug-cc-pV5Z -113.213736 542.8152 uC-aug-cc-pV5Z -113.214943 542.8152 u-aug-cc-pV5Z aug-cc-pV5Z -113.214077 542.8047 u-aug-cc-pV5Z -113.229862 542.8044 aug-cc-pV6Z aug-cc-pV6Z -113.222164 542.8169 uC-aug-cc-pV6Z aug-cc-pV5Z -113.222180 542.8010 uC-aug-cc-pV6Z -113.223047 542.8010 u-aug-cc-pV6Z aug-cc-pV6Z -113.222351 542.7953 u-aug-cc-pV6Z -113.233567 542.7948 aug-cc-pCVTZ aug-cc-pCVTZ -113.197758 543.0234 uC-aug-cc-pCVTZ aug-cc-pCVTZ -113.197912 543.0079 uC-aug-cc-pCVTZ -113.198510 543.0079 u-aug-cc-pCVTZ aug-cc-pCVTZ -113.200001 542.9422 u-aug-cc-pCVTZ -113.202259 542.9441 aug-cc-pCVQZ aug-cc-pCVQZ -113.226705 542.8494 34 uC-aug-cc-pCVQZ aug-cc-pCVQZ -113.226810 542.8449 uC-aug-cc-pCVQZ -113.226909 542.8450 u-aug-cc-pCVQZ aug-cc-pCVQZ -113.227430 542.8274 u-aug-cc-pCVQZ -113.227817 542.8283 aug-cc-pCV5Z aug-cc-pCV5Z -113.235384 542.8028 uC-aug-cc-pCV5Z aug-cc-pCV5Z -113.235385 542.8026 uC-aug-cc-pCV5Z -113.235389 542.8026 u-aug-cc-pCV5Z aug-cc-pCV5Z -113.235461 542.8012 u-aug-cc-pCV5Z -113.235509 542.8013 a Total energy for the neutral reference state. b Experimental IE for the O edge, 542.5 eV, is taken from [65] 35 2.3.3 Simplehydridesandtheeectofbasisonhydrogen Small hydrides, water, ammonia, and methane, represent 3 dierent edges in molecules with hydrogen atoms. We use this set to investigate the eect that the basis set on the H atoms has on the heavy atoms’ core IEs. We compare our ndings with those of the previous study[41]. Table 2.5 shows the results computed with the Pople (3df) and Dunning basis sets on the heavy atom, combined with the matching contracted bases on hydrogen. The results in Table 2.5 show that the dierences between the smaller bases are similar to the results for N 2 and CO and that the IEs converge from above to the basis-set limit. The basis-set limit results are slightly above the experimental values; the largest dierence from the experiment is observed for carbon (0.18 eV). This is similar to the ndings in Ref. [41]. In contrast to the observation in Ref. [41], that the eect of the basis set beyond triple- is moderate (0.1 eV dierence between the triple- to quadruple- results), we observe somewhat larger eects for this set ( 0.4 eV), as well as for the N 2 and CO molecules discussed above. This dierence is likely due to the dierent treatment of core electrons in the ground-state optimiza- tion step in the CVS-EOM-CCSD and fc-CVS-EOM-CCSD (we also note that the structures used in Ref. [41] are slightly dierent). However, going from quadruple to quintuple-, we observe a similar change of< 0.1 eV. Thus, the results of both studies indicate near-convergence to the basis-set limit at the quintuple- basis. As the largest basis in the present calculations, we use uC-aug-cc-pV5Z (the IEs drop by0.03 eV upon uncontraction); below, we refer to these results as the basis-set limit. Table 2.6 collects the IEs computed with uC-aug-cc-pV5Z on the heavy atom and smaller bases on hydrogens. As expected, the eect of the hydrogen basis on the core IEs is not large. For 36 example, using aug-cc-pVQZ or even aug-cc-pVTZ instead of aug-cc-pV5Z changes the IEs by less than 0.001 and 0.005 eV, respectively, while signicantly reducing the number of basis func- tions. Thus, one may consider using a contracted triple- basis (or even smaller) on hydrogens in calculations of larger molecules as a viable cost-saving strategy. Table2.5: CoreIEsforH 2 O,NH 3 ,andCH 4 computedwithmixedbasissets. Basis on active edge Basis on H #b.f. (H) Core IE (eV) H 2 O NH 3 CH 4 6-311+G(3df) 6-311G 3 540.9000 406.3149 291.0717 uC-6-311+G(3df) 6-311G 3 540.2500 405.8179 290.7117 aug-cc-pVTZ aug-cc-pVTZ 23 540.6570 406.0944 290.9248 uC-aug-cc-pVTZ aug-cc-pVTZ 23 540.4573 405.9537 290.8383 aug-cc-pVQZ aug-cc-pVQZ 46 540.2110 405.7655 290.6862 uC-aug-cc-pVQZ aug-cc-pVQZ 46 540.0857 405.6690 290.6209 aug-cc-pV5Z aug-cc-pV5Z 80 540.0530 405.6505 290.6103 uC-aug-cc-pV5Z aug-cc-pV5Z 80 540.0162 405.6197 290.5845 Experimental core IEs a (eV) 539.9 405.6 290.76 a Ref. [66]. Table2.6: CoreIEsforH 2 O,NH 3 ,andCH 4 computedwithmixedbasissets. Basis on active edge Basis on H #b.f. (H) Core IE (eV) H 2 O NH 3 CH 4 uC-aug-cc-pV5Z aug-cc-pV5Z 80 540.0162 405.6197 290.5845 u-aug-cc-pVQZ 48 540.0151 405.6189 290.5843 aug-cc-pVQZ 46 540.0148 405.6185 290.5839 u-aug-cc-pVTZ 25 540.0107 405.6144 290.5805 aug-cc-pVTZ 23 540.0103 405.6139 290.5795 aug-cc-pVDZ 9 539.9974 405.5971 290.5617 cc-pVDZ 5 539.9965 405.5958 290.5621 2.3.4 Usingmixedbasissetsformoleculeswithmultipleedges In this section we further investigate the idea of using mixed bases sets. This strategy has been exploited in numerous previous studies (see, for example, Refs. [46, 50, 52]), which have shown that the results are most sensitive to the basis set on the atom being ionized, while the rest of the 37 Table2.7: CoreIEsinCOcomputedwithmixedbasissetsoncarbonandoxygenedges a . Basis on C Basis on O CCSD energy b (a.u.) IE a (eV) uC-aug-cc-pV5Z aug-cc-pVDZ -113.118361 296.2136 aug-cc-pVTZ -113.159656 296.2589 aug-cc-pVQZ -113.174465 296.2832 aug-cc-pV5Z -113.180916 296.2852 aug-cc-pVDZ uC-aug-cc-pV5Z -113.172430 542.7505 aug-cc-pVTZ -113.193331 542.7850 aug-cc-pVQZ -113.206596 542.8045 aug-cc-pV5Z -113.235385 542.8026 a The uC-aug-cc-pV5Z basis is used on active edges. b Total energy for the neutral reference state. atoms can be treated with standard bases. We use the CO molecule as an example and employ a larger basis (uC-aug-cc-pV5Z) on the active edge, and a smaller basis on the inactive edge. Table 2.7 shows the results of these calculations. We observe that using a quadruple- or even a triple- basis on inactive edges leads to relatively small dierences in IEs (less than 0.05 eV). However, the IEs no longer approach the basis-set limit from above. For example, the calculation with aug- cc-pVDZ on the inactive edge yields smaller IE than the calculation with aug-cc-pVQZ on the inactive edge, which indicates potential imbalance of such approach. 2.3.5 Othercost-savingstrategies In larger molecules, using uC-aug-cc-pV5Z on all heavy atoms quickly becomes prohibitively expensive. For example, for molecules with just 4 second row atoms, the total number of basis functions in uC-aug-cc-pV5Z exceeds 500. Aside the obvious choice of using a smaller set, the cost of the calculations can be reduced by using single-precision execution[53] and truncation of virtual space using frozen natural orbitals approach[68, 69]. Using single-precision execution limits the convergence thresholds. Because in the bench- mark study we desire tight convergence for the EOM energies, here we use single precision for 38 the CCSD step only. Because CCSD is the scaling-determining step in the EOM-IP-CCSD calcu- lations, using single precision leads to noticeable speedup. The results are shown in Table 2.8. In agreement with previous benchmark[53], the eect of using single-precision at the CCSD step is negligible. The FNO results are collected in Table 2.9. We use an occupation criterion to control the truncation of the virtual space (for example, the FNO threshold of 99.99 % means that this much of the total natural occupation is recovered within the truncated orbital space). We observe that the errors due to virtual space truncation for a particular value of FNO threshold are similar for all three hydrides. The errors with FNO threshold of 99.99%, which amounts to freezing 27-28% of the virtual orbitals, are around 0.06 eV. This relatively large value illustrates the importance of virtual orbitals in describing the relaxation eects due to core ionization. Combining the FNO approximation with single precision at the CCSD level leads to noticeable reduction of computational time in methane and ammonia (about 7-fold speedup), while the eect in water was much smaller. Table2.8: CoreIEsforH 2 O,NH 3 ,andCH 4 computedwithsingleanddoubleprecisionCCSD a . Molecule Precision CCSD energy (a.u.) b Core IE (eV) H 2 O Double -76.35986226 540.0151 Single -76.35986228 540.0151 NH 3 Double -56.49013066 405.6189 Single -56.49013076 405.6189 CH 4 Double -40.44666824 290.5843 Single -40.44666822 290.5843 a Active edge basis: uC-aug-cc-pV5Z, H basis: uC-aug-cc-pVQZ. b Total energy for the neutral reference state. 8 decimal places are shown in order to demonstrate that the dierence is only in the 8th decimal place. CCSD convergence thresholds in single-precision calculation: 10 4 for amplitudes and 10 5 for energies. 39 Table 2.9: Core IEs for H 2 O, NH 3 , and CH 4 computed with the FNO-based truncation of the virtualspace. Molecule a FNO thresh b Act. virt. Frzn. virt. CCSD energy (a.u.) c IE d (eV) H 2 O 99.00 48 178 -76.348978 -0.6967 NH 3 55 219 -56.481735 -0.5238 CH 4 65 257 -40.440410 -0.5280 H 2 O 99.90 115 111 -76.358738 -0.1541 NH 3 135 139 -56.489228 -0.1781 CH 4 159 163 -40.446014 -0.1632 H 2 O 99.99 165 61 -76.359781 -0.0656 NH 3 200 74 -56.490056 -0.0670 CH 4 233 89 -40.446609 -0.0614 a Active edge basis: uC-aug-cc-pV5Z, H basis: uC-aug-cc-pVQZ. b Population threshold: this fraction of total natural occupation is recovered by the active virtual orbitals. c Total energy for the neutral reference state. d IE shift relative to the full orbital space values in Table 2.8. 2.3.6 Acrolein Table2.10: Acrolein,coreIEsfortheoxygenedge. Basis on C Basis on O Basis on H CCSD energy a (a.u) IE (eV) 6-311+G(3df) 6-311+G(3df) 6-311G -191.605984 540.0669 uC-6-311+G(3df) -191.608320 539.5005 u-6-311+G(3df) -191.611347 539.3078 aug-cc-pVQZ aug-cc-pV5Z aug-cc-pVTZ -191.712598 539.2886 aug-cc-pVTZ aug-cc-pVTZ -191.661197 539.2722 aug-cc-pVTZ aug-cc-pVDZ -191.643593 539.2678 aug-cc-pVQZ uC-aug-cc-pV5Z aug-cc-pVTZ -191.712657 539.2502 aug-cc-pVTZ aug-cc-pVTZ -191.661307 539.2325 aug-cc-pVTZ aug-cc-pVDZ -191.643713 539.2278 a Total energy for the neutral reference state. Acrolein (shown in Fig. 2.1) is an interesting model system with 3 chemically distinct carbon atoms: C1 is connected to 2 hydrogens and one carbon, C2 is connected to two carbons and one hydrogen, and C3 is connected to one hydrogen, one carbon, and one oxygen. We use this molecule to test how multiple IEs corresponding to the same edge are described with dierent bases and test whether our observations based on CO are transferable to a larger molecule. The 40 Table2.11: Acrolein,coreIEsforthecarbonedge. Basis on C Basis on O Basis on H CCSD energy a (a.u) IE (eV) b Shift (eV) c 6-311+G(3df) 6-311+G(3df) 6-311G -191.538887 291.9013 0.0 292.1980 0.2967 294.5433 2.6420 uC-6-311+G(3df) 6-311+G(3df) 6-311G -191.541545 291.6330 0.0 291.9216 0.2886 294.2588 2.6258 u-6-311+G(3df) 6-311+G(3df) 6-311G -191.545560 291.5079 0.0 291.7872 0.2793 294.1488 2.6409 aug-cc-pVQZ aug-cc-pVTZ aug-cc-pVTZ -191.601682 291.6321 0.00 291.9026 0.2705 294.2649 2.6328 uC-aug-cc-pVQZ aug-cc-pVTZ aug-cc-pVTZ -191.602010 291.5453 0.00 291.8155 0.2702 294.1730 2.6277 aug-cc-pV5Z aug-cc-pVTZ aug-cc-pVTZ -191.611970 291.5364 0.00 291.8071 0.2707 294.1650 2.6286 aug-cc-pV5Z aug-cc-pVTZ aug-cc-pVDZ -191.606131 291.5298 0.00 291.7991 0.2693 294.1585 2.6287 uC-aug-cc-pV5Z aug-cc-pVTZ aug-cc-pVTZ -191.612021 291.5086 0.00 291.7794 0.2708 294.1359 2.6273 uC-aug-cc-pV5Z aug-cc-pVTZ aug-cc-pVDZ -191.606239 291.5012 0.00 291.7702 0.2690 294.1286 2.6274 a Total energy for the neutral reference state. b The IEs are arranged in the order C2, C1, and C3, refer to Fig. 2.1. c Experimental shifts[70] in IEs are 0.0, 0.0, and 2.6 with respect to C1 (the experimental resolution is0.17 eV). 41 available experimental results for the carbon edge, reported as shifts relative to C1, are from Ref. [70]. Table 2.10 and 2.11 collect the results obtained using aug-cc-pV5Z and uC-aug-cc-pV5Z for the active edge. We observe that for both edges uncontracting the core in this basis leads to 0.03-0.04 eV drop in IE. Let us rst discuss the results for the oxygen edge. In these calculations, our largest basis for the inactive edge was aug-cc-pVQZ. Further reducing this basis to aug-cc-pVTZ leads to a change of 0.02 eV. The eect of the basis on the hydrogen is even smaller — for example, reducing the basis on hydrogens from triple- to double- changes the IEs by 0.005 eV only. The trend in IEs computed with Pople’s bases is similar to the previous cases. The results for u-6-311+G(3df)/6- 311+g(3df)/6-311G are within 0.06 eV from the uC-aug-cc-pV5Z/aug-cc-pVQZ/aug-cc-pVTZ (our largest basis in this calculation). The total number of basis functions in these two calculations are 181 and 467, respectively. The results for the carbon edge, shown in Table 2.11, follow similar trends. We observe that the chemical shifts (the dierence between 1s C IEs relative to C1) converge with respect to the basis much faster than the absolute values. As said, the largest basis used in this calculation is uC-aug-cc-pV5Z/aug-cc-pVTZ/aug-cc-pVTZ. The results for u-6-311+G(3df)/6-311+g(3df)/6- 311G are within 0.02 eV from that value. Comparing our best estimates to the experiment (only the shifts were reported in Ref. [70]), we note excellent agreement for C3 (2.63 eV versus 2.6 eV), however, for the C2 shift we consistently obtain0.3 eV, versus near zero shift reported in Ref. [70]. We note that the experimental resolution in this study was 0.17 eV and that the calculations reported in this paper also suggested a larger value of the shift. 42 2.3.7 Glycine Glycine (C 2 H 5 NO 2 , canonical form shown in Fig. 2.2) is a polyatomic molecule featuring multiple core IEs and three dierent edges[47, 71, 72]. Table 2.12 shows the results for the mixed basis sets in which we used uC-aug-cc-pV5Z for the active edge and aug-cc-pVTZ basis for other heavy atoms. For the hydrogens, we used aug-cc-pVDZ and aug-cc-pVTZ. Similarly to the acrolein example, the dierence in IEs between these calculations is 0.01 eV. Table 2.13 shows the results with Pople’s bases. As in other cases, we see that the u-6- 311+G(3df)/6-311+G(3df)/6-311G results are within 0.07 eV from our best estimates. We also performed calculations with the fully uncontracted Pople bases on all atoms and, as in previous cases did not observe much dierence (results not shown). Finally, Fig. 2.6 compares the selected results against the available experimental values and our best estimate. The IEs computed with u-6-311+G(3df)/6-311+G(3df)/6-311G are within 0.2 eV for oxygen and carbon edges, and 1 eV for nitrogen edge from the experiments. The shifts between C1/C2 and O1/O2 are also reproduced well. Table2.12: GlycinecoreIEsforalledgeswithmixedbasissets a . C Edge Basis on H CCSD energy b (a.u.) IE C1 (eV) IE C2 (eV) aug-cc-pVDZ -284.013614 292.5430 295.1946 aug-cc-pVTZ -284.027506 292.5531 295.1988 O Edge Basis on H CCSD energy b (a.u.) IE O1 (eV) IE O2 (eV) aug-cc-pVDZ -284.089201 538.6577 540.2119 aug-cc-pVTZ -284.110647 538.6632 540.2216 N Edge Basis on H CCSD Energy b (a.u.) IE N (eV) aug-cc-pVDZ -284.037027 406.5789 aug-cc-pVTZ -284.053208 406.5922 a Active edge basis: uC-aug-cc-pV5Z, inactive edge basis: aug-cc-pVTZ. b Total energy for the neutral reference state. 43 Table2.13: GlycinecoreIEswithPoplebasissets. C Edge a Basis on active edge CCSD Energy b (a.u.) IE C1 (eV) IE C2 (eV) 6-311+G(2df) -283.922367 293.0310 295.6453 uC-6-311+G(2df) -283.924455 292.7187 295.3349 u-6-311+G(2df) -283.927506 292.5870 295.2341 6-311+G(3df) -283.941538 293.0077 295.5965 uC-6-311+G(3df) -283.943366 292.7032 295.3202 u-6-311+G(3df) -283.946117 292.5704 295.2269 O Edge a Basis CCSD Energy (a.u.) IE O1 (eV) IE O2 (eV) 6-311+G(2df) -283.986182 539.5400 541.0882 uC-6-311+G(2df) -283.991309 538.9286 540.4764 u-6-311+G(2df) -283.997698 538.7314 540.2612 6-311+G(3df) -284.009612 539.4839 541.0594 uC-6-311+G(3df) -284.014362 538.9247 540.4761 u-6-311+G(3df) -284.020468 538.7336 540.2695 N Edge a Basis CCSD Energy (a.u.) IE N (eV) 6-311+G(2df) -283.964956 407.3069 uC-6-311+G(2df) -283.966768 406.8375 u-6-311+G(2df) -283.969069 406.6510 6-311+G(3df) -283.986780 407.2477 uC-6-311+G(3df) -283.988335 406.8096 u-6-311+G(3df) -283.990529 406.6262 a See Fig. 2.2 for notations. b Total energy for the neutral reference state. c Inactive edge and H basis is the contracted version of the basis on active edge Experimental IEs: C1: 292.3 eV, C2: 295.2 eV, O1: 538.4 eV, O2: 540.2 eV, N: 405.4 eV(from Ref. [72]). 2.4 Conclusion We presented a computational study of basis-set eects in calculations of core-ionized states using a correlated method, fc-CVS-EOM-IP-CCSD. In agreement with previous studies, we ob- served that core-level states require higher-quality basis sets than valence states because of the large perturbation on the electronic structure due to removal of a core electron. Although the 44 Figure 2.6: Glycine IEs for the carbon, nitrogen and oxygen edges versus the number of ba- sis functions per active edge atom. The total number of basis functions in each calculation is shown in parentheses in the respective panel. The best estimate is obtained with the uC-aug- cc-pV5Z/aug-cc-pVTZ/aug-cc-pVTZbasis;therespectivetotalnumberofbasisfunctionsandthe numberofbasisfunctionsperactiveedgeatomareshowninparentheses. converged results can be obtained by using very large Dunning’s bases, such as aug-cc-pCV5Z, we investigated a dierent strategy, that is, using core- and fully uncontracted basis sets. Our re- sults indicate that this approach is much more eective. We observe especially good performance for uncontracted Pople’s bases. For example, the results with u-6-311G+(3df) are of nearly the same quality as with aug-cc-pV5Z, despite having 60% fewer basis functions. For the systems we studied, the results with uC-6-311+G(3df) and u-6-311+G(3df) are within 0.07 eV from the basis- set limit. These errors are smaller than the anticipated errors due to an incomplete treatment of electron correlation. Slightly smaller bases, uC-6-311+G(2df) and u-6-311+G(2df) also perform very well. Thus, our recommended approach to core-level calculations is to use the uncontracted 45 variants of the standard bases. The largest gain is achieved by uncontracting the core. The re- sults show that it is sucient to uncontract only the basis used for the active edge, while treating the rest of the atoms with matching contracted bases. Smaller bases can be used on hydrogens, without signicant eect on the core IEs. We also investigated more aggressive cost-saving strategies: using mixed bases on active and inactive edges, using single-precision at the CCSD step, and using the FNO-based truncation of the virtual space. The results pave a way towards cost-eective and accurate calculations of core-level states. Future work entails investigation of basis-set eects for calculations of core- level states of heavier elements. While preliminary calculations conrm that uncontracting the standard bases improves the description of lower edges as well, detailed benchmarks including spin-orbit couplings are necessary for quantitative assessment of optimal basis set choices for heavier elements. This work is currently in progress. 46 Bibliography [1] S. Mobilio, F. Boscherini, and C. Meneghini, editors, Synchrotron Radiation: Basics, Methods and Applications. Springer, 2014. [2] J. A. van Bokhoven and C. Lamberti, editors, X-Ray Absorption and X-ray Emission Spec- troscopy; Theory and Applications. Wiley & Sons, 2016. [3] U. Bergmann, V. K. Yachandra, and J. Yano, editors, X-Ray Free Electron Lasers: Applications in Materials, Chemistry and Biology, Number 18 in Energy and Environment Series. Royal Society of Chemistry, 2017. [4] M. Nisoli, P. Decleva, F. Calegari, A. Palacios, and F. Martín, Attosecond electron dynamics in molecules, Chem. Rev.117, 10760 (2017). [5] M. Ahmed and O. Kostko, From atoms to aerosols: probing clusters and nanoparticles with synchrotron based mass spectrometry and X-ray spectroscopy, Phys. Chem. Chem. Phys. 22, 2713 (2020). [6] O. Kostko, B. Bandyopadhyay, and M. Ahmed, Vacuum ultraviolet photoionization of com- plex chemical systems, Annu. Rev. Phys. Chem.67, 19 (2016). [7] P. Norman and A. Dreuw, Simulating X-ray spectroscopies and calculating core-excited states of molecules, Chem. Rev.118, 7208 (2018). [8] T. Fransson, Y. Harada, N. Kosugi, N. A. Besley, B. Winter, J. J. Rehr, L. G. M. Pettersson, and A. Nilsson, X-ray and electron spectroscopy of water, Chem. Rev.116, 7551 (2016). [9] J. A. Pople, Theoretical models for chemistry, inEnergy,StructureandReactivity: Proceedings of the 1972 Boulder Summer Research Conference on Theoretical Chemistry, edited by D.W. Smith and W.B. McRae, pages 51–61. Wiley, New York, 1973. [10] K. Emrich, An extension of the coupled-cluster formalism to excited states (I), Nucl. Phys. A351, 379 (1981). [11] J. F. Stanton and R. J. Bartlett, The equation of motion coupled-cluster method. A systematic biorthogonal approach to molecular excitation energies, transition probabilities, and excited state properties, J. Chem. Phys.98, 7029 (1993). [12] A. I. Krylov, Equation-of-motion coupled-cluster methods for open-shell and electronically excited species: The hitchhiker’s guide to Fock space, Annu. Rev. Phys. Chem.59, 433 (2008). 47 [13] R. J. Bartlett, The coupled-cluster revolution, Mol. Phys.108, 2905 (2010). [14] K. Sneskov and O. Christiansen, Excited state coupled cluster methods, WIREs: Comput. Mol. Sci.2, 566 (2012). [15] R. J. Bartlett, Coupled-cluster theory and its equation-of-motion extensions, WIREs: Com- put. Mol. Sci.2, 126 (2012). [16] N. A. Besley, A. T. B. Gilbert, and P. M. W. Gill, Self-consistent-eld calculations of core excited states, J. Chem. Phys.130, 124308 (2009). [17] M. Nooijen and R. J. Bartlett, Description of core-excitation spectra by the open-shell electron-attachment equation-of-motion coupled cluster method, J. Chem. Phys.102, 6735 (1995). [18] S. Coriani, O. Christiansen, T. Fransson, and P. Norman, Coupled-cluster response theory for near-edge x-ray-absorption ne structure of atoms and molecules, Phys. Rev. A 85, 022507 (2012). [19] S. Coriani, T. Fransson, O. Christiansen, and P. Norman, Asymmetric-Lanczos-chain-driven implementation of electronic resonance convergent coupled-cluster linear response theory, J. Chem. Theory Comput.8, 1616 (2012). [20] T. J. Watson and R. J. Bartlett, Innite order relaxation eects for core ionization energies with a variational coupled cluster ansatz, Chem. Phys. Lett.555, 235 (2013). [21] S. Sen, A. Shee, and D. Mukherjee, A study of the ionisation and excitation energies of core electrons using a unitary group adapted state universal approach, Mol. Phys.11, 2625 (2013). [22] D. Zuev, E. Vecharynski, C. Yang, N. Orms, and A. I. Krylov, New algorithms for iterative matrix-free eigensolvers in quantum chemistry, J. Comput. Chem.36, 273 (2015). [23] B. Peng, P. J. Lestrange, J. J. Goings, M. Caricato, and X. Li, Energy-specic equation-of- motion coupled-cluster methods for high-energy excited states: Application to K-edge X-ray absorption spectroscopy, J. Chem. Theory Comput.11, 4146 (2015). [24] S. Coriani and H. Koch, Communication: X-ray absorption spectra and core-ionization potentials within a core-valence separated coupled cluster framework, J. Chem. Phys.143, 181103 (2015). [25] S. Coriani and H. Koch, Erratum: "Communication: X-ray absorption spectra and core- ionization potentials within a core-valence separated coupled cluster framework" [J. Chem. Phys. 143, 181103 (2015)], J. Chem. Phys.145, 149901 (2016). [26] A. Sadybekov and A. I. Krylov, Coupled-cluster based approach for core-ionized and core- excited states in condensed phase: Theory and application to dierent protonated forms of aqueous glycine, J. Chem. Phys.147, 014107 (2017). 48 [27] R. H. Myhre, S. Coriani, and H. Koch, Near-edge X-ray absorption ne structure within multilevel coupled cluster theory, J. Chem. Theory Comput.12, 2633 (2016). [28] A. P. Bazante, A. Perera, and R. J. Bartlett, Towards core-excitation spectra in attosecond spectroscopy: A coupled-cluster study of ClF, Chem. Phys. Lett.683, 68 (2017). [29] M. L. Vidal, X. Feng, E. Epifanovski, A. I. Krylov, and S. Coriani, A new and ecient equation-of-motion coupled-cluster framework for core-excited and core-ionized states, J. Chem. Theory Comput.15, 3117 (2019). [30] B. N. C. Tenorio, T. Moitra, M. A. C. Nascimento, A. B. Rocha, and S. Coriani, Molecular inner-shell photoabsorption/photoionization cross sections at core-valence-separated cou- pled cluster level: Theory and examples, J. Chem. Phys.150, 224104 (2019). [31] K. Nanda, M. L. Vidal, R. Faber, S. Coriani, and A. I. Krylov, How to stay out of trouble in RIXS calculations within the equation-of-motion coupled-cluster damped response theory framework? Safe hitchhiking in the excitation manifold by means of core-valence separa- tion, Phys. Chem. Chem. Phys.22, 2629 (2020). [32] R. Faber and S. Coriani, Resonant inelastic X-ray scattering and nonesonant X-ray emission spectra from coupled-cluster (damped) response theory, J. Chem. Theory Comput.15, 520 (2019). [33] R. Faber and S. Coriani, Core-valence-separated coupled-cluster-singles-and-doubles complex-polarization-propagator approach to X-ray spectroscopies, Phys. Chem. Chem. Phys.22, 2642 (2020). [34] Y. C. Park, A. Perera, and R. J. Bartlett, Equation of motion coupled-cluster for core excitation spectra: Two complementary approaches, J. Chem. Phys.151, 164117 (2019). [35] L. S. Cederbaum, W. Domcke, and J. Schirmer, Many-body theory of core holes, Phys. Rev. A22, 206 (1980). [36] J. Liu, D. Matthews, S. Coriani, and L. Cheng, Benchmark calculations of K-edge ionization energies for rst-row elements using scalar-relativistic core-valence-separated equation-of- motion coupled-cluster methods, J. Chem. Theory Comput.15, 1642 (2019). [37] F. Frati, F. de Groot, J. Cerezo, F. Santoro, L. Cheng, R. Faber, and S. Coriani, Coupled cluster study of the x-ray absorption spectra of formaldehyde derivatives at the oxygen, carbon, and uorine K-edges, J. Chem. Phys.151, 064107 (2019). [38] L. Kjellsson, K. Nanda, J.-E. Rubensson, G. Doumy, S. H. Southworth, P. J. Ho, A. M. March, A. Al Haddad, Y. Kumagai, M.-F. Tu, T. Debnath, M. S. Bin Mohd Yusof, C. Arnold, W. F. Schlotter, S. Moeller, G. Coslovich, J. D. Koralek, M. P. Minitti, M. L. Vidal, M. Si- mon, R. Santra, Z.-H. Loh, S. Coriani, A. I. Krylov, and L. Young, RIXS reveals hid- den local transitions of the aqueous OH radical, Phys. Rev. Lett. (2020), submitted; https://arxiv.org/abs/2003.03909. 49 [39] G. Cavigliasso and D. P. Chong, Accurate density-functional calculation of core-electron binding energies by a total-energy dierence approach, J. Chem. Phys.111, 9485 (1999). [40] A. Mijovilovich, L. G. M. Pettersson, S. Mangold, M. Janousch, J. Susini, M. Salome, F. M. F. de Groot, and B. M. Weckhuysen, The interpretation of sulfur K-edge XANES spectra: A case study on thiophenic and aliphatic sulfur compounds, J. Phys. Chem. A113, 2750 (2009). [41] J. P. Carbone, L. Cheng, R. H. Myhre, D. Matthews, H. Koch, and S. Coriani, Advances in Quantum Chemistry, volume 79, chapter 11, An analysis of the performance of coupled cluster methods for K-edge core excitations and ionizations using standard basis sets, pages 241–261. Elsevier Inc, 2019. [42] S. Shirai, S. Yamamoto, and S. Hyodo, Accurate calculation of core-electron binding energies: Multireference perturbation treatment, J. Chem. Phys.121, 7586 (2004). [43] Y. Takahata and D. P. Chong, DFT calculation of core-electron binding energies, J. Electron. Spectrosc. Relat. Phenom.133, 69 (2003). [44] J. Wenzel, M. Wormit, and A. Dreuw, Calculating core-level excitations and x-ray absorption spectra of medium-sized closed-shell molecules with the algebraic-diagrammatic construc- tion scheme for the polarization propagator, J. Comput. Chem.35, 1900 (2014). [45] J. Wenzel, A. Holzer, M. Wormit, and A. Dreuw, Analysis and comparison of CVS-ADC approaches up to third order for the calculation of core-excited states, J. Chem. Phys.142, 214104 (2015). [46] B. Kovac, I. Ljubic, A. Kivimaki, M. Coreno, and I. Novak, Characterisation of the electronic structure of some stable nitroxyl radicals using variable energy photoelectron spectroscopy, Phys. Chem. Chem. Phys.16, 10734–10742 (2014). [47] T. Fransson, I. Zhovtobriukh, S. Coriani, K. T. Wikfeldt, P. Norman, and L. G. M. Pettersson, Requirements of rst-principles calculations of x-ray absorption spectra of liquid water, Phys. Chem. Chem. Phys.18, 566 (2016). [48] I. Tolbatov and D. M. Chipman, Benchmarking density functionals and gaussian basis sets for calculation of core-electron binding energies in amino acids, Theor. Chem. Acc.136, 82 (2017). [49] A. E. A. Fouda and N. A. Besley, Assessment of basis sets for density functional theory-based calculations of core-electron spectroscopies, Theor. Chem. Acc.137, 6 (2018). [50] M. Hanson-Heine, M. George, and N. Besley, Basis sets for the calculation of core-electron binding energies, Chem. Phys. Lett.699, 279 (2018). [51] M. A. Ambroise and F. Jensen, Probing basis set requirements for calculating core ionization and core excitation spectroscopy by the self-consistent-eld approach, J. Chem. Theory Comput.15, 325 (2019). 50 [52] D. Hait and M. Head-Gordon, Highly accurate prediction of core spectra of molecules at density functional theory cost: Attaining sub-electronvolt error from a restricted open-shell Kohn-Sham approach, J. Phys. Chem. Lett.11, 775 (2020). [53] K. A. Peterson and T. H. Dunning, Jr., Accurate correlation consistent basis sets for molecular core-valence correlation eects. the second row atoms Al-Ar, and the rst row atoms B-Ne revisited, J. Chem. Phys.117, 10548 (2002). [54] R. Krishnan, J.S. Binkley, R. Seeger, and J.A. Pople, Self-consistent molecular orbital methods. XX. A basis set for correlated wave functions, J. Chem. Phys.72, 650 (1980). [55] M.J. Frisch, J.A. Pople, and J.S. Binkley, Self-consistent molecular orbital methods 25. Sup- plementary functions for Gaussian basis sets, J. Chem. Phys.80, 3265 (1984). [56] T. H. Dunning, Jr., Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen, J. Chem. Phys.90, 1007 (1989). [57] R. A. Kendall, T. H. Dunning, Jr., and R. J. Harrison, Electron anities of the rst-row atoms revisited. Systematic basis sets and wavefunctions, J. Chem. Phys.96, 6796 (1992). [58] M. L. Vidal, A. I. Krylov, and S. Coriani, Dyson orbitals within the fc-CVS-EOM-CCSD framework: theory and application to X-ray photoelectron spectroscopy of ground and ex- cited states, Phys. Chem. Chem. Phys.22, 2693 (2020). [59] M. L. Vidal, A. I. Krylov, and S. Coriani, Correction to: "Dyson orbitals within the fc- CVS-EOM-CCSD framework: theory and application to X-ray photoelectron spectroscopy of ground and excited states", Phys. Chem. Chem. Phys.22, 3744 (2020). [60] J. C. Slater, Atomic shielding constants, Phys. Rev.36, 57 (1930). [61] Y. Shao, Z. Gan, E. Epifanovsky, A.T.B. Gilbert, M. Wormit, J. Kussmann, A.W. Lange, A. Behn, J. Deng, X. Feng, D. Ghosh, M. Goldey, P.R. Horn, L.D. Jacobson, I. Kaliman, R.Z. Khaliullin, T. Kus, A. Landau, J. Liu, E.I. Proynov, Y.M. Rhee, R.M. Richard, M.A. Rohrdanz, R.P. Steele, E.J. Sundstrom, H.L. Woodcock III, P.M. Zimmerman, D. Zuev, B. Albrecht, E. Alguires, B. Austin, G.J.O. Beran, Y.A. Bernard, E. Berquist, K. Brandhorst, K.B. Bravaya, S.T. Brown, D. Casanova, C.-M. Chang, Y. Chen, S.H. Chien, K.D. Closser, D.L. Crittenden, M. Diedenhofen, R.A. DiStasio Jr., H. Do, A.D. Dutoi, R.G. Edgar, S. Fatehi, L. Fusti-Molnar, A. Ghysels, A. Golubeva-Zadorozhnaya, J. Gomes, M.W.D. Hanson-Heine, P.H.P. Harbach, A.W. Hauser, E.G. Hohenstein, Z.C. Holden, T.-C. Jagau, H. Ji, B. Kaduk, K. Khistyaev, J. Kim, J. Kim, R.A. King, P. Klunzinger, D. Kosenkov, T. Kowalczyk, C.M. Krauter, K.U. Laog, A. Laurent, K.V. Lawler, S.V. Levchenko, C.Y. Lin, F. Liu, E. Livshits, R.C. Lochan, A. Luenser, P. Manohar, S.F. Manzer, S.-P. Mao, N. Mardirossian, A.V. Marenich, S.A. Maurer, N.J. Mayhall, C.M. Oana, R. Olivares-Amaya, D.P. O’Neill, J.A. Parkhill, T.M. Perrine, R. Peverati, P.A. Pieniazek, A. Prociuk, D.R. Rehn, E. Rosta, N.J. Russ, N. Sergueev, S.M. Sharada, S. Sharmaa, D.W. Small, A. Sodt, T. Stein, D. Stuck, Y.-C. Su, A.J.W. Thom, T. Tsuchimochi, L. Vogt, O. Vydrov, T. Wang, M.A. Watson, J. Wenzel, A. White, C.F. Williams, V. Vanovschi, S. Yeganeh, S.R. Yost, Z.-Q. You, I.Y. Zhang, X. Zhang, Y. Zhou, B.R. Brooks, G.K.L. Chan, D.M. Chipman, 51 C.J. Cramer, W.A. Goddard III, M.S. Gordon, W.J. Hehre, A. Klamt, H.F. Schaefer III, M.W. Schmidt, C.D. Sherrill, D.G. Truhlar, A. Warshel, X. Xu, A. Aspuru-Guzik, R. Baer, A.T. Bell, N.A. Besley, J.-D. Chai, A. Dreuw, B.D. Dunietz, T.R. Furlani, S.R. Gwaltney, C.-P. Hsu, Y. Jung, J. Kong, D.S. Lambrecht, W.Z. Liang, C. Ochsenfeld, V.A. Rassolov, L.V. Slipchenko, J.E. Subotnik, T. Van Voorhis, J.M. Herbert, A.I. Krylov, P.M.W. Gill, and M. Head-Gordon, Advances in molecular quantum chemistry contained in the Q-Chem 4 program package, Mol. Phys.113, 184 (2015). [62] A. I. Krylov and P. M. W. Gill, Q-Chem: An engine for innovation, WIREs: Comput. Mol. Sci.3, 317 (2013). [63] P. U. Manohar, J. F. Stanton, and A. I. Krylov, Perturbative triples correction for the equation- of-motion coupled-cluster wave functions with single and double substitutions for ionized states: Theory, implementation, and examples, J. Chem. Phys.131, 114112 (2009). [64] B. P. Pritchard, D. Altarawy, B. Didier, T. D. Gibson, and T. L. Windus, New basis set ex- change: An open, up-to-date resource for the molecular sciences community, J. Chem. Inf. Model.59, 4814 (2019). [65] W. L. Jolly, K. D. Bomben, and C. J. Eyermann, Core-electron binding energies for gaseous atoms and molecules, Atomic Data and Nuclear Data Tables31, 433 (1984). [66] J. Schirmer, A. Tromov, K. Randall, J. Feldhaus, A. M. Bradshaw, Y. Ma, C.T. Chen, and F. Sette, K-shell excitation of the water, ammonia, and methane molecules using high- resolution photoabsorption spectroscopy, Phys. Rev. A47, 1136 (1993). [67] P. Pokhilko, E. Epifanovskii, and A. I. Krylov, Double precision is not needed for many-body calculations: Emergent conventional wisdom, J. Chem. Theory Comput.14, 4088 (2018). [68] A. Landau, K. Khistyaev, S. Dolgikh, and A. I. Krylov, Frozen natural orbitals for ionized states within equation-of-motion coupled-cluster formalism, J. Chem. Phys. 132, 014109 (2010). [69] P. Pokhilko, D. Izmodenov, and A. I. Krylov, Extension of frozen natural orbital approxima- tion to open-shell references: Theory, implementation, and application to single-molecule magnets, J. Chem. Phys.152, 034105 (2020). [70] D. Duot, J.-P. Flament, I. C. Walker, J. Heinesch, and M.-J. Hubin-Franskin, Core shell excitation of 2-propenal (acrolein) at the O 1s and C 1s edges: An experimental and ab initio study, J. Chem. Phys.118, 1137 (2003). [71] R. H. Myhre, S. Coriani, and H. Koch, X-ray and UV spectra of glycine within coupled cluster linear response theory, J. Phys. Chem. A123, 9701 (2019). [72] O. Plekan, V. Feyer, R. Richter, M. Coreno, M. de Simone, K. C. Prince, and V. Carravetta, Investigation of the amino acids glycine, proline, and methionine by photoemission spec- troscopy, J. Phys. Chem. A111, 10998 (2007). 52 Chapter3 Charge-transfer-to-solventstatesprovideasensitive spectroscopicprobeofthelocalsolventstructurearound anions 3.1 Introduction Understanding the microscopic origins of ion-specic macroscopic properties of solutions such as surface tension, adsorption enthalpies and free energies, protein solubility, and chaotropicity has motivated intense research eorts, owing to their importance in biology[1], interfacial and atmospheric chemistry[2], electrochemistry[3], water purication and desalination[4], biofuel production[5, 6], pharmaceutical and biomedical applications[7], and more. The microscopic understanding critically depends on the characterization of the solvation environment of ions in the bulk or at interfaces. Spectroscopic techniques that are sensitive to the local solvent structure around ions, therefore, can serve as a tool for microscopic characterization of these ion-specic properties. 53 Small anions such as halides, hydroxide, and thiocyanate do not support bound excited elec- tronic states in the gas phase. When solvated, they exhibit broad, featureless bands in the deep UV (> 5 eV) region. These bands, which are called charge-transfer-to-solvent (CTTS) excitations[8– 10], arise due to the excitation of an electron into a solvent cavity, suciently close to the molec- ular core to have a non-zero oscillator strength. The CTTS states act as precursors to solvated electrons[11–16]. Because the CTTS excitations are solvent supported, they are highly sensitive to the local arrangement of the solvent around the anion and factors that can aect solvent struc- ture, such as temperature, pressure, and presence of other solvated species[10, 17]. Hence, these states can serve as a basis for spectroscopic techniques aiming to probe local solvent structure and its dynamical uctuations. Many studies of CTTS states have employed UV–visible (UV–vis) spectroscopy[8–10, 18–22]. Recently, applications of two-photon absorption (2PA) and surface-sensitive nonlinear spectro- scopies such as second harmonic generation, electronic sum-frequency generation (SFG), and vibrational SFG to probe anions (such as halides and thiocyanate) in the bulk and at air–liquid interfaces have been reported[23–27]. Supported by robust ab initio modeling, these spectro- scopies can exploit the CTTS transitions and help to elucidate the local solvent structure around the anions. A few computational studies for modeling such spectroscopic experiments have been reported[21, 28], however, robust computational protocols based on high-level electronic struc- ture methods are scarce. Several studies have investigated the CTTS states in a variety of anion–solvent systems exper- imentally and computationally[9, 18–22, 29, 30]. In particular, the CTTS transitions of aqueous 54 iodide [11, 20, 31–34] have garnered considerable attention. In a series of landmark papers[11– 13], Sheu and Rossky simulated this system and its dynamical evolution, including the genera- tion of solvated electrons, using a hybrid QM/MM (quantum mechanics/molecular mechanics) approach. Bradforth and Jungwirth[21] carried out a full ab initio calculation of energies and wave functions of the CTTS states and showed that although the long-range solvent polarization provides a dominant contribution to the binding energy of the CTTS states, it can be captured by the QM/MM calculations using xed point charges. By using MP2 to describe the solute (io- dide) and point charges to describe the solvent, they were able to compute the CTTS band within 0.4 eV from the experiment. Recently, Herbert and co-workers reported calculations of the io- dide CTTS bands within 0.1 eV from the experiment using a projection-based embedding scheme within the time-dependent density functional theory (DFT)[28]. Here, we focus on a more complex system, the aqueous thiocyanate anion. SCN (aq) is one of the most chaotropic ions in the Hofmeister series, i.e., it disrupts the hydrogen-bonding network of water signicantly[35, 36]. Its solvation shell is highly asymmetric and has an eggshell-like shape, with a larger radius around the sulfur end[37]. It is sparsely hydrated in bulk and shows preference for interfaces, which makes it suitable for surface-sensitive spectroscopies[23, 24, 26]. These features of SCN (aq) render it an interesting model system for studying bulk and surface CTTS states. The room-temperature CTTS UV–vis spectra of SCN (aq) exhibit two major peaks at 5.78 eV and 6.86 eV, with a hint of a third (strong) peak around 7.14 eV[19]. However, neither the band positions nor their assignment has been backed up by computational modeling. To reliably model UV–vis spectra of solvated anions, one needs to use a high-level electronic structure method ca- pable of treating local, diuse, and charge-transfer (CT) transitions accurately and on an equal 55 footing, preferably, wave-function based, although successful applications of DFT have been reported[28]. The simulations require suciently diuse basis sets and accurate description of the solvent structure and its thermal uctuations around the anion. Modeling spectroscopy in the condensed phase requires reliable sampling over equilibrium trajectories to incorporate thermal structural uctuations. Here, we perform such averaging by selecting structures (snapshots) from two dierent treatments of equilibrium dynamics. With the two sets of structures, we compute two UV–vis spectra employing a quantum/classical (QM/MM) embedding scheme based on a high-level electronic structure method. By comparing these two spectra with the experimental spectrum and by characterizing individual excitations by a variety of electronic descriptors, we developed a robust, computationally feasible protocol for simulating the UV–vis spectra of SCN (aq) with a high-level ab initio method. We envision that this protocol will serve as a basis for future simulations of higher-order nonlinear spectra of SCN (aq) and other solvated anions. The rst set of snapshots was obtained using classical molecular dynamics (MD) with a stan- dard non-polarizable force-eld. In the MD simulations, we used TIP3P water[38] and the force- eld parameters for sodium thiocyanate of Tesei et. al[39], which were reported to perform well for modeling both bulk and interface solvation. The second set was obtained using QM/MM ab initio MD (AIMD) with a high-quality density functional (!B97X-D)[40, 41]. This choice was in part motivated by the results of Baer and Mundy[42], who studied the dynamics of bulk and interfacial anions (including thiocyanate) using the BLYP functional with Grimme’s dispersion. We computed the UV–vis spectra for the two sets of snapshots using the high-level equation- of-motion coupled-cluster method with single and double excitations (EOM-EE-CCSD)[43, 44] within a QM/MM embedding scheme, following our previous studies[45–47]. The computational 56 cost of EOM-EE-CCSD scales asO (N 6 ) with system size, which limits the size of the QM system. Thus, we developed a general protocol for balancing accuracy versus costs in calculations of CTTS states with such high-level QM methods. In their study of CTTS excitations of iodide in bulk water and water clusters, Bradforth and Jungwirth[21] concluded that there is little advantage in treating the waters even in the rst solvation shell quantum mechanically. Our results, in contrast, indicate that anions such as SCN , which have a non-spherical solvation shell and strongly perturb the solvent structure, require much more sophisticated treatment to achieve a quantitative agreement between the theory and experiment. We show that a robust description of the highly diuse excited states in these systems requires a QM treatment of at least two solvation shells around the anion. We then analyzed structural parameters, such as the radial distribution functions (RDFs) around the solute atoms, compared the computed RDFs with the experimentally derived ones and with each other, and illustrated the strong sensitivity of the CTTS bands to the local solvent structure around SCN (aq) , which is, in turn, sensitive to the quality of interaction potentials (force elds or functionals). By juxtaposing the two sets of simulations, our results indicate that spec- troscopies targeting CTTS transitions oer an experimental probe of the quality of force elds or density functionals. To elucidate how the dierent local structures obtained from the two equilibrium sampling protocols impact individual excitations, we characterized these electronic transitions by com- puting and visualizing natural transition orbitals (NTOs)[48–50]. For solvated anionic systems, however, visualizing the NTOs is not sucient to clearly distinguish between intramolecular excitations and CTTS transitions due to the mixing of local and CT excitations at some cong- urations. Moreover, visual inspection of NTOs for each snapshot is not practical. Therefore, we 57 augmented the NTO analysis for the selected snapshots with quantitative descriptors such as hole, particle, and exciton sizes, computed using the corresponding reduced one-particle tran- sition density matrices (1PTDMs). Together, NTO visualization and statistical analysis of the descriptors provide a comprehensive toolkit for characterizing CTTS and intramolecular transi- tions in solvated anionic systems. The structure of this paper is as follows. The next section describes computational protocols with additional details given in the Supplemental Information (SI). In section 3.3, we discuss the results and examine the structural dierences between the MD and AIMD calculations, illustrat- ing how these dierences are reected in the UV–vis spectra. 3.2 ComputationalDetails Molecular dynamics. MD simulations were performed with NAMD[51], version 2.14, using the initial conguration generated by Packmol[52]. The system comprises an SCN ion solvated in 1,347 water molecules and one Na + counter-ion in a periodic cubic box with a side of 34.4 Å. The water molecules were described by TIP3P model and the parameters for Na + and SCN were taken from Ref. [39]. A timestep of 1 fs was chosen for the simulation. The waters were made rigid using the SETTLE algorithm. The electrostatic interactions were computed using the Particle– Mesh Ewald method with a 14 Å cuto for long-range interactions. Following initial energy minimization, the system was equilibrated for 500 ps in an NVT ensemble maintained at 300 K using the Langevin thermostat. This was followed by 3 ns of production runs in NPT ensemble with 1 atm pressure maintained using the Nosé–Hoover method. From this MD trajectory, we collected 80 snapshots separated by 2.5 ps for the calculations of the spectra. 58 Abinitiomoleculardynamics. Seven snapshots from the MD run were chosen as the initial congurations for the AIMD simulations. In these simulations, the system was divided into the QM and MM parts. The QM part, which included SCN and 20 nearest waters, was described by !B97X-D/6-31+G*. The waters for the QM part were selected based on the distance from the car- bon atom of SCN . The MM part was described by TIP3P waters. The van der Waals parameters for the SCN and Na + ions were taken from Ref. [39]. The QM–MM interaction was described by electrostatic embedding. The system was equilibrated for 1 ps in an NVE ensemble, followed by 2.5 ps production run in an NVT ensemble with the 1 fs timestep. The temperature was main- tained at 300 K using the Langevin thermostat and the long-range interactions were calculated using Ewald’s summation. A total of 84 snapshots were taken at 200 fs intervals from the trajecto- ries for excitation-energy calculations. We note that unless an adaptive QM/MM scheme is used, the quantum waters can diuse away from SCN in the course of the simulation. We tested our snapshots and found that such exchange between quantum and classical waters around SCN was minimal, aecting just a few waters from the outer solvation shell (see Fig. S5 in the SI). Calculation of excited states. We computed excited states and energies using a hybrid QM/MM approach with electrostatic embedding with the EOM-EE-CCSD method[43, 44] for the QM treatment. The MM water molecules were described as TIP3P point charges with a Gaussian blur. In order to establish optimal protocol for these calculations, we investigated the convergence of the spectra with respect to the number of waters in the QM region, basis set, and the number of excited states computed; in these exploratory calculations, we used the conguration interaction singles (CIS) method. We briey describe the resulting protocol below; more details are given in the SI. 59 We determined that the CIS spectrum needs at least 20 waters in the QM region for excitation energies to converge within 0.01 eV, as shown in Fig. S1 in the SI. The rst solvation shell of SCN contains 8 waters; 20 waters correspond roughly to two solvation shells. Given the importance of Pauli repulsion for these states (i.e., without it, the electron density can extend too far into the solvent), it is not surprising that at least two solvation shells are needed for an adequate description of the CTTS bands. For each snapshot, the QM system was determined by choosing 20 water molecules that are closest to the carbon atom of the anion. Because of the potential exchange between quantum and classical waters in the course of AIMD simulation, it means that the QM system in the calculation of spectra slightly diers from the QM system in the AIMD snapshot; our analysis indicates that these adjustments only aected a small number of snapshots and only aected the outermost water molecules. The basis-set convergence analysis (see Fig. S2 in the SI) indicates that a mixed basis set 6-31+G*/6-31G (wherein SCN and eleven nearest waters are described by the 6-31+G* basis and the remaining nine waters are described by 6-31G) yields excitation energies that are blue- shifted by only 0.1 eV relative to our reference calculations in which SCN and 24 nearest waters are described by the aug-cc-pVDZ basis. The mixed basis set contains almost three times fewer basis functions than the full aug-cc-pVDZ (431 versus 1,111), which aords signicant savings in the EOM-EE-CCSD calculations. To test the eect of additional diuse basis functions, we also carried out calculations with the above basis augmented with additionals andp diuse functions on SCN and observed only minor (< 0.1 eV) changes in excitation energies, which indicates that the diuse basis functions on the QM waters are sucient to describe the CTTS states. In the production-level EOM-EE-CCSD/MM calculations, we computed the lowest eight ex- citations (see Fig. S3). The CCSD and EOM-CCSD steps were executed in single precision, which 60 aords a signicant cut in the memory and computational time requirements compared to the double-precision execution, while introducing negligible errors[53]. The core orbitals were frozen in these calculations. We also froze all virtual orbitals with energies above 2.5 hartree; this re- sulted in signicant reduction of the computational cost while introducing negligible errors of the order of< 0.01 eV in excitation energies. In summary, the nal protocol used for the production-level EOM-EE-CCSD/MM calculations is as follows: 1. QM Region: SCN + 20 nearest waters; 2. Mixed basis set: 6-31+G* on SCN and 11 nearest waters, 6-31G on the rest; 3. MM waters: TIP3P point charges with a Gaussian blur; 4. Excited states requested: 8; 5. All core orbitals and virtual orbitals above 2.5 hartree in energy were frozen; 6. CCSD and EOM-EE-CCSD calculations were executed in single precision; 7. The stick spectra from the MD and AIMD snapshots were convoluted using normalized Gaussian functions with a full width at half maxima (FWHM) of 0.1177 eV and added to- gether to produce the cumulative spectrum. Sample inputs for the QM/MM AIMD and EOM-EE-CCSD/MM jobs are provided in the SI. All AIMD and EOM-EE-CCSD/MM calculations were carried out using the Q-Chem software[54, 55]. 61 3.3 ResultsandDiscussion 3.3.1 Equilibriumdynamics Figure3.1: RDFsofH(leftpanel)andO(rightpanel)atomsonwaterswithrespecttothe(a)S, (b)C,and(c)NatomsonSCN ,extractedfromtheMDandQM/MMAIMDsimulations. Figure 3.1 shows RDFs for distances between S, C, and N atoms of the anion and the O and H atoms of the solvating waters, obtained from the MD and AIMD simulations. The RDFs for the MD simulations were computed using the last 2 ns of the production run. For the AIMD simulations, we use the average RDFs of the individual runs. Both simulations show the eggshell- like solvation shell around SCN , i.e., featuring dierent radii of the rst solvation shell around each atom, with N having a tighter shell compared to S, and C having the largest radii, as clearly seen in Fig. 3.1 (right panel). This is consistent with the previous experimental and theoretical studies[42, 56]. 62 Table 3.1: Key RDF parameters extracted from MD and AIMD simulations and the experimen- tallyderivedvalues. RDF MD AIMD Expt. a r max , Å n tot r max , Å n tot r max , Å n tot S–O 3.07 3.94 3.34 5.31 3.1 5.17 C–O 3.35 9.60 3.72 8.13 3.3 8.05 N–O 2.77 4.49 2.90 3.09 2.8 3.1 a From Ref. 5. Figure 3.2: Average distances and standard deviations for the 11 nearest water molecules from theS(left)andN(right)endsofSCN extractedfromtheMDandAIMDsimulations. Table 3.1 compares the positions of the rst RDF maximum (r max ) and the approximate water- coordination number (n tot )—dened as the area under the rst peak—for the computed RDFs of O atoms shown in Fig. 3.1 to those from experimentally derived RDFs. We note that the lat- ter should not be taken as the exact reference, because their quality depends on the underlying renement protocols. In this case, the experimental RDFs are obtained by tting the experimen- tal neutron diraction spectra using the empirical potential structure renement technique[56]. As illustrated in the context of other structure-determination studies, using higher-level meth- ods (such as QM/MM instead of classical MD) can noticeably aect the values of the extracted structural parameters[57–60]. 63 The MD simulations reproduce experimentalr max but overestimate the total number of waters in the rst solvation shell by about 2 waters. The QM/MM AIMD simulations yield bettern tot but overestimate ther max by approximately 0.3 Å relative to the experimental values. The results in Fig. 3.1 and Table 3.1 show that the MD simulations over-structure the solvent around SCN near the N end, as revealed by the high value at the rst RDF peak. In contrast, the AIMD simulations yield fewer waters (compared to MD) in the rst solvation shell and a larger radius of the shell and, therefore, yield a less structured solvent shell. Fig. 3.2 shows the average distances and standard deviations for the 11 nearest water molecules from the S and N ends. The waters in the AIMD snapshots are, on average, farther than in the MD snapshots; the discrepancy reduces beyond 11 waters, i.e., in the second solvation shell. It also shows the dierence between the S and N ends—in the MD simulations, the closest waters to the N end show low standard deviation, meaning that they are tightly bound to the N atom in the course of the 2 ns simulation time (this is what we call over-structuring), but in the AIMD simulations the standard deviation is larger, indicating larger structural uctuations. One dierence between the MD and QM/MM AIMD simulations is that the MM waters are frozen while the AIMD waters are allowed to vibrate in the course of dynamics. In order to assess the impact of this structural constraint on individual water molecules in the MD simulations on the local structure around the anion, we performed an additional AIMD simulation with frozen bonds and angles of water molecules using the RATTLE algorithm. Fig. S6 in the SI compares the N-H and N-O RDFs of constrained water simulations with the ones from Fig. 3.1. We observe no appreciable dierence in the RDFs, which means that constraining the water bond lengths and angles is not responsible for the observed dierences in the solvent structure around the anion. 64 In summary, classical MD and QM/MM AIMD yield rather dierent local solvent structure around SCN . The spectroscopic manifestation of these dierences is discussed below. 3.3.2 UV–visspectrum Figure 3.3: NTOs for the lowest excited states from EOM-EE calculations: (a) intramolecular state;(b)s-typeCTTSstate;(c)p-typeCTTSstate;(d)p-typeCTTSstate. Table3.2: Rawdataforexcitationenergies(E ex ,eV)andoscillatorstrengths(f). MD AIMD # E ex f E ex f 1 5.91 0.17 0.001 0.002 5.79 0.20 0.0240.015 2 6.05 0.15 0.004 0.004 5.93 0.19 0.030 0.012 3 6.17 0.15 0.003 0.004 6.34 0.14 0.010 0.012 4 6.49 0.16 0.026 0.017 6.47 0.15 0.013 0.009 5 6.62 0.17 0.024 0.012 6.58 0.16 0.011 0.010 6 7.22 0.16 0.212 0.186 6.91 0.17 0.078 0.117 7 7.35 0.16 0.157 0.152 7.05 0.19 0.0360.060 8 7.51 0.13 0.356 0.227 7.32 0.17 0.252 0.162 We begin by analyzing the types and character of the excitations observed in our calculations. Fig. 3.3 shows representative natural transition orbitals (NTOs) for low-lying EOM-EE-CCSD transitions[48–50, 61]. The hole NTO is always of HOMO or HOMO-1 (of SCN ) character and the character of transitions can be assigned based on the shape of the particle NTOs. As we 65 show below, the lowest states are dominated by intramolecular -like character (arising from transition), and the higher bands are dominated by the three CTTS state of s- and p- like characters. We note that in isolated linear molecules, the ! transitions are forbidden by symmetry, however, in the asymmetric solvent environment, they become weakly allowed. As we show below, for some solvent congurations, the intra-molecular and the lowest CTTS excitations can mix, giving rise to states of a mixed character and intensity redistribution. Tables S3 and S4 in the SI show NTOs and exciton descriptors[49, 61, 62]—such as norm of 1PTDM and the participation ratio—for two representative snapshots, one from the MD and one from the AIMD simulations. Table 3.2 gives the average values and standard deviations for the excitation energies and oscillator strengths of the lowest eight transitions across all the snapshots sampled from the MD and AIMD simulations. These data suggest that the energy ordering and characters of states is dierent for MD and AIMD snapshots, as explained below. For the MD snapshots, the oscillator strengths for excitations 1, 2, and 3 show very small values that are approximately an order of magnitude lower than those for the higher transitions. The NTO analysis of these three transitions reveals the intramolecular character of these -like transitions with the particle NTOs resembling those shown in Table S3 and localized on SCN . For these snapshots, transitions 4 and 5 are CTTS transitions with ans-like particle NTOs. Tran- sitions 6 and 7 have ap z -like particle NTO aligned along the molecular axis. In contrast, transition 8 has as ap x -like particle NTO that resembles ap orbital perpendicular to the molecular axis. The smaller standard deviations in oscillator strengths for the lowest transitions are also consistent with their intramolecular character, compared to those for higher-lying CTTS transitions, which are more sensitive to the uctuations of the local solvent structure in the course of equilibrium dynamics. 66 In contrast, for the AIMD snapshots, the averages and standard deviations for oscillator strengths are lowest for excitations 3, 4, and 5 by about50% lower than those for transitions 1 and 2 (see Table 3.2). Thus, it is not possible to distinguish the intramolecular -like transi- tions from the CTTS transitions based on the oscillator strengths alone. Table S4 presents the NTO analysis for a representative AIMD snapshot. Visually, the NTO analysis suggests that transitions 1 and 2 are CTTS transitions withs-like particle NTOs. Thus, the energy ordering of excited states is dierent for snapshots from MD and AIMD simulations. Further, NTOs for tran- sitions 3, 4, 5, 6, and 7 reveal mixed intramolecular andp z -like CTTS character, which explains the non-negligible oscillator strengths for these transitions, in contrast to the pure intramolec- ular transitions from the MD snapshot. NTOs of transition 8 characterize it as ap x -like CTTS transitions with ap-like particle NTO perpendicular to the molecular axis. Thus, simple visual- ization of NTOs for AIMD snapshots is not sucient to clearly dierentiate between the states dominated by CTTS or local transitions. In order to quantify inter- and intra-molecular character of the transitions across all snap- shots, we computed the averaged values and standard deviations for ab initio descriptors[48, 61, 62] such as the hole, particle (electron), and exciton sizes (d h , d e , and d exc , respectively). The results are summarized in Fig. 3.4. d exc —dened as the root-mean-square (RMS) electron–hole distance—quanties the extent of delocalization and charge resonance between two parts of the systems; here, between the anion and the surrounding cavity. For inter-fragment transitions such as the CTTS transitions, averaged exc andd e are expected to be larger than those for local- ized intramolecular transitions. This is conrmed in Fig. 3.4; transitions 1, 2, and 3 for the MD snapshots and transitions 3, 4, and 5 for the AIMD snapshots show much smaller average exciton and particle sizes compared to those for the other transitions. The average hole sizes do not vary 67 Figure 3.4: Average exciton, hole, and particle sizes and their standard deviations for MD (left) andAIMD(right)snapshots. much, in contrast to the particle and exciton sizes, across these transitions, as also expected from the visualization of NTOs. Further, compared to MD, AIMD leads to larger exciton and particle sizes, which can be attributed to larger transient cavities around SCN in the AIMD simulations. These dierences in wave-function descriptors are, therefore, consistent with the dierences in computed RDFs in Fig. 3.1. We note that large dynamic uctuations of the electronic descriptors of the CTTS states reect their sensitivity to the shape of the transient cavities around the anion. Thus, these descriptors provide theoretical means to quantify this previously noted feature of the CTTS states[21, 33]. 68 Figure 3.5: Comparison of aqueous SCN spectra calculated at T=300 K using MD and AIMD withtheexperimentalspectrum(atT=274K)fromRef. 1. Table3.3: Comparisonofpeakpositions(eV)ofcomputedandexperimentalUV–visspectraof SCN (aq) . Peak Character MD AIMD Expt. a 1 s-type CTTS 6.52 5.89 5.78 2 p-type CTTS 7.24 6.93 6.86 3 p-type CTTS 7.47 7.24 7.14 4 Intramolecular 5.90 6.45 - a From Ref. 1. Figure 3.5 shows the spectra computed using snapshots from MD and AIMD simulations and compares them with the experimental absorption spectrum of aqueous tetramethylammonium thiocyanate by Fox et. al[19]. We note that the experimental spectrum was recorded at 274 K, whereas the simulations were carried out at 300 K. Based on the data from Ref. 1, at room temperature, the position of the lowest band should blue shift by 0.01 eV and the position of the second band red shifts by 0.06 eV. Fox et. al reported four sets of numbers (see Table S2). One set corresponds to the low- temperature spectrum (T=274 K, the one shown in Fig. 3.5), but the three other sets presumably correspond to 298 K and the reason for the discrepancies between them is not clear. Therefore, 69 the comparison between the theory and experiment is not straightforward. In addition to this issue, we note that the experimental spectrum stops while the band intensity is still rising, so the estimated position and the intensity of the third peak may not be reliable. Moreover, the experimental paper also cautions that due to the low intensity of the lowest band, the tting was ambiguous. Foxet. al assigned the peaks at 5.78 eV and 6.86 eV as CTTS excitations and provided evidence of a third peak at 7.14 eV. The assignment of the bands was based on temperature and solvent sensitivity of the peaks[19]. Table 3.3 summarizes the computed peak positions with the corre- sponding experimental values (we use the set of numbers given in the Conclusion section of Ref. 1). The theoretical values in Table 3.3 were obtained by tting the computed spectra with four gaussians (see SI). Relative to the MD spectra, thes-type CTTS excitation is red-shifted by0.6 eV,p-type CTTS is red-shifted by0.3 eV, and the intramolecular excitation is blue-shifted by0.5 eV in the AIMD spectra. This indicates that the CTTS-type excitations are stabilized in the QM/MM AIMD simulations, thereby, shifting to lower energies. Table 3.3 also shows that the AIMD peaks match well with the experimental peaks; the discrepancies in the peak positions are within the error bars of the quantum-chemistry method. Finally, we correlate the structural dierences in solvation around SCN in the MD and AIMD simulations with the dierences in the UV–vis spectra. The AIMD simulations show a higher r max than MD simulations, which means that the rst solvation shell is larger in the former. This stabilizes the diuse CTTS states by reducing the Pauli repulsion between the extended electron density of the CTTS state and the surrounding water molecules. We also argue that the diuse s-type CTTS excitations are aected more by this stabilization than the directionalp-type CTTS 70 excitations, as indicated by a larger red shift of the s-type CTTS state. The blue shift for the intramolecular excitation can be explained in terms of the stabilization of these compact states by tighter solvation in the MD simulations. 3.4 Conclusions We investigated the CTTS states of aqueous thiocyanate by means of high-level electronic struc- ture calculations. We considered two equilibrium dynamics simulations for SCN (aq) : force-eld- based MD and hybrid QM/MM AIMD calculations. The two simulations result in rather dier- ent local structures of water around the anion. The structural parameters show that MD over- structures waters around nitrogen and overestimates the numbers of water in the rst solvation shell. In contrast, AIMD reproduces the number of waters in the rst solvation shell while over- estimating its radius compared to experimentally derived results. Importantly, AIMD results in less rigid solvent shell, giving rise to larger structural uctuations. From the two sets of snapshots sampled from the MD and AIMD trajectories, we computed excited states using a QM/MM electrostatic embedding scheme wherein we used the EOM-EE- CCSD method for the QM treatment. We identied a convergent setup for the QM and MM subsystems, such that the anion and its two solvation shells comprising 20 nearest waters are treated quantum mechanically. Our results indicate that the lowest excited states computed from the MD and AIMD snapshots show signicant dierences in the average peak positions and os- cillator strengths, which we attributed to the dierent solvent structures around the anion. We augmented these results with the analysis of NTOs for these transitions. For the MD snapshots, the intramolecular and CTTS transitions are easy to distinguish by visualization of the NTOs. 71 In contrast, we nd that many transitions from the AIMD snapshots have mixed intramolecular and CTTS character. Thus, simple visualization of NTOs is insucient for the characterization of individual transitions. Moreover, visual inspection of each snapshot from the simulations is impractical. We demonstrated that wave-function descriptors such as the hole, particle, and ex- citon sizes allow for a more thorough dierentiation between CTTS and local transitions across all snapshots. The computed MD and AIMD spectra are also dierent. We correlated the computed struc- tural dierences in the two sampling methods with the dierences in the peak positions of CTTS and local transitions. The over-structuring of solvent molecules and smaller cavity around the anion computed in the MD simulations over-stabilizes the intramolecular transitions on the an- ion but destabilizes the diuse CTTS states due to the electronic repulsion with solvated waters. We show that the computed UV–vis spectrum with the AIMD snapshots agrees reasonably well with the experimental spectrum. Our results suggest that the force-eld parameters used in our MD simulations are inadequate to describe the hydration of SCN anion and also present an argument in favor of the more computationally expensive AIMD methods in further studies. In agreement to previous studies[21, 33], our results show that CTTS excitations are strongly sensitive to the local structure of water around the anion; therefore, robust modeling of spectro- scopic processes involving these excitations requires an accurate description of solvation dynam- ics of these anions. Conversely, spectroscopies utilizing these states provide a sensitive experi- mental probe of both the local structure of the solvent around anionic solutes and of the quality of force elds and density functionals; we propose to explore this idea in future studies. Our observations also suggest that spectroscopies based on the CTTS transitions can be used to gain 72 detailed insight into local structure of the solvent—if supplemented by reliable modeling. While this idea—learn about microscopic structure from simulations and use spectroscopic observables to validate the simulations—is the same as traditionally used in many spectroscopic studies, the distinction here is that for other types of electronic transitions, inhomogeneous broadening is often hiding the imperfections of the simulations. For example, in our previous studies of sol- vated species (CN, phenol, phenolate, thymine, glycine, water)[45–47, 63, 64], we were getting a reasonably good agreement with the experimental photoelectron spectra (and redox potentials) with equilibrium sampling using classical force elds. In contrast, strong sensitivity of the CTTS states to the shape of the transient cavities exposes the deciencies of theoretical models. Our study also highlights the outstanding diculties in condensed-phase studies. In addition to the limitations of the theoretical methods, the comparison with experimentally derived values is often not straightforward. In the context of solvated anions, we note that the quality of the experimental RDFs depends on the simulation protocols used in the structure rening procedures. The interpretation of the experimental spectra in terms of the positions and intensities of the specic bands depends on tting procedures, and so on. In addition, the comparisons are often hindered by uncertainties in the calibration procedures, which translate into the uncertainties in the experimental intensities, and by a limited range of energies. Thus, more work is needed from both the experimental and theoretical sides, in order to fully understand even such simple system as solvated SCN . We hope that the progress in theoretical capabilities, as illustrated by the present simulations, will inspire further experimental eorts. In summary, we have developed a robust computational protocol for modeling the UV–vis spectrum of the aqueous thiocyanate anion and presented a toolkit for a rst-principles char- acterization of CTTS transitions. Our strategy of developing a reliable computational protocol 73 for modeling the UV–vis spectrum of SCN (aq) can be extended to other aqueous anionic systems and to modeling other spectroscopic signals, such as nonlinear 2PA and SFG spectra. Thus, this work lays the foundation for modeling condensed-phase spectra of anions, which involve both intra-molecular and the CTTS bands, by high-level electronic structure method such as EOM-EE- CCSD. 3.5 AppendixA:Protocolforexcited-statescalculations We systematically varied the number of waters treated explicitly in the QM region, basis sets, and the number of excited states calculated to understand their impact on the UV–vis spectra of SCN (aq) . For these exploratory calculations, we used the conguration interaction singles (CIS) method instead of the more expensive EOM-EE-CCSD method, because its lower cost allowed us to run the excited-state calculations with a broader range of parameters, which helped iden- tify a convergent protocol. In these tests, we used snapshots obtained from the force-eld MD calculations. We describe the tests below. 3.5.1 NumberofwatersintheQMsubsystem Figure 3.6 compares CIS spectra computed with an increasing number of waters in the QM part. The number of waters selected was according to the radial distribution function (RDF) informa- tion of SCN (aq) . We considered the minimal QM subsystem that consisted of the anion and 8 closest waters, consistent with the RDFs, which show8 waters in the rst solvation shell. We then included up to 32 waters, thus going beyond two solvation shells, which has24 waters. We used the aug-cc-pVDZ basis set for all the QM atoms in this calculation. The rest of the waters 74 are described by point charges corresponding to the TIP3P model. The spectra were computed using 80 snapshots from the MD simulations. Fig. 3.6 shows that the CIS/MM spectrum converges with 20 waters in the QM subsystem, i.e., the dierence between the spectra obtained with 20 nearest waters and 32 nearest waters in the QM subsystem is small. Figure3.6: CIS/aug-cc-pVDZspectrawithanincreasingnumberofwatersintheQMregion. 3.5.2 Basisset Next, we tested the eect of the basis set on the spectra by using two dierent schemes: • Bas1: (mixed basis set) aug-cc-pVDZ on SCN and several nearby waters and cc-pVDZ on the rest; • Bas2: (mixed basis set) 6-31+G* on SCN and several nearby waters and 6-31G on the rest. Our reference spectrum is the full aug-cc-pVDZ basis on all QM atoms. The notation used in Fig. 3.7 is: 75 • Bas1:24–24 = 24 waters in the QM region and aug-cc-pVDZ on all 24 waters; • Bas1:24–11 = 24 waters in the QM region and aug-cc-pVDZ on 11 waters closest to SCN and cc-pVDZ on the rest. The same scheme is followed for Bas2. We list the total number of basis functions in parenthesis. Fig. 3.7 shows that in the same basis scheme, using mixed basis sets does not aect the spec- trum signicantly. However, when changing the basis scheme from aug-cc-pVDZ to 6-31+G*, we observe a blue shift of almost0.1 eV. The total number of basis functions is reduced from 877 (in the mixed Bas1 scheme) to 431 (in the Bas2:20-11 scheme), which aords signicant reduction in cost, especially in EOM-EE calculations. The eect of addional diuse functions on the S, C, and N atoms was negligible, with energy dierence less than 0.1 eV. Figure3.7:CISspectrawithdierentbasesontheQMatoms.Thetotalnumberofbasisfunctions isgiveninparenthesis. 76 3.5.3 Numberofexcitedstatescomputed Figure 3.8 compares the spectra calculated using 10 excited states and 8 excited states. There is no signicant dierence between the two spectra and, therefore, we computed EOM-EE-CCSD spectra by requesting 8 excited states only. In these calculations, we used the Bas2:20-11 scheme. Figure3.8: CIS/Bas2:20-11spectracomputedwith8and10excitedstatespersnapshot. 3.5.4 CISversusEOM Fig. 3.9 compares the CIS spectra with the EOM-EE spectra, which veries that the character of main bands is not altered. We observe that the shape of the CIS and EOM-EE-CCSD spectra are similar, and that the CIS spectrum is blue-shifted relative to EOM-EE-CCSD. 3.5.5 QM-MMwaterexchange In the course of the AIMD simulation, waters in the initial QM region may exchange with the waters in the MM region. Fig. 3.10 shows the histogram of the number of MM waters entering the 77 Figure3.9: CISversusEOMspectracalculatedusingtheBas2:20-11schemewith8excitedstates over80snapshots. QM region across all 84 AIMD snapshots. We observe that on average 3 waters were exchanged and that all the exchanged waters are in the outer solvation shell. Figure3.10: HistogramofMMwatersreplacingtheQMwaters. 78 3.6 AppendixB:RigidwatersinQM/MMAIMD In our AIMD simulations, we did not constrain the structure of water molecules, whereas the MD simulations used waters with xed bond lengths and angles. In order to check whether the rigid water structure aected the RDFs, we carried out a 3 ps QM/MM AIMD calculation with xed water bonds and angles using the RATTLE algorithm in Q-Chem. Fig. 3.11 compares the N–H and N–O RDFs of this run with the AIMD RDFs with no constraints on the individual water structure. The RDFs for the constrained simulation are noisier because they were taken from only one simulation whereas the RDFs for the unconstrained were averaged over seven AIMD runs. Importantly, there is no signicant dierence between the two sets of RDFs—the rst peaks and valleys coincide and, in contrast to MD RDFs, we do not see over-structuring. Figure 3.11: N–H (left) and N–O (right) RDFs for AIMD runs with and without constraints on waterbondsandangles. 79 3.7 AppendixC:Calculationsandanalysisofthespectra The stick spectra from the MD and AIMD snapshots were convoluted using normalized Gaussian functions with a full width at half maxima (FWHM) of 0.1177 eV as follows: g(x;x 0 ) = 1 p 2 exp 0:5(xx 0 ) 2 = 2 ; (3.1) FWHM = 2 p 2 ln(2) 2:35482; (3.2) wherex 0 is the peak position. The cumulative spectra were computed as follows: I(E) = X N X i=1 g(E;E N i )f N i ; (3.3) where the rst sum runs over the snapshots and the second sum runs over the number of com- puted excited states;E N i andf N i denote the excitation energy and oscillator strength ofi th excited state from snapshotN. To compare with the experimentally derived peak positions, we mimicked the experimental procedure[19] and tted our computed cummulative spectra with four gaussians (we varied the gaussian positions, widths, and weights). Figure 3.12 shows the tting results for the MD and AIMD spectra along with the excitations reported in Table II of the main text as sticks and their standard deviations as bar plots. The dotted lines represent the gaussians used to t each spectra and the dashed line is the sum of the gaussians. The parameters for these gaussians are listed in Table 3.4. The results of this t are reported in Table III of the main draft. 80 Figure3.12: FittingtheMD(left)andAIMD(right)spectrawithgaussians. Table3.4: ParametersforgaussiansusedforttinginFig. 3.12. MD AIMD Gaussian x 0 f x 0 f 1 5.90 0.17 0.009 5.89 0.18 0.120 2 6.52 0.23 0.080 6.45 0.20 0.085 3 7.24 0.12 0.370 6.93 0.20 0.140 4 7.47 0.14 0.570 7.24 0.21 0.640 Table 3.5 summarizes the results from the Fox et. al study[19]. They report four sets of num- bers. One set refers to two lower temperatures. The origin of the discrepancies between the three other sets is unclear. Table3.5: Peakpositions(eV)ofthethreelowestbands a ofaqueousthiocyanateasreportedby Foxet. alstudy[19]. Oscillatorstrengthsaregiveninparenthesis. Where mentioned A D E Abstract 5.703 6.633 7.067 Text b , T=274 K 5.729 (0.066) 6.694 (0.395) 7.141 (0.414) Table 1, T=298 K 5.739 (0.104) 6.638 (0.449) - Conclusions 5.778 6.856 7.142 a Using labels from Ref. 1. Additionally, a low-intensity transition estimated to be at 4.463 eV and the next bright transition estimated to be at 7.687 eV are mentioned. b These values were derived from the spectrum shown Fig. 1 in Ref. 1. 81 3.8 AppendixD:Excited-stateanalysis Table 3.6: Wave-function analyses of the reduced 1PTDMs corresponding to the lowest eight excitationsintheEOM-EE-CCSD/MMcalculationsforarepresentativesnapshotsampledfrom the MD simulations. The wave-function analysis for each 1PTDM includes its norm (jj jj) and participationratio(PR)[??].OnlythedominantholeandparticleNTOsareshown.Thesingular values( K )correspondtonormalized1PTDMs. Isosurfaceis0.015. Exc. jj jj PR 2 K Hole Particle # NTO NTO 1 0.82 1.0 0.99 2 0.82 2.0 0.50 0.49 3 0.82 1.0 0.99 4 0.86 1.1 0.96 Exc. jj jj PR 2 K Hole Particle # NTO NTO 5 0.86 1.0 0.98 6 0.86 1.1 0.98 7 0.86 1.1 0.95 8 0.86 1.2 0.95 82 Table 3.7: Wave-function analyses of the reduced 1PTDMs corresponding to the lowest eight excitationsintheEOM-EE-CCSD/MMcalculationsforarepresentativesnapshotsampledfrom AIMD simulations. The wave-function analysis for each 1PTDM includes its norm (jj jj) and participationratio(PR)[??].OnlythedominantholeandparticleNTOsareshown.Thesingular values( K )correspondtonormalized1PTDMs. Isosurfaceis0.015. Exc. jj jj PR 2 K Hole Particle # NTO NTO 1 0.86 1.0 0.99 2 0.86 1.0 0.99 3 0.84 1.0 0.99 4 0.84 1.7 0.70 0.29 Exc. jj jj PR 2 K Hole Particle # NTO NTO 5 0.86 1.1 0.97 6 0.85 1.3 0.86 0.13 7 0.84 1.1 0.97 8 0.86 1.2 0.92 83 3.9 AppendixE:SampleinputsforQM/MMAIMDandEOM calculations AIMD sample input for NVE simulation. $rem JOBTYPE = aimd METHOD = wB97XD BASIS = 6-31+G* SCF_ALGORITHM = gdm MAX_SCF_CYCLES = 200 THRESH = 14 ! AIMD keywords TIME_STEP = 42 AIMD_STEPS = 3000 AIMD_INIT_VELOC = thermal AIMD_TEMP = 300 USER_CONNECT = true MM_SUBTRACTIVE = true QM_MM_INTERFACE = janus FORCE_FIELD = charmm27 ! Chelpg keywords CHELPG = true CHELPG_DX = 10 CHELPG_HEAD = 30 CHELPG_H = 50 CHELPG_HA = 50 ! Turns EWALD on EWALD_ON = true ! Additional keywords SYM_IGNORE = true NO_REORIENT = true QMMM_PRINT = 1 AIMD_PRINT = 2 $end $forceman ewald alpha .1558 .0488 box_length 34.3192 34.3192 34.3192 $end 84 EOM-EE-CCSD sample input. $rem ! Standard keywords for running EOM+Janus jobs BASIS = mixed SCF_GUESS = core SCF_CONVERGENCE = 8 MAX_SCF_CYCLES = 100 THRESH = 14 METHOD = eom-ccsd EE_SINGLETS = [8] MEM_TOTAL = 220000 N_FROZEN_CORE = 28 N_FROZEN_VIRTUAL = 47 ! Single precision + Speeding up EOM/CCSD calculation keywords CC_SINGLE_PREC = 1 EOM_SINGLE_PREC = 1 EOM_ARESP_SINGLE_PREC = 1 CC_SP_DM = 1 EOM_NGUESS_SINGLES = 9 EOM_PRECONV_SINGLES = 1 CC_ERASE_DP_INTEGRALS = 1 CC_BACKEND = XM ! QM-MM keywords FORCE_FIELD = charmm27 QM_MM_INTERFACE = janus USER_CONNECT = TRUE MODEL_SYSTEM_MULT = 1 MODEL_SYSTEM_CHARGE = -1 GAUSSIAN_BLUR = true GAUSS_BLUR_WIDTH = 10000 ! Additional keywords SYM_IGNORE = true NO_REORIENT = true ! For calculation of NTOs CC_TRANS_PROP = 1 STATE_ANALYSIS = TRUE MOLDEN_FORMAT = TRUE NTO_PAIRS = 4 $end 85 Bibliography [1] Y. Zhang and P.S. Cremer, Interactions between macromolecules and ions: the Hofmeister series, Curr. Opin. Chem. Biol.10, 658 (2006). [2] D.J. Tobias, A.C. Stern, M.D. Baer, Y. Levin, and C.J. Mundy, Simulation and theory of ions at atmospherically relevant aqueous liquid–air interfaces, Annu. Rev. Phys. Chem.64, 339 (2013). [3] S. Yoshimoto and K. Itaya, Adsorption and assembly of ions and organic molecules at elec- trochemical interfaces: Nanoscale aspects, Annu. Rev. Anal. Chem.6, 213 (2013). [4] M.A. Shannon, P.W. Bohn, M.Elimelech, J.G. Georgiadis, B.J. Mari nas, and A.M. Mayes, Science and technology for water purication in the coming decades, Nature452, 301 (2008). [5] J.A. Cray, A. Stevenson, P. Ball, S.B. Bankar, E. CA Eleutherio, T.C. Ezeji, R.S. Singhal, J.M. Thevelein, D.J. Timson, and J.E. Hallsworth, Chaotropicity: a key factor in product tolerance of biofuel-producing microorganisms, Curr. Opin. Biotechnol.33, 228 (2015). [6] D.J. Timson, The roles and applications of chaotropes and kosmotropes in industrial fer- mentation processes, World J. Microbiol. Biotechnol.36, 89 (2020). [7] L. Zongo, H. Lange, and C. Crestini, A study of the eect of kosmotropic and chaotropic ions on the release characteristics of lignin microcapsules under stimuli-responsive conditions, ACS Omega4, 6979 (2019). [8] M. Smith and M. C. R. Symons, Solvation spectra. Part 1. The eect of environmental changes upon the ultra-violet absorption of solvated iodide ions, Trans. Faraday Soc.54, 338 (1958). [9] G. Stein and A. Treinin, Electron-transfer spectra of anions in solution. Part 1. Absorption spectra and ionic radii, Trans. Faraday Soc.55, 1086 (1959). [10] M.J. Blandamer and M.F. Fox, Theory and applications of charge-transfer-to-solvent spectra, Chem. Rev.70, 59 (1970). [11] W. S. Sheu and P. J. Rossky, The electronic dynamics of photoexcited aqueous iodide, Chem. Phys. Lett.202, 186 (1993). [12] W. S. Sheu and P. J. Rossky, Dynamics of electron photodetachment from an aqueous halide ion, Chem. Phys. Lett.213, 233 (1993). 86 [13] W. S. Sheu and P. J. Rossky, Electronic and solvent relaxation dynamics of a photoexcited aqueous halide, J. Phys. Chem.100, 1295 (1996). [14] A. Staib and D. Borgis, Reaction pathways in the photodetachment of an electron from aqueous chloride: A quantum molecular dynamics study, J. Chem. Phys.104, 9027 (1996). [15] R. Lian, D. A. Oulianov, R. A. Crowell, I. A. Shkrob, X. Chen, and S. E. Bradforth, Elec- tron photodetachment from aqueous anions. 3. Dynamics of geminate pairs derived from photoexcitation of mono- vs polyatomic anions, J. Phys. Chem. A110, 9071 (2006). [16] H. Iglev, M. K. Fischer, and A. Laubereau, Electron detachment from anions in aqueous solutions studied by two- and three-pulse femtosecond spectroscopy, Pure & Appl. Chem. 82, 1919 (2010). [17] T. R. Griths and M. C. R. Symons, Solvation spectra. Part 3. Further studies of the eect of environmental changes on the ultra-violet spectrum of iodide ions, Trans. Faraday Soc. 56, 1125 (1960). [18] M. Luria and A. Treinin, Photochemistry of thiocyanate ion in solution, J. Phys. Chem.72, 305 (1968). [19] M. F. Fox, C. B. Smith, and E. Hayon, Far-ultraviolet solution spectroscopy of thiocyanate, J. Chem. Soc., Faraday Trans.77, 1497 (1981). [20] M. F. Fox and E. Hayon, Far ultraviolet solution spectroscopy of the iodide ion, J. Chem. Soc., Faraday Trans.73, 1003 (1977). [21] S.E. Bradforth and P. Jungwirth, Excited states of iodide anions in water: A comparison of the electronic structure in clusters and in bulk solution, J. Phys. Chem. A106, 1286 (2002). [22] T. W. Marin, I. Janikand, and D. M. Bartels, Ultraviolet charge-transfer-to-solvent spec- troscopy of halide and hydroxide ions in subcritical and supercritical water, Phys. Chem. Chem. Phys.21, 24419 (2019). [23] P. B. Petersen, R. J. Saykally, M. Mucha, and P. Jungwirth, Enhanced concentration of polar- izable anions at the liquid water surface: SHG spectroscopy and MD simulations of sodium thiocyanide, J. Phys. Chem. B109, 10915 (2005). [24] R. M. Onorato, D. Otten, and R. J. Saykally, Adsorption of thiocyanate ions to the dode- canol/water interface characterized by UV second harmonic generation, Proc. Nat. Acad. Sci.106, 15176 (2009). [25] A. M. Rizzuto, S. Irgen-Gioro, A. Eftekhari-Bafrooei, and R. J. Saykally, Broadband deep UV spectra of interfacial aqueous iodide, J. Phys. Chem. Lett.7, 3882 (2016). [26] H. Mizuno, A. M. Rizzuto, and R. J. Saykally, Charge-transfer-to-solvent spectrum of thio- cyanate at the air/water interface measured by broadband deep ultraviolet electronic sum frequency generation spectroscopy, J. Phys. Chem. Lett.9, 4753 (2018). 87 [27] D. Bhattacharyya, H. Mizuno, A.M. Rizzuto, Y. Zhang, R.J. Saykally, and S.E. Bradforth, New insights into the charge-transfer-to-solvent spectrum of aqueous iodide: Surface versus bulk, J. Phys. Chem. Lett.11, 1656 (2020). [28] K. Carter-Fenk, C. J. Mundy, and J. M. Herbert, Natural charge-transfer analysis: Eliminating spurious charge-transfer states in time-dependent density functional theory via diabatiza- tion, with application to projection-based embedding, J. Chem. Theory Comput. 17, 4195 (2021). [29] L. Dogliotti and E. Hayon, Flash photolysis study of sulte, thiocyanate, and thiosulfate ions in solution, J. Phys. Chem.72, 1800 (1968). [30] M. R. Waterland and A. M. Kelley, Resonance Raman and ab initio studies of the electronic transitions of aqueous azide anion, J. Phys. Chem. A105, 8385 (2001). [31] J. A. Kloepfer, V. H. Vilchiz, V. A. Lenchenkov, and S. E. Bradforth, Femtosecond dynamics of photodetachment of the iodide anion in solution: resonant excitation into the charge- transfer-to-solvent state, Chem. Phys. Lett.298, 120 (1998). [32] A. E. Bragg and B. J. Schwartz, The ultrafast charge-transfer-to-solvent dynamics of io- dide in tetrahydrofuran. 1. exploring the roles of solvent and solute electronic structure in condensed-phase charge-transfer reactions, J. Phys. Chem. B112, 483 (2008). [33] V. T. Pham, I. Tavernelli, C. J. Milne, R. M. van der Veen, P. D’Angelo, Ch. Bressler, and M. Chergui, The solvent shell structure of aqueous iodide: X-ray absorption spectroscopy and classical, hybrid QM/MM and full quantum molecular dynamics simulations, Chem. Phys.371, 24 (2010). [34] H. Okuyama, Y. Suzuki, S. Karashima, and T. Suzuki, Charge-transfer-to-solvent reac- tions from i- to water, methanol, and ethanol studied by time-resolved photoelectron spec- troscopy of liquids, J. Chem. Phys.145, 074502 (2016). [35] P. L. Nostro and B. W. Ninham, Hofmeister phenomena: An update on ion specicity in biology, Chem. Rev.112, 2286 (2012). [36] L. C. Smeeton, J. C. Hey, and R. L. Johnston, Investigation of the structures and energy landscapes of thiocyanate-water clusters, Inorganics5, 20 (2017). [37] P. E. Mason, G. W. Neilson, C. E. Dempsey, A. C. Barnes, and J. M. Cruickshank, The hy- dration structure of guanidinium and thiocyanate ions: Implications for protein stability in aqueous solution, Proc. Nat. Acad. Sci.100, 4557 (2003). [38] P. Mark and L. Nilsson, Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K, J. Phys. Chem. A105, 9954 (2001). [39] G. Tesei, V. Aspelin, and M. Lund, Specic cation eects on SCN in bulk solution and at the air-water interface, J. Phys. Chem. B122, 5094 (2018). 88 [40] J.-D. Chai and M. Head-Gordon, Systematic optimization of long-range corrected hybrid density functionals, J. Chem. Phys.128, 084106 (2008). [41] J.-D. Chai and M. Head-Gordon, Long-range corrected hybrid density functionals with damped atom-atom dispersion interactions, Phys. Chem. Chem. Phys.10, 6615 (2008). [42] M. D. Baer and C. J. Mundy, An ab initio approach to understanding the specic ion eect, Faraday Discuss.160, 89 (2013). [43] J. F. Stanton and R. J. Bartlett, The equation of motion coupled-cluster method. A systematic biorthogonal approach to molecular excitation energies, transition probabilities, and excited state properties, J. Chem. Phys.98, 7029 (1993). [44] A. I. Krylov, Equation-of-motion coupled-cluster methods for open-shell and electronically excited species: The hitchhiker’s guide to Fock space, Annu. Rev. Phys. Chem.59, 433 (2008). [45] P. A. Pieniazek, S. E. Bradforth, and A. I. Krylov, Spectroscopy of the cyano radical in an aqueous environment, J. Phys. Chem. A110, 4854 (2006). [46] D. Ghosh, A. Roy, R. Seidel, B. Winter, S. E. Bradforth, and A. I. Krylov, A rst-principle protocol for calculating ionization energies and redox potentials of solvated molecules and ions: Theory and application to aqueous phenol and phenolate, J. Phys. Chem. B116, 7269 (2012). [47] A. Sadybekov and A. I. Krylov, Coupled-cluster based approach for core-ionized and core- excited states in condensed phase: Theory and application to dierent protonated forms of aqueous glycine, J. Chem. Phys.147, 014107 (2017). [48] A. V. Luzanov and O. A. Zhikol, Excited state structural analysis: TDDFT and related models, in Practical aspects of computational chemistry I: An overview of the last two decades and current trends, edited by J. Leszczynski and M.K. Shukla, pages 415–449. Springer, 2012. [49] S. A. Bäppler, F. Plasser, M. Wormit, and A. Dreuw, Exciton analysis of many-body wave functions: Bridging the gap between the quasiparticle and molecular orbital pictures, Phys. Rev. A90, 052521 (2014). [50] A. I. Krylov, From orbitals to observables and back, J. Chem. Phys.153, 080901 (2020). [51] J. C. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R.D. Skeel, L. Kale, and K. Schulten, Scalable molecular dynamics with NAMD, J. Comput. Chem.26, 1781 (2005). [52] L. Martńez, R. Andrade, E. G. Birgin, and J. M. Martínez, Packmol: A package for building initial congurations for molecular dynamics simulations, J. Comput. Chem.30, 2157 (2009). [53] P. Pokhilko, E. Epifanovskii, and A. I. Krylov, Double precision is not needed for many-body calculations: Emergent conventional wisdom, J. Chem. Theory Comput.14, 4088 (2018). 89 [54] A. I. Krylov and P. M. W. Gill, Q-Chem: An engine for innovation, WIREs: Comput. Mol. Sci.3, 317 (2013). [55] E. Epifanovsky, T. B. Gilbert, X. Feng, J. Lee, Y. Mao, N. Mardirossian, P. Pokhilko, A. F. White, M. P. Coons, A. L. Dempwol, Z. Gan, D. Hait, P. R. Horn, L. D. Jacobson, I. Kali- man, J. Kussmann, A. W. Lange, K. U. Lao, D. S. Levine, J. Liu, S. C. McKenzie, A. F. Mor- rison, K. D. Nanda, F. Plasser, D. R. Rehn, M. L. Vidal, Z.-Q. You, Y. Zhu, B. Alam, B. J. Al- brecht, A. Aldossary, E. Alguire, J. H. Andersen, V. Athavale, D. Barton, K. Begam, A. Behn, N. Bellonzi, Y. A. Bernard, E. J. Berquist, H. G. A. Burton, A. Carreras, K. Carter-Fenk, R. Chakraborty, A. D. Chien, K. D. Closser, V. Cofer-Shabica, S. Dasgupta, M. de Wergi- fosse, J. Deng, M. Diedenhofen, H. Do, S. Ehlert, P.-T. Fang, S. Fatehi, Q. Feng, T. Friedho, J. Gayvert, Q. Ge, G. Gidofalvi, M. Goldey, J. Gomes, C. E. González-Espinoza, S. Gulania, A. O. Gunina, M. W. D. Hanson-Heine, P. H. P. Harbach, A. Hauser, M. F. Herbst, M. Hernán- dez Vera, M. Hodecker, Z. C. Holden, S. Houck, X. Huang, K. Hui, B. C. Huynh, M. Ivanov, A. Jász, H. Ji, H. Jiang, B. Kaduk, S. Kähler, K. Khistyaev, J. Kim, G. Kis, P. Klunzinger, Z. Koczor-Benda, J. H. Koh, D. Kosenkov, L. Koulias, T. Kowalczyk, C. M. Krauter, K. Kue, A. Kunitsa, T. Kus, I. Ladjánszki, A. Landau, K. V. Lawler, D. Lefrancois, S. Lehtola, R. R. Li, Y.-P. Li, J. Liang, M. Liebenthal, H.-H. Lin, Y.-S. Lin, F. Liu, K.-Y. Liu, M. Loipersberger, A. Luenser, A. Manjanath, P. Manohar, E. Mansoor, S. F. Manzer, S.-P. Mao, A. V. Marenich, T. Markovich, S. Mason, S. A. Maurer, P. F. McLaughlin, M. F. S. J. Menger, J.-M. Mewes, S. A. Mewes, P. Morgante, J. W. Mullinax, K. J. Oosterbaan, G. Paran, A. C. Paul, S. K. Paul, F. Pavošević, Z. Pei, S. Prager, E. I. Proynov, A. Rák, E. Ramos-Cordoba, B. Rana, A. E. Rask, A. Rettig, R. M. Richard, F. Rob, E. Rossomme, T. Scheele, M. Scheurer, M. Schneider, N. Ser- gueev, S. M. Sharada, W. Skomorowski, D. W. Small, C. J. Stein, Y.-C. Su, E. J. Sundstrom, Z. Tao, J. Thirman, G. J. Tornai, T. Tsuchimochi, N. M. Tubman, S. P. Veccham, O. Vydrov, J. Wenzel, J. Witte, A. Yamada, K. Yao, S. Yeganeh, S. R. Yost, A. Zech, I. Y. Zhang, X. Zhang, Y. Zhang, D. Zuev, A. Aspuru-Guzik, A. T. Bell, N. A. Besley, K. B. Bravaya, B. R. Brooks, D. Casanova, J.-D. Chai, S. Coriani, C. J. Cramer, G. Cserey, A. E. DePrince, R. A. DiStasio, A. Dreuw, B. D. Dunietz, T. R. Furlani, W. A. Goddard, S. Hammes-Schier, T. Head-Gordon, W. J. Hehre, C.-P. Hsu, T.-C. Jagau, Y. Jung, A. Klamt, J. Kong, D. S. Lambrecht, W. Liang, N. J. Mayhall, C. W. McCurdy, J. B. Neaton, C. Ochsenfeld, J. A. Parkhill, R. Peverati, V. A. Rassolov, Y. Shao, L. V. Slipchenko, T. Stauch, R. P. Steele, J. E. Subotnik, A. J. W. Thom, A. Tkatchenko, D. G. Truhlar, T. Van Voorhis, T. A. Wesolowski, K. B. Whaley, H. L. Wood- cock, P. M. Zimmerman, S. Faraji, P. M. W. Gill, M. Head-Gordon, J. M. Herbert, and A. I. Krylov, Software for the frontiers of quantum chemistry: An overview of developments in the Q-Chem 5 package, J. Chem. Phys.155, 084801 (2021). [56] A. Botti, S. E. Pagnotta, F. Bruni, and M. A. Ricci, Solvation of KSCN in water, J. Phys. Chem. B113, 10014 (2009). [57] H. M. Senn and W. Thiel, QM/MM methods for biomolecular systems, Angew. Chem., Int. Ed.48, 1198 (2009). [58] Y.-W. Hsiao, Y. Tao, J. E. Shokes, R. A. Scott, and U. Ryde, EXAFS structure renement supplemented by computational chemistry, Phys. Rev. B74, 214101 (2002). 90 [59] A. Genoni, L. Bučinský, N. Claiser, J. Contreras-Garcíia, B. Dittrich P. M. Dominiak, E. Es- pinosa, C. Gatti amd P. Giannozzi, J.-M. Gillet, D. Jayatilaka, P. Macchi, A. O. Madsen, L. Massa, C. F. Matta, K. M. Merz Jr., P. N. H. Nakashima, H. Ott, U. Ryde, K. Schwarz, M. Sierka, and S. Grabowsky, Quantum crystallography: Current developments and future perspectives, Chem. Eur. J.24, 10881 (2018). [60] J. Bergmann, E. Oksanen, and U. Ryde, Combining crystallography with quantum mechan- ics, Curr. Op. in Struct. Biol.72, 18 (2022). [61] F. Plasser, A. I. Krylov, and A. Dreuw, libwfa: Wavefunction analysis tools for excited and open-shell electronic states, WIREs: Comput. Mol. Sci.12, e1595 (2022). [62] S. Mewes, F. Plasser, A. Krylov, and A. Dreuw, Benchmarking excited-state calculations using exciton properties, J. Chem. Theory Comput.14, 710 (2018). [63] D. Ghosh, O. Isayev, L. V. Slipchenko, and A. I. Krylov, The eect of solvation on vertical ionization energy of thymine: From microhydration to bulk, J. Phys. Chem. A 115, 6028 (2011). [64] S. Gozem, R. Seidel, U. Hergenhahn, E. Lugovoy, B. Abel, B. Winter, A. I. Krylov, and S. E. Bradfort, Probing the electronic structure of bulk water at the molecular length scale with angle-resolved photoelectron spectroscopy, J. Phys. Chem. Lett.11, 5162 (2020). 91 Chapter4 Investigationofnewalgorithmsfortensorcontraction usingSTRUMPACKfortheirpotentialapplicationin coupled-clustercalculations 4.1 Introduction Algorithms for ecient storage and computation involving large matrices are important in many applications[1–3]. The cost of standard techniques for matrix operations such as matrix-vector multiplication or matrix inversion scales prohibitively with matrix sizes: For an N N ma- trix, matrix–vector multiplication scales asO (N 2 ) and inverting the matrix scales asO (N 3 ). This necessitates the need for new methods that take advantage of inherent structure of the matrices[4–6]. Computational chemistry describes complex chemical problems via computer simulations us- ing a set of approximate but well-dened models[7]. The simplest of such models treats electron– electron interaction as a mean eld and is called the Hartree-Fock method. The computational cost for this method scales asO (N 3 ) withN being the system size. Since a mean-eld model ignores electron correlation, the Hartree-Fock method is not very accurate, introducing errors many times that of the desired chemical accuracy. 92 The gold standard for computational chemistry methods are the coupled-cluster (CC)[8, 9] family of methods, in particular the coupled-cluster with single and double substitutions and with perturbative account of triples, CCSD(T), which approaches chemical accuracy for many systems. However, the computational cost of CCSD(T)[10, 11] scales asO (N 7 ) and the storage cost scales asO (N 4 ) with the system size. This steep scaling signicantly limits the application of this method to systems with more than 20-30 atoms[12]. The scaling of many-body theories can be reduced by taking into account “near-sightedness" of electron correlation[13], i.e., that the electron-electron interactions die o at over long dis- tances, roughly around 1 nm. This near-sightedness of electrons is also responsible for the two- electron integral (TEI) matrices — which are expensive matrices to store and operate on — to be sparse. There are many methods that take advantage of this fact resulting in local CC family of methods but their use is limited because they are sensitive to additional parameters. The sparsity in the TEIs is something we can take advantage of without explicitly modifying the CC equations. Sparsity is one of the previously mentioned “structures" in a matrix that is use- ful to exploit for ecient storage and computation. Matrices that have at most a few non-zero en- tries in each row are called sparse matrices, e.g., upper/lower triangular matrices. Computational algorithms exist to take advantage of such structures, for example matrix vector multiplication can be performed for upper triangular matrices inO (N) time rather thanO (N 2 ) . Matrices can be also be rank-structured[14], a term that we will dene in the next section, and developing algorithms to make use of such structured matrices is an extensively studied topic. This Chapter describes how such matrices can be compressed and computed on using a lin- ear algebra library called STRUMPACK [15]. STRUMPACK, or STRUctured Matrix PACKage, is a library developed by Xiaoye and co-workers[15] for the purpose of providing linear algebra 93 routines for sparse and rank-structured linear systems. Here we aim to use it for compressing rank-structured matrices generated by Q-Chem and study their properties, such as compression accuracy, eciency, and linear algebra operations on them. We begin by introducing a few key concepts for understanding matrix computation followed by a brief description of STRUMPACK and its usage. We then describe the real–world matrices used for compression and other tests. Finally, we discuss our results in Section 4.4. 4.2 Theory In this section we dene the key terms related to matrix compression using STRUMPACK . 4.2.1 Numericrank Rank of a matrix is dened as the maximum number of linearly independent columns in the matrix. For A2R mn rank(A)min(m;n): (4.1) This denition is general and measuring rank according to it means nding the dimension of the vector space spanned by the columns (or rows) of a matrix, which is dicult. So we instead introduce numeric rank, which is often an approximation to the actual rank. The numeric rank is obtained by rank-revealing factorizations like QR decomposition or singular value decomposition (SVD) A =UV T ; =diag( 1 ; 2 ;:::; p );p =min(m;n); (4.2) where,U2R mm ,V2R nn , and 1 2 :::; p 0. For a given> 0, numericrank(A) is the largest integerk such that k >. 94 As discussed above, matrices are sparse when they have only a few non-zero elements in each row. Similarly, matrices are called rank-structured if they have low numeric-rank blocks, typically away from the diagonal despite the entire matrix being full rank or non-sparse. Such matrices commonly arise from discretization of certain dierential[16, 17] and integral operators[1], in signal processing etc. Here, we consider a type of rank-structured matrices called Hierarchically Semi-Separable (HSS)[18, 19] matrices, which have low-rank blocks away from the diagonal. These matrices can be readily compressed into their HSS representation which involves a multi-level hierarchical row and column partitioning represented by a binary tree. 4.2.2 Binarytree A binary tree[20] denes the hierarchical clustering of the rows or columns of the matrix such that every node of the tree, is associated with an intervalI . The root node is associated with the interval [1;n] where n is the number of rows (or columns) of the matrix. For every node of the tree with children 1 and 2 , we have I =I 1 [I 2 and I 1 \I 2 =;: (4.3) A perfect binary tree is such that all nodes have two children except the leaf nodes and all leaf nodes have the same depth. An example of a 2 level perfect binary tree is shown in Fig 4.1. AnyA2R mn matrix can be transformed into HSS form following the structure of the binary tree, 1. For each leaf node, the corresponding diagonal block D = A(I ;I ) is untouched 95 Figure4.1: Alevel2binarytree 2. For each non-lead node with children 1 and 2 , the corresponding o-diagonal blocks A 1 ; 2 = A(I 1 ;I 2 ) are approximated as A 1 ; 2 U big 1 B 1 ; 2 V big;T 2 (4.4) 3. The hierarchical nature of the representation also presents another condition U big = 2 6 6 4 U big 1 0 0 U big 2 3 7 7 5 U ; V big = 2 6 6 4 V big 1 0 0 V big 2 3 7 7 5 V ; (4.5) i.e.U at node is given byU , also called translation operator and theU big matrices of the node are obtained using the U big matrices of its children 1 and 2 , which are themselves given by looking at grandchildren of and so on. In the next section we will expand this denition and provide an example of a 2 level HSS representation and the respective binary tree. 96 4.2.3 HSSrepresentation HSS representation is a hierarchical representation based on a recursive row (or column) par- titioning of the matrix. For example, a 2 2 block partitioning of any given matrix A looks like A = 2 6 6 4 A 1;1;1 A 1;1;2 A 1;2;1 A 1;2;2 3 7 7 5 = 2 6 6 4 D 1;1 U 1;1 B 1;1;2 V T 1;2 U 1;2 B 1;2;1 V T 1;1 D 1;2 3 7 7 5 ; (4.6) where A 1;1;1 and A 1;2;2 are the diagonal blocks of the level–1 partition, which are untouched, A 1;1;2 andA 1;2;1 are the o diagonal blocks that are factored intoU,B, andV T matrices akin to conventional SVD seen in Eq 4.2. The purpose of this factorization is to take advantage of the low-rank o-diagonal blocks in A which will make the matrix B smaller and U skinny and tall andV T wide. However, the factorization is not a simple SVD[21, 22] since as we go beyond 2 2 partitioning, i.e., level 1, we place additional constraints on which factors are allowed. The binary tree for a level 1 HSS representation is shown in Fig. 4.2. A 4 4 or level 2 partitioning shows a true HSS representation since now, we factorize the o-diagonal blocks of the diagonal blocks of level 1 as D 1;1 = 2 6 6 4 D 2;1 U 2;1 B 2;1;2 V T 2;2 U 2;2 B 2;2;1 V T 2;1 D 2;2 3 7 7 5 D 1;2 = 2 6 6 4 D 2;3 U 2;3 B 2;3;4 V T 2;4 U 2;4 B 2;4;3 V T 2;3 D 2;4 3 7 7 5 ; (4.7) 97 where the level is indicated by the subscript, and the full partitioning becomes A = 2 6 6 6 6 6 6 6 6 6 6 4 2 6 6 4 D 2;1 U 2;1 B 2;1;2 V T 2;2 U 2;2 B 2;2;1 V T 2;1 D 2;2 3 7 7 5 U 1;1 B 1;1;2 V T 1;2 U 1;2 B 1;2;1 V T 1;1 2 6 6 4 D 2;3 U 2;3 B 2;3;4 V T 2;4 U 2;4 B 2;4;3 V T 2;3 D 2;4 3 7 7 5 3 7 7 7 7 7 7 7 7 7 7 5 : (4.8) We introduce the translation operators that relate theU andV matrices of level 1 to the ones in level 2 U 1;1 = 2 6 6 4 U 2;1 R 2;1 U 2;2 R 2;2 3 7 7 5 U 1;2 = 2 6 6 4 U 2;3 R 2;3 U 2;4 R 2;4 3 7 7 5 V 1;1 = 2 6 6 4 V 2;1 W 2;1 V 2;2 W 2;2 3 7 7 5 V 1;2 = 2 6 6 4 V 2;3 W 2;3 V 2;4 W 2;4 3 7 7 5 : (4.9) This represents a complete two-level HSS representation of a general matrix and the correspond- ing binary tree is shown in gure 4.2. The gure also shows the matrices saved at each level. At level 1 we need to save everything but at level 2, because of the translation operators theU and V matrices need to be saved only at the leaf level. Figure4.2: Binarytreesforlevel1(left)andlevel2(right)HSSrepresentation. AdaptedfromRef. 5 98 Next, we generalize the denition for HSS representation of any given matrix for any level. A matrix A is said to have HSS representation if there exists matrices of the formD i;j ,U i;j ,V i;j , R i;j ,W i;j , andB i;j;j1 associated with the node (i;j) of a binary tree and satisfy the recursions. D i1;j = 2 6 6 4 D i;2j1 U i;2j1 B i;2j1;2j V T i;2j U i;2j B i;2j;2j1 V T i;2j1 D i;2j 3 7 7 5 U i1;j = 2 6 6 4 U i;2j1 R i;2j1 U i;2j R i;2j 3 7 7 5 V i1;j = 2 6 6 4 V i;2j1 W i;2j1 V i;2j W i;2j 3 7 7 5 forj = 1; 2;:::; 2 i 1; 2 i : (4.10) Note thatD 0;1 represents the root of the binary tree and corresponds to the full matrix A. Due to the hierarchical nature of the representation, only theB,R andW matrices and leaf levelD,U, V matrices need to be stored, resulting in “compression" for matrices with low-rank o-diagonal blocks. The HSS rank of the matrix is dened as the maximum numerical rank of all the HSS o-diagonal blocks at all levels of representation. There are algorithms that can perform matrix operations on the HSS representation of a ma- trix, for example, vector and matrix multiplication, additions, solving linear equations and more, eciently[23]. 4.2.4 STRUMPACK STRUMPACK[15] is a software library that provides linear algebra routines and linear system solvers for rank-structured matrices[23]. It is designed to detect and compress matrices that have some kind of low-rank property, for example, matrices that have dense diagonal and low-rank o- diagonal blocks. We aim to use this package to compress matrices (matricized tensors) involved 99 in high-level quantum chemical calculations, such as coupled-cluster (CC) calculations for large systems. Below I present the exploratory research which assesses the feasibility of deployingSTRUMPACK withinQ-Chem[24] to carry out CC calculations. In addition to HSS representation,STRUMPACK provides several rank-structured formats for compressed dense matrix representation, such as the Block Low Rank (BLR), etc. In this work, we focus mainly on the HSS approximation. Currently, there are two ways of constructing an HSS matrix using STRUMPACK: • By giving an explicit dense matrix as input. This is anO (rN 2 ) complexity calculation wherer is the maximum HSS rank. This complexity, combined with anO (N 2 ) memory requirements, make it an expensive process for large matrices. • By specifying two routines: – A matrix-vector multiplication routine to speed up the HSS construction phase. A fast multiplication routine reduces the complexity toO (r 2 N). – An element extraction routine, i.e., a function to evaluate the submatrix A(I,J) for a row index set I and column index set J. We used the rst method in all our calculations since, as of now, there is no routine to generate the matrices we need block by block. The sample code snippet provided below shows the HSS matrix construction from directly passing a dense matrix A, strumpack::DenseMatrix<double> A(m, n); // ... fill the dense matrix A // Create an HSS options object and set some options. strumpack::HSS::HSSOptions<double> opts; opts.set_leaf_size(128); 100 opts.set_rel_tol(1e-2); opts.set_abs_tol(1e-8); // allow the command line arguments to overwrite any options opts.set_from_command_line(argc, argv); // Construct the HSS matrix from the dense matrix, //using the options specified in opts. // This will use a uniform partitioning of the matrix, //using a leaf size from opts.leaf_size(), // and it will immediately start the HSS compression. strumpack::HSS::HSSMatrix<double> H1(A, opts); In the next section we provide details for the tests we carried out in this exploratory phase. 4.3 Methods 4.3.1 Real-lifematricesfrom Q-Chem The rst step was to determine which key tensors used in CC calculations might have the low- rank o-diagonal block property and the extent to which they can be compressed. In our calcu- lations, we use the VVVV, OOVV, and OVOV matrices (tensors that we matricize) and calculate their sparsity, compression time, compression accuracy, etc. Here, O, and V, denote the occu- pied valence, and unoccupied orbitals, respectively. The VVVV matrix signies the TEIs of type hpqjjrsi, and OOVV is used to represent CCt 2 amplitudes. The OVOV matrix on the other hand is used only for compression (since only square matrices are compressible inSTRUMPACK ). Figure 4.3 shows the matrix shapes in terms of occupied (O) and virtual (V) blocks. 4.3.2 Compressiontests Using the matrices mentioned in the previous section generated using Q-Chem, our rst test is compressing them and studying the eects of the HSS options parameters, like absolute and relative tolerances, leaf sizes on the compression process. The tolerance values (absolute and 101 Figure 4.3: Matrices used in this report. The size of: OVOV matrix is (nocc*nvir x nocc*nvir), OOVV is (nocc 2 x nvir 2 ), and VVVV is (nvir 2 x nvir 2 ), where nocc and nvir are the number of occupiedandvirtualorbitals,respectively. relative) control the respective errors in the compression process, and the leaf size controls the size of the smallest size of the partitioning of the dense matrix. In the code snippet shown above, the absolute tolerance is set to 10 8 , and relative tolerance is set to 10 2 , and the leaf size is 128. In our calculations, we keep the absolute tolerance constant at 10 8 and vary the relative tol- erance to see the eects of memory savings and compression time for OVOV and VVVV matrices for: • CH 4 molecule for a series of basis functions (cc-pVDZ, aug-cc-pVDZ, cc-pVTZ, aug-cc- pVTZ, cc-pVQZ, aug-cc-pVQZ, and cc-pV5Z); • Linear chain alkanes (C n H 2n+2 for n=1-8) with aug-cc-pVDZ basis. In the rst system, the number of occupied orbitals remains constant while the number of virtual orbitals varies, which we take as the system size. In the second system, the system size is represented by the number of carbons in the calculation. 102 The goal of these tests is to determine the eects of the tolerance on compression and to choose a suitable value for the next set of calculations. We do not show accuracy here, but it will be shown once we have selected a particular set of parameters to use. We will also include an orbital localization technique[25] developed by the Head-Gordon group to study the eects of choice of orbital localization on matrix sparsity and compression. 4.3.3 TensorcontractionusingHSSmatrices After the compression tests and selecting suitable parameters, we now move on to the contrac- tion of matrices (tensors) in the HSS representation and timing and accuracy comparison with libtensor contractions in Q-Chem. We perform contractions of the type A ijab = X cd V abcd V ijcd ; (4.11) whereV abcd is the VVVV tensor andV ijcd is the OOVV tensor, both with 4 indices and the contrac- tion is over indicesc;d often referred to as a “common geminal". We can simplify this equation by matricizing the tensors A M IJ = X C V M AC V M IC ; (4.12) where we have folded two indices into one and the matricized tensors are represented by the superscript M. This equation corresponds to a matrix multiplication. In libtensor[26, 27], the same contraction is represented as A l (ijjjajb) = contract(cjd; i_vvvvv(ajbjcjd); i_oovv(ijjjcjd)): (4.13) 103 A discussion of how tensors are stored in libtensor and how such contractions are performed is beyond the scope of this work. Our goal is to compare the timing and accuracy of contractions using the following methods: • Libtensor,A l ; • HSS matrix product, A h = hss_mult(compress(i_vvvv), i_oovv) ; • dgemm, A g = OOVV * VVVV ; • The multiplication with dgemm is carried out as a matrix multiplication of OOVV (nocc 2 x nvir 2 ) and VVVV (nvir 2 x nvir 2 ) matrices. We calculate accuracy using the following formula Accuracy = kA h A l k kA l k : (4.14) In the next section we detail the results of the two calculations mentioned above. 4.4 Resultsanddiscussion Figures 4.4 and 4.5 show the memory ratio (mem(HSS)/mem(Dense)) and compression time versus the number of virtual orbitals plot for OVOV and VVVV matrices. We clearly see the dierence between them as OVOV matrices are not as compressible, nor do they take a lot of time to be compressed. The VVVV matrices, however, show remarkable memory savings on compression, but the compression time is greater for them by order of 10 4 . Higher relative tolerance seems to give better memory compression and lower compression time, but it also gives greater error in compression. We also observe that as the number of virtual 104 orbitals reaches 300, the memory ratio converges to 0.2 for all values of relative tolerances. A similar trend is observed in Figures 4.6 and 4.7, where we show the plots with respect to the number of carbons in the alkane chain. For all the calculations, we can use the following values for HSS compression parameters: • Leaf size = 128; • Absolute tolerance = 1e-8; • Relative tolerance = 1e-3 . Figure4.4:Memoryratio(MemoryofHSSmatrix/MemoryofDensematrix)versusthenumber of virtual orbitals for CH 4 system for compression of OVOV (left) and VVVV (right) matrices withdierentrelativetolerances. 4.4.1 Choiceoforbitalrepresentation The matrices used in the test above were generated using canonical molecular orbitals (MOs), but one of our goals was to study how the numerical sparsity would be aected by the choice of orbital representation. This section will delve more into that by: 105 Figure4.5: TimeforcompressionversusthenumberofvirtualorbitalsforCH 4 systemforcom- pressionofOVOV(left)andVVVV(right)matriceswithdierentrelativetolerances. Figure4.6:Memoryratio(MemoryofHSSmatrix/MemoryofDensematrix)versusthenumber ofCinsystemforcompressionofOVOV(left)andVVVV(right)matriceswithdierentrelative tolerances. • Comparing the numerical sparsity of the matrices generated by canonical MOs and an orbital localization routine developed by the Head-Gordon group referred to as “locovirt" method. • Comparing the compression time, rank, and memory of the HSS matrices and correlating them with sparsity. 106 Figure 4.7: Time for compression versus the number of C in system for compression of OVOV (left)andVVVV(right)matriceswithdierentrelativetolerances. We use a simple denition of numerical sparsity here sparsity(S) = Number of elements less than 10 10 Total number of elements 100: (4.15) In this section and the following sections we use the following systems: • System I: Linear alkane chains (C n H 2n+2 , n = 1-8) with cc-pVDZ basis for rst two alkanes and aug-cc-pVDZ basis for the rest ; • System II: C 3 H 8 with dierent basis sets (cc-pVDZ to cc-pVQZ). Tables I and II show the sparsity (S), compression time, rank, and memory ratio (Mem(HSS)/Mem(Dense)) for matrices generated using canonical MOs and locovirt method for systems I and II, respectively. We have plotted and compared the compression time and memory ratio for VVVV matrices gen- erated using the two methods. The previous section and tables also reveal that OVOV matrices are generally not as compressible compared to VVVV matrices, and hence we will not include them in further analysis. 107 For systems I and II, VVVV matrices generated using the locovirt method show systematic im- provement in compression times and memory savings. However, the matrices from the canonical MO method seem to be sparser than the locovirt ones, which stems from the inherent symmetry present in our systems (alkanes). In order to conrm the eects of symmetry on sparsity, we perform the same calculations on slightly distorted alkanes in the next section. Figure 4.8 shows the results for system I, and we can see that the eects of compression kick in from C 3 H 8 where the compressed matrix is almost 10% the size of the uncompressed dense matrix. This ratio is maintained for up to 8 carbon atoms and appears to be size-independent. The compression time, however, increases rapidly with the system size, which will be the main bottleneck in these calculations. Figure 4.9 presents the same for system II where we systematically increase the size of the virtual orbital space by increasing the number of basis functions on a single system. The memory ratio is even lower than the observed for system I (5%), and compression time increases as well. In both cases, the locovirt matrices are slightly more compressible than the canonical ones despite being less sparse. 4.4.2 Alkanesatdistortedgeometry In this section, we present results using systems I and II with distorted geometries. The goal is to observe the eects of eliminating the inherent symmetry of our systems on sparsity and compressibility. Tables III and IV show the results for systems I and II respectively. We observe that in this case, the locovirt VVVV matrices are much sparser than the canonical matrices. However, there is little to no improvement in compression times or memory saving. 108 Table 4.1: Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developedbyZhenling(locovirt)forsystemI. System Matrix Canonical SCF Locovirt S Time(s) Rank Mem(10 1 ) S(10 2 ) Time(s) Rank Mem(10 1 ) CH 4 /cc-pVDZ OVOV 15.3 0.007 51 9.71 0 0.009 51 9.74 VVVV 8.68 0.136 129 5.56 0 0.124 122 5.12 C 2 H 6 /cc-pVDZ OVOV 8.68 0.039 112 8.06 0 0.047 111 7.81 VVVV 5.56 1.25 224 3.56 0 1.07 208 2.74 C 3 H 8 /aug-cc-pVDZ OVOV 20.60 0.667 268 5.06 3.25 0.65 251 4.81 VVVV 23.20 60.20 562 1.31 1.06 46.80 508 0.972 C 4 H 10 /aug-cc-pvdz OVOV 15.62 2.18 363 4.16 0.145 2.09 339 3.91 VVVV 13.87 151.0 706 1.00 0.129 112.0 593 0.669 C 5 H 12 /aug-cc-pVDZ OVOV 10.36 5.08 458 3.53 0.217 4.86 418 3.26 VVVV 11.36 377.0 845 0.799 0.402 266.0 674 0.476 C 6 H 14 /aug-cc-pVDZ OVOV 13.92 10.30 570 3.05 0.308 10.0 515 2.82 VVVV 19.91 813.0 992 0.679 1.19 574.0 790 0.363 C 7 H 16 /aug-cc-pVDZ OVOV 11.37 17.10 661 2.69 0.545 17.40 606 2.47 VVVV 12.93 1510.0 1117 0.576 3.22 1010.0 863 0.282 C 8 H 18 /aug-cc-pVDZ OVOV 5.67 32.80 769 2.41 0.762 29.20 687 2.22 VVVV 9.71 2660.0 1269 0.497 8.15 1610.0 930 0.215 109 Table 4.2: Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developedbyZhenling(locovirt)forsystemII. System Matrix Canonical SCF Locovirt S Time(s) Rank Mem(10 1 ) S(10 2 ) Time(s) Rank Mem(10 1 ) C 3 H 8 /cc-pVDZ OVOV 11.30 0.214 186 6.56 25.00 0.214 181 6.370 VVVV 5.52 4.62 320 2.63 89.30 3.20 276 1.720 C 3 H 8 /aug-cc-pVDZ OVOV 20.60 0.663 268 5.06 3.25 0.740 250 4.80 VVVV 23.20 59.80 562 1.31 1.06 46.80 508 0.972 C 3 H 8 /cc-pVTZ OVOV 17.40 2.05 393 5.11 2.28 1.91 384 4.86 VVVV 13.90 286.0 810 0.897 2.05 185.0 653 0.437 C 3 H 8 /aug-cc-pVTZ OVOV 22.30 5.72 529 4.01 3.06 5.32 491 3.83 VVVV 28.30 2370.0 1132 0.456 1.45 1970.0 1044 0.297 C 3 H 8 /cc-pVQZ OVOV 20.30 10.90 708 4.30 4.13 10.60 657 3.05 VVVV 18.50 7790.0 1536 0.382 6.17 5320.0 1229 0.164 110 Table 4.3: Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developedbyZhenling(locovirt)forsystemIwithdistortedgeometry. System Matrix Canonical SCF Locovirt S Time(s) Rank Mem(10 1 ) S(10 2 ) Time(s) Rank Mem(10 1 ) CH 4 /cc-pVDZ OVOV 0 0.012 51 9.74 0 0.001 51 9.74 VVVV 0 0.140 128 5.63 0 0.127 124 5.13 C 2 H 6 /cc-pVDZ OVOV 0 0.049 116 8.13 0 0.040 110 7.91 VVVV 0 1.230 224 3.61 1.38 0.937 207 2.75 C 3 H 8 /aug-cc-pVDZ OVOV 0.722 0.663 273 5.10 0 0.647 250 4.80 VVVV 0.534 59.90 564 1.36 3.19 46.90 509 0.975 C 4 H 10 /aug-cc-pVDZ OVOV 0.254 2.230 374 4.21 13.0 2.09 336 3.84 VVVV 0.672 155.0 691 1.02 10.50 112.0 584 0.673 C 5 H 12 /aug-cc-pVDZ OVOV 4.00 4.790 465 3.55 49.40 4.820 419 3.20 VVVV 0.781 396.0 847 0.824 38.10 265.0 676 0.476 C 6 H 14 /aug-cc-pVDZ OVOV 11.20 9.790 579 3.05 234.0 9.240 522 2.71 VVVV 0.921 823.0 1008 0.695 106.0 583.0 787 0.365 C 7 H 16 /aug-cc-pVDZ OVOV 33.20 16.60 687 2.64 1230.0 17.20 598 2.34 VVVV 1.11 1500.0 1111 0.585 297.0 1020.0 854 0.283 C 8 H 18 / aug-cc-pVDZ OVOV 42.40 29.30 808 2.39 1980.0 30.30 685 2.12 VVVV 1.34 2610.0 1273 0.510 845.0 1610.0 921 0.216 111 Table 4.4: Sparsity, compression time, rank, and memory of matrices generated using canonical MOs and localized virtual code developedbyZhenling(locovirt)forsystemIIwithdistortedgeometry. System Matrix Canonical SCF Locovirt S(10 4 ) Time(s) Rank Mem(10 1 ) S(10 4 ) Time(s) Rank Mem(10 1 ) C 3 H 8 /cc-pVDZ OVOV 0 0.211 194 6.63 0 0.220 181 6.34 VVVV 0 5.01 327 2.70 7.41 3.24 275 1.74 C 3 H 8 /aug-cc-pVDZ OVOV 0.722 0.664 273 5.10 0 0.654 252 4.80 VVVV 0.534 60.05 564 1.36 3.19 46.80 509 0.975 C 3 H 8 /cc-pVTZ OVOV 0.321 1.98 398 5.11 8.61 1.96 387 4.92 VVVV 0.213 297.0 809 0.918 35.40 187.0 653 0.476 C 3 H 8 /aug-cc-pVTZ OVOV 0.496 5.04 544 4.08 4.96 5.35 496 3.85 VVVV 1.10 2400.0 1140 0.471 13.80 1900.0 1056 0.299 C 3 H 8 /cc-pVQZ OVOV 0.693 10.80 723 4.34 29.20 10.40 659 4.08 VVVV 0.635 8070.0 1544 0.388 189.0 5660.0 1217 0.166 112 Figure 4.8: Time for compression (left) and memory ratio (right) versus number of carbons in thesystem(I)forcompressionofVVVVmatrix. This seems counter-intuitive since we expect the compressibility of matrices should increase with sparsity, but no evidence of this is seen from the numerical results we have. However, through these tests, we have shown that the choice of orbital representation (canon- ical versus localized) does aect the extent to which OOVV and VVVV tensors can be compressed. The issue of how sparsity is correlated to compression is still not completely clear. In the next section we present the results for the matrix contractions for systems I and II. 4.4.3 CanonicalversusLocovirtmatrixcontraction Tables 4 and 5 show the numerical results for the contraction calculations for systems I and II, respectively. Fig. 7 shows a graphical representation of contraction time comparison for canon- ical and locovirt matrices. The libtensor timings were obtained from Q-Chem and we disabled symmetry using the keywordsym_ignore = TRUE. Tables 4 and 5 show that hss_mult contraction timings (compressed VVVV matrix multiplied by OOVV matrix using STRUMPACK ) are lower than libtensor and dgemm timings. Figure 4.10 also shows the contraction times for compressed matrices generated by the locovirt method are 113 Figure 4.9: Time for compression (left) and memory ratio (right) versus number of virtual or- bitalsinthesystem(II)forcompressionofVVVVmatrix. slightly lower than the canonical matrices, but the dierence increases with system size. This implies that the VVVV matrix generated by locovirt method is better suited for compression and handlesSTRUMPACK methods using compressed matrices (like hss_mult) better than the matrix generated by canonical MOs. We also observe from the data that the errors in contraction (matrix multiplication) due to compression is system independent and is around 0.0005. This error is controlled by the rela- tive and absolute tolerance values we set in the beginning of these calculations (0.001 and 1e-8, respectively). 4.5 Conclusion We have shown that the real–world matrices generated by quantum-chemical calculations are generally compressible and the compressibility is dependent on choice of orbital representation. We have also demonstrated the advantage of using HSS matrices in contraction over libtensor 114 Table4.5: ContractiontimingsandaccuracyforsystemI. System n occ n vir Method Contraction time (s) Accuracy Libtensor dgemm hss_mult dgemm(10 13 ) hss_mult(10 4 ) CH 4 5 29 Canonical 0.010 0.001 0.002 0.041 2.54 Locovirt 0.001 0.002 0.030 2.11 C 2 H 6 9 49 Canonical 0.016 0.020 0.043 0.078 3.20 Locovirt 0.020 0.034 0.083 2.55 C 3 H 8 13 128 Canonical 4.92 2.15 0.813 584.0 5.31 Locovirt 2.06 0.672 524.0 4.46 C 4 H 10 17 165 Canonical 23.67 9.18 2.80 1700 4.16 Locovirt 8.19 2.11 1500 3.80 C 5 H 12 21 202 Canonical 63.67 27.70 7.45 3750 5.77 Locovirt 27.60 5.22 3770 4.33 C 7 H 16 29 276 Canonical 345.66 183.0 32.70 11300 4.69 Locovirt 181.0 19.60 11400 4.21 C 8 H 18 33 313 Canonical 756.64 385.0 50.40 16400 11.2 Locovirt 386.0 30.30 18700 5.16 Figure4.10: ContractiontimecomparisonforsystemsI(left)andII(right). in Q-Chem. The advantages of compressing large matricized tensors increases with system size. However, there are several caveats to this work which we will detail next. We have shown that the bottleneck for our calculations is the compression time for large matrices. In STRUMPACK, HSS compression is MPI and OpenMP parallelizable using a class calledHSSMatrixMPI. However, methods handling compressed matrix multiplication with dense matrix is not yet parallelized. 115 Table4.6: ContractiontimingsandaccuracyforsystemII. System n occ n vir Method Contraction time (s) Accuracy Libtensor dgemm hss_mult dgemm(10 13 ) hss_mult(10 4 ) cc- pVDZ 13 69 Canonical 0.571 0.152 0.211 0.137 3.84 Locovirt 0.152 0.161 122000 2.77 aug-cc- pVDZ 13 128 Canonical 4.864 2.14 0.804 584.0 5.31 Locovirt 2.04 0.673 443.0 4.46 cc-pVTZ 13 189 Canonical 22.59 8.56 2.54 4.74 4.63 Locovirt 8.50 1.72 8830.0 3.69 aug-cc- pVTZ 13 309 Canonical 160.11 61.10 8.19 8180.0 5.72 Locovirt 61.0 5.59 3030.0 4.88 cc- pVQZ 13 392 Canonical 463.38 167.0 15.90 138.0 4.81 Locovirt 159.0 9.06 140.0 4.20 We should also keep in mind that we are passing the entire VVVV matrix to the compression function, which is highly memory intensive, as it requires us to store the full VVVV matrix in memory. The goal of our tests was to see whether it was possible to compress these matrices using STRUMPACK , and to prototype tensor contractions and compare it with libtensor. Next, our goal should be to optimize these methods. A limiting factor of STRUMPACK is its inability to compress non-square matrices, which led us to choose tensors which can be easily matricized to square matrices for our tests. The VVVV matrices are important since they are the biggest ones involved in CC calculations but there are other tensors, which cannot be easily converted to square matrices. The inability to multiply or add two HSS matrices is also something that needs to be addressed. 116 Bibliography [1] V. Rokhlin, Application of volume integrals to the solution of partial dierential equations, Computers and Mathematics with Applications11, 667 (1985). [2] L. Greengard and V. Rokhlin, A fast algorithm for particle simulations, J. Chem. Phys.73, 325 (1987). [3] J. Carrier, L. Greengard, and V. Rokhlin, A fast adaptive multipole algorithm for particle simulations, SIAM J. Scientic and Statistical Computing9, 669 (1988). [4] I. Koltracht, Linear complexity algorithm for semiseparable matrices, Integral Equations and Operator Theory29, 313 (1985). [5] S. Chandrasekaran, M. Gu, and W. Lyons, A fast adaptive solver for hierarchically semisep- arable representations, CALCOLO42, 171 (2005). [6] Vandebril R, M. V. Barel, G. H. Golub, and N. Mastronardi, A bibliography on semiseparable matrices, CALCOLO42, 249 (2005). [7] T. Helgaker, W. Klopper, and D. P. Tew, Quantitative quantum chemistry, Mol. Phys. 106, 2107 (2008). [8] T.D. Crawford and H.F. Schaefer III, An introduction to coupled cluster theory for compu- tational chemists, Rev. Comp. Chem.14, 33 (2000). [9] R.J. Bartlett and M. Musial, Coupled-cluster theory in quantum mechanics, Rev. Mod. Phys. 79, 291 (2007). [10] K. Raghavachari, G.W . Trucks, J. A. Pople, and M. Head-Gordon, A fth-order perturbation comparison of electron correlation theories, Chem. Phys. Lett.157, 479 (1989). [11] J. D. Watts, J. Gauss, and R. J. Bartlett, Coupled-cluster methods with noniterative triple excitations for restricted open-shell Hartree-Fock and other general single determinant ref- erence functions. Energies and analytical gradients, J. Chem. Phys.98, 8718 (1993). [12] M. Head-Gordon, Quantum chemistry and molecular processes, J. Phys. Chem.100, 13213 (1996). [13] W. Kohn, Nobel lecture: Electronic structure of matter—wave functions and density func- tionals, Rev. Mod. Phys.71, 1253 (1999). 117 [14] M. Fiedler, Structure ranks of matrices, Linear Algebra and its Applications179, 119 (1993). [15] F. Rouet, X. S. Li, P. Ghysels, and A. Napov, A distributed-memory package for dense hier- archically semi-separable matrix computations using randomization, ACM Transactions on Mathematical Software42, 1 (2016). [16] D. Zweilinger, Handbook of Dierential Equations. Academic Press, Boston MA, 1997. [17] W. Lyons, FastalgorithmswithapplicationstoPDEs. University of California, Santa Barbara, 2005. [18] S. Chandrasekaran, M. Gu, and T. Pals, A fast ULV decomposition solver for hierarchically semiseparable representations, SIAM J. Matrix Analysis Applications28, 603 (2006). [19] Z. Sheng, P. Dewilde, and S. Chandrasekaran, Algorithms to solve hierarchically semi- separable systems, Operator Theory: Advances and Applications176, 255 (2007). [20] W. Lyons, H.D. Ceniceros, S. Chandrasekaran, and M. Gu, Fast algorithms for spectral col- location with non-periodic boundary conditions, J. Chem. Phys.207, 173 (2005). [21] P. G. Martinsson, A fast randomized algorithm for computing a hierarchically semiseparable representation of a matrix, SIAM J. Matrix Analysis and Applications32, 1251 (2011). [22] J. Xia, Randomized sparse direct solvers, SIAM J. Matrix Analysis and Applications34, 197 (2013). [23] S. Chandrasekaran, P. Dewilde, W. Lyons M. Gu, and T. Pals, A fast solver for HSS repre- sentations via sparse matrices, SIAM J. Matrix Analysis and Applications29 (2008). [24] E. Epifanovsky, A. T. B. Gilbert, X. Feng, J. Lee, Y. Mao, N. Mardirossian, P. Pokhilko, A. F. White, M. P. Coons, A. L. Dempwol, Z. Gan, D. Hait, P. R. Horn, L. D. Jacobson, I. Kali- man, J. Kussmann, A. W. Lange, K. U. Lao, D. S. Levine, J. Liu, S. C. McKenzie, A. F. Mor- rison, K. D. Nanda, F. Plasser, D. R. Rehn, M. L. Vidal, Z.-Q. You, Y. Zhu, B. Alam, B. J. Al- brecht, A. Aldossary, E. Alguire, J. H. Andersen, V. Athavale, D. Barton, K. Begam, A. Behn, N. Bellonzi, Y. A. Bernard, E. J. Berquist, H. G. A. Burton, A. Carreras, K. Carter-Fenk, R. Chakraborty, A. D. Chien, K. D. Closser, V. Cofer-Shabica, S. Dasgupta, M. de Wergi- fosse, J. Deng, M. Diedenhofen, H. Do, S. Ehlert, P.-T. Fang, S. Fatehi, Q. Feng, T. Friedho, J. Gayvert, Q. Ge, G. Gidofalvi, M. Goldey, J. Gomes, C. E. González-Espinoza, S. Gulania, A. O. Gunina, M. W. D. Hanson-Heine, P. H. P. Harbach, A. Hauser, M. F. Herbst, M. Hernán- dez Vera, M. Hodecker, Z. C. Holden, S. Houck, X. Huang, K. Hui, B. C. Huynh, M. Ivanov, Á. Jász, H. Ji, H. Jiang, B. Kaduk, S. Kähler, K. Khistyaev, J. Kim, G. Kis, P. Klunzinger, Z. Koczor-Benda, J. H. Koh, D. Kosenkov, L. Koulias, T. Kowalczyk, C. M. Krauter, K. Kue, A. Kunitsa, T. Kus, I. Ladjánszki, A. Landau, K. V. Lawler, D. Lefrancois, S. Lehtola, R. R. Li, Y.-P. Li, J. Liang, M. Liebenthal, H.-H. Lin, Y.-S. Lin, F. Liu, K.-Y. Liu, M. Loipersberger, A. Luenser, A. Manjanath, P. Manohar, E. Mansoor, S. F. Manzer, S.-P. Mao, A. V. Marenich, T. Markovich, S. Mason, S. A. Maurer, P. F. McLaughlin, M. F. S. J. Menger, J.-M. Mewes, S. A. Mewes, P. Morgante, J. W. Mullinax, K. J. Oosterbaan, G. Paran, A. C. Paul, S. K. Paul, F. Pavošević, Z. Pei, S. Prager, E. I. Proynov, Á. Rák, E. Ramos-Cordoba, B. Rana, A. E. Rask, 118 A. Rettig, R. M. Richard, F. Rob, E. Rossomme, T. Scheele, M. Scheurer, M. Schneider, N. Ser- gueev, S. M. Sharada, W. Skomorowski, D. W. Small, C. J. Stein, Y.-C. Su, E. J. Sundstrom, Z. Tao, J. Thirman, G. J. Tornai, T. Tsuchimochi, N. M. Tubman, S. P. Veccham, O. Vydrov, J. Wenzel, J. Witte, A. Yamada, K. Yao, S. Yeganeh, S. R. Yost, A. Zech, I. Y. Zhang, X. Zhang, Y. Zhang, D. Zuev, A. Aspuru-Guzik, A. T. Bell, N. A. Besley, K. B. Bravaya, B. R. Brooks, D. Casanova, J.-D. Chai, S. Coriani, C. J. Cramer, G. Cserey, A. E. DePrince, R. A. DiStasio, A. Dreuw, B. D. Dunietz, T. R. Furlani, W. A. Goddard, S. Hammes-Schier, T. Head-Gordon, W. J. Hehre, C.-P. Hsu, T.-C. Jagau, Y. Jung, A. Klamt, J. Kong, D. S. Lambrecht, W. Liang, N. J. Mayhall, C. W. McCurdy, J. B. Neaton, C. Ochsenfeld, J. A. Parkhill, R. Peverati, V. A. Rassolov, Y. Shao, L. V. Slipchenko, T. Stauch, R. P. Steele, J. E. Subotnik, A. J. W. Thom, A. Tkatchenko, D. G. Truhlar, T. Van Voorhis, T. A. Wesolowski, K. B. Whaley, H. L. Wood- cock, P. M. Zimmerman, S. Faraji, P. M. W. Gill, M. Head-Gordon, J. M. Herbert, and A. I. Krylov, Software for the frontiers of quantum chemistry: An overview of developments in the Q-Chem 5 package, J. Chem. Phys.155, 084801 (2021). [25] Z. Wang, A. Aldossary, and M. Head-Gordon, Sparsity of the electron repulsion integral tensor using dierent localized virtual orbital representations in local second order Møller- Plesset theory, (2022). [26] E. Epifanovsky, M. Wormit, T. Kuś, A. Landau, D. Zuev, K. Khistyaev, P. U. Manohar, I. Kali- man, A. Dreuw, and A. I. Krylov, New implementation of high-level correlated methods using a general block-tensor library for high-performance electronic structure calculations, J. Comput. Chem.34, 2293 (2013). [27] I. Kaliman and A. I. Krylov, New algorithm for tensor contractions on multi-core CPUs, GPUs, and accelerators enables CCSD and EOM-CCSD calculations with over 1000 basis functions on a single compute node, J. Comput. Chem.38, 842 (2017). 119 Chapter5 Futuredirections Chapters 2 and 3 presented the computational studies of basis-set requirements for the CVS-EOM- IP-CCSD method for second–row molecules and UV-Vis spectroscopy of anions in the condensed phase. In Chapter 4 we described a pilot study of algorithms for tensor contraction using com- pressed matrices and a potential implementation in coupled-cluster calculations in Q-Chem. In this chapter I discuss future directions for each of these topics. 5.1 Core-levelstatesoflargersystems For heavy elements (3rd row of the Periodic Table and below) with K- and L-shell excitations, two factors must be taken into account for accurate description of core-level states: (i) spin-orbit coupling, and (ii) scalar relativistic corrections. An implementation of spin-orbit coupling for the CVS-EOM-CCSD method exists[1] and uses perturbative evaluation of spin-orbit coupling using Breit-Pauli Hamiltonian[2]. However, scalar relativistic corrections are not included in the current CVS-EOM-CCSD method in Q-Chem. Whereas approximate or even empirical treatment of relativistic corrections yield accurate results for lighter elements, they play a larger role in altering the energies of core electrons of 120 heavier atoms or molecules. Typically, the relativistic eects are calculated by solving an elec- tronic Hamiltonian based on the four-component Dirac operator, which is computationally ex- pensive. The exact two-component (X2C)[3, 4] Hamiltonian oers a more tractable solution, and it would be interesting to extend its implementation to the CVS-EOM-CCSD framework. Another exciting direction is motivated by double core hole (DCH)[5, 6] spectroscopy in which two core vacancies are created on a single site or two dierent sites in a molecule. DCH spectroscopy is sensitive to both chemical environment and molecular structure. In order to cap- ture these doubly ionized states one needs to extend the CVS scheme to the EOM-CCSD method for calculating double ionization potentials (DIP), referred to as EOM-DIP-CCSD as implemented in Q-Chem. A promising direction would be to extend the fc-CVS approach to doubles core-hole states. Our current approach for core-ionization freezes the core electrons (frozen core) and modies theR IP operator such that at least one core orbital is included in the ionization (CVS). For DIP calculations, theR DIP looks like R DIP = 1 2 X ij r ij ji + 1 6 X ijka r a ijk a y kji; (5.1) i.e., a two hole (2h) and three hole two particle (3h1p) operator. To implement the CVS scheme, the DIP equations need to be re-derived within the CVS scheme. 121 5.2 Expandingthescopeofcompressedmatrixcontractions We have shown that prototyping tensor contractions as HSS–dense matrix multiplication signif- icantly speeds up the contraction process. However, the tensor contraction we explored were of the form A ijab = X cd V abcd V ijcd ; (5.2) which are referred to as representing “direct" interactions, i.e., a product of two tensors with summation over a common geminal represents long-range “direct" interactions. There are other tensor contractions representing short-range “exchange"-like interactions where the contracted indices spam across multiple geminals A ij = X kab V jkab V ikab : (5.3) Tensor contractions of this form present signicant challenges in matricizing the tensors. For example, a tensor with 4 indices, sayV ijcd wherei;j;::: represent occupied orbitals anda;b;::: represent virtual orbitals, is matricized using the following algorithm, for i in range(Nocc): for j in range(Nocc): for a in range(Nvir): for b in range(Nvir): IJ = i * Nocc + j AB = A * Nvir + b Vmat(IJ, AB) = V(i,j,a,b) whereNocc andNvir are the number of occupied and virtual orbitals respectively. The dimen- sion ofVmat isNocc 2 Nvir 2 . We can similarly fold indicesi anda together to make a square matrix of dimensionNoccNvirNoccNvir and this is exactly what is done in resolving the tensor contraction of Eq. (5.2). However, we must be careful since we aim to exploit the inherent 122 rank-structuring of a matrix while might be destroyed when reordering and permuting indices. Unfortunately, there is no easy solution to gure out when inherent structures of a tensor might be aected by permuting and folding indices. This issue is more apparent in contractions shown in Eq. (5.3) where we must fold 3 indices (one occupied, and two virtual) together to matricize the tensor. Additionally, there is no way of making that matrix square, which prevents it from being compressed in STRUMPACK. So, one should rst look for a method of compressing non-square matrices and then we go through each of the terms involved in the CC calculations in Q-Chem and explore their matricization, compression, and linear algebraic operations on them. 5.3 Non-linearspectraofaqueousanions The CTTS states can play a role in non-linear spectroscopies like 2PA and SFG because of the res- onant enhancement of their intensity when the sum frequency photon matches a dipole–allowed transition. We are interested in modeling these spectroscopies and investigating the dierences between non-linear and linear spectra. 5.3.1 2PAspectraofaqueousSCN Using the methodology developed by Nanda and Krylov[7, 8], we calculated the 2PA spectra of SCN in bulk water. We used the same snapshots and basis set protocol as the CTTS spectra calculations and chose! 1 as 2.66 eV. The macroscopic 2PA cross section, 2pa , is measured in the Göppert-Mayer units (1 GM = 10 50 cm 4 s/photon) and is related to the microscopic cross section 2PA as 2pa = 2 3 a 5 0 ! 2 c h 2PA iS(! ;! 0 ; ); (5.4) 123 where, 2pa is the 2PA cross section in GM units, is the ne structure constant,a 0 is the Bohr radius, ! = ! 1 +! 2 , c is the speed of light,h 2PA i is the rotationally averaged 2PA cross- section, and S is the line-shape function. We choose the Gaussian lineshape function with 0.1 eV broadening factor. Since the rotationally averaged 2PA cross section depends on the polarization of the incident light, we report the 2PA spectra in Fig 5.1 for parallel linearly polarized light, along with the AIMD CTTS spectra for comparison. Figure 5.1: 1PA and 2PA CTTS spectra of SCN (aq) . The 2PA spectrum is computed for parallel polarizationandtheunitsareintheGöppert-Mayer(GM)units. The gure shows that there is an enhancement in the intensity of the 2PA spectra for the lower transitions and a small overall redshift of the peak maxima. Since the ratio of the s-type CTTS peak and p-type CTTS peak changes, we interpret it as dierent CTTS transitions behaving dierently in 2PA process. We observe evidence of non-Condon eects in 2PA transitions[9] 124 which could explain the shift in the 2PA peaks. Table 5.1 shows the diagonal 2PA moments (M aa fora =x;y;z) and par for all 8 excited states (g!e) for a representative snapshot. Table5.1: Diagonal2PAmomentsfor8lowestexcitedstatesand par inatomicunits. ES Type M xx M yy M zz par (au) 1 s-type CTTS -5.414 3.5125 14.969 72.24 2 s-type CTTS -11.987 0.584 3.362 114.7 3 Intramolecular 1.794 7.826 -6.074 17.38 4 Intramolecular -1.789 -8.364 7.192 51.36 5 Intramolecular 1.284 12.100 -0.331 39.22 6 p-type CTTS -1.637 -42.319 -19.748 765.3 7 p-type CTTS -3.173 -17.433 0.597 202.1 8 p-type CTTS 6.251 22.417 -22.982 257.8 The data in Table 5.1 does not reveal dominant contributions in any of the moments, which is expected since our system is not centrosymmetric. This makes the NTO analysis challenging because we have to consider every NTO pair, i.e., 3 pairs (x, y, and z) for each excitation in order to characterize transitions. We are investigating the NTOs and the exciton properties like exciton size and electron-hole distance to explain the dierences between the 2PA and UV-Vis CTTS spectra. 5.3.2 ChallengesinmodelingSFGspectra We used QM/MM AIMD method to simulate the solvation of SCN ion in bulk, but doing the same in the air-water interface is more challenging. The diculty of QM/MM simulations in the interface arises from its anisotropic nature[10]. The thickness of the interfacial region of water is estimated to be5 Å[11], which means that the size of the QM region must taken all interfacial waters around SCN into account for the dy- namics and non-linear spectra calculation. This makes such calculations prohibitively expensive. Additionally, the current QM/MM implementation inQ-Chem does not allow for non-cubic boxes, 125 which makes conventional air-water interface dynamics impossible, since these calculations use slab shaped box to simulate the interface. This arises from the Ewald summation[12, 13] formula inQ-Chem, which only allows cubic unit cells and the extension to non-cubic cells would require an overhaul of the Ewald equations. QM/MM dynamics in air-water interface is an essential step in computing the electronic SFG (E-SFG) spectra of the anion. The experimental E-SFG spectra, as reported by Saykally[14], does not show the excitation we refer to as s-type CTTS excitation in our calculations. Theoretical simulation and understanding of this phenomena is crucial in describing how complex anions are solvated in the interface. A possible route to performing these calculations would be to simulate QM/MM AIMD calculations using another package and then carry out spectral calculations using Q-Chem. 126 Bibliography [1] M. L. Vidal, P. Pokhilko, A. I. Krylov, and S. Coriani, Equation-of-motion coupled-cluster theory to model L-edge x-ray absorption and photoelectron spectra, J. Phys. Chem. Lett.11, 8314 (2020). [2] P. Schwerdtfeger, editor, Relativistic Electronic Structure Theory. Elsevier, Amsterdam, 2002. [3] M. Ilias and T. Saue, An innite-order two-component relativistic Hamiltonian by a simple one-step transformation, J. Chem. Phys.126, 064102 (2007) [4] L. Halbert, M. L. Vidal, A. Shee, S. Coriani, and A. Severo Pereira Gomes, Relativistic EOM-CCSD for Core-Excited and Core-Ionized State Energies Based on the Four-Component Dirac-Coulomb(-Gaunt) Hamiltonian, J. Chem. Theo. Comp.17(6), 3583 (2021). [5] J. H. D. Eland, M. Tashiro, P. Linusson, M. Ehara, K. Ueda, and R. Feifel, Double Core Hole Creation and Subsequent Auger Decay in NH 3 and CH 4 Molecules Phys. Rev. Lett. 105, 213005 (2010) [6] P. Linusson, O. Takahashi, K. Ueda, J. H. D. Eland, and R. Feifel, Structure sensitivity of double inner-shell holes in sulfur-containing molecules Phys. Rev. A.83, 022506 (2010) [7] K. D. Nanda and A. I. Krylov, Two-photon absorption cross sections within equation-of- motion coupled-cluster formalism using resolution-of-the-identity and Cholesky decompo- sition representations: Theory, implementation, and benchmarks, J. Chem. Phys.142, 064118 (2015). [8] K. D. Nanda and A. I. Krylov, The eect of polarizable environment on two-photon absorption cross sections characterized by the equation-of-motion coupled-cluster singles and doubles method combined with the eective fragment potential approach, J. Chem. Phys.149, 164109 (2018). [9] E. Kamarchik and A. I. Krylov Non-Condon Eects in the One- and Two-Photon Absorption Spectra of the Green Fluorescent Protein, J. Phys. Chem. Lett.2, 488 (2011) [10] G. Groenhof, Introduction to QM/MM simulations, Methods Mol. Bio. 924, 43 (2013) [11] M. P. Coons and J. M. Herbert, Quantum chemistry in arbitrary dielectric environments: Theory and implementation of nonequilibrium Poisson boundary conditions and application to compute vertical ionization, energies at the air/water interface, J. Chem. Phys.148, 222834 (2018) 127 [12] Z. C. Holden, R. M. Richard, and J. M. Herbert, Periodic boundary conditions for QM/MM calculations: Ewald summation for extended Gaussian basis sets, J. Chem. Phys.139, 244108 (2013) [13] Z. C. Holden, B. Rana, J. M. Herbert, Analytic gradient for the QM/MM-Ewald method using charges derived from the electrostatic potential: Theory, implementation, and application to ab initio molecular dynamics simulation of the aqueous electron, J. Chem. Phys.150, 144115 (2019) [14] H. Mizuno, A. M. Rizzuto, and R. J. Saykally, Charge-transfer-to-solvent spectrum of thio- cyanate at the air/water interface measured by broadband deep ultraviolet electronic sum frequency generation spectroscopy, J. Phys. Chem. Lett.9, 4753 (2018). 128
Abstract (if available)
Abstract
Computational spectroscopy is a branch of quantum chemistry concerned with the modeling of various spectroscopic signals and analysis of the underlying electronic structure. The challenge encountered in this field is twofold: (i) Selection of a reliable and robust quantum-chemical method capable of describing relevant transitions and, (ii) accurate inclusion of environmental effects such as solvent and presence of interfaces. We use the equation-of-motion coupled-cluster (EOM-CC) formalism to study core-ionized and charge-transfer-to-solvent linear (UV-Vis) and non-linear multiphoton spectra of solvated anions. We also investigated strategies for reducing the steep scaling of coupled-cluster method and propose the use of a new tensor contraction algorithm involving compression of two-electron integral tensors. We used the linear algebra library STRUMPACK developed by Xiaoye et al., that is capable of compressing square matrices and performing algebraic operations on them, to study the compressibility of typical matrices (matricized two-electron integral tensors) involved in coupled-cluster calculations.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Ultrafast spectroscopy of aromatic amino acids and their chromophores in the condensed phase
PDF
Modeling x-ray spectroscopy in condensed phase
PDF
Probing charge transfer and electric fields at the bulk and interface using vibrational spectroscopy
PDF
Ultrafast dynamics of excited state intra- and intermolecular proton transfer in nitrogen containing photobases
PDF
Electronic structure and spectroscopy of radicals in the gas and condensed phases
PDF
Vibrational spectroscopy and molecular orientation at interfaces: correlating the structure of a material's surface to its properties
PDF
Adiabatic and non-adiabatic molecular dynamics in nanoscale systems: theory and applications
PDF
Electronic structure and spectroscopy in the gas and condensed phase: methodology and applications
PDF
Following redox chemistry and excited state dynamics in solution using liquid jet photoelectron spectroscopy
PDF
Catalysis for solar fuels production: molecular and materials approaches
PDF
Potential-induced formation and dissociation of adducts at interfaces and adsorption of phenols onto metal electrodes
PDF
Development of robust ab initio methods for description of excited states and autoionizing resonances
PDF
Electronic structure and spectroscopy of excited and open-shell species
PDF
Elucidating local electric fields and ion adsorption using benzonitrile: vibrational spectroscopic studies at interfaces and in bulk
PDF
Controlled heteroatom functionalization of carbon-carbon bonds by aerobic oxidation
PDF
Interfacial polarization and photobasicity: spectroscopic studies of the electrode environment and photochemical effects
PDF
Molecular orientation and structure of hydrogen bonds at water interfaces
PDF
Study of rotation and phase separation in ³He, ⁴He, and mixed ³He/⁴He droplets by X-ray diffraction
PDF
The effect of microhydration on ionization energy and proton transfer in nucleobases: analysis and method development
PDF
Understanding the roles of posttranslational modifications in aggregation using synthetic proteins
Asset Metadata
Creator
Sarangi, Ronit
(author)
Core Title
Computational spectroscopy in gas and condensed phases
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Chemistry
Degree Conferral Date
2023-08
Publication Date
06/20/2023
Defense Date
06/15/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
AIMD,computational chemistry,molecular modeling,nonlinear spectroscopy,OAI-PMH Harvest,x-ray spectroscopy
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Krylov, Anna (
committee chair
), Dawlaty, Jahan (
committee member
), El-Naggar, Moh (
committee member
)
Creator Email
ronits.chem95@gmail.com,rsarangi@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113178791
Unique identifier
UC113178791
Identifier
etd-SarangiRon-11976.pdf (filename)
Legacy Identifier
etd-SarangiRon-11976
Document Type
Dissertation
Format
theses (aat)
Rights
Sarangi, Ronit
Internet Media Type
application/pdf
Type
texts
Source
20230621-usctheses-batch-1057
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
AIMD
computational chemistry
molecular modeling
nonlinear spectroscopy
x-ray spectroscopy