Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Consistent evaluation of binding free energies and study of the role of electrostatic effects in the stabilization of protein complexes
(USC Thesis Other)
Consistent evaluation of binding free energies and study of the role of electrostatic effects in the stabilization of protein complexes
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter 6ce, while others may be from any type o f computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI A Bell & Howell Infbrmadon Compaiy 300 North Zed) Road, Ann Arbor MI 48106-1346 USA 313/761-4700 800/521-0600 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CONSISTENT EVALUATION OF BINDING FREE ENERGIES AND STUDY OF THE ROLE OF ELECTROSTATIC EFFECTS IN THE STABILIZATION OF PROTEIN COMPLEXES B y Holly Tao A Dissertation Presented to THE FACULTY OF GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Computational/Biophysical Chemistry) May 1997 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 9733149 UMI Microform 9733149 Copyright 1997, by UMI Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. UMI 300 North Zeeb Road Ann Arbor, MI 48103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. L M '.E R 5[T Y OF 50LTHER.N CALIFORNIA THE GRa D L a TE s c h o o l UNIVERSITY PARK LOS ANGELES. CALIFORNIA 90007 This dissertation, written by Holly Tao under the direction of k..ST. Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillment of re- quirements for the degree of DOCTOR OF PHILOSOPHY 6 -N O Tuate Studies Date 4 , 1 9 9 7 DISSERTATION COMMITTEE A v-u \ Chairperson ..... Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents T ab le o f C o n te n ts...................................................................................................................... ii List of T ab les.......................................................................................................................................v List of Figures.....................................................................................................................................vi A c k n o w le d g m e n ts .........................................................................................................................xi Chapter 1 Introduction........................................................................................................................1 C hapter 2 M odels and M e th o d s..........................................................................................9 2.1 Introduction.................................................................................................................... 9 2.2 The Intermolecular Potential Forces in the SCAAS M odel................................ 13 2.3 The Langevin Dipole M odel.......................................................................................18 2.3.1 Introduction..................................................................................................18 2.3.2 The PDLD Model and Its Intermolecular Interactions.......................20 2.3.3 Energetic Terms in the PDLD M o d el....................................................26 Chapter 3 The Free Energy C alculations..................................................................................... 29 3.1 B ackground................................................................................................................... 29 3.2 Applications to the Binding Free Energy C alculations........................................30 3.3 The Thermodynamic Cycles in the PDLD/S A pproach...................................... 33 3.4 The Free Energy Perturbation (FEP) M ethod....................................................... 40 3.5 The Linear Response A pproxim ation (LRA) A pproach...............................42 3.6 The Concluding Remarks o f the FEP/MD and LRA calculations..................... 44 3.7 The Solvation Free energies.......................................................................................45 3.7.1 B ackground S ig n ific a n c e................................................................. 45 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents 3.7.2 The PDLD Approach in the POLARIS P rogram ................................. 48 Chapter 4 Structure and Function of Aspartic P roteinases....................................................... 52 4 .1 Introduction ....................................................................................................................52 4.2 Specificity of Endothiapepsin..................................................................................... 53 4.3 Intermolecular Interactions......................................................................................... 58 Chapter 5 Computational Studies o f Energetic Features............................................................65 5.1 T he M olecular D ynam ics S im ulations...........................................................65 5.2 The Field Evaluations in the PDLD M ethod............................................................ 6 6 5.3 The Residue Contributions to the Binding E n erg y ................................................6 8 5.4 The Absolute and The Relative Binding Free E nergies....................................... 84 5.5 T he U nderlying Sam pling C onfigurations.................................................... 8 6 5.6 The Sensitivity of the PDLD and PDLD/S C alculations......................................92 5.7 The Effect of Partial Charge Distributions............................................................... 94 5.8 The Optimal Approaches in the PDLD/S C alculations......................................... 98 5.9 The Q SA R Analysis.......................................................................................................101 5.10 Concluding R em arks..................................................................................................104 Chapter 6 Studies o f the HIV-1 Protease....................................................................................... 106 6.1 Introduction.................................................................................................................... 106 6.2 M aterials and Computational Procedures................................................................. 109 6.3 C om putational R esults and A naly ses............................................................. 119 6.3 .1 The pKa Calculations.................................................................................119 III Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents 6.3.2 The Total B inding Energy and Its Com positions..........................121 6.4 The Residue Contributions to the Binding E n e rg y ............................................... 127 6.5 D is c u s s io n s .............................................................................................................. 136 6 .6 Concluding Rem arks.................................................................................................... 139 Reference................................................................................................................................ 141 IV Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Tables Table 3.1 The solvation energy o f Na+ and C l' in kcal/mol (gas —> s o lu tio n ) .........................................................................................................................50 Table 3.2 The table o f calculated solvation energies for charged ion pairs of Na+ and C l '....................................................................................................51 Table 5.1 The X-ray names o f the enzyme-inhibitor complexes and the corresponding observed binding free energies.......................................... 67 Table 5.2 The solvation energy differences o f inhibitors in solution/protein (kcal/m ol)............................................................................ 84 Table 5.3 The PDLD/S results and the energy components (kcal/m ol)...................8 8 Table 5.4 The relative binding energies between eia and ic due to the partial charge o f S in the S 2’ p osition................................................. 95 Table 6 .1 The results o f pKa’s o f Asp 25 in HIV-1 PR and the binding energies o f d ifferent com plexes.............................................................120 Table 6.2 The calculated and observed pKa values of aspartic groups in HIV-1 P R and the D M P323 inhibitor................................................. 121 Table 6.3 The total binding energy and its components in kcal/mol. (£ p = 4 )............................................................................................................124 Table 6.4 The total binding energy and its components in kcal/mol. (eopt=3.782)..................................................................................................125 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure 2.1 The com putational flowchart in the POLARIS program for the binding free energy calculations using PDLD and LRA approaches..........................................................................................................12 Figure 2.2 The regions o f the protein system in the SCAAS m o d el.........................15 Figure 2.3 The structure of ligand molecule in the SCAAS m odel............................16 Figure 2.4 The structure o f ligand-protein complex in the SCAAS m odel................................................................................................................... 17 Figure 2.5 The structure of ligand (pepstatin) molecule in the PDLD m odel...................................................................................................................23 Figure 2.6 The structure of pepstatin complex in the PDLD m odel..........................24 Figure 2.7 The regions in the PDLD model....................................................................25 Figure 3.1 The thermodynamic cycle in the binding free energy c a lc u la tio n ......................................................................................................35 Figure 3.2 The free energy scheme for a system following the linear response approximation..................................................................................37 Figure 3.3 The thermodynamic cycle for evaluating the electrostatic energy o f the charging processing in the binding free energy c a lc u la tio n ......................................................................................................39 Figure 4.1 The schem atic structure of endothiapepsin with L363-564 com plex..............................................................................................................55 VI Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure 4.2 The cartoon drawing o f pepstatin with endothiapepsin com plex..............................................................................................................56 Figure 4.3 The functional determinants of the natural inhibitor, pepstatin, with endothiapepsin com p lex ....................................................57 Figure 4.4 The protein structure analysis and enzym e classification........................59 Figure 4.5 The chem ical structure of I 1-I4. All protein-ligand structure were obtained from the Protein Data Bank (PDB) with the corresponding nam es......................................................................................60 Figure 4.6 The chem ical structure of 1-5 and the corresponding name in the Protein D ata Bank (PD B )........................................................................61 Figure 5.1 The electrostatic energy distribution in the protein residues of the com plex L 363,564...................................................................................70 Figure 5.2 The electrostatic energy distribution in the protein residues of the complex P epstatin.................................................................................... 71 Figure 5.3 The electrostatic energy distribution in the protein residues of the complex H 2 6 I ...........................................................................................72 Figure 5.4 The electrostatic energy distribution in the protein residues of the com plex H I4 2 .......................................................................................... 73 Figure 5.5 The electrostatic energy distribution in the protein residues of the complex j 3 .................................................................................................. 74 VII Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure 5.6 The electrostatic energy distribution in the protein residues of the complex eia................................................................................................ 75 Figure 5.7 The electrostatic energy distribution in the protein residues of the complex j 5 .................................................................................................. 76 Figure 5.8 The electrostatic energy distribution in the protein residues of the complex ic ..................................................................................................77 Figure 5.9 The electrostatic energy distribution in the protein residues of the complex ig ..................................................................................................78 Figure 5.10 The average electrostatic energy contributing to the binding of endothiapepsin complexes as a function o f residues p o sitio n s P 3 ' ........................................................................................ 80 Figure 5.11 The average electrostatic energy contributing to the binding of endothiapepsin complexes as a function o f residues p o sitio n s P 6- P 3 ’........................................................................................ 81 Figure 5.12 The correlation between the calculated results from PDLD/s (Ep=4) and experimental binding energies...............................................87 Figure 5.13 Comparison of calculated PDLD/S results over 16, 8 and 4 configuration.................................................................................................... 91 Figure 5.14 The energetic variations of total binding energy and its compositions as a function of the number of structures.........................93 VIII Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure 5.15 The correlation between experimental and calculated binding free energies......................................................................................96 Figure 5.16 The ab initio calculations o f charge distributions in levels of AM I andH F 6-31G *......................................................................................97 Figure 5.17 The correlation of optimal approach in the PDLD/S, £opt=3.321, and the observed binding energies....................................... 99 Figure 5.18 The correlation between the calculated binding energy with nonpolar vdw correction and the experimental binding energ ies..............................................................................................................1 0 0 Figure 5.19 The linear regression method to correlate the calculated binding energies and the experimental values............................................103 Figure 6 .1 Ribbon drawing of the HIV-1 PR with JG-365 com plex....................... 110 Figure 6.2 Ribbon drawing of the hom odim er HIV-1 PR with JG-365 com plex..............................................................................................................I l l Figure 6.3 The specificity pockets in the HTV-1 PR with JG-365 com plex..............................................................................................................112 Figure 6.4 The chem ical structure of inhibitors for the HTV-1 protease..................115 Figure 6.5 The illustration of the thermodynamics cycle used for the pKa calculations in the POLARIS program................................................118 Figure 6 .6 The correlation of the pKa shifts (p-> p+1) and the observed b in d in g e n e rg ie s ........................................................................................ 1 2 0 IX Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure 6.7 The results of PDLD/S approaches are summarized with Ep=4............................................................................................................126 Figure 6 .8 The average electrostatic energy contributing to the binding o f HIV-1 PR complexes as a function of residues positions P5-P 4 ............................................................................................................... 129 Figure 6.9 The average electrostatic energy contributing to the binding o f HIV-1 PR complexes as a function of residues positions P 5-P 5 ’ ............................................................................................................... 130 Figure 6 .10 The electrostatic energy distribution in the protein residues o f the complex A-76928 (S ,S )....................................................................131 Figure 6 .1 1 The electrostatic energy distribution in the protein residues o f the com plex A -76889 (R ,R )............................................................ 132 Figure 6 .12 The electrostatic energy distribution in the protein residues of the complex A -74704............................................................................... 133 Figure 6.13 The electrostatic energy distribution in the protein residues of the complex JG -365..................................................................................134 Figure 6.14 The electrostatic energy distribution in the protein residues of the complex M V T-101............................................................................. 135 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgments It is a pleasure for me to acknowledge the contributions o f m any people whose inspiration encouraged me to pursue my dreams and becom e who I am today. I would like to thank my parents wholeheartedly for their love and sacrifice, for their experience to teach me to be strong, smart and independent. I owe my debt to my teachers in China w ho led me into science and were responsible for my early education. I would like to thank Dr. Chu with all my gratitude for his patient explanation of program and his unselfish help to my project. I was fortunate to be in a wonderful and stim ulating place which allowed me to refine my research skills and develop my critical thinking abilities. My special thanks to Dr.W arshel who has shown m e an entirely different world as both the scientist and person; I really learned how to get the best from m yself and stretch my abilities. I would like to thank him for his support financially and otherw ise during my years here. I would like to thank many of my friends for their love and support for good times and bad times that I really cherish and treasure. I really owe my thanks to Andy for his unlimited support, encouragem ent and love. I learned to appreciate w hat I have. I want to thank my sisters for their efforts and love. XI Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 1 Introduction The specificity and selectivity of proteins are closely related to their unique three dim ensional structures and geom etry. The three dim ensional structure and function o f proteins are directly depending on the sequence o f amino acid side chains in the polypeptide. The physical-chem ical properties of the side chains divide the twenty am ino acids into several groups. Each am ino acid offers distinct characteristics, including the three-dim ensional geometry, size, chemical activity, hydrophobicity, etc. The distinctive combination o f amino acids and the complexity of am ino acid molecules themselves indeed deserve the investigations of the functions, structures and dynamics o f proteins at multiple aspects.*-2 T he studies lead one to tackle increasingly diverse problem s emphasizing the fundam entals o f physical, chem ical and biological activities. T hough the extensive investigations have involved the determination and analyses of proteins structures with graphical dem onstrations, the m ajor studies in this work have been devoted to developing and refining the computational methods with the aim of understanding the fundamentals of m olecular associations that govern biological activities. The force field approaches have been applied to quantitatively estim ate free energies o f the binding o f protein-ligand complexes in solution and pKg’s o f charged protein residues. A protein is usually determ ined at the structural levels in w hich the prim ary structure containing the sequence o f amino acids provides fundam ental and useful inform ation for further studies. T he secondary structure involves the formation of the structural architecture of proteins from the non-covalent interactions, such as hydrogen bonds. The tertiary structure o f proteins are determ ined by all possible interm olecular Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. interactions that are responsible for the protein folding. The most advanced structural type, the quaternary structure, is dealing with the functional groups o f proteins as well as the classification o f protein structures. A fter all we conclude that the im portance of intermolecular interactions is directly related to the formation o f the structure, shape and geometry o f protein, as well as the biological functions.^-^ Therefore, the puzzle of the structures and functions of proteins should be solved in such a manner that detailed and thorough interm olecular interactions can be quantitatively described and analyzed at an atomic level. Since the interatomic interactions are dominated by electrostatic interactions, the evaluation o f the electrostatic free energies from the intermolecular interactions would hopefully provide the best w ay o f correlating structure and function from a microscopic point of view. The availability of X -ray structures o f proteins offers excellent opportunities to model m olecules at a m icroscopic level. In this context, the inform ation o f three- dimensional structure has a pivotal role that provides the construction of frameworks for the conceptual models. The resultant graphic model allows us to explore the relationships o f three-dim ensional structure o f biological systems and their significant biological functions through the investigation of the interm olecular interactions considered to be responsible for many phenomenon, for example, the dynamics o f proteins and the specific functionality. The progress of studying the relationship of the structure and function of biological systems have been advanced to a point where the detailed structure at an atomic level can be taken into account to probe the chem ical and physical p r o p e r ti e s .T h e tasks are accom plished through the m erger o f the theoretical developm ents and the advanced computer simulations and technologies. In this work the studies emphasize the quantitative calculations o f m olecular association energy o f protein-ligand com plexes and the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. consequently rational analyses would provide com prehensive understanding of the origin of the affinity and specificity of the binding. W e have chosen to study the affinity and specificity o f aspartic proteinases, including endothiapepsin and HIV-1 protease. Aspartic proteinases are characterized by the presence o f a deep active site cleft containing two aspartic acid residues that are essential for catalytic activityEndothiapepsin is one of typical fungal extracellular enzyme and has been studied extensively with many renin inhibitors in attempts to understand the specificity and affinity o f the fungal enzyme complex.^ The dim eric structure o f HIV-1 PR is similar to the other aspartic protease, each m onom er contributes half of the active site which includes two catalytic aspartic acids that are in all aspartic proteases. The extensive X-ray studies m ake both endothiapepsin and HIV-1 PR the best structurally characterized enzym es known. The am ple availability o f the three-dim ensional structures o f enzyme com plexes provide opportunity to study the specificity and affinity o f enzym e complexes computationally. The electrostatic interactions are considered to play a major role in many biological activities, such as association of subunits and ion channels in m em branes.• '•*'* Quantitative descriptions of m icroscopic electrostatic interactions m ight be an im portant aspect of studying the relationship o f the structure and function o f protein and D N A molecules. The implementation of the evaluation o f electrostatic energies is a vital process to examine this idea. It is anticipated that the developm ent of a com prehensive program to calculate the electrostatic energy not only offers an advanced approach to make an im portant impact on detailed studies which m ost experimental works can hardly access, but also provide an instructive tool to m ake intellectual predictions of how and why protein or DNA are going to function from m icroscopic viewpoints. The autom ation is our ultim ate goal and its practical applications are enormous and seductive. For example, if we can predict the ability of different ligands to associate with a receptor, for instance, a protein or DNA molecule. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. discriminating potentially strong-binding drugs out of the selective pool using the state-of- the-art com puter sim ulations, the process o f screening perspective drugs would be drastically reduced. Therefore, attractive applications have extended the capacity of the computational methods to assist in the search and modeling o f novel drugs in an innovative approach based on structural information and desired properties. The calculation o f electrostatic energy is, however, not a simple task in many aspects. M ost significant obstacles might be the nature o f long range interactions that need to be taken into account and the subtlety o f inhomogeneous environm ents o f biological s y s t e m s . I n c l u s i o n o f solvent effects presents a m ajor challenge in the com puter m odeling approach. R egarding the solvent models, the question should be w hether the model is reliable and flexible enough to be adopted in the application o f a com plicated system and capable to generate the desired chemical and physical properties. T he models of considering the solvent effects can be generally summarized as the follows: m acroscopic, m icroscopic and sem i-m icroscopic modeling. All models and their strength and weakness have been discussed and analyzed. In particular, the sem i-m icroscopic m odel (PDLD model) o f representing the solvent molecules as dipoles has been successfully implemented and exam ined to estimate m any biological activities. * ‘ ^■20 The results are quite impressive as in the binding energies and pK a’s calculations of charged residues as the m ajor accomplishm ents of this thesis. It is also advised to keep in m ind about the m odel that no matter which model and approach are chosen, the achievement o f accuracy is o f course the most im portant issue o f concern. M oreover, one should concentrate on w hether the proposed model and m ethod can provide the effective approach to help explore and understand the nature o f protein behavior from a microscopic point of view. A fter all, the power o f possessing the advanced techniques in efforts to explore the nature o f enzyme catalysis and inhibition will inevitably enhance our capacities to manipulate and repair the architecture of proteins at a molecular level. 4 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. It is always im portant and necessary to understand different approaches in which the com parisons o f advantages and drawbacks provide useful inform ation fo r the im provem ents that m ight lead to develop new methods. In the present work, w e have investigated and com pared several alternative methods including Protein Dipole Langevin D ipole (PDLD) m ethod, Sem i-m icroscopic PDLD (PDLD/S) m ethod. Free E nergy Perturbation (FEP) and L inear R esponse Approximation (LRA) and Semi m icroscopic LRA (LRA/S) for the free energy calculations. Considering the m icroscopic approach of the FEP m ethod, m any difficulties confronted in the practical calculations lead us to develop alternative approaches, including the semi-empirical PDLD and linear response approximation. This work has em phasized the development and refinem ent o f the Protein D ipole Langevin Dipole (PDLD) approach to the study of solvation and association free energies of organic com pounds, ions, ion pairs, polar and nonpolar ligands in the protein and in solution. The studies include the assessment of solvation energies o f ligand molecules in w ater and protein, binding free energy calculations, the evaluations o f pKg's o f charged groups in protein. The applications are focused on studying the electrostatic properties of the binding site in aspartic proteinase. The electrostatic interactions were found playing a key role in binding o f inhibitors to the active site o f the enzyme. The PDLD m odel is considered as a simplified solvent m odel that captures the average electrostatic forces betw een the solute and solvent and solvent itself. The implementation o f the PDLD model into the POLARIS package has been coupled with the m olecular dynam ics sim ulations to carry out the com putations. The following accom plishm ents are satisfied to com plem ent the PDLD approach and achieve the robust calculations in the POLARIS program. 1. The solvation free energies o f ions and ion pairs have been evaluated using PDLD/S approach. The results are found in good agreement with the observed values. The Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. physics of the compensatory nature o f electrostatic interactions is illustrated in details with the solvation energy calculations o f ion pairs in water. 2. M ethods of calculating the binding free energy are developed and refined. The binding free energies are calculated by different approaches. Among all the methods, it is found that the PDLD/S is the m ost reliable and consistent m ethod for obtaining reasonable results within an acceptable time frame. T he results of calculated binding free energies are found very impressive comparing against the observed values. 3. The group contributions of protein and ligand residues have been determined and analyzed. The results have revealed interesting insights into the detailed binding patterns in a way that the active residues o f protein as well as ligand are quantitatively identified. It is an obviously powerful approach for the structure-based m olecular design. 4. The im portant sam pling issues in PDLD and PDLD/S approach have been investigated. W hat is striking about the PDLD model is its elegant sim plicity and consistence. The dipole representations o f w ater molecules provide a feasible and robust model that is capable to seize the average force fields of solvent by the empirical approach. The computational efficiency has been exhibited in the given model to reach the energetic convergence much faster than any o th er m icroscopic m odel. The model has been successfully applied in many studies, such as pKa‘s calculations, binding free energy and redox potential energy, etc. It has been found that the successful calculations of both binding energy and pKg's in the PDLD m odel are not expected to rely on the massive numbers of configurations. This is partly achieved by including the locally solvent relaxation due to the charging process. 5. The sensitivity of the PDLD/S calculation has been studied on various issues, for example, the question o f what impact the constraint of molecular dynamics will have on the resultant free energies is investigated. To effectively and precisely evaluate the interactions Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. involving the ionizable residues in protein appears to be an essential task and real challenge in free energy calculations. 6 . The close relationship between the partial charges of protein and the binding free energy is quantitatively exam ined. To accurately obtain the partial charges of protein molecules presents a real challenge. The ab initio method has been em ployed to assess the charge distributions. W e found that the best residual charges that were used in the PDLD calculations are very similar to that obtained from the Ab initio calculations. 7. M ethod o f Quantitative Structure Activity Relationship (Q SA R) study of the correlation of binding free energy and its energetic contributors has been conducted. A linear regression method is developed to analyze the relationship of the total energy and its components based on the PDLD/S approach. Analytical m ethods have been developed in attem pts to rationalize the PDLD/S results in which the various interrelated factors of different contributions and the total binding energy w ould be analyzed in a systematic w ay. The ability to optim ize the computational method will undoubtedly enhance the predictive power of the approach. The linear regression m ethod has extended the concept o f QSAR theory and provided an analytical tool to interpret the PDLD/S results. The com putational m odeling provides access to the state-of-the-art visualization tools for displaying m olecular structure in three dim ensions. The integrated m olecular dynamics sim ulations o f the m olecular system under the influence o f the intermolecular forces can hopefully provide a real-tim e frame of m olecular animation. The resources o f computer hardware and software have been used to integrate computation and visualization tools for the problem s in structural and modeling chem istry with particular emphasis on elucidation of studies o f macrom olecular structures, m olecular interactions and molecular design. The equipm ent include program s for num erical and sym bolic computing of 7 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. structurai features o f protein m olecules as well as high perform ance interactive three- dim ensional color graphics interfaces fo r visualizations o f com plex m olecular structures. M oreover, the introduction o f the netw ork applications in the field o f bioinform atics is considered as one o f the vital resources for inform ation conduits both to obtain and disperse scientific data worldwide through the navigation o f powerful network engines. The applications o f PDLD m ethod to the binding free energy and pKg calculations have been extended to the HIV-1 protease and its inhibitor com plexes. The pKg's for ionizable groups of HTV-1 PR (Asp25) have been determ ined quantitatively in the native enzyme and the com plex form, respectively. It is interesting to observe that the binding activity is closely related with the values of pKa‘s o f the catalytically active Asp residues. All HIV-1 PR com plexes share a couple of common structural features. For instance, a favorable pattern of the protein and ligand interactions has been observed in all complex. All inhibitors are bound in the extended conformation and induce sim ilar binding modes in the active site. The group contributions both in the protein and ligand are evaluated by the electrostatic theory. It is found that the binding potency is governed by the specific interactions between the protein and ligand. The im portant and detailed com putational analyses provide the foundations that assist us to design and guide the strategies for the knowledge-basis rational drug-design. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 2 Models and Methods 2.1 Introduction Representing an infinite system by a finite m odel system is far from being trivial especially when the system is constituted by several inhomogeneous ingredients. The pitfalls of the early approach o f using a macroscopic m odel have been stressed in many studies. For the application o f a free energy calculation, a microscopic m ethod has been advocated by m any re se a rc h e rs.22 H ow ever, studies consistently show that the convergence problem associated with the all-atom m odel for the free energy calculation presents real challenge and requires considerable length o f computation time. Interestingly, it has been found successful in applying the alternative models of representing the solvent molecules as protein dipole Langevin dipoles (PDLD) in the free energy calculations. All models have been im plem ented in the software package POLARIS that are incorporated with the methods for the free energy calculations. We have investigated several alternative methods, including the Free Energy Perturbation (FEP), Linear Response A pproxim ation (LRA), Protein Dipole Langevin D ipole (PDLD) and semi-microscopic Protein Dipole Langevin Dipole (PDLD/S). The molecular dynamics (M D) simulations have been utilized with FEP m ethod for the free energy calculations. O ur previous studies revealed that FEP/M D calculations involved m ajor convergence problem s and lengthy execution tim es.23-25 LRA approach appeared to provide reasonable results for nonpeptide inhibitors o f the same fam ily but it needs significant com puting time. Thus, we have devoted our efforts into the PDLD and PDLD/S studies and found that the force field Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. approach in the PDLD method is more effective in reaching an energy convergence. Despite the convergence problem in the FEP/MD approach, the m olecular dynam ics simulations have been used for sampling the configurations that are collected for the subsequent PDLD calculations. Rem arkably, the PDLD/S method seems to be the best approach to produce sem i-quantitative results in a reasonable execution time. •'^ > 2 6 The overall com putational schem e in the POLARIS program for the binding energies o f protein-ligand complex by several methods is shown in Figure 2.1. As the m ethodology and computational analyses will be discussed in details in the later chapters, we shall focus on the models, namely the surface-constraint all-atom model (SCAAS)22.28 and protein dipole Langevin dipole (PDLD) m o d e l , 2 9 that are used for the FEP/LRA and PDLD/S calculations, respectively. In our approach the model system is basically divided into two regions: The interactive region o f a system is represented by a reaction (microscopic) region in which all atoms are represented explicitly and a continuum (m acroscopic) region for the remainder of the system. A user can choose to represent the solvent m olecules in the reaction region as: (a) a surface-constraint all-atom (SCAAS) m odel; (b) a Protein Dipole Langevin Dipole (PDLD) model; or (c) a hybrid model that contains both o f the explicitly water molecules and the Langevin dipoles as occupied in the fine region that is between the all-atom model and the continuum region (optional model). The special boundary conditions had been implemented on the inter-surface between the m icroscopic region and the continuum region in an attempt to better mimic the infinity system by a finite model. The Local Reaction Field (LRF) have been implemented in the m olecular dynam ics sim ulations and PDLD model in order to correct the long range electrostatic forces that are usually overlooked or present difficulty to develop method to effectively ev aluate.20 Both the special boundary conditions and LRF m ethod have enhanced the performance of free energy calculation significantly. The determination of the energy from the intermolecular interactions in the reaction region is obviously important 1 0 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Computational Flowchart in the Polaris Program Starting structure (X-Ray) Relaxed A State Relaxed B State SCAAS Model and MD DLDModei and Methods Structure 1 (A) Structure 1 (B) Structure 2 (A) Structure 2 (B) 4 Structure N (A) Structure N (B) DLD/S Results RA Results 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. F ig u re 2.1 The com putational flowchart in the POLARIS program for the binding free energy calculations using FOLD and LRA approaches. The two pathways for separated m olecular dynamics simulations are determ ined by two different charge states. This special feature of considering the solvent reorganization energy due to the charging process in the biological system has obviously presented advantages to accurately evaluate and describe the biological activities of using molecular modeling. The assumption of Linear Response Approximation (LRA) allows one to evaluate the free energy difference between the two states, for example, charged (A) and uncharged (B) states, by simply taking the average over the two potential surfaces o f the A and B states. The schematic calculations for FDLD and LRA are distinctive. In the FDLD approach, each structure o f A and B states is collected at equal length over the consecutively m olecular dynam ics simulations; while in the LRA method, the energy is collected over many trajectories of molecular dynamics simulations. Most significantly, the molecular models are different in both approaches. In the FDLD method, the FDLD model is obviously used to represent the interm olecular potentials; in the LRA approach, the surface-constraint all-atom (SCAAS) model is implemented to determine the microscopic potential terms. The X-ray structure serving as a starting point provides a computational framework and represents the most realistic physical description. In the diamond symbol, the FDLD approach involves m ethods for electrostatic and hydrophobic calculations. This calculation scheme is implemented into the thermodynamic cycle for the binding process. For instance, the calculation o f free energy o f ligand in solution (a phase in the binding thermodynamic cycle) is determined by this process. The reorganization energy is evaluated from the average difference between the energies of A and B states. 1 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. because the energy from the reaction region is a m ajor component to the total energy of the system. It should, however, be realized that it is essential to include the average forces and energy from the part of the system not explicitly included in the model. In addition to that, the dynam ical effect of the continuum region should also be taken into account in the molecular dynamics simulations. The protein molecules are not rigid and static in any physically realistic system. The fluctuation of the protein molecules about their equilibrium position at a given temperature and the influence o f the solvent will be considered in the model system. The exploration of the conformational space of the protein molecules is accomplished by molecular dynamics simulations. The molecular dynamics simulations provide an effective method to sample the configuration spaces of a protein about its equilibrium position. 2.2. The Intermolecular Potential Forces in the SCAAS Model In the Surface Constrained All Atom M odel (SCAAS), the solvent and protein m olecules are represented explicitly in the reaction region for the m olecular dynam ics simulations and free energy calculations. The rem ained molecules of the entire system that are beyond the reactive region are represented as a continuum cavity. The description of the SCAAS model is given in Figure 2.2. The positions o f solvent molecules in the ligand and protein-ligand systems have been m odeled in a w ay that the density o f solvent equals to 0.998 g/cm^ at the initial state o f the molecular dynam ics simulations. The examples of the protein structures in the SCAAS model used in the simulations are given in Figure 2.3-2.4. The total potential energy for the whole system should be expressed as additive terms that involve microscopic representation o f protein molecules and their surrounding solvent shell up to a radius of 18 and the average forces from the continuum region for the rest o f the 13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. system. This equation is derived from standard statistical m echanical relationships.^' The potential energy terms in the microscopic region are given: y S S ^ ^ bands angles dihedrals = S^w(^,-^o)+Ë^e(^,-eo)+ 2^t,(I + cos(ny,-a)) (2.2.1) i i i n n /T rr induced +2^(332 M i + 4e,_((-%.)" - i<j ’'ij ^ij i where V^s represents the interactions between all the solute molecules. and represent the solute-solvent interactions and the solvent-solvent interactions, respectively. The first term represents the covalent bond stretching along the bond equilibrium position bq. The second term describes the angle bending along the angle equilibrium position. The third form is used for the dihedral-angle interactions. The next term s include the coulomb interactions and the van der W aals effective non-bond interactions. A simple Lennard-Jones 1 2 -6 potential o f the two-body effects has been implemented in the com puter simulations to provide reasonable intermolecular interactions. The parameters e and C T , are provided by the polarizability and van der W aals radius respectively. In general. The energy param eter e increases with atom ic num ber as the polarizability goes up, a also increases down the group of the periodic table, but decreases from left to right across the period table with the increasing size of the van der W aals radius. The actual parameters are, however, calibrated regardless of what chemical properties of the particular atom(s) are involved. In tackling the nonbonding potential involving two unlike atom s, the venerable Lorentz Berthelot mixing rules are applied: (Jab=l/2(Ca+ab) (2.2.2) eab=l/2(£a+eb) (2.2.3) where a and b indicate two different atoms. The last term includes the induced dipole interactions of the protein. The energy is expressed in kcal/mol and distances in Â. The 14 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. F ig u re 2.2 The regions of the protein system in the SCAAS m odel used for m olecular dynamics simulations. The region 1 contains the charged group of interest. The ligand molecules that attack the enzyme active site are chosen to be the charged group in target. The region 2 is constituted by the protein molecules that are found within a sphere with a radius of R% are represented in a microscopic way. The region 3a made up by the w ater molecules that are found w ithin a sphere with a radius o f are represented explicitly. The region 3b contains the surface water m olecules that are subjected to period boundary conditions. All molecules of both protein and water in region 2 and 3 are allowed free to move. If R4 is chosen to be greater than R4 , the Langevin Dipoles are constructed in the fine region between R31, and R4 . The rest molecules of the system are treated as a continuum that includes the bulk solvent and the protein molecules outside of the region 4. 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The SCAAS Model of Ligand in Solvent Ÿ C % F igure 2.3 The structure of ligand molecule in the SCAAS model. The pepstatin molecule is shown in sphere. 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The SCAAS Model of Pepstatin Complex in Solvent ( Figure 2.4 The structure of ligand-protein complex in the SCAAS model Strand drawing of pepstatin- endothiapepsin complex is shown. The water molecules have been built penetrating into the protein space. 17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. parameters qj are the residual charges o f the atoms (in units o f electron charge), a are in A. e are in kcal/mol and a are in Â^. The SCAAS model is im plem ented in the program EN ZY M IX , The static and dynamical properties o f the bulk w ater m olecules have been exam ined using the all atom solvent m o d e l .T h e results dem onstrate that the perform ance o f the model is quite consistent and alm ost insensitive to the size o f the model. These are, however, not the studies we intend to present. As far as the sim ulations of the charged groups and polar molecules are concerned, the calculations of electrostatic energy becom e very sensitive to the size o f the system. The application of the all atom model in the FEP/M D method will be discussed in the following section. As depicted in Figure 2.1, the SCAAS model that has been built around the protein are subject to the molecular dynamic simulations. The relaxed structures of protein system from the end of each state of molecular dynamic simulation are retained for the subsequent PDLD calculations. 2.3 The Langevin Dipole Model 2.3.1 Introduction The developm ent of microscopic m odels and approaches is aim ed at providing a detailed description in terms of com pletely energetic information and structural dynamics that implicate the specifically biological function. These results are perhaps the most direct evidences or factors for structure-function correlation in proteins. Theoretically, the microscopic model possesses more advantage than the macroscopic model in aspects of studying the solvent effects and dynam ics in the protein system. All interactions could be assessed by the explicit representations of the m olecular system. M ost significantly, the 1 8 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. m icroscopic m odel does not require the information o f the renow ned ‘dielectric constant' for the inhom ogeneous system . The m icroscopic approach is prom ising to lead a com prehensive and predictive com putational study by including the detailed solvent associated effects. T he pioneering examples o f successfully using all-atom model to simulate solvent molecules, for exam ple, works done by Barker & W atts ( 1 9 6 9 ) 3 3 and Stillinger & Rahman ( 1971,1974,1977)34-36 had invited opportunity for further application of sim ulations of com plicated m olecular system in polar solution. Considering both the polar group o f solute and possible reorientations of solvent molecules, the sim ulations unfortunately face form idable convergence problem s that could not be handled by current com puter power. The problem s are mostly associated with the accuracy of the force field representations, long range electrostatics treatment and various cutoffs limited by the current computational power. This rather disappointing conclusion of using rigorous all atom model prom pted us to devise alternative representation of solvent m olecules in the m icroscopic regim e. The premise o f applying solvent dipoles dates back to 1 9 4 1 , a pioneering model proposed by Stockmayer includes the solvent effects by calculating the interactions of both dipole-dipole and van d er W aals of dipole p o i n t s . 3 7 The arguable incapability o f this type m odel to reproduce solvent structure, or to represent the associative effects o f solvents were considered as secondary factors, especially when the solvation energy is concerned. As pointed out in the free energy studies, it is an exclusively key process to consistently calibrate the solvation energy and radial distribution function for the em pirical parametrization. It has been found that using the potential terms calibrated in the gas phase would produce the unrealistic solvent energy unless the potential terms w ould be recalibrated using experimental information. By incorporating the experimental data, the quantum m echanical evaluation becom es insignificant and can be negligible. The first attem pt o f using such approaches for evaluating the m icroscopic dielectric effect of 19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. biological m acrom olecules has been introduced by W arshel & L evitt (1976).^^ They developed a m odel that includes the perm anent dipoles, charges and induced dipoles of proteins and that represents the surrounding w ater molecules by the Langevin dipoles (W arshel, 1978, 1981).*^’^^ They have successfully showed that the calculations of electric fields in proteins w ithout considering the fields obtained from the induced protein dipoles and the surrounding solvent molecules can be very m isleading. (Warshel, 1984. 1985)'* o.41 The feasible explanation is that the physics of solute-solvent interactions of a complex system can be captured by simple models. In contrast, the m acroscopic electrostatic theory that views m atter as a dielectric continuum fails to handle the inhomogeneous system in which the dielectric measured at different sites are not necessarily the same. In additional, the calculation applying macroscopic electrostatic form ula involves using the arbitrary-defined dielectric constants. For instance, w e use C oulom b’s equation to evaluate the energy o f bringing two charges from infinity to a given distance (3Â) inside a protein, where e is defined by reflecting all the electrostatic effects o f the surrounding of the ions. The charge-charge energy in protein would be over-estim ated by at least 2 0 kcal/mol if the net dielectric constant is assumed to be 3, a net electrostatic effect o f the surrounding water and protein is translated implicitly into the actual energetics o f the system. The use o f the B om ’s ^243 and O nsager’s macroscopic model to estim ate the energy associated with moving the charge and ion pairs from vacuum (e= l) to a given inhomogeneous medium can be quite problematic. 2.3.2 The PDLD Model and Its Intermolecular Interactions In this model solvent molecules are replaced by the Langevin-type dipoles that are translationally "frozen" and their interactions with both the protein m olecules and the other solvent molecules should be treated in a way that the average forces and energy produced 2 0 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. by the solvent are approximated. Each dipole is induced to point its local field direction by the dipoles in the protein m olecules and the other solvent according to the Langevin equation: = e % (c o th % ; — Y) (2.3.1 ) Xi where C j is a unit vector in the direction of the local field, p is a param eter, pq is taken as 0.35D . The equation is solved iteratively to generate the self-consistent Langevin dipoles. The relevant polarization law can be reproduced by calibrating the parameter p and the solute-solvent van der W aals distances for different atom types so that the solvation energies of various ionic species with different charges and radii are reproduced. This m odel is prom ising to lead a faster energy convergence than that o f using the all atom m odel. A Local Force Field approach to treat the long range interactions of the Langevin dipoles has a great advantage to reach an energy convergence and provides reasonably good estimates o f the solvation energies of small molecules in considerably short time. The description for the system o f interest in the PDLD m odel is the following: The protein m olecules and the solvent dipoles are represented explicitly in the reaction region and a m acroscopic continuum is used for the rest o f the system. In the reaction region, the m odel contains protein atoms occupying in their X -ray structure with calibrated residual charges representing the protein permanent dipole moments. C onsidering the construction o f solvent dipoles in the solvent region, three cubic grids, 15x15x15 w ith 1 inner spacing, 20x20x20 with 3 inner space and 21x21x21 with 4 inner space, respectively, are built based on the same point that is the center o f ligand m olecules at the protein active site. The resultant grids are then rounded to three spheres with radius o f 15Â, 20Â, 21Â. respectively. The solvent dipoles that are within the van der W aals distance o f the closest protein atom are truncated. T he grid dipoles are calculated by applying the Langevin 2 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. equations o f (2.3.1) and (2.3.2). The interactions of the nearest-neighbor Langevin dipoles are negligible. The PD LD m odel considers explicitly the electrostatic interactions of protein/solvent system, the field-dependent hydrophobic energy and nonbonding ven der Waals interactions. T he illustration o f actual protein structures in FDLD models for the calculations have been depicted in Figure 2.4 and 2.5. The total electrostatic free energy is a collection o f several energetic terms according to the four regions existing in the PDLD model as described in Figure 2.6: i^ K m l = AV'ee + + à V tM (2.3.3 ) Region 1 is m ade up by the charged groups of interest. In the system o f protein- ligand com plex, the entire ligand molecules are considered in the region 1. The term A V q q represents the intra and inter molecular interactions of ligand. The geometrical center of the ligand is the same center that has been chosen to construct the dipole grids. Region 2 contains the protein atom s within the sphere w ith the radius o f R%. In our system, R? is chosen to be 18 Â. AVq^ denotes the interactions between the charges in regions 1 and the residual charges of the protein in the region 2 . AVgg represents the energy o f the protein- induced dipoles. It is worth to mention that the protein dipoles in the region 2 are inducible while the dipoles of ligand are not polarizable in the PDLD model. Region 3 contains all the Langevin dipoles that are built around the protein and ligand atoms within the radius of R3 of a sphere. In our calculations, Rga and Rgy are chosen to be 15 À and 20 Â, respectively. AV/gvn represent the energies o f the Langevin dipoles. AV^om is the electrostatic energy associated with contributions o f the infinite system that are not included explicitly in the PDLD model. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The PDLD Model of Pepstatin Ligand in Solvent ii Figure 2.5 The structure of ligand (pepstatin) m olecule in the PDLD model. The ligand m olecule is shown in sticks. 23 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The PDLD Model of Pepstatin Complex in Solvent F igure 2.6 The structure o f pepstatin complex in the PD LD model. Strand drawing of the pepstatin- endothiapepsin in com plex is shown. Langevin Dipoles have been built penetrating into the space of protein. 24 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bulk F igure 2.7 The regions in the PDLD model.The model is depicted with the structure of endothiapepsin- pepstatin complex. Region 1 contains the charged group o f the interest. The highlighted sticks that represent the ligand molecules are in the region I . Region 2 contains the protein groups found in a sphere with a radius of R ; . The region 3 contains Langevin grids within a sphere with a radius of R3 , including inner grid with a radius o f Rg^, and outer grids with a radius o f Rgy. All protein molecules that are found beyond the R % region are truncated to a backbone chain that are almost subjected with a strong constraint in the molecular dynamics simulations. The electrostatic interactions are considered explicitly in the region 1, 2 and 3. The rest o f the system is treated as a continuum region with bulk solvent effect. 25 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.3 Energetic Terms in the PDLD Model As a m atter o f fact, the determination of the electrostatic interactions in the PDLD model requires the evaluation o f different electric fields that describe how an electric charge or a dipole influences and is influenced by its surroundings. The m icroscopic interactions between charge-charge, charge-dipole, dipole-dipole consider the energetics in vacuum (e=I). The summary of different electrostatic contributions is described as follows: As illustrated in the PDLD model, the protein structure is represented explicitly in the three dim ensional space. The first term contains the intramolecular interactions of the ligand including the charge-charge interactions between the non-bonding atom s of the inhibitor is given by: V ^ g g = 3 3 2 T ^ (2.3.4) where the energy is given in kcal/m ol, the distances in  and the charges in atom ic units. The sam e units are used for follow ing equations. The next term considers the intermolecular interactions of ligand and enzyme, in terms of the charge-charge interactions between the inhibitor molecules and the enzyme : ^ e . = 3 3 2 X — (2.3.5) ’’ij The third term introduces the induced energetics of protein due to the other protein atoms and the solvent dipoles. The formulas are given by Warshel and Russell (1984): Vq„ = - 1 6 6 I p P^j (2.3.6) i = (2.3.7) where a . is the atomic polarizability of the ith atom of the protein molecules, is the local field on the ith atom of the enzyme molecules induced by the residual charges o f the rest of the protein molecules and the Langevin dipoles in the reaction region. Note that the ligand molecules could not be polarized in this case. The field, a collected field on the ith 26 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. protein atom due to all the other protein charge points, is evaluated w ith a screening function to reduce the induced dipole-induced dipole interactions: (2.3.8) nj ///" ,)) = I-G x p (-r^ / r j ) (2.3.9) where r^ is taken as 3Â. The field that is a collection o f all the dipoles applying on a protein atom charge is evaluated by equation (2.3.10): [ 11, - 3(r,yp.y)r,y ~f-] (2.3.10) yV . '"ij ’’a The field formulation is based on the assumption that a distance between the charge point and a dipole is significantly larger than the length o f the dipole. T he equation (2.3.6) is solved iteratively by evaluating the dipole m due to for the first iteration. Then the local field is evaluated by adding the field from other dipoles. The new dipoles obtained from the new local filed are used for the next interaction until the induced energy converges.22 The contribution from the interactions o f protein molecules and the Langevin dipoles is given by the charge-dipole interactions: G,j„„ = -3 3 2 X (n h l? + 7 ? ^ ) + v “ ^i) (2.3.11) k L L w here is evaluated by equation (2.3.1) and (2.3.2) and is the field on the kth Langevin dipole from the protein perm anent charge distribution and is given by equation (2.3.8). The field on the kth dipole due to all the other dipoles is evaluated by equation (2.3.10). The 1/2 factor reflects the fact that each dipoles pair is evaluated twice. The last term is the energy associated with forming the kth dipole. The contribution from the bulk of the system is given by: 27 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. w here the first term and the second term represent the monopole and dipole moments, respectively, is taken as 80. represents a spherical radius o f the grid with the Langevin dipoles, ry is the distance between the center o f the Langevin dipole sphere and the protein perm anent charge point. Thus, the total free energy o f solute-solute, solute-solvent and solvent-solvent interactions that are depicted by dominated electrostatic representations is determined by: = + + + (2.3,13) where AGg., = AC,,„ + AGs.a (2.3.14) As illustrated from above, the sim plified PDLD microscopic approach involves evaluations o f tens o f thousands o f interactions between charges and their surrounding dipoles (e.g. hydrogen bonds and other polar groups), w ithout confronting the intricate ‘dielectric co n stan t’ from the m acroscopic point o f view . All energetic term s are straightforward and meaningful. W e have discussed and presented the model system s that are im plem ented in the POLARIS program and the analytical methods for defining the total potential function of a m olecular system. T he microscopic electrostatic and dipolar theories are emphasized and beautifully elucidated and em ployed in the PDLD m odel. Compared with the all atom model, the em pirical PDLD model has been developed in efforts to capture the physics of the solvent nature in a com plicated system without confronting the configurational problem s as the m odel is incorporated in the molecular dynam ics simulations that will be discussed in next chapter. 28 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 3 The Free Energy Calculations 3.1 Background The free energy o f a m olecular system indicates the tendency of the atom s to associate and interact with each other. This is a crucial measurement in the studies o f many phenom ena in the biological system, such as the solvent energy o f small substrates, the association o f ligand-receptor and the stability of protein and nucleic acids, etc.. It is a very attractive goal to develop the methods in attem pt to predict this quantity. The free energy calculations incorporate the com puter-aided approaches, such as, molecular dynam ics or Monte Carlo simulations to evaluate the energies of the m olecular interactions under the influence o f the surrounding environm ents, em phasizing the practical applications of solvations and associations o f ionic, polar and nonpolar m olecules in solution and in protein. The statistical m echanical definition o f free energy is defined by the partition function that provides us the initial expression, the free energy difference AGab is defined by the potential surface of £a and Eg: 1 f dTe~^^^ Where P is called Boltzmann constant and equals to 0.6 kcal/mol at 300K. A and B are two nonphysical related state. A slight modification of the above equation yields the following: g-pU^-es)g-pe, B 29 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. By definition, the free energy equation can be expressed in a form of Boltzm ann ensemble average that involves the evaluations o f all low energy states for A and B o f the system at B: = - ^ l n [ < >g] (3.1.3) Regarding the com putation of free energy, the requisite ensemble is generated by either a molecular dynamics or Monte Carlo simulation. If we consider to determine the relative free energy difference o f protein betw een tw o ligands corresponding to states A and B, the above equation can be applied with M D simulations to sample all low energy configurations of states A and B. If a perturbation involves a change o f a ligand A to nothing, the absolute binding energy o f ligand A can also be obtained.^^ As pointed out above, the successful application o f free energy perturbation requires A and B are physically non-related harmonic states and sampling all low energy configurations in order to converge. The later requisite is indeed a challenge task when A and B states are too dissim ilar. W e may never sample enough o f configurational space even with very long simulations. 3.2 Applications to the Binding Free Energy Calculations To understand the correlation o f the binding specificity and the distinguished structural features, we have conducted extensive studies to determ ine the binding free energies with different com putational methods in w hich the relevant electrostatic interactions betw een the inhibitor and protein/solution are evaluated num erically. The im portance o f electrostatic energy for studying the relationships o f the structure and function in the biological system has been well accepted. The quantitative evaluation of electrostatic contributions using advanced computational methods w ould provide greater insights into the structural-based binding specificity and affinity o f the enzym e-inhibitor complexes. 30 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The strategy to select the best and most efficient approach for electrostatic calculations o f macromolecules is to develop different alternative methods and apply them in the biochemical system to calculate different observed properties and compare the results with the observed data. The perform ance o f different methods have been com pared and discussed on how well the calculations were able to: ( 1) reproduce the solvation free energy o f polar and non polar molecules. (2 ) produce quantitatively reliable binding free energy estimates in reasonable execution time. (3) provide predictive power to dm g design. The ligand binding processes are dominated by a range o f non-bonding interactions: van der Waals (dispersion) forces, electrostatic and hydrophobic interactions. It is essential to develop com prehensive m ethods to evaluate the reaction force field by inclusion of solvent effects w hen considering the stability of a protein and its inhibitor com plex in solution. The electrostatic interactions o f protein, ligand and solution are o f particular interest to consider. Many previous studies have revealed that the FEP/MD approaches with an all atom solvent model seem to be a very expensive task if one wants to obtain the free energy difference between two distinguished states.'^^>'^^ However, it is still not clear whether this approach will converge on a reliable free energy estimate within reasonable com puting times.^^^-^g ^ L in ear R esponse A pproxim ation (LRA) to estim ate the electrostatic com ponents, incorporating with a hydrophobic free energy contribution, is em ployed for a binding free energy calculation.'^-SO This m ethod is a simplified approach over the PEP m ethod that the sam pling of configurational space has been drastically reduced from the simulations of a series hybrid states to only two different states. That the linear approxim ation o f the LRA approach is not uncom m only used in predicting the properties of organic com pounds. The LRA approach in combination with a hydrophobic contribution provide reasonably good results for non-polar inhibitors but at the expense of very long com puting times.^^ 31 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The microscopic PDLD model and approach seems very promising as a remedy for the convergence problems that are encountered in the FEP m ethod. This is a hybrid model that consists o f partial inclusion o f the solvent molecules in a simplified way. The solvent m olecules are represented as Langevin D ipoles (LD) occupying a three-dim ensional spherical grid. Each dipole is induced according to the Langevin function and allow ed to point toward its local field. The protein and the surrounding solvent molecules are modeled in a way that the mean force field and the electrostatic field o f the system are approximated. In the PDLD m odel the calculations o f the solute-solvent interactions revealed that the energy convergence for the free ligands in the solvent dipoles required at least 10 iterations to reach the self-consistent energy using equations (2.3.1) and (2.3.2). In another individual evaluation of the solute-solvent interactions o f the protein-ligand com plex in solution, the energy would converge after 20 iterations. The com putational processes took around 10 and 2 0 minutes of the self-iterative calculations for the free ligand in solution and protein-ligand in solution, respectively. The recent report of using PDLD/S approach to evaluate the solvation free energy of a series of com pounds give impressively reliable results.^* M oreover, the PDLD/S is found to give consistent calculations o f the relevant properties of charged groups in protein, such as p K a calculations,^9.52-53 redox p o t e n t i a l , 2 0 - 5 4 catalytic effects. The present work explains problem s involved in the FEP/M D m ethods and elucidates how and why the LRA approach alleviates the problem s encountered in the FEP/M D m ethod. This work em phasizes the PDLD m ethod for the binding free energy calculations. The improvement of the PDLD m ethod is m ainly made by the various force field evaluations. Our results indicate that only the PDLD/S approach provides encouraging results within reasonable com puting time. Therm odynam ic cycles used in the PDLD/S method are presented in details in the next section. A linear regression method to analyze 32 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the energy com ponents o f the total binding energy is presented. The results from the PDLD/S method and the linear regression approach are obtained and discussed. 3.3 The Thermodynamic Cycles in the POLARIS program The com putations o f binding free energy can be illustrated by the follow ing two therm odynam ic cycles. The binding free energy calculation of the PDLD/S approach is based on the thermodynamic cycle in Figure 3.1. As indicated from Figure 3.1, the theory of the binding process o f bringing the substrate to the enzyme active site involves the exchange o f the surrounding environment of the substrate from the solvation shell o f water to the binding site of the enzyme.^^ In other words, the exchange reaction of the binding process represents the change o f the dielectric surroundings o f the substrate from a polar medium (water) to a non hom ogeneous less polar dielectric medium (protein), AG;. In this cycle, AG, represent the energy changes of bringing the charged (A)/uncharged (B) ligand and the protein m olecules from the polar surrounding to the less polar protein medium: = - ( AG^o/ + AG^gi ) * (— ------— ) (3.3.1) ^in ^out ^^ sol = vn + ^^bulk + ^^liyd (3.3.2) 4 (3.3 :)) AGigvn 3.nd A G bulk are given in the (2.3.11) and (2.3.12), respectively. The evaluation of the energy, AG^^j, is associated with the determination of the field-dependent hydrophobic surface area due to the charging processing: ^G„^d = TAS,^.j (3.3.4) The modification of the early approach to estimate the effective surface area, AS^^j, in the combination of the electrostatic calculations gave excellent results in the recent studies 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. /M j 4 (A,B) Ligand of Interests (1): charged (A) and uncharged (B) states 34 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.1. The thermodynamic cycle in the binding free energy calculation. A and B represents the charged and non charged states of ligand. The binding process is typically illustrated by the energy change of AG4 that involves the free protein and ligand in solution at infinite separation and the ligand-protein complex in solution. The process is further cascaded into several consecutive changes that are represented in the energies o f A G ;, AG 2 . and AG 3 . The equation of binding is given in the text. (1) We start with the ligand (charges o f interest) and protein molecules at infinite separation in water and obtain the corresponding hydration energies. These energies are evaluated in the PDLD model. (2) Bringing both the ligand and protein molecules from the outer medium to the inner surrounding as protein, 8 ^ . gives a free energy AG ; that reflects the corresponding change in the solvation energies of the ligand and protein. represents all other dielectric properties of protein, such as the induced dipoles, etc., that are not included explicitly in the PDLD calculation. (3) The ligand are brought to the binding site by the corresponding interactions of charge-charge and charge-dipole under the same dielectric environment, . The free energy is given by AG]. (4) The medium surrounding of the liganded protein complex is brought from 8 ^ back to Eout. The energetic change is given by AG 3 that is obtained by scaling the solvation energy of protein-ligand complex by a factor (l/£ ,^ -1/ Eout )• 35 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of the solvation free energies of the polar and non-polar molecules.^^’^ ® The determination of hydrophobic surface area in the equilibrium system at room temperature involves Tiled- dependent evaluations o f solvent dipoles on the protein surface (w ithin 3 o f surface protein atoms): A S f,y j= A + B j ,C F ( ^ f ) (3.3.5) i ^ ^ ^phobic \ 1 - ?< W Çphilic Ç phobic (3.3.6) 0 The procedure o f calculating the field-dependent hydrophobic energy o f protein is described as follows: First, all the Langevin dipoles within the 3 of the surface protein atoms are identified and retained for the further calculations. Then the field on each surface Langevin dipole is assessed by including the effects from protein charges as well as neighbor Langevin dipoles. The value o f the field on each surface dipole is com pared with ^phobic and ^phiiic to give the corrected field F(^i) according to the equation (3.3.6). Each hydrophobic dipole is translated into a unit of surface area in cm^, as expressed in the constant C o f equation (3.3.5), C=1.26 cm^. The constant A and B are calibrated from a series of calculations by fitting the solvation energies to the experimental values. The non-bonding interactions between the charged ligand and the protein molecules represent the energy required to bring the ligand to the protein active site. This energy change is given by AG]; A G 2 = ( A V q, + Vq q) * - ^ (3.3.7) ^in where Wqq and V g^are given in equation (2.3.4) and (2.3.5), respectively. The last step, AG 3, represents the process o f bringing the charged ligand in the protein active site from the protein m edium back to the w ater environm ent. AGp+' represents the protein-ligand complex: 36 Repro(juce(j with permission of the copyright owner. Further reproctuction prohibitect without permission. AC3 = A G 5 ' * ( - L - J r ) € -out = AG|Pg™ + A G £ îl+ A G g ;' The total binding energy without reorganization energy is yielded as follows: fbind _ % = = ( AGi + AG2 + AG3 B) where AG4 = [ A G g ' - (A G ^ + A G ^, )1 ‘ € (3.3.8) (3.3.9) (3.3.10) (3.3.11) ^out The reorganization energy, X, is recognized as a steric energy to restore the electrostatic interactions that connect the energy states A and B as illustrated in Figure 3.2: A =< Vg - -A G ^g —< V a — y g > g + A G , MB (3.3.12) therefore, A = - ^ ( < A y >^ - < A y > g) = - ^ ( < AC/^,-^ >^ - < AC/^,-^ > g) (3.3.13) S ? I X F igure 3.2 The free energy scheme for a system following the linear response approxim ation. The PDLD/S calculation is based on fixed structural information, A G ^,„^ = 37 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. O A and o B designate the relevant energies are evaluated on states A and B, respectively. B indicates the uncharged state o f the inhibitors. Let us recall the outline of the com putational fram ew ork summarized in Figure 2.1, the assumed harm onic oscillator of each component that follows the linear response approximation is depicted above. The total binding energy is given: - ^^btnd(B-^A) = " ( < ^^bind ^^bind > g) (3-3-14) >A +A (3.3.15) Alternatively, the total binding energy is also given by: ^bind(B~*A) ^ ^ b in d ^A The feature o f including the reorganization energy in the binding free energy calculation is very unique. It is important to consider the energy change o f the solvent relaxation due to the m utation of charged group of interest to the neutral state. The PDLD/S can be referred to the scaled microscopic model that acquire the precision of the electrostatic calculation by scaling a dielectric constant 6 p , as indicated in the equation (3.3.11), reflecting the macroscopic philosophy o f binding process. >6.57,58 The dielectric constant also denotes the effects o f the remaining dielectric and the rest of the continuum would be taken into account implicitly. The constant has no connection with the m eaning o f the m acroscopic property of protein. It has been found that considering the charging process in the calculations allows one to use a small scaling constant, Ep, in the PDLD/S approach. The electrostatic binding free energy o f the semi-PDLD approach is based on the therm odynam ic cycle depicted in Figure 3.3. The p+1 represents the protein-inhibitor complex; p and 1 indicate the protein and ligand, respectively. The superscript w represents the dielectric surrounding is a water-like polar environment. The superscript p represents a lower dielectric surrounding o f the protein structure around the active site. The protein surrounding is a very inhomogeneous and less polar medium as far as its dielectric constant 38 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. AG, (p + /)g A G f AG? AG=0 AG] p '^ + C Figure 3.3 The thermodynamic cycle for evaluating the electrostatic energy of the charging processing in the binding free energy calculation. is concerned. The letters A and B represent the charged and uncharged state o f the ligands of interest. The cycle involves the calculation o f the energy difference due to the change of inhibitors from the charged state A to the uncharged state B. The energy changes, AG; and AG3, represent the solvation energy difference of the changes of the ligand states, from the charged ligand state (A ) to the uncharged ligand state (B), in solution and protein, respectively: ^ ^ 3 - ~^Ra ) > b ) = i[ ( < AA G Ç i > j + < >^ + < AVg„ + Afgg ) + « à A G Ç l > J + < AA CC^ >B + < A t'g . + AWgg >b)] = — [(< AAGig,„ > x + < AAC^m + < ^V gg ) where (3.3.16) (3.3.17) (3.3.18) (3.3.19) (3.3.20) +(< AAG,gj,„ >B+< ^ G b u lk >fl + < >fl)] AA = A^ - A g The energy, AG 2 represents the energy change associated with the process of bringing the non-charged ligand into the active site. This energy can be evaluated from the 39 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. <AG>b (3.3.10 and 3.3. II) in the thermodynamic cycle o f Figure 3.1. The equations of AG I and AG] are determ ined by the LRA approach. Therefore, the total electrostatic binding energy due to the change o f the charge state is given by: A G f" = AGj - AG, + AGz (3.3.21 ) 3.4 The Free Energy Perturbation (FEP) Method In the FEP/M D methods and a Linear Response Approximation (LRA) approach, the all atom solvent model is im plem ented in an effective way o f representing the solvent molecules around the protein active sites.^^-^ In the free energy perturbation approach, the free energy difference betw een any two dissimilar states r and p can be evaluated by equation: s c a r -> A p ) = - K bT \ t\ < e x p - [(V, - V ^ ) ! K qT] >p (3.4.1 ) where Kg is the B oltzm ann’s constant, T is the absolute temperature. and are the potential energy surfaces o f the reactants and products, respectively. The bracket o is the average ensem ble o f structures representative of the system at products. In the practical simulation, it is important to collect not only the configurations giving low energies for the reactant state but also those possessing low energies for the product state. One approxim ation in the FEP calculation is that we assume all low energy states can be effectively attained when the sim ulation is long enough. Obviously, this is not valid for all cases. In particular the product state is significantly different than the reactant state. In case of that the reactant and product are too dissimilar states, an um brella sample m ethod has been employed to alleviate the inconsolable sampling problem. If can be represented by any point on the potential energy surface of a reaction as a linear combination of and V„=V , + A,„(Vp-V,) (3.4.2) 40 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. w here and are the potential energy surfaces of the reactants and the products, respectively. A , is a w eight coefficient between 0 and 1. Then the Gibbs free energy change from the reactants, to the products, V^, can be evaluated by below: AG(Ao A„) = X 5 G (A „ -> (3.4.3) Increasing the coupling param eters A , in a systematic way from 0 to 1 brings the potential energy from V ^to Vp. A suitable choice o f the increment, A A ,, is required to make each o f the AV small enough to yield an accurate 5G. Since the free energy is a state function, the total free energy can be expressed as the summation o f the free energy increments, ôG, that can be obtained from equation (3.4.1). A typical application o f the free energy method is usually assumed that a simulation that involves macromolecules would retain essentially structural features after 10-100 of picoseconds. This requires the accurate force field representations and large cutoffs. The calculations of the free energy difference will inevitably encounter the inherent convergence problems due to the com plex nature o f the potential energy surface and the large dimensionality o f the s y s t e m . ^ I t is found that the sim ulations of the charged group in an aqueous solution face challenges when the net charge o f the system has been changed.'^^ The results reveal that a larger non-bonded cutoff is preferred to yield a better solvation free energy. The development of an efficient m ethod to evaluate the long range electrostatic effects becomes important for the simulation o f the polar molecules in solution. The local reaction field (LRF) method developed by Lee and W arshel provides an effective approach for the correction o f the long range electrostatic effects.29 However, the com putational approaches still require the sim ulations of many hybrid states betw een the reactants and products. In fact, m any researchers are still skeptical about w hether using the FEP m ethod for the autom ated binding free energy calculation will provide reliable values within reasonable com puting tim e. The lim ited potential energy functions and representations (such as using various cutoffs) have 41 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. barricaded the progress to attain the accurate biom olecular simulations in the free energy perturbation calculations. Incorporating with the im proved representations in the simulation, such as period boundary condition and long range electrostatic interactions, would enhance the perform ance o f the free energy calculation, but hardly escape from the convergence trap by the solvent configurations. W hat we should keep in mind about the FEP method is that in theory the approach will be able to produce the accurate results if using infinite sim ulating time, perfect parameters, large cutoffs and including many solvent molecules, etc. As a m atter of fact, we obviously did not reach that stage yet. 3.5 The Linear Response Approximation (LRA) Approach In order to sim plify the FEP/M D approach, the L inear Response Approximation (LRA) has been used to carry out the calculations. The L R A approach had been proposed in early studies o f redox potential energy calculation.^"^ W e have extended this method to the binding free energy calculation in aims to provide an alternative free energy study that is assumed that the solvent response o f macrom olecular interactions is followed by the well accepted linear approxim ation. The illustration o f the linear approxim ation is shown in Figure 3.2. The assum ption of the LR A approach that is m ade for Marcus type (Marcus, 1964)50 harmonic potential energy surfaces for the electrostatic forces gives the difference of the electrostatic free energy going from one state (the uncharged state B) to another (the charged state A): > g ) (3.5.1) where, AV = V ^ - V b (3.5.2) The energies, and V y s^, represent the potential surfaces o f states A and B, respectively, w ithin the approxim ation o f harm onic free energy functions of equal curvature. The validity o f the linear response in the solvation o f ions and metal ions have 42 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. been confirm ed in the study of Roux et al. (1 9 9 0 )^ and Âqvist, J. ( 1 9 9 0 , 1 9 9 4 ) . It has been indicated that the calculation o f ratios of AGeiec: (<AV>a+<AV>b) were less accurate for very weak field. O bviously, there is no direct evidence to point out w hether the associated errors are caused by enharm onic effects or the representations o f potential functions and more investigations are needed. Considering the LRA approach, the binding free energy is evaluated from the difference of the intermolecular interactions of ligand in the protein/solution host and the free ligand in solution state: AAG^w = (3.5.3) where = | ( < > ^ + < AV^^p'^ > g ) (3.5.4) ^ i ( < > g ) (3.5.5) where AV = -V g (3.5.6) Therefore, the trajectories o f ligand in water and ligand-protein in solution are generated independently from MD sim ulations, in which the trajectories on the charged ligand state (A) and uncharged ligand state (B) are considered fo r the average free energy calculations. The average free energy of each phase in the thermodynamic cycle is evaluated by the LRA approach. Therefore, the charging process has been considered in the total binding energy. To elaborate the total potential energy V in term s o f electrostatic interactions and nonbonding ven der W aals contributions, we decom pose the total energy into different component parts: ^ + ^g// + ^QW + ^Igv/i + ^bom * V " L + + v ' J z + v ü : + Where the Wq q is a collection of intra charge-charge interactions of ligand. The terms, and V g^, refer to the additive energies of interactions of ligand with protein and ligand with water, respectively. O bviously, the V g^ term is zero in the free ligand phase 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation. is the solvent dipole contribution in case o f that the PDLD model is hybridized with the all-atom model. stands for the contribution that is not represented the interactions between the reactive region and the continuum region in the model. The van der W aals interactions are catalogued into the following: the intra-vdw interactions within the ligand Vyjy^; the inter-vdw interactions o f the ligand with protein, ligand with protein, protein with water, ; and protein-ligand com plex with water, . Clearly, The terms and are excluded in the free ligand phase. 3.6 The Concluding Remarks of the FEP/MD and LRA Calculations The FEP/M D approach is implemented in the ENZYMDC program. The application of m olecular dynam ics sim ulations, incorporating with special spherical boundary conditions, m ake the polarization effects o f the edge solvent m olecules having finite contributions. This treatment is, however, a very expensive approach to simulate the larger proteins even with the combination o f the pow erful Local Reaction Field (LRF) method. The LRF m ethod is capable of com pensating the long range electrostatic energy. Our previous studies o f using FEP calculations that based on the adiabatic charging approach indicated that the evaluation of the electrostatic energy needs a larger cut-off for the water- w ater interactions to yield better results o f the calculations.^^ A s far as the electrostatic energy calculations o f the charged groups are concerned, the convergence problem is still a m ajor source for the errors. The FEP calculations o f antibody-antigen interactions produce the binding energies that are not substantially better that the PD LD /S results. As noted, the averaging c.p.u. times is 100 times longer for the FEP calculations than the PDLD calculations. It should be stressed that the accuracy o f a free energy perturbation simulation is directly related to the adequacy of the sam pling o f the phase space. The application of using current com puter power and advanced methods for sampling the phase space is still a 44 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. slow and expensive procedure. The exploration o f the whole dim ensional system using com puter-aided m olecular simulations presents a great challenge for the free energy calculations. As m entioned in the method, the LRA approach is a sim plified approach over the rigid FEP method to estim ate the free energy difference. The m ethod has been used for such free energy calculations as the redox potential energy estimates'^^ and the calculations o f the affinity o f antibody-antigen interactions^^. The results o f this approach are not fantastic encouraging as expected, more studies are needed. The LR A approach is, in fact, a pragm atic and feasible method. Intrinsically, the LR A m ethod m ight confront the conceptual limitations from its simple harmonic assumption that describes the complicated nature o f the potential energy surface o f the protein. Again, it is hard to examine the validity o f the harm onically potential surface o f protein from the directly experim ental works. H ow ever, a recent study o f the binding free energy calculations reveals that these draw backs could be partially removed by carefully choosing the m olecules studied and running considerably long computing times to collect data as much as possible.'^* 3.7 The Solvation Free energies 3.7.1 Background Significance M any biological processes can be described as solvation processes. Thus, the calculation of the relative difference and absolute quantity o f the solvation energy of the different solutes in a solvent is an essential task for any model to study the chemical and/or physical process in a condensed phase. Solvation process implies that a solute is being solvated by a solvent in a more conventional sense. In a more precise description of the solvation process, the solvation should be considered as an inherently molecular interaction 45 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with its surrounding medium, concerning about the m icroscopic rather than macroscopic environm ent that is responsible for the solvation process. M ost solvation processes are. however, recognized initially at a macroscopic level that involved ambiguity arises from arbitrary-defined constants. For example, the free energy change associated with moving a charged ion from vacuum to a given homogeneous environm ent is frequently referred to as the self-en erg y o f the ion charge and was first realized by the B om formula: 1 AG = - 3 3 2 ^ ( 1 - - ) (3.7.1) l a e where a is the effective radius o f the charge; and e denotes the m acroscopic-defined dielectric constant. With reasonable choice of the value of dielectric constant and the radius of the charge, one has no difficult to obtain satisfying result o f the self-solvation energy of any charged ion in a homogeneous solution. Unfortunately, one faces difficult of applying the equation in determ ining the macroscopic e o f an inhom ogeneous medium, such as proteins, and leads to rather meaningless energetics that reveals no information o f detailed solvation process and locally environmental changes. A nother example involves the calculation of electrostatic energy of an ion pair in a given solvent is according to the following formula: where b is the radius of the solvent cavity around the ion pair in Â, p is the dipole moment of the ion pair and e refers the dielectric constant o f m edium . The free energy is unfortunately depended on two phenomenological parameters, nam ely the cavity radius, b, and the ill-defined dielectric constant, e. As mentioned above, choosing the appropriate parameters, one could always obtain any desired results of sim ple systems. T o illustrate the difficulty o f using the ill-defined ‘dielectric co n stan t’ to quantitatively determine the free energy, we give an example of attempting to evaluating the 46 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. energy associated with bringing tw o charged from infinity to a given distance by the Coulom b’s law: AG = 332 — = - Ü 2 _ (3.7.3) The £eff in the equation represents the effect o f all other interactions that are not taken into account explicitly between the charge Q ’s and the surrounding medium in the given model. W hen the approxim ation represents the interactions in an inhom ogeneous surrounding, such as a protein in solution, the calculation became very problematic and produced rather meaningless energy o f the charge interactions in protein w ithout revealing any detailed inform ation. This is because the m acroscopic dielectric constant appears to be quite different from site to site in protein. Thus the microscopic simulations are required and reasonable to probe the dielectric properties of the solvent. Although the macroscopic theory appears accurate for evaluating the charge-charge interactions on the protein surface, it is desirable to explore the detailed contributions, i.e.. charge-charge interaction in protein, solvent effect and hydrophobic energy, which evidently requires a microscopic m odel to accomplish so that the fully description and understanding o f the solvation process can be achieved. The importance o f accurately evaluating the solvation energy have been shown that it is considered as a vital factor to correlate the function and structure relationship in the biological system. For instance, the quantity shows that how much polarity of the environment around electron carriers is and implies that the energy directly controls the redox potential.To correctly assess the electrostatic energy, it is crucial to obtain the accurate absolute solvation energies as demonstrated in the PDLD studies. As indicated in many of our early studies, the assessm ent of solvation energies of small molecules in solution is not the major challenge in this work, this is partly because 47 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the flexible adjustm ent remains open for the van der Waals parameters or the atomic Bom radii. To face the real challenge o f evaluating the charge-charge interactions in m acrom olecules that has been addressed repeatedly in our previous studies, we should focus on the evaluation free energy o f charges in protein interior. It is found that the microscopic approach involves electrostatic compensation effect among contributions, for exam ple, the charge-charge interactions in vacuum (Vq^) are offset by the change in the solvation energy. The exam ple o f solvation energies of ion pair in water is demonstrated in the next section. 3.7.2 The PDLD Approach in the POLARIS Program D espite the arbitrary-defm ed constants in the macroscopic model, the calculations are rather m eaningless with no capability to give the explicitly energetic information of the solute and solvent molecules. The detailed knowledge of the solvation process that involves solvating a m olecule into its surrounding medium is really what is important to assist us to understand the nature of the solvent effects. The microscopic model appears a promising approach that w ill provide detailed description of change o f the locally surrounding properties o f the solute m olecules during the solvation p r o c e s s . A s realized in our early studies, the solvation process plays a crucial role in many biological activities. The solvent effects are directly related with chemical and physical properties of the system, such as pKa’s o f charged group in solution and protein; the redox potentials and binding free energies that m easure the affinity of macrom olecular association in solution. All of these m easurem ents o f the system are associated with the free energies of protein or small molecules in solution. The solvation energies of ions and organic com pounds would be effectively evaluated using the sem i-m icroscopic PDLD model that the average free energy has been 48 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. estimated from the collection of explicitly microscopic energy terms associated with moving the molecule from the vacuum to the solution. In the m odel, the solvent molecules are represented in a sim plified way in which the mean force field is evaluated by the Langevin equation. The applications include the self-energy or solvation free energy o f a series o f organic com pounds and twenty am ino acids in solution; and ionized residues in solution and in protein. O ne o f im portant purposes o f accurately evaluate the solvation energy is designated to calibrate the param eters in the PDLD calculations. For exam ple, the param eters o f A and B in the equation (3.3.5) of hydrophobic energy calculations are obtained from solvation free energies of series of molecules. The ion pair plays an im portant role in many biologic processes, including enzyme catalysis, subunit interactions and photosynthesis. Ion pair interactions are in fact the most common exam ples o f electrostatic interactions in proteins. Despite the definition for the self-energy of charges in proteins seems abstract, the values can be obtained from standard thermodynamic or kinetic properties. The accurate evaluation of ion pair interactions in proteins appears to be a key point to assess many properties in the biological system. For instance, the pKa calculations o f ionizable group in proteins and the binding free energies of protein-ligand com plex, as will be discussed in details in later chapters. As demonstrated in many studies, the electrostatic interactions play a key role in contributing to the solvation process. Even though m any other contributions, such as hydrophobic interactions, appear to be im portant, it is still desirable to develop m ethods that model the m icroscopic interactions from the perspective o f electrostatic theory. W e have show ed examples o f examining the solvation energies o f ions and the electrostatic compensation o f ion pairs o f Na+/Cl' in water using POLARIS program. The solvation energies of Na+ and Ck from gas phase to the solvation phase are give in Table 3.1. The total solvation free energy consists of interactions of charge and Langevin dipoles, AVigvn; the hydrophobic energy, AVhyd; the 49 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. energy from the continuum region that is not represented explicitly in the given model. AVborn: and the van der W aals interactions from solvent dipoles, AVvdw. The van der Waals energy in the PDLD model is given by: V 'v rfw = + <yj) / + < J j ) l i f r f (3.7.4) where i and j denote the atoms in the pairw ise interactions and r,y is the distance (Â) between the /th and jth atoms, e, is taken as 0.3 kcal/mol for all atom types, a represents the cutoff distance between the given atom type and closest Langevin dipole. The calculated solvation energies o f N a+ and Cl" ion in water were found in excellent agreement with the observed values. Table 3.1 The solvation energy o f Na± and Cl- in kcal/mol (gas-->solution) Igvn ^^hyd ^ V born AVydw AGsoi AGexp Na+ -92.52 1.19 -7.13 -1.83 -100.29 -100.3 ci- -68.81 1 .0 0 -7.13 -1.17 -76.11 -76.6 The solvation energies o f one ion were determined in the presence o f the second ion at distance of 8 , 10 and 12 Â. The results were presented in Table 3.2. The calculations were carried out using PDLD/S approach in the POLARIS program. The total energy is constituted by the energy of the charge-charge interactions, AGq^; the energy o f the Langevin dipoles, AGigvn; and the bom energy from the continuum region, AGbom. As shown in Table 3.2, the apparently com pensations between these terms are exhibited in all the calculation. The total energies are found to be quite consistent even thought the components are different. 50 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.2 T he table of calculated solvation energies for charged ion pairs of Na±-and Cl in water in the PDLD/S results are given in kcal/mol 2 nd ion Dist (Â) born AGsoi Na+ -92.52 -7.13 -99.65 ci- 8 -31.12 -76.70 3.55 -104.27 10 -24.90 -76.82 3.58 -98.14 12 -20.75 -85.99 3.56 -103.18 Na+ 8 31.12 -104.97 -17.87 -91.72 10 24.90 -100.70 -17.82 -93.62 12 20.75 -94.98 -17.88 -92.11 ci- -68.81 -7.13 -75.94 Na+ 8 -31.12 -56.17 3.54 -83.75 10 -24.90 -60.93 3.52 -82.31 12 -20.75 -61.56 3.63 -78.68 ci- 8 31.12 -80.43 -17.81 -67.12 10 24.90 -76.30 -17.85 -69.25 12 20.75 -73.74 -17.78 -70.77 51 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 4 Structure and Function of Aspartic Proteinases 4.1 Introduction W hy do proteins have their uniquely three-dimensional structures and functions? How does the primary com bination o f am ino acid sequence determine the uniquely three- dim ensional structures? In what w ay do we describe the uniqueness o f the three- dim ensional structures? M any questions regarding the correlation betw een the different representations o f protein structures, and how the structures relate to the specificity and functionality, etc., can be listed on and on. In this particular specificity and affinity studies of protein complexes, we have attempted to understand that why and how different ligands show distinctive binding affinities to the same protein. W e have extensively investigated how the protein responds to different ligand as expressed in the structural change especially in the active site and the electrostatic interactions between the enzyme and ligands. What Is significant in the difference of the active sites helps us to understand why one inhibitor possesses more binding pow er over the other. Moreover, we are eagerly striving to find the answers to such questions as what are the m ost essential factors of bringing the ligand into the binding site o f protein and how they have been related. As indicated by many studies, there is no single factor to lead such a com plicated process of binding. This is not contradictory to the well accepted statem ent that the electrostatic energy is a major role in the course of binding. The hydrophobic effect and steric energy should also be considered. In this particular case, we attempt to draw a general relationship between these factors. M ore specifically, we represent these factors in a correlated function using graphical approach. At this point, we are convinced that it is unlikely to approach the answers of these im portant questions entirely from the experim ental studies. The developm ent of 52 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. com putational techniques would definitely extend our capacities in the study of molecular and biological science. It is not surprised to observe that many studies dealing with the biological systems that require detailed knowledge o f how molecules are interacted and the effect o f the surrounding environment develop the advanced techniques, such as computer- aided molecules sim ulations and the integrated graphics for visualizing structural display and interactive movement. The participation o f computational approach that has integrated the traditionally experim ental studies w ould definitely enhance our capacity to probe the structure and function relationship o f biological system . One argues, in fact, that the com putational study is the state-of-the-art science and should be viewed as an extensive experimental science. 4.2 Specificity of Endothiapepsin T he intrinsic binding free energy o f protein and ligand m easures the stability of ligand-protein complex. In an alternative remark, the binding free energy can be redeemed into a dissociation constant, K^,, representing an equilibrium state o f ligand phase, protein phase and ligand-protein complex in solution. The quantity indicates how efficient a ligand is going to bind to the protein in the active site — the crucial first step in the course of enzyme catalysis reaction. U nderstanding the interactions that stabilize the protein-ligand complex provides a fundamental description o f biological system at a m olecular level. The knowledge will guide the further applications that involve molecular based interactions. We chose to study the binding specificity and affinity o f endothiapepsin. The protein has been selected as the basis for comparative modeling of the human renin that is rarely extracted to purify and determined structurally. Human renin is known as a specific aspartic proteinase playing a key role in regulating the form ation o f the angiotensin II, a hypertensive agent.5^’ 5^ The inhibitors o f human renin are expected in lowing blood pressure. Many inhibitors of renin are found to have K, values in the nanomolar range with endothiapepsin. 53 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The similarity o f sharing the tetrahedral transition-state analogue m ight reveal the binding specificity and leads to discover of putative drugs. The aspartic proteinases are an important class o f proteolytic enzym es. They comprise a group o f bilobal enzymes that are significance with a deep pocket in the active site containing tw o aspartic acid residues that are essential for catalytic activity. Endothiapepsin, from the fungus E neothia p a ra sitic , is a typical exam ple of an aspartic proteinase. The X-ray structures o f endothiapepsin and a series o f synthetic and natural protein-inhibitor complexes have been determined in recent years.^^-^* The inhibitors H142 and L363,564 are known as effective drugs in lowering blood pressure in hum ans. The secondary structures that contain numbers of a-helices, strands and ^-sheets are illustrated in Figure 4.1-4.2. The similarities in the secondary structure have been shared in all the complexes. H261 is also a highly potent renin inhibitor. The natural compound, pepstatin. is found to be an effective inhibitor o f almost all of the aspartic proteinases. A detailed illustration of the active site of the pepstatin com plex has been shown in Figure 4.3. The example of three-dimensional depiction of pepstatin molecule interacts with the environm ents including the two catalytic Asp residues (32, 215) provides realistically m icroscopic picture o f the active site. The structural description dem onstrates that the aspartic proteases, being com prised by a pseudo-dyad relating two topologically sim ilar lobes, contains m ixed antiparallel and parallel P-strands and som e h elices.T he crystallographic investigation on the endothiapepsin-inhibitor com plexes indicate that the residues in the P i-P i’ position acts as a dipeptide analogue in the binding cleft. The two catalytic aspartates (32 and 215) lie in the centre o f the extended cleft to accommodate the active site. A unique feature of the active site of this aspartic proteinase is characterized by a formal negative charge that is believed to exist between the two carboxylates of Asp32 and Asp 215 of the enzym e. There is a general consensus that the form ation of the enzym e- inhibitor complexes involve a protonation of a carboxylate of either Asp32 or Asp215. 54 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Structural Illustration of L363,564 Complex Figure 4.1 The schematic structure of endothiapepsin with L363-564 complex. The color arrangements are displayed as the following: dark stands for a-helixes; grey stands for strand; the lightest grey stands for turns. The catalytically active Asp32, A sp 2 l5 and a hydroxyl group of STA4 are indicated in balls-and- sticks. 55 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Structural Illustration of Pepstatin Complex Figure 4.2 The cartoon drawing of pepstatin with endothiapepsin complex. The important functional determinants are indicated. 56 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Structural Illustration of the Active Site of Pepstatin P 2 1 5 R 218 S T M E R B O P32 Figure 4.3 The functional detenninants o f the natural inhibitor, pepstatin, with endothiapepsin complex. The ‘flap’ region of residues 77-80 is shown. The protein residues of 34, 217, 218 and 219 having close interactions with ligand are drawn in sticks. The catalytic groups Asp 32 and 215 and a hydroxyl group are shown in ball-and-stick. 57 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The com putational FEP/M D studies revealed that the energy barrier between the protonation of the carboxylates o f Asp 32 and Asp 215 could be as low as 1 kcal/m ol and was strongly dependent on the configuration state.*®’** T he negatively charged environment of the active site might suggest a pH dependent mechanism for the hydrolysis of a scissile bond between the P p P i' position. The initial structural study suggests that the negative charge is m ore likely to be associated with Asp32.*^’ *^ The sum m arized structural inform ation including the detailed secondary structures are presented in Figure 4.4. It is not surprising to observe that the existing sim ilarities of the secondary structures in term s of the num ber of helixes, stands are found am ong all the protein-inhibitor com plexes. The chemical structures of all inhibitors are given in the Figure 4.5-4.Ô. The im portant difference of these two types o f inhibitors is that Figure 4.5 contains peptide inhibitors while Figure 4.6 has shown the non-peptide inhibitors. These two groups of inhibitors have exhibited distinctive electrostatic interactions as will be discussed in the next chapter. The high resolution structures o f the enzym e-inhibitor com plexes allow us to consider how the differences in the specificities are achieved. All inhibitors in the enzyme are bound in the extended conform ation in the specificity pockets. The X -ray studies indicate that most inhibitors (except H I42) contain a hydroxyl group at Pi position which enhances the degree o f charge-charge interactions between the ligand and protein. 4.3 Intermolecular Interactions It is important to realize that the intermolecular interactions are responsible for most physical and chem ical properties of biological system. T he com plicated nature of intermolecular interactions presents no conceptual difficulty. In this context, the behavior of a molecular can be completely described by the Schrodinger equation: 58 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Structural Information o f Proteins Hydrolase ( Acid Proteins) Structure: E ndothia a sp a rtic p ro tein a se (Endothiapepsin ) complex with pepstatin (4er2 Source: C hestnut b lig h t fu n gus {en doth ia p a ra sitic a ) Resolution: 2.0 No R-factor given Reference: V eerapandian,B -,C ooper,J.B .,SaIi,A -,B lundell,T.L .,J.M oI.B io., 1990,216,1017 Enzyme Classification number: E.C .3.4.23.22 (from PDB file: 3.4.23.6) Number of Residues: 330 Ligand : Iva-Val-Val-Sta-Ala-Sta (Six residues) Class: M ainly Beta Architecture: Sandw ich Topology: C om plex Proteases Secondary Structures: Helix (8): E47-E49, E58-E60, E109-E112, E l 2 6 -E l 28, E138-E142, E225-E232, E270-E273, E304-E307 Strand(24): E0-E6, E14-E20, E25-E32, E70-E74, E81-E87, E89-E91, E94-E96, E99-E104, E l 19-El 22, E l 50-E l 54, E l 63-E l 67, E l 80-El 8 6 , E192-E194, E l 9 6 -E l 99, E207-E214, E221-E223, E245-E249, E257-E261, E264-E268, E283-E285, E287-E289, E300-E302, E310-E315, E320-E325 Figure 4.4 An example of the protein structural analyses and enzyme classification of endothiapepsin with pepstatin complex. 59 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B()C ------------- H tS PRO ■ -P H E H IS STA ‘ LEU HIS Y O H N OH Il (L363,564) IV A — VAL ' V A L — STA — A LA ai o I OH o 0 I2 (Pepstatin) BOC --------- H I S ------------- PRO -----------PHE ------------- H IS ----------------- LEU V A L ILE H IS OH I3 (H261) P R O H I S PRO-------------- PH E H I S -----------------LEU — —V A L ILE H IS I4 (H142) F igure 4.5 The chem ical structure of I 1-I4 . All protein-ligand structure were obtained from the Protein Data Bank (PDB) with the corresponding names. 60 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. O H J3 O H i< Eia 0_ N r i O H O H K Ic O H O H H N Ig Figure 4.6 The chemical structure of 1-5 and the corresponding name in the Protein Data Bank (FOB) 61 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. + Ei{R) - E\Xi{R) = 0 (4 .3.1 ) Zfi The solution of the equation is based the Bom -Oppenheim er assum ption that assumes the separated motions of the electrons and nuclei. This means that the interactions of atoms are prim arily governed by electrom agnetic interactions, the n uclear and gravitational interactions being completely negligible on the scale o f chemistry. This has proved to be the proper practice in most o f the cases in a way that no appreciable error is found. Therefore, the intermolecular potential energy surfaces are rely entirely on the proper representations of electrostatic interactions. T he analytical method o f solving the quantum m echanical equation for such a m ultidim ensional problem is alm ost im possible to achieve at present. Regarding the solution o f Schrodinger equation, the Bom-Oppenheimer approxim ation provides a way to drastically simplify the potential terms to a sum of inter- and intram olecular functions. To treat the macromolecular system in solution, the intermolecular potentials of the total system can be represented by the additive terms that describe the interactions o f solute-solvent and solvent-solvent that would be evaluated in a self-consistent way. The detailed elaboration will be presented in the next chapter. O ne note beyond this approximation that needs to be em phasized is the correction terms that correspond to the coupling of the electronic and nuclear m otions have to be considered in order to enhance the precision o f the intermolecular potential surface. There are two kinds of correction term s for the coupling o f the electronic and nuclear motions. The first is the diagonal corrections that shift the energy levels. The treatment is also called the adiabatic approxim ation that gives the best possible potential energy curves and surfaces. The second is non diagonal corrections that produce and lead to the energy transition between the quantum states. Since the thesis deals with the interm olecular 62 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. potential based on the B om -O ppenheim er assum ption, the detailed discussions and applications of coupling corrections are really beyond the scope of this work. Considering the basic electrostatic concepts in the macromolecules, we will address the issue from both m icroscopic and macroscopic perspectives. If we treat the electrostatic interactions at an atom ic level, the electrostatic energy is a collection o f atoms in a specified system , for example, a protein in solution, in which each atom is characterized by a given residual charge (Q/): U = 1 6 6 l Q i ( I ^ ) = ^ l Q i V i (4.3.2) i j> i ^ij ^ i where / and j run over all atom s o f protein and solvent and r,y is the distance betw een atom i and j in  and V; is the microscopic electrostatic potential on the /th atom. The total electrostatic energy is the collections of all possible pair interactions between atoms. W e can also define the microscopic electric field at any point of the system as: = -V ,V = 3 3 2 lG y ^ (4.3.3) y ' ■ ( / ■ where V ,- is the gradient operator with respect to the x, y, z coordinates o f the point r, and rÿ=r, -ly; and V includes the potentials from all atoms. The m acroscopic field, E, can be derived from the average collections of microscopic field: E = < (r)> x (4.3.4) where E is the m acroscopic electric field in a volume elem ent, x, o f the system . The microscopic field appears to clarify the definitions and distinctions of microscopic picture and the traditional m acroscopic electrostatic theory. The uniqueness o f m icroscopic filed should be obvious that one could evaluate the macroscopic E from but it is impossible to deduce the specific ^ from E. In the conventional approach of macroscopic view, the straightforward approach of electrostatic calculation is the application o f the Coulomb law: 63 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. AG = 332— — = _2iê_ (4.3.5) w here the Egff represents the electrostatic property o f dielectric continuum of the surroundings o f the charges Q i and Q 2. For example, if the two charges are placed 1 0 apart in the w ater, the m acroscopic dielectric constant, e, is taken as 80. It somehow appears to be a reasonable approxim ation to represent the electrostatic energy o f the interactions betw een the charges, Qi and Q%, in the w ater. It is obvious that the m acroscopic theory fails to deal with inhomogeneous system in which the dielectric properties of the system can not be represented by a simple dielectric constant. This chapter discusses the functional and structural characteristics in the secondary and tertiary structure o f the proteinases complexes. The 3-dimensional structure of protein could be described and analyzed in a highly organized manner in which the structural and geometrical features are detailed and compared. The highly ordered structural features have been conserved in each complex. The developm ent o f methods and models that allow accurate evaluation o f the details of inter- and intramolecular interactions of protein system are a vital process to study the correlation o f structure and binding specificity and affinity. The m icroscopic electrostatic descriptions m ight provide direct approaches that probe the relationships of structure and function o f biological system. The fundamentals o f electrostatic theory have been com pared and discussed from both microscopic and macroscopic perspectives. The traditional m acroscopic picture suffers from the inextricable deficiency that treats the medium surrounding as a dielectric continuum. 64 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 5 Computational Studies of Energetic Features 5.1 The Molecular Dynamics Simulations The three-dim ensional coordinates for the molecules I; to I4 were taken from the Protein Data B ank while the molecules I to 5 were kindly supplied to us by Âqvist. et al.-^ The X-ray nam es o f all the inhibitor-enzyme com plexes and the corresponding observed binding energies are listed in the Table 5.1. The X-ray structures serve as the starting point for all the m olecular dynamics simulations that are outlined below. The param eters of the partial charges and the neutral groups o f all inhibitor residues are given in the same fashion as that of the standard amino-acid library. The center o f the whole system was the same as that of the ligand. The total constrained potential energy o f the M D simulation equals to a quadratic constraint equation form ed by introducing a constant, k^wj, with the unit of kcal/molÂ'2. = (5.1.1) / where d f are the X-ray coordinates o f the system. All atoms in a spherical region with a 16 radius from the center were subjected to constrained molecular dynam ics simulations. That means it is a user-defined param eter, that its non-zero value usually yields to an external force that prevents the atoms swinging further away from their initial position. PC coM j is norm ally selected from 0.01 to I.O kcal/molÂ-^. The rem ainder atoms of the system were under the sim ulations o f a stronger constraint, w here K^ons was taken as 30kcal/m olÂ'2. This means that the atoms beyond the sphere (w ith the 16 À radius) are almost identical to their initial positions during the entire process o f m olecules dynamics simulations. O ne reason o f using a very strong constraint for this region might be that it is 65 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. unlikely to have strong interactions to occur between ligand and protein atoms so much far away from the active center. W e used the SCAAS solvent model for the M D simulations. The solvent molecules occupied a spherical space with an 18 cutoff radius. The density of water was taken as 0.998 g/m l. The intermolecular interactions between ligand and protein within an 21 cutoff radius w ere evaluated explicitly by the microscopic treatment. The contributions from the non-bonding pairw ise interactions o f the protein-protein, protein- water and water-water w ere evaluated with IO cutoffs radii and a long range treatment is integrated to deal with the interactions beyond the 10Â. (Lee & W arshel, 1992) The pairlists were updated every 30 steps. This means that the pairwise interactions between the new pairlists in the cutoff radius due to the atomic movements are realized very 30 steps of the sim ulation. The sim ulations o f the charged A state and the uncharged B state of an inhibitor in the protein started from the sam e X-ray structure. The initial structure was subject to a short relaxation o f 1 ps. Then the subsequent simulations for both states A and B would generate the corresponding configurations from the relaxed structure. The time interval between each configuration was 1 ps. 5.2 The Field Evaluations in the PDLD Method The protein configurations obtained from the m olecular dynamics simulations were subject to the PDLD calculations. The PDLD model is im plem ented in the POLARIS program . T he three dim ensional structure of protein system provides the vitally computational framework. The Langevin dipoles occupied a three-dimensional grid with a radius o f 18 of the m odel system . The center of the Langevin dipole grid is the same center as that o f the protein system . The average num bers of the total dipoles for all inhibitors in solution w ere ranging 2100-2200 while there w ere about 1300-1400 solvent 6 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. dipoles observed in the inhibitor-enzym e complexes system. The best optim ized three out of the thirty sets o f the Langevin grids were chosen for the further calculations to generate the self-consistent Langevin dipoles. These iterative evaluations o f the solvent and the protein induced fields were the m ost time consuming steps in the PDLD approach. A local reaction field approach has been employed to perform the long-range interactions o f the Langevin Dipoles.30 (See Lee and W arshel, 1992, for more details) This im provem ent provides essential correction o f the long range electrostatic energies that are excluded from the evaluation o f the dipole-dipole interactions using a cutoff radius. The field, ^ , o f the Langevin dipole in equation (2.3.1) is rewritten as: (I,■ + «,') (5.2.1 ) The short range field, , due to the Langevin dipoles within the cutoff radius, was endothianeosin ^ Inhibitor K ,(M ) ^^bindinç (kcal/m ol) L-363,564 1.6E-11171] -14.67 l2 Pepstatin 5 .0E -11173] -14.00 I3 H261 9.0E -11174] -13.65 I4 H142 1.6E-10171] -13.31 1 J3 -11.67126] 2 J5 -9.84126] 3 Eia -10.67126] 4 Ic -7.93126] 5 Ig - 6.57126] ^The references o f the dissociation constants and the observed binding energies are given in the bracket []. 67 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. evaluated at every iteration. The long range field, due to the rest of the Langevin dipoles, was updated at every iteration for the first five iterations and then was updated only once in every 10 iterations. The effective field on the ith Langevin dipole due to the nearest neighbors was truncated. This treatment enhances the performance of the POLARIS program significantly, in particular when there are more than thousands of solvent dipoles in the forced filed calculations. 5.3 The Residues Contributions to the Binding Energy It is of obvious interest to investigate the detailed energetic distributions in the protein complex that the energetic inform ation will reveal the insights into the specificity and selectivity of the enzyme. The significant and essential energetics in the active site can be thoroughly explored by calculating the residue contributions. In this way the vital factors that stabilize the ligand into the active site o f the protein can be explicitly identified and assessed. As pointed out from the catalysis study, the initial binding process opens the door for the entrance (or invasion) o f substrate (or inhibitor) into the active site of protein. The energetics of attraction and repulsion between a protein and ligand can be assessed by the electrostatic theory in which the additive energy o f each residue of protein as well as energies o f ligand are evaluated. The m ethodologies o f this calculation are rather straightforward if that the electrostatic energy is the major factor bringing the protein and ligand together is valid. In fact, the arguments o f the im portance of electrostatic in many biological activities have been presented quantitatively in the calculation of the antibody- antigen interaction in our earlier s tu d ie s .T h e energy of each residue is a collection of the energy o f every atom in the residue. This approach has been reported in recent study of the 68 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. electrostatic factor in Ras p21.^^ The accumulative energy on every atom is collected by equation: A G ,„ = X 3 3 2 - ^ ^ (5.3.1) i* j where i runs over the w hole residue on protein and j covers all atoms on ligand. The other electrostatic effects that are not represented explicitly in the model are represented in Eetf- The results are given in Figure 5.1-5.9 . The corresponding energy distributions of inhibitor residues have been determined in the same fashion as that o f protein residues and the results are sum m arized in Figure 5.10 and 5.11. The graphs indicated that energy from the interaction of the negative charged Asp diad was significant to the total binding energy. It accounts for the m ajority (-50% ) of the total electrostatic energy o f residues contributions. It is also observed that the quantity of this energy contribution is very sensitive to the geometry o f the two carboxylates and one hydroxyl group in the S j- S f pocket. The energetic variances o f this pocket in different complex might reflect specifically structural activities im posed by inhibitors. Moreover, other im portant interactions that involve the close contact o f flap region (Asp77, Gly78) have been determined quantitatively in different complexes. It is interesting to note that the active regions are conserved in all the complexes. More specifically, the regions that are mainly responsible for the stabilization are given in residues 7-18, 31-44, 75-85 (the flap region), 112-129, 216-228, 248-251, 277-283 and 291-311. For different com plexes, these active regions have been conserved while the observed variances may stand for distinctions. It is not surprising to observe that the catalytic region possesses the majority of the electrostatic energy in most cases o f protease com plexes. The region contains the catalytically essential two aspartates with one Asp residue ionized to facilitate the cleavage of the surrogate bond. T he results support the suggestion that the catalytic mechanism involves a protonation reaction of one of the aspartates (32) in the active region. 69 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I i I X a I 1 s a I w I M I ,r I i *L % s Qom/ivoof) <8j9U 3 & 70 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 a £ i I I I I <5 i I u I •r! t s tc (J 0 W /J D 3 3 1 ) < 8 j 9 U 3 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. s $ s t s 1 X s I « I I u % 5 I w 0^ E < r < £ I 72 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •r, t a ejj (jow /ivoot) K 8 j9U 3 73 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. i T3 I I I V w S 0 1 "C .2 Û V i >c I < £ I i i I I T i iri t g (jOUt/lV3}() ^ J 9 U 3 74 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. n ï i •o I I 0) g c I (A 5 U S S I g 7 oc I I N = a. I I I « ir! & E (]0 W /]V 3 )f) ^ J 9 U J 75 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. *n i "9 1 I I I * < s i 1 iâ u I i 5 e J s Q C t i 1 V s a » I ir k 3 e£ (jO U t/JVO ^) < SJ3U 3 76 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I •o I s s I s *Û I I I < N O ac w i w & z (jotu/ivjii) £Su9U3 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. e t g - g 1 X I I X i % i •= S u e k. 5 I I (joui/jüo^) <3jbu3 O - •rî t & 78 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The electrostatic contributions from ligand residues have also been assessed quantitatively. It is found that in general the energetic contributions of P i-P i’ account for the majority to the corresponding total energy. As noted in Figure 5-10 and 5-11, the contributions from P2- P 2 ’ region determ ine the strength o f the total electrostatic interactions between protein and ligand, while others that are further away from the core of active site, for example, residues beyond P 3 and P3’ make insignificant impact into the binding potency, but the structural features in these regions deter the binding specificities. This finding is supported by the results from the energetic distributions in protein in which the strength o f ‘flap’ region is clearly dom inated by the length o f the inhibitors. These results support the same observations from the evaluation o f protein contributions that the most important stabilizing factor that discriminates the strong potency from the other is the nonbonding interactions in the S i-S i’ pocket containing two aspartates (32 and 215) and a hydroxyl group. If this is valid, the molecules should be designed in a way that the optimal energetics in the S2-S2’ pocket will be adjusted as long as the binding affinity is concerned. This is supported by the X-ray studies o f the protein-ligand com plexes that the extent of affinity in the S 2-S 2 * pocket determ ines the potency o f ligand w hile the length and structures for the rest of the ligand will define the specificity o f binding.®^ The above point is not contradicting to the observation o f X-ray study that the binding affinity is attributed to the nonbonding interactions between the ligand side chains and the enzyme binding pocket. However, a few inhibitors (such as H I42) lacking the hydroxyl group in the P |- P i’ position have shown exceptional binding affinity to the protein. The results indicate the importance o f other contributions that are allowed for the optimal adjustment in the binding process. Nevertheless, the determination o f the energetics between enzyme and inhibitor is fundam ental to understand the interactions that stabilize the endothiapepsin-inhibitor complexes. The participation of robust modeling of the electrostatic interaction provides a key for success of molecular based drug design. 79 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Comparison of Electrostatic Distributions of residues in Different Ligands o h L363.564 pepstatin H261 H142 < - 8 ' p i - p r P 6 P2 P2' P3' P5 P4 Residues Positions Figure 5,10 The average electrostatic energy contribtiting to the binding o f endothiapepsin complexes as a function o f residues positions Pg-P] '. 80 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I I V I C 0 • £ 3 M 'B Ô 1 < Comparison of Electrostatic Distributions of residues in Different Ligands O' - 2 ' jS.pdb eia.pdt jS.pdb ic.pdb ig.pdb - 3 ' -4 ' P6 P5 P4 P3 P2 Residue Positions p i - p r P2' P3' F igure 5.11 The average electrostatic energy contributing to the binding of endothiapepsin complexes as a function of residues positions Pg-P]'. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For a com putational convenience, the electrostatic interactions o f the protein in solution were calculated from the same trajectory of that in the ligand-protein complex after Ips of the m olecular dynamics simulations. The computational approximation is made that the protein m olecules adopt the sam e configuration in solution as that in the protein complex. In fact that it is not a bad assumption is substantiated by many X-ray structures of proteins and their ligand-binding com plexes which exhibit slightly structural differences,*^ but there are exceptions.*^ T he inclusion o f the protein phase in the therm odynam ics equilibrium o f the binding process has been taken into account for the free energy calculation. The recognition o f the protein phase in the binding process has frequently been omitted by other free binding energy calculation. To be m ore conscious about the structural difference o f ligand upon binding , it is realized that the average structure o f the ligand in the solution differs from that in the protein. The difference that is caused by the distinctively electrostatic properties of the ligand definitely deserves attention. To examine the energy difference, we have carried out the studies by running the M D sim ulations o f the inhibitors only in solution. The M D simulations would generate the configurations of the ligand about its starting position in the solution. The subsequent PD LD calculations on these configurations w ould give the average solvation energy of the ligand in solution. The average solvation energy differences o f moving ligands from water to protein medium over different configurations are given in Table 5.2. It should be stressed that as a general trend, the sim ulations of the polar inhibitors in solution yielded more negative solvation free energy than that o f using the same configurations o f inhibitors in protein. The solvation free energy of inhibitors in protein reflects the structure of the protein active sites. The solvent accessibility o f the polar inhibitors in w ater can be reached to the optimal value. On the other hand, the non-polar solute tend to have a m inim um exposed surface area in solution. The results of the 82 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. simulations demonstrated that the inhibitors with rich electrostatic property tend to lose the transitional and rotational flexibility in going to the form ation o f the enzym e-inhibitor complexes w hile the non-polar inhibitors have increased degrees of freedom in the protein active site. The translation of this improvement in the into the corresponding change of the total binding free energy needs further studies. 83 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5.2 The solvation energy differences of inhibitors in solution/protein (kcal/mol) (kcal/mol) A AC;;^''(kcal/m ol) If -92.83 -96.92 -4.09 ^2 -61.96 -70.42 -8.46 -97.51 -105.49 -7.98 I4 -104.31 -107.83 -3.52 1 -37.73 -32.60 5.13 2 -41.58 -38.47 3.11 3 -32.37 -30.66 1.71 4 -37.63 -29.82 7.81 5 -34.33 -27.60 6.73 The solvation energies of ligand in water and protein are evaluated using the PDLD/S approach. The PDLD/S gives reliable solvation energies of a series organic and amino acids compounds as reported in early studies, the same construction o f the grid dipoles to evaluate the solvation energies of ligand molecule is expected to retain the accuracy in the calculations. As a m atter of fact, we have to understand that the role o f the solvation energy is difficult to be completely realized in the binding calculations. This is partly because that, in m ost cases, the information of solvation energy is not available experimentally. Therefore, we consider the calculations provide row data in this respect. The results appear to reveal valuable insights of the properties o f ligand molecules as discussed in the text. 84 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.4 The Absolute and The Relative Binding Free Energies W e can appreciate w hat the POLARIS program could offer into the insights of the absolute and the relative binding free energy o f endothiapepsin to its inhibitors by taking and analyzing the accountability o f every contributor. As realized in the total binding free energy, in addition to the electrostatic interactions, it is essential to consider other factors that are involved in the binding process, such as the relevant hydrophobic and steric contributions. T he details o f the methods o f estim ations o f hydrophobic energy and reorganization energy are given in previous chapter. The hydrophobic energy is calculated from the corrected hydrophobic surface area of the solute that is evaluated from a comparative process by an effective field approach. (Luzhkov & W arshel, 1992) The steric effect considers the contribution of binding from the electrostatic change due to the charging process. T he results o f the total free energies and all the contributions are listed in Table 5.3 and Figure 5.12. The total binding energy, being com posed by several contributors, AGq^, AGiyg^, AG^un^, AG^yj and the reorganization energy, X, is evaluated by scaling a ‘dielectric constant’, Ep, for the electrostatic terms as presented in the PDLD/S method. AGq^ is a collection o f the charge-charge interactions betw een the inhibitor molecules and the proteins molecules that are represented explicitly in the reaction region. They range from -23 kcal/mol to -8 kcal/mol. In general, the free energy o f the interactions between molecules o f enzym e and the polar inhibitors (I1-I4) have more negative energy contributions than that o f the non-polar inhibitors (1-5). This suggests that the polar inhibitors are better stabilized in the active site. A missing hydroxyl group between P| and Pj' in the H142 caused at least about 10 kcal/m ol less negative contribution to the AGq^ com pared with other polar inhibitors. The im portantly structural feature o f ow ing the hydroxyl group of the inhibitors is dem onstrated from the free energy difference of the charge-charge interactions. On the other hand, this favorable energy contribution is 85 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. com peted with the adverse energy that com es from the solvent, AG[yg„. The polar molecules have better chance to dissolve in water than the non polar molecules. Therefore the larger com pensations from the solvent for the polar inhibitors have also tied into the total binding results. As com pared from the table, the hydrophobicity is increasingly exhibited with the increm ent o f charge-charge interactions, indicating that the activation cleft for a polar inhibitor is better protected from the solvent disturbance. As indicated in the energetic contributions, the entropy represents the favorable term in contributing to the total binding. The solvent and the rest of the protein effects that are not explicitly included from the missing part o f the system are evaluated by the energy, AG^uik- The calculated binding free energy were found in an excellent agreement with the observed data in the reasonable effective dielectric effect. 5.5 The Underlying Sampling Confîgurations Based on the PO LARIS approach o f binding free energy calculations, we will address the sam pling issues from two aspects that com prise the m olecular dynam ics simulation and the FOLD calculations. Regardless the formidable convergence problem in the M D/FEP approach, the molecular dynamics simulations offer an useful tool to relax and generate the structure applying the force field approach. In principle, the m olecular system will be eventually brought to the equilibrium states under the influence of the intermolecular interactions if given accurate potential functions and representations no m atter what the initial state o f the system is. This requires the simulations as long as possible. However, the facts that the limited potential functions and using of various cutoff radii needed by the current computer power are very problematic, especially, the simulations involve such large and complicated system as protein. The simulations may never lead to the equilibrium state. One solution to relieve the problem is to choose the incipient state o f the simulation closing 86 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to the equilibrium state. As m entioned in the previous section, the introduction of an additional constraint term will acquire pow ers to monitor the dynam ic m ovem ent of the system. By adjusting the constraint, the structures are allowed to move in a very cautious pace. If we choose the initial state as the X -ray structure to run the molecular dynamics simulation, the trajectories of the molecular system can be instructively obtained. -10 - 12 " bû -16" -18 -16 -12 -10 -14 -8 6 Experimetnal Binding (kcal/mol) F igure 5.12 The correlation between the calculated results from PDLD/s (£p=4) and experimental binding energies for a series of endothiapepsin-inhibitor complexes. The total binding free energies were evaluated over 5 configurations that were simulated and collected by the molecular dynamics. The variances in the average free energies are indicated in the range o f error bars. The fluctuations are evaluated by the lengths of the relaxation (I-4ps) and the strengths o f the constraint molecular dynamics (kcon=001~10)- 87 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5.3 The PDLD/S results and the energy components fkcal/moH Exp. Cal. AAGou AAGqw AAGhvd X Il -14.67 -17.237 -23.336 16.809 -16.923 6.213 h -14.00 -15.900 -20.899 13.830 -14.463 5.632 h -13.65 -16.093 -22.993 18.606 -15.787 4.081 h -13.31 -15.038 -15.027 14.250 -16.853 2.592 1 -11.67 -15.126 -12.006 9.996 -16.547 3.431 2 -10.67 -14.746 -8.957 8.570 -15.682 1.323 3 -9.84 -13.838 -13.056 10.791 -15.724 4.151 4 -7.93 -13.648 -8.235 8 .2 2 0 -15.918 2.285 5 -6.57 -11.049 -9.947 9.050 -11.099 0.947 Exp. and Cal. represent the experimental and calculated values, respectively. is the collection of the charge-charge interactions between the inhibitors and enzyme. AAGgty represents the energy differences of binding from the solvent molecules, including the correction from the long range electrostatic energy evaluated by a local reaction field approach. The term, AAG/,y^ is the change in the hydrophobic free energy evaluated from the local field-dependent hydrophobic surface area. X . is recognized as a reorganization energy to restore the force for the electrostatic interactions. 88 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. O ther than the resources for the structural supplies o f the PDLD calculations, the molecular dynam ics also help to refine the local structures. O ur experience o f training the program to m eet our purpose lead us to believe that in practice the relaxation should not be needed too long. More uncertainties are involved after the com plex system experience a very long sim ulation given less than perfected parameters and representations. Therefore, a short relaxation is recommended and found to be useful for local structural refinement. The second aspect concerns the sampling problems in the PDLD calculation. The application o f the PDLD m ethod in the binding free energy calculation faces a major question regarding the numbers of configurations that are required to give a self-consistent energy in protein and solution. The twofold configurations refer to the first, the numbers of the protein structures, and then the numbers o f the solvent dipole grids that corresponding to the protein structure. C onsidering the num ber o f protein structures for the self-consistent force field evaluation, it is found that 4-8 structures are usually required in our studies as dem onstrated in Figure 5.13. It appears that the mean force field of the solute-solvent interactions could be effectively obtained over 4-16 configurations, more structures (more than 16) are not necessarily to improve the result at the cost o f very expensive calculations. The relevant explanation might be the fact that the average interactions of solute and solvent could be effectively captured in the PDLD calculations over several configurations. N evertheless, four configurations are proven to be the low est limit for the PDLD calculations. As presented in the previous section, thirty sets o f solvent grids are generated with one protein configuration that is taken from the MD sim ulations following a dipole-grid construction procedure that is detailed in the previous chapter. Then, the integrated non iterative calculations are carried out for each set of the solvent grid. The three grid sets with the lowest energies are chosen to yield the average Boltzm ann energy o f solute-solvent 89 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. interactions. This m ethod has been exam ined by calibrating the solvation free energies of a series of organic com pounds, reproducing excellent agreem ent with the observed values. W e conclude that the average solvent-solute interactions can be m odeled by the above setting o f solvent model in the solvation free energy calculations o f organic compounds and ions. W e used the sam e setting for the solvation calculation o f protein molecules to expect that the construction in the PDLD model can be successfully utilized in the macromolecular computation. O ur investigations to date suggest that the m olecular dynam ics sim ulation is a pow erful tool to explore the configurational spaces despite the facts that given the limited force field representations (including cutoffs, long range electrostatics, etc.) hampers the progress o f sampling enough configurational space o f m acromolecules. W hen we initiate the M D sim ulation from the X -ray structure, the system is m anually m onitored by introducing a constraint term in the M D simulations. The additional term is given in the equation (5.1.1). On the other hand, the PDLD approach have showed relatively stable performance o f the calculation o f the averaging force fields in protein and solvent system. The averaging solute-solvent interactions over 4-16 micro configurations generated from M D simulations provide a computational framework o f using PDLD model for the free energy calculations. 90 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Comparison of the calculated binding energies over 16,8 and 4 configurations - 10 ' • 4 configurations A 8 configurations ■ 16 configurations I - 12 ' -1 8 ' -20 -12 -10 8 6 -16 -14 Observed binding (kcal/mol) Figure 5.13 Com parison o f calculated PDLD/S results over 16, 8 and 4 configurations. It is interesting to observe that the calculations from 16 configurations are not necessarily needed to improve the results. 91 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.6 The Sensitivity of the PDLD and PDLD/S Calculations The stability and convergence of the PDLD calculations have been exam ined by Russell and W arshel (1985) on the pKa calculations. W e have also exam ined these sensitive factors in the bind free energy calculations. As the results indicated, the calculations o f different dimensional systems for a same enzym e-inhibitor com plex give different results. This is because the initial conditions of the sim ulations to generate each configurational space about its equilibrium positions are different. As presented in the binding free energy calculations, the application of the PDLD/S approach is sensitive to the structural changes. This is reflected in the variant energetics o f each structure. It is noted that even if each individual component is different for each configuration, the variances of the average o f total binding free energy are not significant. In other words, the evaluations o f total binding energy turn out to be quite consistent. The energetic variances of the total binding energy and its com positions for pepstatin complex have been dem onstrated in Figure 5.14. Apparently, the compensative nature of electrostatic interactions is a key factor that is exhibited in this approach. This compensative effect m ight be one reason to explain that why different solvents give similar solvation energy of NaCl in water (-102 kcal/mol) and methanol (-100 kcal/mol), even though each component appears distinctive. This is generally observed in the binding fee energy calculation in w hich the relative energy differences of AGq^ and AGigvn are changed in a way to opposite against each other. This suggests that the strong charge-charge interactions of protein are usually offset by the solvent effects. It is a m atter of degree o f com pensation, how ever, the fact that the compensative effects in protein and solvent are obviously more complicate than that of the single ion pair in solvent leads to consider effects other than the solvent contribution, such as the hydrophobic and steric changes. The hydrophobic effects appear to be the favorable contribution to the binding. 92 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 20 ' 10 ' 1 - 1 0 - -20 -30 2 3 4 5 ( N ^ A GQw # GTot ■ GQu 1 A Ghyd N um ber of Structures F igure 5.14 The energetic variations of total binding energy and its com positions as a function of the number of structures. G q ^ stands for energy from solvent; Gq ^ represents the energy from charge-charge interactions; G^yd represents the hydrophobic energy; X is the reorganization energy from the steric effect: and G r o t's obvious the total binding energy. 93 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In exam ining the change o f electrostatic energy due to the effect o f changes in geometry, it is important to realize that the protein equilibrium geom etry changes upon the ionization o f its acidic group. This particular consideration is im portant for the aspartic proteinase because the X-ray structures were obtained under the ionization state of the active site. F or all inhibitors in this work, Asp32 was treated as an ionized group for all calculations. The consideration o f the effect o f changes in geom etry is also exam ined by calculating the geometrical r.m.s. (root of mean square) o f each configuration w ith respect to its initial position (X-ray structure). The r.m.s. values range from 0.9 to 1.2 for all configurations. This differences lead the fluctuations of the electrostatic energies around 2 kcal/mol. M ore studies are needed, however, to determine the correlation factor. 5.7 The Effect of Partial Charge Distributions As m entioned in the m ethod, the PDLD calculations are directly affected by the partial charges possessed by each protein atom. Our experience o f assigning the partial charge distributions in the PDLD model is based on the calibration o f the solvent energy of a series o f m olecules, for exam ple, organic molecules; twenty am ino acid; single charged ions and ion pairs. In the parameterization, the solvent energy consists all the contributions as described in the PDLD m odel. The assignm ent o f partial charge on each atom is especially im portant when the charged atom is either a new type o f atom that requires to calibrate or in the position of the atom is so sensitive to a structural change that can cause significant energetic shift. In this section, we examine the sensitive issue of im portance of the accurate partial charges in the POLARIS program to calculate the electrostatic free energy from two viewpoints. The first underlying point is em phasized on the role o f partial charge for each atom in the POLARIS program. The second part is directed to discuss the 94 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. methods o f how these partial charges were obtained in the current computational setting and how reliable they are com pared with the results from an ad hoc approach. W e have investigated the binding energy as a function of different partial charges of sulfur that has been existed in the both com plexes o f eia.pdb and ic.pdb, respectively. It is interesting to observe that the calculated energy difference between these two com plexes is increasing with the change o f the charge from 0.6 to 0.8. A t charge o f 0.6, the energy difference was barely recognized. Until the charge on S was 0.8, the energy change went to 3 .1 1 kcal/m ol, w hich was in a reasonable range com pared with the experim ental difference, 2.74 kcal/m ol. The results are given in Table 5.4. The partial charges have been evaluated by the ab initio calculation. W e have em ployed the G aussian 94 program to carry out the calculations to exam ine the partial charge distributions. The structure is initially relaxed at AMI level. The partial charges have been evaluated at both A M I and H F6-3IG * levels, the results are shown in Figure 5.16. the So’ DositionfS=0.6.0.7 and 0.81 are sum m arized Exp. (kcal/mol) 8= 0.8 8=0.7 8 =0 . 6 eia -10.67 -19.885 -19.849 -19.675 ic -7.93 -16.775 -18.703 -19.535 AAG 2.74 3.11 1.146 0.14 W e have show ed the corrected correlation o f binding energy and the observed binding energy in Figure 5.15. As dem onstrated in this case, the partial charges would 95 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. influence the electrostatic calculations to a great extent. Care should be exercised in the assignment o f sensitive charge distributions. -12 -14‘ I I - 20 ' -22 -12 -10 9 8 •7 6 Experimental Energies (kcal/mol) Figure 5.15 The comparison of the correlation between experimental and calculated binding free energies (in kcal/mol) for com plexes 1-5 in term s of changing the partial charges o f S atom in eia and ic compounds. 96 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. A M I results o f charge distributions on heavy atoms (-0.4954) O H- H -0.6389) / H (0.6791) / CH 7243) H ( O .I 1 8 4 \ (-0.7671) \ c ------- / \ O (-0.4998 H H Ab inito (HF6-31G*)results o f charge distributions on heavy atoms H- H \ ( ^ -0.4542) V: (-0.5451) O (-0.5520) (0.7939) / / •CH H (0.2420\. (-0.4523) /r " (-0.4998 H H F igure 5.16 The ab initio calculations o f charge distributions in levels o f A M I and HF6-31G*. respectively. 97 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.8 The Optimal Approaches in the PDLD/S Calculations To appreciate the optimal method in the PDLD approach, one has to realize that the m icroscopic evaluations o f every contribution are tem pted to take all the m olecular interactions explicitly into account. However, the limitations o f using various regional cutoffs and the convergence problems have practically hampered the progress to completely reach the goal. That m ight explain the reason of why the calculation of scaling the PDLD result w ould extend the precision significantly. It has been proved that, in the PDLD/S calculations, the m issing effects are implicitly taken into account in the dielectric constant. £p, as indicated in the therm odynam ic cycle in the binding process. It is w orthw hile to mention that the dielectric constant, £p, does not represent the m acroscopic property of protein system in any way. The extended scaling term would be considered as the correction that could take into account those effects that are not included explicitly in the PDLD m o d el. W e found that the approaches are quite consistent and produced fairly stable results in both pKa’s calculations and ligand binding free energy estimations. The fact that the existing method has been very successful in reproducing the solvation and total binding free energy o f well defined systems establish an important foundation that we can stand on for further studies. W e proceed to calculate the optimal binding energies for inhibitors 1-5 with the effective dielectric constant, Eopt. The result is given in Figure 5.17 and the £opi was found to be 3.321. Alternatively, we proposed an approach by incorporating the non bonding ven der Waals interactions into the total energy results. As noted that the entire approach o f PDLD method is based on the electrostatic interactions, the additional correction from vdw interactions w ould be expected to im prove the calculations. The total energy can be dissected into two parts as shown below: AGto'a' = AG PDLD/s + a AG vdw (5.8.1 ) 98 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. w here stands for the total energy, AGP["LD/s and AG''^*'^ indicate the calculated binding energy from the PDLD/S method and the energy o f the nonpolar vdw interactions, respectively. The a o f AG^^*'^ represents the weight factor that is calibrated by fitting the linear relationship for a series of com plexes, a was found to be 0.15. The result is given in Figure 5.18. - 8 ' -12 -12 11 9 -10 -8 ■ 7 6 Observed Binding (kcal/mol) Figure 5.17 The correlation of optimal approach in the PDLD/S, £opt=3.321, and the observed binding energies. The variances of binding energies are indicated in the corresponding range of the error bars. 99 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. As dem onstrated by both optimal approaches, the results are very stable and appear in excellent agreem ent with the experimental binding energies. The studies reveal that the calculations are expected to inevitably converge as the approach becom es more complete and depicts a realistically physical picture. - 8 ' I & - 10- I < -12 -12 -10 11 •9 8 7 6 Experimental Binding (kcal/mol) F ig u re 5.18 The correlation between the calculated binding energy with nonpolar vdw correction and the experimental binding energies. 1 0 0 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.9 The QSAR Analysis If we are going to claim that the approach o f the PDLD model and FDDL/S method for the binding free energy calculation is the best out o f all the other approaches with current com puter power, then in w hat ways do we claim that it is a reliable and predictive approach for the binding energy calculations. As the results indicate that the relative binding free energy shows reasonable trend comparing against the obtained values, it is, at least, an encouraging fact that the approach is successfully reproducing free energy o f binding. Once we have enough experience and confidence in producing the total binding energy, we move on to study more details beyond the total binding energy. If we look further into the com positions o f the total binding energy, the question is how we can interpret these contributions, and then in what ways these energetic com ponents are related with each other and to the total energy. It is extremely difficult to gain confidence in each calculated contribution by comparing against the experimental values. It is because, in most cases, we lack the corresponding experim ental results, nor could we experim entally access the quantitative measurement. The goal o f using the quantitative-structure-activity-relationship (QSAR) studies is to rationalize the results with the mathematical m ethod. W e are interested in extracting systematic information of relationship of the electrostatic energy with the other type of energetic factors, such as hydrophobic and steric energies contributing to the total free energy of the binding process. The paradigm o f Q SA R is an extending o f Ham m ett’s concept that was first studied in 1935.^^'^^ The applications of the concept into the biological system have received a lot attentions r e c e n tly In general, the method derives a function relating biological activity ( or any m olecular properties ) with some parameters describing the features of the molecules: 1 0 1 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. Biological Activity = ^ f i{ p a r a m e te r s ) (5.9.1 ) i In this application, the relationship is form ulated by analyzing the correlation between the total binding free energy, that is considered as a biological activity on the left- hand side o f the equation, and param eters o f electrostatic, hydrophobic energy, steric contribution and solvent reorganization energy, for a series o f protein-ligand complexes on the right-hand side o f the equation. The total binding free energy describing the associating tendency o f protein-ligand in solution is closely dependent on different energetic contributors that are taken from the average micro configurations. Each contributor is calculated independently. Understanding the correlation o f these energetic contributors and the total free energy is ultim ately useful and provides remarkable insights into the natural molecular interactions that w e can rely on for the structural-based molecular design. As presented in the PDLD/S m ethods the total binding free energy, , i s com prised of several different energy com ponents: AGg„, AG q„ , AG^^, and X. If we assume that the total binding energy is linearly proportional to its every component, the correlation of different com ponents and the total binding energy can be analyzed by the introduction of the linear regression method.^i Graphical methods are particularly useful to explore the relationship between these com ponent variables. The assum ption is made that all these components are independent variables which can be calculated exactly while the total binding, AG^"'j, is subject to random errors o f m easurem ent. For a convenient expression, let us suppose = AGg„, Xj = AG q^, = AG^^, = A and y = AG'^lj, the equation is given by: y = a x ^ + b x 2 + c x j + d x ^ + e (5.9.2) For an ith set of coordinates to be x,', x^, x^, x^ and y', then: = y ‘ - (ax{ + bx ' 2 + C X 3 + d x \ + e ) (5.9.3) 1 0 2 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. where is the residual value. In order to fit the best line we have to minimum )” • i This condition is satisfied by partially differentiating w ith respect to each i coefficients, a, b, c, d and e. Each equation equals zero. The coefficients and the best value y can be determ ined by solving a linear equation. The results are sum m arized in the figure 5.19. The corresponding coefficients are obtained: a=1.612, b=2.014, c=1.377, d=0.473 and e=-0.053. T he correlation coefficient R is equal to 0.963. Î I 6 7 8 -9 -10 -12 -13 -14 -15 -16 16 -15 -14 -13 -12 -11 -10 -9 8 6 7 Experimental Values (kcal/mol) Figure 5.19 The linear regression method to correlate the calculated binding energies and the experimental values. The presum ption that assumes that each component is linearly proportional to the total result allows one to apply in the case o f study. The weight of each contribution could be analyzed and determined. The mathematical m ethod is useful to identify and eliminate the systemic or mechanic errors that arc persistent in each com ponent and generate more reasonable results with optimal correlation between each contributor and the total energy. 103 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. This method provides analytical blueprint to rationalize the calculated PDLD/S results. The approach gives quantitative results that coincidentally agree with the argument that the electrostatic interactions between protein, ligand and solvent play a dominant role in the stabilization o f protein and ligand com plex. The results also p oint out the other im portant fact, that the hydrophobic contribution could not be negligible in the binding process. 5.10 Concluding Remarks Computational modeling of real molecular system offers an innovative approach to study and understand the insights into the physical and chem ical properties of macromolecules. One o f the most seductive goals o f this approach is the predictive power of the calculations that will ultimately enhance our ability to study many applications, such as autom ated and effective drug screening procedure by the com putational methods. The predication o f binding energy of protein and ligand interactions is one o f them. Developing methods are a great challenge. The optimal approaches have offered a com plete treatment that is attem pting to consider all effects into the free energy. The approach is motivated to achieve a completely physical picture that is mostly limited to reach by current technologies and not so-perfect method. The FEP/MD approach requires much longer com puting time to produce the binding energies that do not appear substantially better than the PDLD/S results.^3 The LRA approach is a very promising and effective method. The claim of robust calculations is still prem ature. Further studies are needed. The PDLD/S m ethod provides the optimal approach to produce mighty results in reasonable computing times. In the PDLD/S calculations, the calculated binding free energies for different inhibitors are in good agreement with the available experimental data. T he results show the importance o f structural information for obtaining reliable electrostatic energies. The PDLD 104 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. methods really provide an effective approach to evaluate various force fields introduced by the protein molecules and the solvent dipoles in the reaction region. The introduction of the local reaction field approach has successfully included the long range interactions of the Langevin dipoles. The improvement increases the perform ance o f the POLARIS program especially when required to include m ore than thousands o f the solvent molecules in the system. T o develop effective m ethods that will provide predictive pow er for automated drug design is a true challenge. T he m olecular dynam ics sim ulations provides a useful method to explore the configurations about the equilibrium space, It is, however, not clear how m any configurations are required to represent the whole dim ensional system in the calculations. N evertheless, the application of the force field approach in the PDLD and PDLD/S m ethods to reproduce the binding free energy offers an effective way to study the correlated force fields in enzymes and solutions and the stability o f the protein molecules. With the help o f the graphical m ethod, the relationship o f each com ponent variable that leads to the total binding energy is determined. The results indicate that the electrostatic energy plays a major role in the determination of the stability of the proteins. 105 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 6 Studies of the HIV-1 Pro tease 6.1 Introduction The applications of structure-based, com puter-assisted molecular design have been used extensively in searching for inhibitors of protease in many recent studies.^3-95 The com putational approaches incorporated with the X -ray crystal studies have provided a powerful com bination to probe the putative drug.^^ For example, UCSF8, a putative drug was designed by com putational m ethods and cocrystallized with HIV-1 protease, has shown an exceptional potency resulting from the proposed residues in the binding mode.^^ The innovative approaches have offered enormous advantages that have not only enhanced our capacities to understand the m icroscopic process but also shortened the screening process to effectively search for both peptide and nonpeptide based drugs. The peptide- based m olecules are in general biologically unstable, poorly absorbed, and rapidly m etabolized.^* T he problems associated with the peptide-based inhibitors are not uncommon in the therapeutically challenged. The sam e barriers have been found using the renin inhibitors.^^'^o* The com puter-aided selection for nonpeptide inhibitors from the available databases has provided the synthetic fram ew orks with predictable potency and desired structural activities. In the course of drug design, the ability o f predicting the binding association o f drug to protein is an im portant process in attempts to select prospective m olecules. The successful implementations offer an innovative route to novel leading compounds and might play an important role in rational drug design. The reported PDLD m odel and PDLD/S approach to study the nature of enzyme inhibition have been successfully reproducing the trend of binding affinity for endothiapepsin to several of its i n h i b i t o r s . T h e Q SA R analysis reveals that the 106 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. electrostatic energy is a major factor in regulating the binding potency. Though other contributions, such as the hydrophobic and steric are found playing im portant roles in contributing to the total binding energy, they are correlated w ith the electrostatic energy in a way that the compensatory effects have been exhibited to yield the consistent result of the total binding energy. W e have extended the free energy studies to the applications o f the HIV-1 protease in aims of proving that the PDLD approach can provide a consistent approach to quantitatively evaluate the binding free energies with the structural information. Further studies o f energetic com ponents and detailed contributions would offer profound insights into the observations o f different stabilization o f the protease com plexes at a molecular level. Since the HIV-1 virus was identified as the m ajor agent to cause the acquired immunodeficiency syndrome (ADDS), * 02,103 much research has been conducted in efforts to understand the origin of the virus and develop effective anti-viral drugs to prevent the infection. Up to date, the approved anti retroviral therapies for AIDS are the 2 ’,3’- dideoxynucleosides: AZT, ddc and DDL The drugs are effective to block the DNA chain replication by inhibiting the viral reverse transcriptase (RT).i®^ However, the problems with drug resistance and many reported side effects associated with the long term usage of these com pounds have shifted the research interest into the developm ent o f effective inhibitors of the HIV-1 protease. It has been found that the HIV-1 PR acts as an essential agent to promote the development of infectious virous. Therefore deactivation of this protein is a crucial point to prevent the process o f production of virions. The X-ray structure of the native form o f HTV-1 aspartic protease as well as structures of the protein com plexes have been well characterized and determ ined in many recent studies. The availability of the X-ray structures allows one to use computational approaches to study the binding affinity and specificity. As a characteristic of aspartic protease, two catalytic aspartates 25/25’ are found in a deep cleft forming a hydrogen bond netw ork with a 107 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. hydroxyl group (except no such group) at Pi-Pi’ position o f inhibitor. This is the significant characteristic that HTV-l PR shared with aspartic proteases except that the active form o f HIV-1 PR is a hom odim er o f lI-k D a subunits (M eek et. al., 1 9 8 9 ).'07 The structural configuration of the protein residues within the active site (Figure 6.1 ) formed at the dim eric interface is almost identical from those o f the monomeric aspartic proteases. It is reasonable to suggest that the HIV-1 PR w ould proceed the sam e catalytic mechanism as that o f aspartic protease. As a m atter o f fact, the recent study o f DM P323 by the NMR determ ination has revealed a significant increase o f the pKa’s o f A sp 25/25’ in the protein active site upon the binding o f a seven num ber ring inhibitor.It is tempted to be seen as the result o f the protonation o f the catalytic residues in the active site. Therefore, the formation o f protonations have been suggested on two carboxylates of Asp 25/25’. It is believed that the pKa values are sensitive to local environment o f the active site. Thus, the know ledge of the pKa values of protein residues would provide an evident source to understand the local changes and interactions between protease and inhibitors. Other than that distinguished local interactions between the enzyme and different inhibitors expected, a similar binding pattern being shared am ong all protein and inhibitor interactions has been suggested by a recently com putational study o f HIV-1 PR and its com plexes."® In the previous chapter, a similar binding pattern has also been identified o f endothiapepsin with its inhibitors. T his work is a continuous and extended study o f the investigation of the electrostatic role in the protein-ligand interactions that regulate the activity o f enzyme. The application o f free energy calculations for the binding has been carried out using the semi- em pirical PDLD approach. The calculations are arranged into two parts. The first component contains the pKa’s calculations of protein residues, Asp25, in the native form of enzym e and in the protein-ligand com plexes. In particular, the pK a’s of several active 108 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. residues in the liganded-protein have been estim ated in the D M P323 complex. It is interesting to note that the pKa shift o f catalytic residue due to binding of different inhibitors is proportional related to their binding energies. The second part focuses on the relative and absolute binding free energy estim ations o f a series o f HTV-l protease to its ligands. The extensive studies and analyses of the electrostatic interactions between protein and ligands in solution lead to detailed knowledge o f the interactions that stabilize the protein complex. The conclusions o f this work provide a better understanding of molecular original o f ligand binding and specificity o f HTV-l protease. As m ore results of the applications of the PDLD/S approach are obtained, the PDLD/S m ethod appears to be the evidently robust approach in predicting the pKa’s and binding energy based on the structural inform ation o f the biological system . The com prehensive analyses of the energetic com positions by the Q SAR approach have revealed the im portant notation as expected that the electrostatic interactions are the m ajor force to regulate the local environm ents of the active site. 6.2 Materials and Computational Procedures W e obtained X-ray structural information o f five potent protein-ligand complexes from the Protein D ata Bank with the ligands available in the HTV-l protease active site. M VT-101 is a reduced amide-containing inhibitor.• It was crystallized with HTV-l PR in an aim at understanding the structural and functional characteristics of the enzyme-ligand com plex at an atomic level, which will be useful in determining the molecular original of substrate specificity and in the design o f H IV -1 PR inhibitors. T he JG-365 inhibitor containing a hydroxyethyl-amine linkage is considered as highly potent molecules to the HIV-1 protease.i>2 Its high potency was expected from the strong interactions involving the hydroxyl group in the P; position. The examples o f structural illustrations of JG-365 109 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Structural Illustration of JG-365 Complex F igure 6.1 Ribbon drawing of the HIV-1 PR with JG-365 complex. The catalytic residues are indicated in black. (Asp25/25’ and a hydroxyl group) 1 0 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Structural features of JG-365 Complex Figure 6.2 Ribbon drawing of the HIV-1 PR homodimer with JG-365 complex. The catalytic residues are indicated. (Asp25/25’ only) Two homodimer of H IV -1 PR forming active lobes are denoted in dark and grey, respectively. I l l R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Structural Illustration of Active Site Complex rz!> T = :': Figure 6.3 The specificity pockets in the HTV-1 PR with JG-365 complex. The active site contains two identical tripeptide sequence, Asp-Thr-Gly, on the interface of the two lobes of homodimer of H IV -1 PR. That a hydroxyl group firom POH residue at P i position is a crucially functional group interacting with two catalytic groups are represented in black. Other close interactions between Asp29 and Asp30 are indicated. The dots denote the vdw radii of atoms. 1 2 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. complexes are shown in Figure 6.1 and 6.2. The detailed active site description involving the interfaces o f two lobes from each m onom er is depicted in Figure 6.3. The hydroxyl group was also found in the same Pi position o f A -74704 inhibitor.• ^3 The C2 symmetrical inhibitor, A-74704, is less potent than JG-365. The chemical sym m etries of the inhibitors that cause distinctive binding patterns have been investigated by exam ining the detailed interactions between protein and ligand that reflect the difference o f the total binding energies. A76928 (S,S) and A -76889 (R,R) have been selected for the calculations."^ XK263 and DMP323 containing a seven-number ring are the result o f the modeling studies in attempts to mimic the catalytic water in the active site."^ Replacing both phenyl groups by two hydroxyl groups has led to 10 times decrease o f the K, value for DMP323 HIV-1 protease. The calculations o f pKa values of the catalytic residues and relative and absolute binding free energy have been evaluated using the POLARIS softw are program. Since the binding process is occurred in the solution, the energetic impact of the interactions between protein and solvent molecules becom es essential force that is responsible for the biological activities. In the POLARIS program , the m olecular associations o f protein, ligand and solvent are described and quantitatively evaluated by the m icroscopic Protein Dipole Langevin Dipole (PDLD) model, in which the energies are divided into the following types: the electrostatic interactions between protein and ligand, protein and protein, protein and w ater, ligand and water, w ater and water; the hydrophobic energy from the solvent- inaccessible area and the steric energy representing the change o f different charging state of ligand. The thermodynamic cycle o f the binding process yields the total binding energy is: = [ACS' - (ACL + AGS » * - - ir) + (6.2.1) ^in ^out ^in w here A G ^ , AGj^/y, AG^/^ designate the solvation free energies o f protein-ligand complex, ligand, and protein, respectively; V gg and Vg„ represent the microscopic 13 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. JG -365 CH, O D e 0 Thr 0 N ie 0 Gin M V T -101 O V al O H V al O O Phe Phe O A -7 4 7 0 4 O V al O H Phe O CH, ryf'^=-VYV.\'V'0 CHj 0 Phe O H V al O A -76928 (S,S) ? ' J H V al O A -76928 (R,R) 14 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. O H H O X K 263 O H H O ' O H H O D M P 3 2 3 Figure 6.4 The chemical structure of inhibitors for the HIV-1 protease. interactions of the charged atoms within the ligand and between the atoms in the ligand and protein, respectively; is the dielectric constant of water; refers to a ‘dielectric’ scaling constant. The evaluation o f each energetic contributor at different phases (including ligand in solution, protein in solution and protein-ligand in solution) is determined individually. This means that the separated molecular dynamics simulations of ligand in solution would be carried out in the POLARIS program in attempts to consider the solvation effect due to the configurational difference of ligand upon moving from solution into the protein active site. The illustration o f com putational framework of the POLARIS program that includes the m olecular dynam ics sim ulations to generate the configurational structures and the corresponding PDLD/S calculations of the protein structures from the MD results was given in Figure 2.1. To clarify the differences of two distinct yet corporate parts in the POLARIS program, we have reinstated that in the molecular dynamics simulations for the PEP and LRA calculations, the SCAAS model is im plem ented to represent explicitly 15 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. solvent atoms; while in the PDLD/S calculations, the Protein D ipole Langevin Dipole (PDLD) model is applied to evaluate the average energy of solvent-solute interactions. The presum ption of the linear response approximation allow s the m ean PDLD/S energy being taken from the average energies of two different charged states o f the interested group. This unique feature in the PDLD approach provides the consistent treatment of the protein structural relaxation upon formation of charges. The form ula adopts the sam e concepts as discussed in the LRA approach that assum es the energies of both states contain several collective harmonic terms: ^^PDLD/S ~ ^ r '’ + < ) (6 .2 .2 ) 2, - V . (6.2.3) where rp represents the protein structure generated with the corresponding charged state q. The two charge states consist o f the charged state and zero charge state for each atom. AV corresponds the potential difference of the protein structures generated from the change of the charge states from q=0 to q = ^ . The averaging process is also im plem ented in the evaluation of pKg’s. (see Sham, Y. et al. submitted) Com paring with other m icroscopic approaches, such as FEP and LRA, the semi-microscopic PDLD/S calculations o f the same system possesses significant advantage in a way that the energy convergence is much faster and reliable to achieve without confronting the configurational problem . This is generally attributed to the simplified representations of solvent molecules while retaining the energetic features of solvent molecules in nature. It has been argued that the dipole m odel could not reproduce the solvent structure and capture the associations o f solvent m olecules. To consider the calculations of energetic features of the solution system using current computer pow er, these insufficiencies are only seen as secondary factors from our extensive experiences with the applications o f PDLD model. The model has been applied into many studies and show s very consistent perform ances, for exam ple, the binding energy 16 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation for body-antibody,'^ endothiapepsin;*'^ the solvation free energies^* and pKa’s calculations in recent studies. In the evaluation o f the energetic distributions in the protein residues and in the ligand residues, the sim plified approach o f using C oulom b’s law for the energies of charged and uncharged residues are represented explicitly in the protein model. The solvent and other effects are im plicitly taken into account by scaling a ‘dielectric constant’, £p. As indicated in the PDLD/S method, the binding energy is evaluated on the average o f both charged and uncharged stated o f the residues. Therefore, it is desirable to obtain the residue contributions over the average of both the charged and uncharged forms, respectively. The pKa of ionizable group in proteins is illustrated in a therm odynamic cycle, as shown in Figure 6.5. PK'a = P : ^ A ’ )] (6.2.4) where the superscripts p and w denote the protein and water, respectively. AAGsoi is the difference in solvation energy between the ionized (A-) and the protonated (AH) states. A G y = A G f ^ i ( A H ^ A - ) + AGZi(H'") + A G H A H - ^ A ' + H ^ ) (6.2.5) A G 2 = - A G Z T '’^^''> -A G i,o n d (6 .2.6 ) AG3 = A G Z iiA H -4 . A ") - A G ZiiR-^) - A G ^ A H - ^ A ' + H ^ ) (6.2.7) therefore, AG = AG, + AG2 + AG3 (6.2.8) = A G ^ ^ iiA H A G h i A H -» A ") - AG^^nd (6.2.9) 117 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 6.5 The illustration of the thermodynamics cycle used for the pKa calculations in the POLARIS program. 18 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.3 Computational Results and Analyses 6.3.1 The pKa Calculations As pointed out in the endothiapepsin studies, the affinity o f ligand binding has been prom oted by the protonations at specific residues in the active sites o f protein. It is also believed that the protonation is strongly dependent upon the local environment. W e have conducted experim ents to investigate some aspects o f the pKg changes upon binding of ligands. First, we have determined the pKa’s o f Asp25 in native form and liganded form, respectively, of the complexes, JG-365, M VT-101, XK263 and D M P323. The results are given in Table 6.1. The studies of crystal structure reveal that the ionization would take place at PD range o f 2 -7 of the side chain o f ASP by the isotope m easurement. The pKa’s o f protonated Asp 25/25’ in the active enzym e w ere found to have distinct values ca. 3.5 and 5.7.* It should not be surprising to observe that the calculations yield the consistent results o f pKg’s o f Asp25 in the native enzym e. The averaging value was found to be around 4.7, which is reasonable above 3.5 to against the acid dénaturation. It is interesting to observe that the shift of pKa’s upon binding of different ligand are linearly proportional to the observed binding energies. The correlation o f the pKg shifts and the observed binding energies is given in Figure 6.6. The relationship between the binding energies and the pKa‘s is certainly interesting and might provide insight know ledge to understand the catalytic m echanism . However, the application o f this correlation is not intermediately obvious. It should be reasonable to believe that the local surrounding environments of ionizable residues w ould result in the distinguished pKa’s o f catalytic residues upon binding to different inhibitor and therefore lead to the specific stability between the ligand and enzyme interactions in the active site. The quantitative determination of this relationship m ight need further validation from experimental viewpoint. The inform ation is obviously useful and crucial for selecting potential molecules in the computer-aided drug design. 19 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. T able 6.1 The results o f pK^’s o f Asp 25 in HIV-1 PR and the binding energies of different complexes: ^^P_+P+/ DMP323 4.8 8.1 3.3 -13.10115 JG-365 4.9 7.2 2.3 -13.07112 XK263 4.6 6.3 1.9 -11.57115 MVT-101 4 .6 5.5 0.9 -8.23111 The calculated pKa’s have been obtained by the PDLD/S approach, p and p+1 designate the protein and protein-ligand complex, respectively. Energies are given in kcal/mol. ^ The references o f the corresponding experimental binding energies. DMP323 JG-365 XK263 MVT-101 -14 -12 -13 11 -10 9 8 Observed Binding (kcal/mol) Figure 6.6 The correlation of the pKa shifts (p-> p+1) and the observed binding energies. 1 2 0 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the investigation of the protein-ligand interactions o f D323 com plex, significant pKa’s shifts have been found and m easured for the residues A sp25/25’ in the active site and some other residues that are also believed being actively participated in the binding process with considerably conform ational change as supported by the N M R studies. We have conducted the experiments to calculate the pKa shift of active residues upon binding for the series o f the enzyme-inhibitor complexes. The result are given in Table 6.2. the DMP323 inhibitor ^ Residue pKa (calculated) pKa (observed) Native form Asp25 4.2 3.5ni6] DMP323 complex Asp25 8.1 8.21108] Asp29 2.5 2.11108] Asp30 4.8 4 01108] Asp60 4.2 3.11108] ^ Determined by the PDLD/S approach w ith 6p=4. The references are given in the bracket []. 6.3.2 The Total Binding Energy and Its Compositions To understand the protein-ligand binding process, it is crucial to develop theoretical models and methods that capture the essentially associated free energy for each phase in the therm odynam ics cycle that displays the binding process from free protein and ligand to form the protein-ligand in solution. The PDLD/S approaches that divide the total free energy into several components offer innovated m ethods to investigate the interactions of protein and ligand molecules in solution. As our previous studies indicated, the conserved 121 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. patterns in the binding site have been observed in the endothiapepsin with its nine inhibitors. All inhibitors are bound in the extended conformation and binding with enzyme has induced significantly conformational change o f the enzyme. The total binding energies have been calculated em ploying the POLARIS program . The electrostatic binding free energies from PD LD model are instructively scaled by a ‘dielectric constant’, £p. As indicated in the therm odynam ic cycle, the purpose o f using this ‘dielectric constant’ is to increase the precision o f the m icroscopic approach w hile retaining the m acroscopic philosophy in the binding process. It represents the contributions that w ere not included explicitly in the given model. It is w orth mentioning that the scaling constant does not represent the real property o f protein. In the standard PDLD/S calculations, the total binding energies are obtained with £p=4. The results are sum m arized in Table 6.3 and Figure 6.7. W e have also determined the total binding energy using the effective dielectric constant, 8opt, that was found between 2 -4 . The binding energies that have been calculated by taking the optim al scaling constant £opt=3.782 are appreciated as com parable results in the PDLD/S approach. The optim al value of £opt that represents the correction of the contributions from the proteins and solution that are not included in a m icroscopic way is consistent w ith that obtained from the pKg studies o f using PDLD m ethod. In case o f binding energy studies, the induced dipole interactions are not explicitly included in our calculations. It is, however, to be included implicitly in the dielectric scaling constant, £p. As also argued in the pKa studies, the finding of effective £opt=2 is reasonable when only the induced dipoles are not included explicitly. This m ay also be part o f the reason why both binding energy calculation and pKa calculation o f em ploying PD LD method give the similar range o f optimal value o f £opt- As a matter of fact, the results that the optimal £opt in the binding is consistently larger than that obtained from pKg calculations might indicate that the m issing com ponent or the im plicit energy is even hard to be recognized in the binding energy calculations. The structural studies have indicated that the symmetric 1 2 2 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. unliganded protein has gone substantial conform ational changes to make extensive interactions with sym m etric and/or asymmetric ligand molecules in binding process. The energetic differences induced by the ligand structures have been presented in the distinguishable components as introduced in the PDLD model. Though the m eaning o f the renowned ‘dielectric constant’ o f protein have been repeatedly discussed and analyzed, the im portance of understanding the term needs to be reemphasized in a way that the practical usage o f the term that represents the macroscopic meaning with the consistent treatm ent has been involved. First of all, the inhomogeneous and reorganized natures o f the dipoles w ithin the protein lead to the defeat o f the conventional definition of the dielectric concepts. This means that the dielectric features of protein can not be represented by a single value. In fact, the values of dielectric properties should be quite different from site to site of the proteins. For example, the value of ‘true’ protein dielectric constant in a region near ionized groups is significantly larger than 2 which is usually referred as a nonpolar environm ent. The consistent evaluation o f the ‘dielectric constant’ in the active site of trypsin gave the value of 9 which is a fairly polar medium in the m acroscopic s e n s e . T h e n the function o f the ‘dielectric constant’ as depicted in the therm odynam ic cycle of the binding process is solely used as a scaling constant that is used to take into account of the effects that are not treated in a microscopic way. It can be considered as a continuum correction term that is distinguishable from the conventionally macroscopic meaning. The dielectric constant, Ep, in the PDLD/S approach does not represent the truly dielectric properties o f protein regardless o f the representation of the symbol e. 123 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 6.3 The total binding energy and its components in kcal/mol. Exp. Cal. AAGg„ AAGgjy AAG;,y^ X A-76928 -14.890 -15.406 -12.341 25.799 -31.881 3.017 (S,S) A-76889 -13.520 -15.033 -8.966 22.854 -29.563 0.642 (R,R) JG-365 -13.070 -13.086 -18.829 31.793 -28.890 2.840 A-74704 -11.340 -12.093 -16.736 28.360 -28.943 5.226 MVT-101 -8.230 -10.679 -14.270 31.882 -29.744 1.453 124 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 6.4 The total binding energy and its components in kcal/mol. (er>pt=3.782) Exp. Cal. AAGgvy X A-76928 -14.890 -14.367 -13.054 27.368 -31.881 3.200 (S,S) A-76889 -13.520 -14.122 -9.484 24.244 -29.563 0.681 (R,R) JG-365 -13.070 -12.068 -19.917 33.726 -28.890 3.013 A-74704 -11.340 -11.017 -17.703 30.085 -28.943 5.544 MVT-101 -8.230 -9.476 -15.094 33.821 -29.744 1.541 125 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. -10 MVT-101 A-74' JG-36: • PDLD/S DMP323 -1 5 ‘ A-76889 (R,R) A-76928 (S,S) -16 -16 -14 -12 -10 8 Observed Values (kcal/mol) Figure 6.7 The results o f the correlation of PDLD/S results and the observed binding energies arc summarized with £p=4. 126 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.4 The Residue Contributions to the Binding Energy It is of significant interest to investigate the detailed energetics between protein and ligand in solution. As revealed in the early studies o f endothiapepsin, all the enzyme- inhibitor com plexes share a number o f com m on features. O ne o f the rem arkable finding is that some favorable interactions between protein and ligand have been persistent in the group calculations. For instance, in all protein-ligand complexes, the strengths o f the ‘flap’ region indicated by the electrostatic interactions between the inhibitor and the residues in the ‘flap’ region have been conserved for a series o f inhibitors. Only some specific interactions that involve the catalytic groups are responsible for maintaining the binding affinity while other interactions may contribute to the effects of binding specificity. In this study, the total mean potential energy o f protein and ligand in solution was decom posed into small group contributions as represented in units o f a residue. The ability to quantitatively identify the active residues in protein and ligand has been obviously useful in the structurally based drug design. T he investigations have been conducted based on the evaluation of electrostatic energies explicitly in the PDLD model and all other effects w ere represented im plicitly by a dielectric constant. The calculations are done by using C oulom b’s law for charged and uncharged groups in equation (5.3.1). For unionized group, the effects of the solvent are expected to be small, the energy can be reasonably evaluated by using £p=4; w hile in case o f charged group, large effective £eff is expected. £eff is evidently found to be 40 in our recent study o f the meaning o f group contributions (in preparation).^^ The im portance of the electrostatic interactions that are involved in the binding site to regulate the activity of the enzym e functions have been realized and needs to be reinstated here. It is also observed that the electrostatic interactions plays a key role in the therm odynamic phase o f the binding process. Thus, the finding o f a strong relationship of the electrostatic effects and the binding affinity is not surprising. O ther effects that are not evaluated explicitly are 127 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. included by scaling the results with a ‘dielectric constant’. The results o f residue contributions o f ligand are shown in F igure 6.8 and 6.9. The graphs o f residue contributions of enzym e are summarized in Figure 6.10-6.14. As expected from the evaluations o f the energetic distributions am ong protein and ligand residues, the results reveal that inhibitors with a hydroxyl group at P i-P i’ position have possessed a consistently stronger energy than that of the same P i-P i’ position without the hydroxyl group. T his explains why M VT-101 has shown weaker potency because of suffering from lacking the hydroxyl group at P i-P i’ position. The result is also consistent with the group contributions as shown in Figure 6.8 that all residue contributions o f ligand of M VT-101 are flatter than others. The energy from P % position o f M VT-101 is of no particular significance as compared with others at the same Pi position. The C l symmetry-related inhibitors The crystal structure o f uniquely symmetrical dimer of HTV-1 protease suggests a better drug design paradigm that involves searching for potent pseudo C i symmetric, diol- containing inhibitors. T o consider the stereochemistry of 0% inhibitors, A -76928 (S,S) and A-76889 (R,R) have been found to give very similar binding energies from PDLD/S approach (-15.406, -15.033, respectively). It is, however, interesting to observe that the binding pattern from the group contributions o f ligand residues is quite different as shown in Figure 6.9. 128 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Electrostatic Distributions in the Ligand Residues of A-74704, JG-365 and MVT-101 I -4 ‘ A-74704 JG-365 MVT-101 - 5 ' P I' P5 P4 P3 PI P2’ P4' P2 P3' Ligand Residues F ig u re 6 . 8 The average electrostatic energy contributing to the binding o f HIV-1 PR com plexes as a function of residues positions P5 -P4 ’, A74704, JG-365 and MVT-101. 129 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Electrostatic Distributions in the Ligand Residues of A-76928(S,S) and A76889(R,R) A-76928(S,S) A76889(R,R) I c d I - 2 ' - 3 ' P3 P2 PI p r P2' P3' Residue positions F igure 6.9 The average electrostatic energy contributing to the binding of HIV-1 PR com plexes as a function o f residues positions P5-F4 ’. Even though the calculations produce almost equal binding energies for the com plex A-76928(S,S) and A76889(R,R), the binding patterns involve all inhibitor residues are quite different. It is interest to observe that the energies of specific interactions in the Pi position are very close. 130 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. W 3 œ fS I % « a I •N I s i s s o 3 2 .H I I (lom /p^i) Q ja u ^ bSdjsav 131 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CD ■ D O Q. C g Q. ■ D CD C/) C/) 8 ( O ' 3 CD 3. 3 " CD CD ■ D O Q. C g o 3 ■ D O CD Q. O C " O CD C/) C/) Electrostatic Distributions in the HIV-1 Protease (A-76889 (R,R)) (Experimental Binding: -13.52) O J K > 2.0 ? 0.0 A sp25 Asp2S -4.0 60.0 80.0 100.0 120.0 140.0 160.0 180.0 200.0 40.0 0.0 20.0 P r o t e i n S e q u e n c e W to F igure 6.11 r- i I I > V fi I I (A 0 u a 1 I 'C I f I oc fN 5 I : I f f t . (JOW/JV331) X S j a u j a S o jS A V 133 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. i! e I 3 A •C i w ’ S 3 I I c = < N I £ S (low/foo^) ^J9U 3 aSouaxy 134 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. H S I I a , I > s ê I .2 O w I S « £ è (jOUt/lV03() £SJ9U^ 98vJ9AV 135 135 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.5 Discussions T he PO LA RIS program provides the opportunities to consistently evaluate biological properties o f m acrom olecular system under the same com putational framework. It is desirable to exam ine the relationship between the function and structure o f biological system in a way that the energetic distributions can be num erically determ ined. The im portance o f keeping consistent approaches in both pK^ calculations and binding free energy evaluations has been repeatedly emphasized and presented in many o f our previous studies. In this work, the studies provides us an opportunity to exam ine both properties, namely the binding free energy and pK^ of catalytic residues in the active enzym e. The conservation o f the sim ulation conditions and the charge properties o f protein-ligand complex in solution for every calculations appear necessary and provides blueprint for establishing the autom ated computational model. The PDLD model and method explore the detailed electrostatic interactions between protein and ligands that are m icroscopically reflected and evaluated in both properties. The results also consistently reveal that the specific interactions between protein and ligand determining the binding affinity and pK^’s shift upon binding. The special catalytic mechanism of aspartic protease suggests that the ionization states o f the catalytic Asp residues are strongly dependent on the local environment of the active site in protein. Therefore, the investigation of the pKa shifts upon binding o f ligand and the energetics that describe the detailed interactions o f protein and ligand would provide the fundam ental understanding o f the interactions that stabilize the protein-ligand complexes. To precisely assess the charge-charge interactions in the protein appears to be a key point in the calculation o f electrostatic energy in protein. It has been accepted that the evaluation o f charge-charge interactions between the surface group o f protein is fairly accurate even with the m acroscopic model (where the macroscopic dielectric constant e is about 40). This is because the charge-charge interactions between the surface groups 136 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. involve large compensation effects resulting from the organization of solvent and/or protein molecules that lead to a large effective Eefr- The charge-charge interactions in the interior of protein presents a real challenge. It is not surprising to find that the Eeff in the interior of protein is m uch larger that w hat was thought.i'* The realization o f the rather polar environm ent around the charged-residues is reasonable and reflected in evaluating the group contributions in this case and the recent studies of electrostatic control of G TP and GDP binding in the oncoprotein p21. It is found that one can obtain a reliable contribution from ionized residues by taking Coulom b’s law with a large effective dielectric constant, £eff>15, and w ith optimal value of £eff=40. One more point should be m entioned; the relatively strong charge-charge interactions o f charged groups cannot be justified by only taking the non-relaxed group contributions. This is because the large solvent effects around the charged group cannot be sufficiently reflected by increasing the effective £p value by a maximum factor o f two. The validity of using effective £eff to treat the interactions between charged groups is established that the lower limit o f £eff=15 is only for special case while £eff=40 usually provides reasonable energies of charged groups. To consider the charge-charge interactions in protein, A V g^ during the binding process, it is instructive to use a small £p, that represents only a scaling constant in order to obtain a better precision in the PDLD/S calculation. The scaling is clearly not related to any meaning of macroscopic m odels and only serves as a com pensation factor to correct the overestimated electrostatic energies in our microscopic m odel. It has been proven from many of our previous studies, i.e., pKa calculations and binding energy, that the treatments provide robust results in a reasonable computation time. The compensation effects o f the charge-charge interaction and the solvent reactions have been exhibited in Table 6.3 and 6.4. The detailed analyses of the group contributions indicate that the charged group, i.e.. ionized Asp25 (A) and A sp25 (B), possess significant energetics and the patterns of binding have been persistent in almost all complexes, except the complex MVT-101, whose 137 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. inhibitor lacks a hydroxyl group at P, position. That appears to be an obvious disadvantage and explains the main reason causing the weakest potency of M VT-101 com pared to other complexes in study. M any previous studies indicate that in order to consistently obtain the reliable energetics o f protein in the microscopic model, the crucial point is to construct and train the model to not only accurately evaluate the solvation (or self-en erg y) energy o f charges in protein and solvent, but also to give correct descriptions of m icroscopic environm ent. The concept o f the key role o f solvation energy in controlling the biological activity have been proven and accepted in recent studies. As dem onstrated in the pK^ studies, the shift of pK^ for a charged residue moving from w ater to protein is directly related to the dielectric models used. To choose the properly dielectric m odel that reflects the actually local environm ent o f the charged group in protein and solution is a vital process. For example, a charged residue has been found to have more favorable s e lf en e rg y ca. 35 kcal/m ol, in protein than in water, if the protein is considered as an entirely nonpolar entity, neither with the solvent effects due to the charging process, nor with the water penetration. As shown in early studies, even if we have taken into account the solvent reorganization energy due to the charging process, it was still not anticipated that such a large shift in the solvation energy w ould occur. To consider the relaxation of solvent effect due to the charge process allows one to use a small dielectric constant, £p, and that a consistent assessm ent of electrostatic energies is indeed found in the PDLD approach. T he finding that the electrostatic interactions play a major role in regulating the binding potency and pKa shift upon binding suggests that these interactions are essential in controlling the activity o f the HIV-1 protease. As indicated in the binding energy and pKg’s calculations, the extent o f the interactions between protein and ligand m olecules is directly determ ined by som e specific interactions that is dependent on the local structure of the 138 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. active site. Therefore, it is quite reasonable to speculate that there should be som e connections between the binding energy and the pKa of the active residues. In this work, we have found that the correlation o f the binding energy and the pKg o f Asp25 shift due to association w ith ligands is linear. The results suggest that there m ight be distinctive ionization state o f the catalytic residues o f Asp in the active site that is very likely to be induced by the specific ligand structures. 6 .6 Concluding Remarks This work presents system atic studies o f the perform ance o f microscopic PDLD method for calculations of electrostatic governed binding free energy and pKa’s calculations in solution and macromolecules. The calculated pKa and binding free energies are found in good agreem ent with the experim ental values. The studies provides evident supports to prove that the PDLD/S approach can be generally applied and expected to produce robust results o f free energies. This work has also been devoted to investigating the detailed electrostatic interactions that are responsible to stabilize the protein complex. As expected, the results support the notation that electrostatic free energies is the m ost important factor to regulate the binding activities that have been seen in terms o f pKa changes and binding energies. It is interesting to observe that the calculated correlation between the shifts of pK a’s o f catalytic residues from uninhibited protein to inhibited protein and the corresponding binding energies is linear. The uniqueness of taking account o f the energy from the charging process into the PDLD/S energy have undoubtedly enhanced the performance o f the program that delivers quite consistent calculations. The inclusion o f a dielectric constant into the PDLD/S calculations have resulted in significant im provem ent of precision. The process reflects an optimal approach to com pletely include all the effects that m ight be missed or not be 139 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. accounted for explicitly in the PDLD/S. The consideration of the reorganization energy of the system allow one to use a small scaling constant, Ep, to obtain the average o f the PD LD/S en ergies. As dem onstrated in both pKa and binding calculations, the computational outcome achieved by considering the steric energy led the consistent results that are less likely to be dependent on the configurational structures. It is reasonable to expect that this approach would produce consistently electrostatic calculations. The point is even more valid when the calculations have been started from the X-ray structure and the constrained m olecular dynamics have also been applied to generate the structures that are close to the observed structure. One important approach in the PDLD/S method for binding is the consideration of a completely thermodynamic cycle for the binding process in the calculations. As defined in the term of dissociation constant that involves the formation o f the protein-ligand complex, three phases are considered: the protein in solution , ligand in solution and protein-ligand in solution. The content of PDLD/S approach involves the com prehensive therm odynamic analyses and provides more than convincible and reliable results that reflect a com pletely physical picture o f binding process. 140 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. References 1.K raut,J,Science,1988,242,533-540 2.C reighton,Th.E.,Protein,1983,W .H .Freem an& C o.,N ew York 3.Careri,G,The Enzym e Catalysis Process, 1988, 3,NATO ASO Series,New York 4.C ooper,A .,H ouben,J., The Enzym e Catalysis Process, 1988, 11, NATO ASO Series, New York 5. W arsheI,A .,B iochem istry,1981,20,3167-3177 6.W arshel,A .,R ussell,S .T .,M olecular D ynam ics and Protein Structure, 1 985, 23, Polycrystal Book Service, Illinois 7.Structure and Function o f the A spartic Proteinases, 1991,D um m ,B .M .,E d .,Plenum Press, New York 8.A spartic Proteinases and T heir Inhibitors, 1985,Kostka,V ., Ed.,W alter de G ryuter & Co.,Berlin,New York 9.PearI,L.H .;B lundell,T.,FEB S L ett,1984,174,96 10.Navia M .A ., et al.,N ature,1989,337,615 1 l.P erutz,M .F .,S cience,1978,201,l 187 12. Jord an ,P .C .,J.P h y s,C h em .1978,91,6582 13. qvist,J.;W arshel,A .,B ioP hys.J.,1989,56,171 14.C hurg,A .K .;W arshel,A .,B iochem istry,1986,25,I675 15.W arsheI,A .Proc.N atl.A cad.Sci.U SA ,1978,75,5250 16.W arsheI,A .; qvist,J,A nnu.Reb.B iophy.Biophys.Chem .,1991,20:267-298 17. W arshel, A. ;Tao,H .;FothergiIl,M .;C hu,Z .T.,Isra.J.C hem .,1994,34,253-256 18.L ee,F.S.;C hu,Z .T .;B olger,M .B .,W arshel,A .,Prot.E ng.l992,5:215 19.Sham ,Y .Y .;C hu,Z.T.;W arshel,A .,Subm itted,1996 20.L angen,R .;B rayer,J.M ol.B oI.,1992,224:589-600 2 1 .M erz,K .M . Jr.;K oIlm an,P.A .,J.A m .C hem .Soc.,1989,111:5649-5658 141 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 2 .W arsh el,A -,C o m p u ter M odeling o f C hem ical R eaction in E nzym es and Solutions,1991, W iley Interscience,New York 23.K ollm an,P. A .;M erz,K .M . Jr., A cc.C hem .R es.,1990,23,246-252 24.B everidge,D .L.;D iC apua,F.M .,A nn.R ev.B iophys.B iophys.,C hem .,1989,18,431-492 25.Straatsm a,T ,P.;M cC am m on,J.A .,A nn.R ev.Phys.C hem .,1992,43,407-435 26. qvist,J.;M edina,C .;Sam uelsson,J.E .,Prot.E ng.,1994,7,385-391 27.W arshel,A ;,C hem .Phys.L ett.,1978,55,454 28.W arshel,A .;K ing,G .,C hem .Phys.L ett.,1985,I21,124 29.Lee,F.S .;C hu,Z .T., W arshel, A ., J.C om p.C hem .,1993,14:161-185 30.L ee,F.S.;W arshel,A .,J.C hem .Phys.,1992,97:3100 31 .King,G. ;W arshel, A., J.C hem .Phys.,1989,91:3647-3661 32.K ing,G .;L ee,F.S.;W arshel,A .,J.C hem .Phys.,1991,95,4366-4377 33.B arker,J.A .;W atts,R .O .,C hem .Phys.Lett,1969,3,144-145 34.R ahm an,A .;Stillinger,F.H ., J.C hem .Phys.,1971,55,3336-3359 35.Stillinger,F.H .;R ahm an, A ., J.C hem .Phys.,1974,60,1545-1557 36.Stillinger,F.H .,Phil.T rans.R .Soc.L ond.,1977,278,97-110 37.Stockm ayer,W .;J.C hem ..Phys.,1941,9,398-402 38.W arshel,A .;Levitt,M .,J.M ol.B io.,1976,103:227-249 39. W arshel, A .,B iochem istry,1981,20,3167 40. W arshel, A .;R ussel,S.,Q uart.R ev.B iophys.,1984,17:283-422 41.Russel,S. ;W arshel,A .,J.M ol.,B iol.,1985,185:389-404 42. B om ,M ;O ppenheim er,J.R .,A nn.Phys.(G erm any),1927,84,457 43.Bom ,M ;H uang,K .,D ynam ical Theory o f Crystal Lattices, Oxford U niversity Press, L ondon,1954, pp l6 6 4 4 .0 n sag er,L .,J. A m .C hem .Soc.,1936,58,1486 45.K ollm an,P.A .,C hem .R ev.,1993,93:2395-2417 142 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 46.Pearlm an,D .A .;K oIlinan,P.A .,J.C hem .Phys.,1991,94,6,4532-4545 47.M itchell,M .J.;M cC am m on,J.A .,J.C om put.C hem .,1991,12:271-275 48.Pearlm an,D .A .;K ollm an,P.A .,J.C hem .Phys.,1991,94,4532-4545 49.Straastsm a,T.P.;M cCam m on,J. A., J.C hem .Phys., 1991,95,1175-1188 50.W arshel,A .;H w ang,J.K .,J.C hem .Phys.,84,1986,4938-4957 51.W arshel,A .;Chu,Z.T.,ACS Sym posium Series: Structure and R eactivity in Aqueous Solution. C haracterization o f C hem ical and B iological System s, C ram er,C .J.,and Truhlar,D .T. editors, W ashington,D .D .,1994 52.W arshel,A .,J.Phys.C hem .,1979,83,1640 53.W arshel,A .;Sussm an,F.;H uang,J.-K .,B iochem istry,1986,25,8368 54.Parson,W .W .;C hu,Z.-t.;W arshel,A .,B iochem .B iophys.A cta,1990,1017 55.Fersht,A .,Enzym e Structure and M echanism ,2nd Edition,1985,W .H . Freeman and Company, New York, 11-12 55.W arshel,A .;J.Phys.C hem .,1979,83,1640 56.Luzhkov,V .;W arshel,A .,C om p.C hem .,J.C om p.C hem .,1992,13,199 57.W arshel,A .;N aray-Szabo,G .;Sussm an,F.;H w ang,J.K .,B iochem isty,1989,28,3629 5 8 .K ing,G .;L ee,F.S.;W arshel, A., J.C hem .P hys.,1991,95,4366 59.H w ang,J.-K .,W arshel,A .,J. A m .C hem .Soc.,1987,109,716-720 60.K ing,G .;W arshel,A .,J.C hem .Phys.l990,93,8682-8692 61 .W arshel,A .,Pontif.A cad.Sci.Scr. V an ,1984,55,59-81 62.W ong,C .F.;M cC am m on,J.A .,J.A m .C hem .Soc.,1986,108,3830-3832 63.M arcus,R .A .,A nnu.R ev.Phys.C hem .,1964,15,155-196 64.R oux,B .;Y u,H .A .;K arplus,M .,J.Phys.C hem .,1990,94,4683-4688 65. qvist,J.,J.P hys.C hem .,1990,94,8021-8024 66.C reighton,S.;H w ang,J.K .;W arshel,A .;Parson,W .W .;N orris,J.,1988,B iochem istry, 27:774 143 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 67.W arshel,A .,Proc.N atl.A cad.Sci.,U SA ,1978,75,5250 68. W arshel,A ., A cc.C hem .R es.,1981,14,284 69.Skeggs,L.T.;D oren,F.E.;L evine,M .;L entz,K .E. and K han,J.R .,1980, In: T he Renin- A ngiotensinogen System (Johnson,J.A . and A nderson,R .R .,eds) Plenum Press, New York, 1-27 70.G reenIee,W .J.,M ed.R es.R ev.,1990,10,173-236 71.V eerapandian,B .;C ooper,J.B .;Sali,A .;B Iundell,T.L.,J.M oi.B io.,1990,216:1017-1029 72.Sali,A .;V eerapandian,B.;Cooper,J.B.;FoundIing,S.I.;H oover,D .J.;BIundell,T.L.,EM BO J.,1989,8:2179-2188 73.C ooper,J.B .;Foundling,S.I.;B lundeIl,T.L.;B oger,J.,Jupp;R .A .,K ay,J.,B iochem istry, 1989,28:8596-8603 74.R ao,C .M .,et al.,J.M ed .C h em .l9 9 3 ,3 6 :2 6 14-2620 75.R ich,D .H .,J.M ed.C hem .,1985,28:263 7 6 .R ich ,D .H .;B o p arai,A .S .;B ern ato w icz,M .S ., B iochem . and B iophys. R esearch Com m ., 1982,104:3,1127-1133 77.Jam es,M .N .;Sielecki,A .;Salituro,F.;R ich,D .H .;H ofm ann,T.,Proc.N atl.A cad.Sci.B lo chem istry,1982,79:6137-6141 78.Foundling,S.I.;C ooper,J.;W atson,F.E.;C leasby,A .,N ature,1987,327:349-352 79.B lundell,T .L .;C ooper,J.B .;S ali,A .and Z hu,Z .,1991,In: Structure and Function of Aspartic Proteinases(Dunn,B.M .,ed,)PIenum Press,New York,443-453 80.R ao,B .G .;Singh,U .C .,J. A m .C hem .Soc.,1991,113:6735-6750 81 .R ao,B .G .;T ilton,R .F .;S ingh,U .C .,J.A m .C hem .S oc.,1992,114:4447-4452 82.D avies,D .R .,A nnu.R ev.B iophys.B iophys.C hem .,1990,19:189 83.Jam es,M .N .;Sielecki,A .R .,B iochem istry,1985,24:3701 84.M uegge,I.;Schw eins,T.;Langen,R .;W arsheI,A .,Structure,1996,4,475-489 144 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 85.D hanaraj,V .;D eaIw is,C .;B ailey,D .,C ooper,J.B .;B lundelI,T .L ., In: Innovations in Proteases and T heir Inhibitors,1993,pp 141-159 86.H einz,D .W .;B aase,W .A .;D ahlquist,F.V .;M eathew s,B W .,N ature,1993,36I,56I-564 87.D ing,J.P.;K oelIner,G .;G runert,H .P.;Saenger,W .,J.B ioI.C hem .,1991,266,15128- 15134 88.H am m ett,L .P .,J.A m .C hem .S oc.,1937,59,96 89.H ansch,C .;M aloney,P.P.,Fujita and M uir,R .M .,N ature,1962,194,178 90.W eininger,D .,J.C hem .Inf.C om put.Sci.1988,28,31 9 1.Leaver,R.H .;Thom as,T.R.,Analysis and Presentation o f Experim ental Results,1974, Jojn W iley & Sons,New Y ork,6 92.A llen,M .P.;Tildesley,D .J.,Com puter Sim ulation of Liquids, 1987,Oxford Press. 93.H ansson,T .; qvist,J.,P rot.E ng.,1995,V 8,l 1,1137-1144 94.D esJarlais,R.L.;Seibel,G .L.;K untz;I.D .;Furth,P.S.;A lvarez,J.C.;O rtiz De M ontellano. P.R .;D eC am p,D .L .;B abé,L .M .;C raik,C .S.,1990,Proc.N atl.A cad.Sci.U SA ,87,6644- 6648 95.D esJarlais,R .L .;Sheridan,R .P.;Seibel,G .L .;K untz,I.D .;V enkataraghavan,R .,I988,J. M ed.C hem .,31,722-729 96.W lodaw er,A.;M iller,M .;Jask61ski,M .,Sathyanarayana,B.K.;Baldw in,E.;W eber,I.T.; Selk,L .M .;C law son,L .;Schneider,J.;K ent,S.B .,1989,Science,245,616-621 97.Plattner,J.J;N orbeck,D .W .,in Drug D iscovery Technologies,Clark,C.R .,M oos,W .H ., Eds. (Ellis H orw ood,Chichester,England,1990),92-126 9 8 .S c h n eb li,H .P .;B rau n ,N .J.,1 9 8 6 ,In Proteinase Inhibitors,V ol. 12(A .J.B arrett and G .Salvensen,eds.).Elsevier Science,N ew Y ork,613-627 99.G reen lee,W .J.,M ed .R es.R ev .,1 9 9 0 ,1 0 ,173,, 100.R osen,S.H .,et al.,J.M ed.C hem .,1993,36,449 101 .K leinert,H .D .,et al.,Science,1992,257,1940 145 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 102.Poiesz,B J;R uscetti,F.W .;G azdar,A .F.;B unn,P.A .;M inna,J.D .;G allo,R .C .,1980, Proc.Natl. A cad.Sci.,U S A ,77,7415-7419 103.B latter,W .A .,1991,FA SEB J.,5,2340-2348 104.M itsuya,H .;Y archoan,R.;B roder,S.,1990,Science,249,1533-1544 105.Richman,D.D.;FischI,M .A.;Grieco,M .H .;Gottlieb,M .S.;V olberding,P.A.;Laskin, O .L.;Leedom ,J.M .;G roopm an,J.E.;M idvan,D .;H irsch,M .S.;Jackson,G .G .;D urack,D .T.; N usinoff-Lehrhan,S. ;G roup, A .C.W ., 1987,N .Engl. J.M ed.,317,192-197 106.Larder,B. A .;K em p,S.D ., 1989,Science,246,1155-1158 107.M eek,T.D .;D ayton,B.D.;M etcalf,B.W .;D reyer,G .B.;Strickler,J.E.;Gom iak,J.G .; Rosenberg,M .;M oore,L.;M agaard,V .W .;Bebouck,C.,Proc.N atI. Acad.Sci.,1989,86,1841- 1845 108.Yam azaki,T.;Nicholson,L.;Torchia,D.A.;W ingfield,P.;Stahl,S.J.;K aufm an,J.D .; Eyerm ann,C .J.;H odge,C .N .;Lam ,Y .S.;R u,Y .;Jadhav,P.K .;C hang,C .H .;W eber,P.C .,J.A m .Chem .Soc., 1994,116,10791-10792 109. H arte,W .E .,Jr.;B everidge,D .L .,J.A m .C hem .Soc.,1993,15,3883-3886 110.V erkhivker,G .;A ppeIt,K .;Freeer,S.T.;V illafranca,J.E.,Pro.Eng.,1995,7,667-691 11 l.M iller,M ;Schneider,J.;Sathyanarayana,B.K .;Toth,M .V .;M arshaIl,G.R.;Claw son,L.; S elk,L .;K ent,S.B .H .;W lodaw er,A .,Science,1989,246,1149-1152 112.Jaskolski,M .;Tom asseli,A .G .;Saw yer,T.K .;Staples,D.G .;H einrickson,R.L.; Schnerider,J.;K ent,S.,B .H .;W lodaw er,A .,1991,B iochem istry,30,1600-1609 113.E rickson,J.,et.al.,1990,Science,249,527-533 114.Housur,M .V.;et aI.,J.A m .C hem .Soc.,1994,116,847-855 115.Lam ,P.Y .,Jadhav,P.K .;Eyerm ann,C .J.;H odge,C .N .;R u,Y .;B acheler,L.T.;M eek, J.L.;O tto,M .J.;R ayner,M .M .;W ong,Y .N .;C hang,C .H .;W eber,P.C .;Jackson,D .A .; Sharpe,T.R .;V iitanen,S.,Science,1994,263,380-384 116.H yland,L.J.;Tom aszek,T. A. ;M eek,T.D .,Biochem istry,1991,30,8454,8463 146 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 117.King,G.;Lee,F.S.;Warshel,A., J.Chem.Phys.,1991,95,4366 147 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Computer simulation of polar solvents with the SCAAS model
PDF
Calculations of electrostatic interactions in proteins
PDF
Computer modeling and free energy calculatons of biological processes
PDF
Crystallographic stydies on Zn(2), Mn(2) and Pt(2) complexes of phosphonates
PDF
Computational investigation of the catalytic aspartic acids of HIV-1 protease
PDF
Computer simulation of chemical reactions in aqueous solutions and biological systems
PDF
Characterization of glycoproteins synthesized by human breast epithelial cells
PDF
Computer simulation studies of charge transfer through biological and artificial membrane channels
PDF
Ab initio methodologies in studying enzymatic reactions
PDF
Chemical, physicochemical, and biological studies on the mucoproteins of plasma and serum
PDF
Characterization of portohepatic glucosensors in sympathoadrenal counterregulation
PDF
A comparative study of the chemistry and biology of Heparin
PDF
A study of phonetic (sound) reinforcement and generalization learning.
PDF
Accounting and access control for multicast distributions: Models and mechanisms
PDF
Basic studies in the analytical chemistry of beryllium
PDF
A synthetic model approach to H-bonded oxyhemoglobin
PDF
Characterization of H3abp and its binding site: A complex that specifically binds a cell cycle regulatory element of the hamster histone H3.2 promoter
PDF
An investigation of the effects of output variability and output bandwidth on user performance in an interactive computer system
PDF
Algorithms for the discrete logarithm problem and algebraic aspects of quantum computation
PDF
A nutritional evaluation of the essential fatty acids in the rat
Asset Metadata
Creator
Tao, Holly (author)
Core Title
Consistent evaluation of binding free energies and study of the role of electrostatic effects in the stabilization of protein complexes
Degree
Doctor of Philosophy
Degree Program
Computational and Biophysical Chemistry
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
biophysics, general,Chemistry (Physical Chemistry),chemistry, biochemistry,Chemistry, pharmaceutical,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Warshel, Arieh (
committee chair
), [illegible] (
committee member
), Petruska, John (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c17-279601
Unique identifier
UC11350175
Identifier
9733149.pdf (filename),usctheses-c17-279601 (legacy record id)
Legacy Identifier
9733149.pdf
Dmrecord
279601
Document Type
Dissertation
Rights
Tao, Holly
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
biophysics, general
chemistry, biochemistry