Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Simulating the helicase motor of SV40 large tumor antigen
(USC Thesis Other)
Simulating the helicase motor of SV40 large tumor antigen
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
SIMULATING THE HELICASE MOTOR OF SV40 LARGE TUMOR ANTIGEN by Yemin Shi A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTATIONAL BIOLOGY AND BIOINFORMATICS) May 2012 Copyright 2012 Yemin Shi ii Acknowledgments I deeply grateful to my supervisor Dr. Xiaojiang Chen, for his support, encourage, patience and guidance, for giving me the opportunity to learn the state-of-arts structure biology research and for helping me exploring the elaborate helicase motor problem, for always giving me good advice both in academics and life. Without him this doctoral thesis would not have been possible. I sincerely appreciate Dr. Ariel Warshel for guiding me into the theoretical world of molecular simulation. His unsurpassed knowledge and rigorous scholarship are invaluable for my doctoral training. Thanks to Dr. Remo Rohs, Dr. Fengzhu Sun and Dr. Alber Frank for accepting to be my graduation committee member, revising and quickly answering my questions and doubts. And thanks to my friends, Yiyu Li, Yang-ho Chen, Xiting Yan, Peter Chang, Feng Qi, Chao Dai and Shen Soh for your company in this long journey. Thanks Dahai Gai, Courtney, Ronda, Bo, Jessica, Maocai, Hanbin, Spyridon Vicatos, Z.T. Chu, Jie Cao and all my friends at USC, it is their support through all these years that made this achievement possible. I would like to acknowledge the financial, academic and technical support of the University of Southern California, particularly the department of molecular and computational biology for the award of a Research Assistantship that provided the necessary financial support for this research. iii Finally and most importantly, I deeply grateful to my dear mother Xuejun Xie and my father Zhaoping Shi, for their love, encourage and support during all these years. iv Table of Contents Acknowledgments............................................................................................................... ii Table of Contents ............................................................................................................... iv List of Tables ..................................................................................................................... vi List of Figures ................................................................................................................... vii Abstract ............................................................................................................................ viii Chapter 1: Introduction .................................................................................................... 1 1.1 Overall structure of SV40 LTag helicase domain .................................................. 3 1.2 Overview of chapters ............................................................................................. 6 Chapter 2: Helicase cooperative model ........................................................................... 8 2.1 Binomial distribution ............................................................................................. 8 2.2 Bionomial model of ATPase activity and helicase activity ................................... 9 2.3 Results and discussions ........................................................................................ 15 Chapter 3: ATP binding study ....................................................................................... 20 3.1 Overview .............................................................................................................. 20 3.2.1 LTag helicase ATP binding model ............................................................... 25 3.2.2 Mg 2+ coordination ......................................................................................... 36 3.2.3 Apical water coordination ............................................................................. 39 3.2.4 Interaction energy ......................................................................................... 40 3.2.5 Monomer Conformational change ................................................................ 42 3.2.6 The cooperative movement triggered by ATP binding................................. 43 3.3 Discussion ............................................................................................................ 46 3.4 Methods................................................................................................................ 48 Chapter 4: DNA translocation study .............................................................................. 54 4.1 Overview .............................................................................................................. 54 4.2 A general analysis of DNA translocation process ............................................... 56 4.3 Discussion ............................................................................................................ 68 4.4 Methods................................................................................................................ 72 v Chapter 5: Conclusions, discussions and future directions ............................................ 75 5.1 Conclusions and discussions ................................................................................ 75 5.2 Future directions .................................................................................................. 79 References ......................................................................................................................... 81 Appendices Appendix A: R 2 of the allosteric models .................................................................. 89 Appendix B: Interface distance ................................................................................. 90 vi List of Tables 2.1 LTag helicase hexamer inactive monomer configurations and activities. ................ 14 A.1 R 2 value of six models in fitting helicase inactivation experiments……………….89 vii List of Figures 1.1 Electron microscopy images of ring-shaped helicases……………………………...2 1.2 SV40 LTag helicase structures in three nucleotide bound states…………………...4 1.3 Structure of SV40 LTag monomer and ATP binding pocket………………………5 2.1 Theoretical models for ATPase activity with cis-mutant doping…………..………17 2.2 Theoretical models for helicase activity with mutant doping ………………..….19 3.1 The three-staged ATP binding procedure………………...………….…………….27 3.2 The ATP conformational change and the cross locking system….……………….29 3.3 The timeline of the major hydrogen bonds formed during the binding procedur…..31 3.4 The movement of the helicase monomer sub-domain……………………..……...35 3.5 Mg 2+ coordination transition procedure……………………………………………38 3.6 The interaction energy profile between the ATP-Mg 2+ and the binding pocket…….42 3.7 A refined iris model for helicase cooperative movement………………..………….45 4.1 A hypothetical system that leads to DNA translocation……………….…………..57 4.2 Two types of free-energy maps for translocation process……………….…………58 4.3 A structural model of LTag hexamer in complexed with ssDNA………..………61 4.4 The effective free-energy surface for the translocation process in LTag..……….64 4.5 The simulated time dependence of the translocation process of low barrier……..67 B.1 The gap size variation after transforming the state of one monomer………...…….90 viii Abstract Helicases are motor protein that utilize the energy derived from NTP binding and hydrolysis to translocate and unwind DNA/RNA during the replication. Understanding the energy coupling of NTP hydrolysis cycle to the DNA movement is the key to understand the DNA replication mechanism in the molecular motor. The helicase domain of simian virus 40 large tumor antigen (SV40 LTag) is a ring-shaped AAA+ domain that participates in viral DNA replication and host cell growth control. Recent SV40 LTag structure studies have provided a set of high resolution structures in different nucleotide binding states. Hence, in this thesis we use LTag helicase as a model protein, and present the first systematic simulation study on the mechanism of the LTag helicase motor. Our work includes three major sections: first, we model the LTag ATPase activity and the helicase activity based on the biochemistry experiment results. This model indicates that the LTag helicase subunits work in highly cooperative patterns. When the origin DNA is presented, the helicase translocates DNA in a sequential pattern. When the fork DNA is added, the helicase works in a semi-sequential pattern, otherwise, the subunit cooperativity is not significant. Second, we present the first simulation study on the ATP binding/hydrolsis procedure using the non-equilibrated molecular dynamics method, the results suggest a three-stage Locker-binding model. We evaluate the energy profile using the LRA version of the semi-microscopic Protein Dipoles-Langvin Dipoles method ix (PDLD/S). The energy profile matches the experimental results. Thirdly, we investigate the electrostatic energy that guides the single-strand DNA (ssDNA) translocation process and propose a unidirectional translocation model. To accomplish this work, an ssDNA/LTag complex model is built using the structure information from the LTag helicase and the E1 protein-DNA complex, a two-dimensional effective electrostatic free- energy landscape is calculated based on the ssDNA/LTag model, and the unidirectional model is proposed by evaluating the energy landscape. The time dependence of the coupled protein-DNA motion is explored by simulating the translocation process using a renormalized method. Altogether, our theoretical and simulation study advanced our understanding of the fundamental molecular mechanism underlying the directional movement of ring-shaped helicase motor. 1 Chapter 1 Introduction Helicase motor are essential of almost all complexes that catalyze reactions of nucleic acid metabolism [38]. The catalytic reactions involve the translocation and unwinding of various DNA/RNA duplex in the replication process using the energy from enzymatic nucleotide triphosphate (NTP) hydrolysis [48, 61]. The efficienty of the helicase reactions affects the integrity of genetic information. Defects in helicases lead to several human disorders, such as Bloom’s syndrome and mental retardation [48]. Given a suitable DNA/RNA substrate in vitro, most of the helicases are capable of catalyzing the unwinding procedure [48]. Although various models have been proposed in the structure studies [17], the atom-level catalytic mechanism is still unclear due to the availability of structure information and modern computational power [61]. To understand these catalyzed unwinding processes and develop a clear mechanism model, we need to utilize the information from different research domains to investigate the possible coordination among the helicase subunits and how the energetic of the NTP binding/hydrolysis cycle is coupled with the DNA movement allosterically. Helicases can be assembled into different oligomeric structures [48]. Our research mainly focused on the ring-shaped helicase due to the versatile biological functions and 2 availability of crystal structures. The ring-shaped helicase subunits usually assemble around the nucleic acid strand and form a ring structure. This structure helps to promote the processivity during the nucleic acids strand unwinding [28, 29]. Figure 1.1 listed the electron microscopy (EM) images of the major ring-shaped helicases [48], including bacteria, archea, eukaryotes and virus helicases. Our model helicase, the SV40 LTag helicase is a viral helicase. Figure 1.1: Electron microscopy images of ring-shaped helicases [48]. Simian virus 40 is a polymavirus capable of transforming normal cell into tumor cell [69]. During the transformation, the LTag regulates its own viral life cycle by binding to the origin sequence of viral DNA replication, and promoting the synthesis of viral DNA. LTag also modulates the cellular signaling pathways of the host cell to stimulate l 3 progression of the cell cycle by binding to a number of cellular control proteins [69]. The SV40 LTag helicase is an efficient hexameric helicase that belongs to the helicase superfamily III, as well as the AAA+ protein superfamily. In the viral DNA replication, the SV40 LTag helicase motor unwinds and translocates the DNA using the energy generated from the ATP binding/hydrolysis cycle at the nucleotide pocket. SV40 LTag helicase is a versatile helicase capable of unwinding DNA, RNA, and RNA-DNA hybrid [48, 61]. Moreover, the recent structure studies provided a set of high-resolution hexameric structures of SV40 LTag helicase in various nucleotide binding states, which facilitates our simulation study on the LTag helicase motor [13, 14]. 1.1 Overall structure of SV40 LTag helicase domain The SV40 LTag is composed of three major functional domains, namely, the DnaJ homology domain (DnaJ, residues 1–82) [8], the origin DNA binding domain (OBD, residues 131–259) [4], and the helicase domain (residues 251–627) [34]. The DnaJ domain participates in protein complexes remodeling and is dispensable for replication [34]. The OBD recognizes the replication origin of SV40 and initiates the replication process [4, 55], during which the LTag double hexamer is assembled and the helicase domain is formed. The double hexamer recruits the other replication regulation proteins and forms a replication initiation complex. In the following elongation stage, the LTag double hexamer acts as a helicase to unwind the DNA bidirectionally [62]. The helicase domain is the major working unit for DNA unwinding and translocation. 4 Three high resolution structures of the LTag helicase domain have been reported in ref. [14, 34], including the Apo state (PDB ID 1svo), the ATP bound state (PDB ID 1svm) and ADP bound state (PDB ID 1svl) (Figure 1.2). These structures reveal an iris-like motion of the hexamer helicase during the drastic conformational switches that are triggered by ATP binding and hydrolysis. Accompanying the iris-motion of the LTag hexamer is the longitudinal movements of the six β-hairpins along the central channel [14]. Figure 1.2: SV40 LTag helicase structures in three nucleotide bound states. A: Apo (empty) state. B: ATP bound state (The orange molecules are ATPs). C: ADP bound state (The orange molecule are ADPs). The radius of central channel is around 12Å, 8Å, and 10Å for Apo, ATP bound and ADP bound states respectively. A B C 5 Figure 1.3: Structure of SV40 LTag monomer and ATP binding pocket. (A) The side view of the hexamer structure of LTag helicase. (B) The side view of LTag monomeric structure. (C) The structure of the LTag helicase binding pocket viewing from C-terminal end (bottom). The cis- residues are in copper and the trans-residues are in blue. The ATPs are painted in yellow in the middle figure. The ATP is colored by element type in the left mini-view. The nitrogen, carbon, oxygen and phosphate atoms are painted in blue, cyan read and gold. The N1-C4’-C5’-PB dihedral angle and the N1-C4’-PB bending angle are used to represent the conformational change in the ATP binding procedure. The detailed structures of LTag helicase are illustrated in Figure 1.3, the height of the helicase is about 85Å and the diameter is about 128Å. The central channel diameter is about 8-14Å depending on the nucleotides binding states. Each LTag subunit of the helicase contains three structural domains, D1, D2 and D3 (Figure 1.3 (B)). D1 is the N terminal Zn domain essential for LTag hexamerization [34, 72]. D2 is a typical AAA+ domain with Walker A or P-loop and Walker B motifs, which is important for ATP binding [34]. D3 is composed mostly of long helices, which is sequentially interrupted by 128Å 85Å A B C 6 D2 roughly in the middle of D3, while D1 is structurally well separated from D2/D3 (Figure 1.3 (C)). Six binding pockets are located at the interface between each of the two adjacent monomers. The monomer with the P-loop at a given interface is named cis-monomer, and the other monomer forming the interface is named trans-monomer. For an ATP to bind to the nucleotide pocket, the only possible route is through the gap between the two neighboring monomers (or subunits) from the C-terminal end (bottom). The binding pocket residues on the cis-monomers can be divided into two groups, one group includes I428, D429, K432, T433, T434 on the P-loop, and the other group includes N529, D474 on the Sensor I motif (Figure 1.3 (C)). The binding pocket residues on the trans- monomers contains an arginine finger tR540 (t designates trans), a lysine finger tK418, and residues tR540, tD502 and tR498. The ATP interacts with the cis-residues and trans- residues mainly by its phosphate group and the ribose. The adenosine group inserts into the hydrophobic pocket formed between two helices, H9 and H13, on the cis-monomer (Figure 1.3 (C)). 1.2 Overview of chapters Despite the advancement in LTag helicase studies mentioned above, the detailed paths for these conformational switches and the corresponding energetics associated with 1) the ATP binding/hydrolysis process and 2) DNA translocation/unwinding process are still unknown. For the former process, there are intensive studies on the nucleotide hydrolysis [26, 60], however, study on the binding process is still challenging due to the difficulty in 7 both experiment and simulation [1]. For the latter case, although a few models have been proposed, none of them is based on rigorous energetic study [14, 59]. In this thesis, we attack these problems by simulating the two processes and investigate their energetics during the conformational transformation. This thesis will be organized as follows: In Chapter 2, we modeled the cooperative movement based on the ATPase/helicase activity results. In Chapter 3, we perform a systematic simulation to reproduce the key steps during the ATP binding, and a three-stage binding model in the LTag binding pocket is proposed. The energy profile is comparable to the experimental data. In Chapter 4, we propose a unidirectional single-stranded DNA translocation model in the LTag helicase central channel. The model is based on a two-dimensional electrostatic free energy profile. The time dependence of the proposed model is validated using the renormalization approach [37]. In Chapter 5, we summarize our current work and the future directions of helicase motor simulation study. 8 Chapter 2 Helicase cooperative model The six monomers of the LTag helicase work in a cooperative pattern. In this chapter, we investigated the inactivation curve of the LTag helicase activity and ATPase activity from experimental results, and proposed a probability model frame work to explain the mechanism of LTag cooperative movement. 2.1 Binomial distribution We introduce some concepts before describing the probabilistic model. If the outcome of a trail can be classified as either a success with probability 0 ≤ p ≤ 1 or a failure with probability (1-p), this trail is a Bernoulli trail. A Bernoulli random variable X is used to describe the outcome of a Bernoulli trail. If the outcome is a success, then X=1; if the outcome is a failure, then X=0. The distribution of a Bernoulli random variable X can be represented by the following probability mass function (2.1) , } 1 { ) 1 ( , 1 } 0 { ) 0 ( p X P p p X P p = = = − = = = (2.1) where 0 ≤ p ≤ 1, is the probability that the trail is a success. 9 A binomial random variable represents the number of successes that occur in a sequence of n binomial trails. And the probability mass function having parameters (n,p) is given by (2.2) , ) 1 ( ) ( k n k p p k n i p − − = (2.2) where n is the number of Bernoulli trail and k is the number of success trails. In our biochemistry experiment, the activity state of each monomer of LTag helicase can be modeled as a binomial random variable, while the active state of the hexamer can be modeled as a sequence of six binomial trails. The number of active monomer in a hexamer therefore follows a binomial distribution. 2.2 Bionomial model of ATPase activity and helicase activity The experimental data shows a relationship between the inactivated helicase monomer condense (p) and the helicase/ATPase activity (A). This macroscopic relationship provided useful information for the microscopic mechanism of allosteric conformational change. With this in mind, we proposed a theoretical model framework to fit the data, and this framework take into account the macroscopic relationship by embedding different model assumptions, hence it can be applied to model the microscopic mechanism. We used the R 2 between the experimental data and the theoretical models as a measure to screen the candidate models for future simulation and then estimated the underline allosteric mechanism based on the selected candidates. 10 One may notice that all high resolution structures are homo-hexamer, that is, all six monomers are in the same nucleotide binding states (homo-state), there is no hetero-state hexamer structure observed. This indicates the SV40 LTag monomers working in a concerted model [14], i.e. all the monomers need to reach the same next state before any one of them can go further to the state after that state. It is notable to point out that our work investigated the inter-monomer cooperative movement between each of the two homo-state structures. An inter-state cooperative movement of monomers does not necessary requires all the monomers to move in the same pace. That means there are roughly three major classes of inter-monomer movement models, the sequential model, the random model and the semi-sequential model. The sequential model curve is usually considered as the lower boundary of the helicase activity since the cooperation requirement for activating a hexamer ring in this model is the most critical. While a random model shows a linear relationship between the helicase activity and the inactivate monomer condense p, there is no cooperative requirement. Other models are the semi- sequential models that usually fall between the sequential model and the linear random model. Of course, the inactivation curve also affected by the assumption in each model, we will discuss this later. The connection between the activity and the inactive monomer condense is connected by two relationships. 1) The helicase activity with its hexameric ring component configuration; 2) the probability of a ring component configuration with the mutant condense in the solution. Here, a configuration means a unique distribution pattern of the inactivated monomers in a hexamer. The uniqueness is defined by the 11 number of inactivated monomers and their distribution pattern. For example, suppose the monomers in a hexamer are labeled as 1, 2, 3, 4, 5 and 6, two of which are inactivated. In a continuous pattern where the inactive monomers are adjacent to each other, the inactive monomers can either be 1, 2 or 2, 3. The number and distribution pattern are the same, so they reduced to one configuration. On the same hexamer ring, there are six possible ways for two inactive monomers sitting next to each other. All these patterns are degenerated in to the same adjacent configuration of two inactive monomers. In another case, if there are two hexamers, one has inactive monomers 1 and 2, and the other has inactive monomer 1 and 3, then the configuration of these two hexamers are different, since the distribution of inactive monomer are different. Now, we investigate the two relationships: For relationships 1: we want to express the hexameric activity using its different configuration activities. In our model framework, A i (m) is the hexamer activity of the ith equivalent configuration under model m, t i is the probability of the ith configuration, and i runs over all C k equivalent configurations with k inactive monomers. The average activity level of helicase with k inactive monomers can be represented as equation (2.3): ∑ = = 〉 〈 k C i m i i m k A t A 1 ) ( ) ( (2.3) Different configurations may not necessarily correspond to different activity levels. It also depends on the model assumption. For example, when there are two inactive monomers in a helicase ring (Table 2.1, K=2), they can be adjacent or with one active monomer in between. In model 2, their activity is 3/6 and 2/6 respectively, while in model 4, their activity are both 4/6. 12 For relationship 2: we need to give the probability of a hexameric configuration with respect to the helicase assay mutant condense. The number of inactive monomers k in a hexameric ring is a binomial random variable. The probabilistic mass function is given by equation (2.2) with parameter n=6 for hexamers. Since all hexameric configurations could appear in the helicase assay, the average activity with respect to inactive monomer condense p for model m is given by equation (2.4): ∑ ∑ = = − − = 〉 〈 k m C i m i i n k k n k m A t p p k n A 1 ) ( 0 ) ( ) ( ) 1 ( ( 2.4 ) where n (m) is the maximum number of inactive monomers for a hexamer configuration to be activated in model m. This model framework can be applied for any ring-shaped helicase of n monomers. In Table 2.1, we listed our major candidate models. Model 1 is the sequential model, the only active hexamer will be the one with no inactive hexamer. Model 2 is the adjacent wild type (W-W) model in which W-W monomer binding pocket has activity. In Model 3, both W-W and W-M (Mutant) monomer binding pockets have same biological activity level; Model 4 is the random model, the inactive monomer will only affect the trans- pocket, the helicase activity is proportional to the number of active monomers in the hexameric ring; Model 5 and model 6 are both variants of model 2, with additional assumption that 3+ (for model 5) or 2+ (for model 6) consecutive inactive monomers will result in zero activity of the hexamer. We have tried other models with additional local configuration assumptions (data not shown), such as assigning the activity of n’ (2<n’<n) consecutive monomers according to their local configuration patterns. Models with these 13 additional assumptions do not show significant advantage in fitting the experimental data, so we only discuss the six aforementioned models (Table 2.1). 14 6 1/1 0 0 0 0 0 0 5 1/1 0 0 2/6 1/6 0 0 4 3/15 0 0 4/6 2/6 0 0 6/15 0 0 4/6 2/6 0 0 6/15 0 1/6 3/6 2/6 0 0 3 2/20 0 0 6/6 3/6 0 0 12/20 0 1/6 5/6 3/6 1/6 0 6/20 0 2/6 4/6 3/6 0 0 2 3/15 0 2/6 6/6 4/6 2/6 2/6 6/15 0 2/6 6/6 4/6 2/6 2/6 6/15 0 3/6 5/6 4/6 3/6 0 1 1/1 0 4/6 6/6 5/6 4/6 4/6 0 1/1 1 1 1 1 1 1 K Config. t k Model1 Model2 Model3 Model4 Model5 Model6 Table 2.1 LTag helicase hexamer inactive monomer configurations Table 2.1. LTag helicase hexamer inactive monomer configurations and activities. Model 1 is the sequential model, which assumes that one inactive monomer will deactivate the hexamer ring. Model 2 assumes only binding pocket between two adjacent wild type (W) monomer is active. Model 4 assumes the pocket is inactive only if the trans-monomer is inactive. Model 4 is also a sequential model. Model 5 is a variate of model 2 with additional assumption that more than 2 consecutive inactive monomer will deactivate the hexamer. Model 6 is another variate of model 2 with additional assumption that more than 1 consecutive inactive monomer will deactivate the hexamer. The activity A i (m) of configuration i in Model m is listed, where m=1..6. i=1..C k , C k is the maximum number of equivalent configurations with k inactive monomers. t k is the probability of the kth configuration, k runs over all equivalent configurations. 15 2.3 Results and discussions To evaluate the feasibility of these models, we plotted on the same figure all the six candidate models with the data from biochemistry experiment of ATPase activity or Helicase activity, in the presence of mutant doping, these experimental data includes: 1) ATPase assay (Figure 2.1): • ATPase doped with cis-mutant; • ATPase + ssDNA, doped with ATPase cis-mutant; • ATPase + folk DNA, doped with ATPase cis-mutant; • ATPase + DNA origin sequence, doped with ATPase cis-mutant; 2) Helicase assay (Figure 2.2): • Helicase+fork DNA, doped with cis-mutant; • Helicase+DNA origin sequence, doped with cis-mutant; • Helicase+fork DNA, doped with K512/H513-mutant; • Helicase+DNA origin sequence, doped with K512/H513-mutant; The cis-mutant means the mutation is in the cis-residues of the binding pocket. This mutation deactivates the ATP binding. The K512A/H513A-mutant means to mutate the tip residues in the β-hairpin which is located in the central channel and interact with the ssDNA intensively during the translocation and unwinding. This mutation deactivates the DNA translocation. Table A.1 (Appendix A) listed the R 2 for each model when fitting the ATPase and helicase activity data. 16 Figure 2.1 (A) shows that the ATPase activity is close to the random model when there is no DNA in the solution. The R 2 for model 4 is 0.993, the highest among all six models (Appendix A). This linear relationship between the doping mutant and ATPase activity in the random model indicate that, without assembling into a hexameric ring, the LTag helicase monomers can still hydrolysis ATPs. This is consistent with our experimental observations that trimers or even dimers can hydrolysis [15]. Figure 2.1 (B) shows that the results fall between Model 2 (semi-sequential) and Model 4 (sequential). The R 2 for Model 2 (R 2 =0.980) is slightly larger than model 4 (R 2 =0.974). The ATPase activity indicates certain amount of allosteric pattern with the addition of ssDNA. This is because the data is located below the random model and less activity is measured at the same doping condence. The activity reduction is mainly caused by the increased cooperative requirement among the monomers for the helicase to reach the same level of activity. Recent work [15] showed that the ssDNA also stimulated the ATPase activity. If this stimulation mainly happened in the central channel of the hexamer ring, this allosteric pattern could be the results of this hexamerization, and there still might be some LTag monomers that hydrolysis ATPs without forming the ring, hence the comprehensive mechanism shift the results upwards from Model 2 to random model. Figure 2.1 (C) and (D) show that model 6 is the best model in fitting 1) the ATPase activity with cis-mutant doping and folk DNA and 2) the ATPase activity with cis-mutant doping and DNA with origin sequence. Model 6 is a semi-sequential model, which assumes that only W-W pocket could hydrolysis ATP, and if there are more than 2 adjacent inactive monomers in a hexamer ring, the ring will be deactivated. These results 17 indicate that the ATPase semi-sequential pattern can be applied to the starting and middle stage of DNA unwinding procedure. The R 2 of model 6 are 0.992 and 0.994 respectively (Table A.1). A. B. C. D. Figure 2.1. Theoretical models for ATPase activity with cis-mutant doping. Model 1- 6 refers to the models listed in Table Error! Reference source not found.. A. ATPase doped with cis-mutant; B. ATPase + ssDNA, doped with ATPase cis-mutant; C. ATPase + folk DNA, doped with ATPase cis-mutant; D. ATPase + DNA origin sequence, doped with ATPase cis-mutant. 18 In Figure 2.2, the helicase activity is measured in the presence of fork DNA and DNA with origin sequence (we will use term “origin DNA” as the equivalent to DNA with origin sequence in the following paragraphs), and two groups of doping mutants, namely, the cis-mutant and the β-hairpin tip mutant is mixed into the solution gradually. Figure 2.2 (A) illustrates that model 6 (semi-sequential) best fit the data when the fork DNA (R 2 =0.988) is presented, while Figure 2.2 (B) shows that model 1 (sequential model) fits the data best with the when the origin DNA is presented (R 2 =0.993). This difference indicates that the conformation of double strand DNA will affect the unwinding activity. The fork DNA is “easier” to unwinding since the DNA is already seperated in the forking point, while the origin DNA needs additional work to initiate the melting process. This additional work might be the reason that all six monomers are required to work together. Subgraph (B) shows a strict sequential pattern. The same results are illustrated in subgraph (C) and (D), where the only difference is the mutation experiment. The helicase activity with K512A/H513A mutants in the presence of fork DNA best matches the semi- sequential model 6 (R 2 =0.994) and the helicase activity in the presence of origin DNA best matches the sequential model 1 (R 2 =0.983). The similar results indicate that the additional work to starting melting the origin DNA need all six binding pockets to hydrolysis properly, and all six β-hairpins to translocation and unwinding properly. And the fork DNA unwinding and translocation requires less work than that of the origin DNA, therefore it is tolerable to have a few non-consecutive inactive monomers in the helicase. 19 In addition, if comparing the ATPase activity of Figure 2.2 (C, D) and helicase activity of Figure 2.2 (A, B), we can see that the cooperative requirement in helicase activity is more critical in the latter. This indicates that there are some energy diminish after ATP binding and hydrolysis, so that the DNA unwinding need all the monomers to work together. A. B. C. D. Figure 2.2: Theoretical models helicase activity with mutant doping. Model 1-6 refers to the models listed in Table Error! Reference source not found.. A. Helicase+fork DNA, doped with cis-mutant; B. Helicase+DNA origin sequence, doped with cis-mutant; C. Helicase+fork DNA, doped with K512/H513-mutant; D. Helicase+DNA with origin sequence, doped with K512/H513-mutant. 20 Chapter 3 ATP binding study The thermodynamic principles that control the binding of nucleotide to the target protein are well understood [57], however, revealing the atomic level mechanism of the nucleotide binding process is still challenging. In this section, we presented a systematical stimulation of the ATP binding process and the energy coupling in the key steps consist with the experimental results. 3.1 Overview ATP binding process is a ligand docking process. Current docking programs take only ligand flexibility into account while keep the receptor rigid or partial rigid. This treatment is to reduce the degree of freedom and improve the structural searching speed. Lacking this full flexibility of the receptor might result in a ligand with bad binding affinity in the docking structure [1]. The molecular dynamics (MD) method simulates the physical movement of atoms and molecules by integrating Newton’s laws of motion [30]. The forces between particles are modeled using molecular mechanics force field, therefore is capable of simulating the flexibility of both the ligand and its receptor in a docking procedure. 21 Determining the ligand entry pathway could be a difficult task in both simulation and experiment, and most of the molecular simulation studies focused on the escaping pathway instead of the entry pathway [1, 57]. The escaping pathway could provide some useful reference, such as candidate intermediate structures and energetics, for the entering pathway. In our case, we have the docking structure (ATP bound structure 1SVM), the major part of the initial structure (Apo structure 1SVO), and some similar study about the ATP escape pathway has been reported [2]. Therefore, we can integrate the information into our molecular dynamics setup to simulate the ATP binding pathway in the LTag helicase. Similar ATP binding study has also been performed using the molecular dynamics to study the conformational change, such as ATP-binding cassette transporters (ABC- transporters) [23]. However, the brute force MD is still infeasible for our helicase study. The highly cooperative structure and the scale of our SV40 LTag system (2166 residues, or 35448 atoms) restrict the time scale of MD simulation to nanoseconds/microseconds, which is much less than the millisecond scale of the actually biological process. To address this problem, we resort to the non-equilibrium methods that apply external forces on biomolecules to accelerate processes. The non-equilibrium methods are widely applied in the studies of larger and more complex systems [41, 54]. In our study, we specifically use the targeted molecular dynamics (TMD) method, of which the external forces is applied in the form of holonomic constraint on the physical potential to reduce the root mean square deviation between the current structure and the final (targeted) structure [54]. TMD is suitable to calculate the transition pathways between two known 22 protein conformations. The combination of MD and TMD methods have been intensive exploited to the dynamics studies of various systems, such as the Ras p21 in the signal transition pathway [40], F1-ATPase system [2, 39], the GroEL complex [41], and the human α-7 nAChR receptor [10]. Here we adopted TMD to calculate the whole transition pathway and used MD to simulate the accurate conformational change in certain key time slots. There are several other candidate methods also adopted perturbation to accelerate the simulation to the desired structures, such as steered molecular dynamics (SMD) [20], biased molecular dynamics (BMD) [47] and RP-TMD method. The SMD mimics the atomic force microscopy and constrains the center of mass of is to move, which is capable of evolving optimal pathways. However, by constraining the center of mass usually cannot take full advantage of the final structure during a limited simulation time. The BMD perturbs the structure only if the structure diverged from the targeted structure, and simulation time still be slow [71]. The RP-TMD introduces some modifications to restrict the perturbation scale and direction so that the simulation will try to avoid trespass some energy barrier. Nevertheless, as long as there is a constraint force, we cannot avoid trespassing some high energy barrier. And the additional parameters, such as the perturbation amplitude also need to be adjusted according to different structures in a black box manner [64]. With these considerations, we adopted TMD to simulate the ATP binding process and used molecular dynamics to relax the intermediate structures generated by TMD, this protocol alleviated the work of the parameter settings. Experimental studies have indicated that there exists an ATP tight-bound state and ATP in loose-bound state in the ATP escaping pathway, and the coupled energy barrier 23 are also measured [2, 5, 23]. These studies provided reference for our ATP binding simulation in the helicase motor. Before introducing the simulation, we describe the detailed sequence and structure information of the LTag ATP binding site. As described in Chapter 1 (Figure 1.3), the LTag binding site located at the “interface” between two adjacent monomers. The ATP binding triggers the conformational change in both adjacent monomers, and the change will propagate along the hexamer ring (We performed a 2ns simulation molecular dynamics simulation; see Figure B.1). The ATP binding site contains high consensus protein sequence motifs, namely the Walker A and Walker B motifs [65]. Walker A motif, or phosphate-binding loop (P-loop) has the pattern GXXXXGK(T/S), where X denotes any of the standard amino acid. The lysine is found crucial for nucleotide-binding [16]. The P-loop interacts with the complex of magnesium-nucleotide complex through beta and gamma phosphate. Crystal studies indicate that P-loop itself does not necessarily bind or utilize nucleotide. Its immediate neighbors proceeded and after help to formulate the nucleotide binding region [51]. This region is characterized by a glycine-rich P-loop preceded by a β sheet and followed by an alpha helix. The Walker B motif has the pattern (R/K)XXXXGXXXXLhhhhD and recent study shows that the motif is more like hhhhDE [16], where D will coordinate with the magnesium ion, and E is essential for ATP hydrolysis. Our ATP binding pocket of LTag helicase is a typical ATP binding domain (Figure 1.3), the P-loop sequence in LTag helicase is GPIDSGKT (residue 425-432). The Walk B motif in the corresponding sequence is RLNFELGVAIDQFLVVFED (residue 455-473). 24 Based on the high resolution structures of LTag, we simulated the ATP binding process of LTag and the associated conformational changes (see the method section). We first used the TMD approach to calculate the transition pathway from the Apo state (ATPs are put outside the binding pocket) to the ATP bound state, and examined the ATP binding process that powered this conformational transition. The results suggest an ATP molecule goes through a three-staged process before being “locked” inside the nucleotide pocket. Meanwhile, the configurations of the binding pocket along the ATP binding pathway were evaluated by using the linear response approximation (LRA) version of the semi-macroscopic protein dipoles langevin dipoles method (PDLD/S). PDLD/S-LRA capable of evaluating binding free energies significantly faster than the microscopic methods, such as PDLD or free energy perturbation with comparable accuracy [33]. In addition, the simulation result of the conformational transition reveals an atom-level refined iris-like cooperative movement of LTag hexamer helicase. 3.2 Results There are three major conformational transition stages of the LTag molecular motor, which are associated with the ATP binding stage, the ATP hydrolysis stage and the ADP releasing stage. Intensive studies have been performed on the nucleotide hydrolysis procedure [25, 60]. Since the LTag ATP binding pocket is a typical binding pocket, and preliminary study using the empirical valence bond (EVB) method has given a similar hydrolysis energy profile as the one illustrated in ref. [60], we leave this detailed study in the future. In this section, we focused on the study of the ATP binding stage. We built a 25 pre-Apo state model by placing six ATP molecules 20Å away from the nucleotide binding pocket, and an ATP docking stage model by putting the ATP in the binding pocket of the Apo state. We simulated a 1 ns (nanosecond) pathway from pre-Apo state to the Apo ATP binding states and another reference pathway from the pre-Apo state to the ATP docking state and finally to the ATP binding state (See section 3.4 Method). The reason we use this as a reference pathway is because the ATP docking state model might help to co-verify the existence of the two intermediate states. However model might also introduce some artifact effect to the trajectory. 3.2.1 LTag helicase ATP binding model In this section, we investigated the simulated trajectory with the latest biochemistry experimental results, and studied the conformational changes of the cis and trans residues involved in ATP binding, the movement of ATP, and the dynamic hydrogen bond formation during the ATP binding process. The results of this study suggest a cross- locking model of the binding pocket for ATP binding. The cross locking mechanism of the ATP binding pocket Four key snapshots from the open state to the closed state of the simulation trajectory are illustrated in Figure 3.1, namely the Apo state (at the start point of the simulation), the weak binding state (WS, at around 0.1ns), the tight binding state (TS at around 0.4ns) and the ATP bound state. It is important to notice that the time in the TMD simulation only represents the conformational change order, not necessarily the exact time needed for the event to occur. Similar to the F 1 -ATPase ATP binding model [2], the binding procedure 26 can be divided into a docking stage from Apo to WS, and a binding transition stage leading to TS. In addition, there is a shrinking stage from TS to the ATP bound state which corresponds to the major conformational change triggered by the ATP binding [14]. For the convenience of description, we first denote the following measurements. 1. The cis- and the trans- residues around the ATP-pocket move closer to each other during the ATP binding and form a cross locking system of three pairs of “locker” residues. The polarity of each locker residue pair is opposite to each other. As illustrated Figure 3.1, we name these locks: lock1 (Ribose and LYS419), lock2 (ASP429 and LYS418/419) and lock3 (ASP474 and ARG540) (Figure 3.1 (A) and (B)). Residue ASP429 blocks the ATP binding pathway after TS and acts as a gate to protect further nucleotide binding or escape. ASP429 is hinged at SER431, which is located at the relative fixed part of the P-loop with THR433. Together, ASP429-SER431-THR433 form a gate mechanism for the binding pocket (Figure 3.1 (B)). 2. To study the ATP dependent conformational change, which is described as twisting and bending during ATP binding, the variance of an ATP dihedral angle (N1-C4’-C5’-PB) and an ATP angle (N1-C4’-PB) between the adenosine group, ribose and the phosphate group are traced during the binding process (Figure 3.1 (C)). 27 Figure 3.1: The three-staged ATP binding procedure. (A,B) The bottom and side views of the binding pocket residue and the ATP plus Mg 2+ (the sphere colored in blue) complex. The residue T433 (the fixed end), G431 (the hinge) and D429 (the gate) form the ATP gate (B). Lock 1 is illustrated as twin-headed arrow with solid line, Lock 2 is illustrated as dashed line and Lock 3 is represented as dotted line. The bottom two rows of figures represent the views from the bottom and the side of the ATP binding pocket, showing the four key snapshots during ATP binding process, namely the Apo state, WS, TS and ATP bound state. The oxygen atoms near the gate are illustrated as red balls in the ATP bound state. 3. Most of the binding procedures are described from the bottom (C-terminal) view 28 of the helicase. We denote the positions near the central channel surface as the inner side and the positions away from the central channel as the outer side. The N-terminus is labeled the upper side and the C-terminus as the bottom side. ATP docking between Apo and WS The docking stage starts from the position where the ATP is around 5Å away from the pocket. The phosphate group of ATP-Mg 2+ complex diffuses to and docks into the binding pocket (Figure 3.1 (ATP)). The positively charged residues in the binding pocket, such as LYS418, LYS423, ARG540, ARG498 begin to orient themselves to point at the negatively charged phosphate group of oncoming ATP (Figure 3.1 (Apo)). The dihedral angle remains about -140 degrees (Figure 3.2 (B)), the ATP angle vibrates above 130 degree (Figure 3.2 (C)), and the ATP gate angle is around 140 degree, fully opened for ATP to enter the binding pocket. At the end of the diffusion, the phosphate group diffuses to the position ~3.5 Å away from that in the final TS (Figure 3.2 (A)). At this stage, the phosphate group established strong contacts with the binding pocket, e.g. the ATP docks to the P-loop of the binding pocket. This state is named as the weak binding state, which is similar to the definition in the ATP escaping procedure in the F 1 -ATPase [2]. At the weak binding state, the adenine is outside the pocket. 29 Figure 3.2: The ATP conformational change and the cross locking system. (A) The binding direction of the ATP-Mg 2+ complex. In the PDB file, the PA, PB and PG represent the αPi, βPi, and γPi respectively. The black, red, green and blue lines represent the distance from the αPi, βPi, γPi and ribose to the relative stable ASP474 Ca in the inner side of the binding pocket. (B) The dihedral angle of ATP; (C) The bending angle of ATP; (D) Lock1, the distance plot between ATP ribose O4’ and the LYS419 CE. (E) Lock2, the distance plot between ASP429 CG and the LYS419/419 CE. (F) Lock3, the distance plot between ASP474 CG and the ARG540 CZ. 30 ATP binding from WS to TS The next stage is the binding transition stage from WS to TS. The adenosine is inserted into the gap between H9 and H13 (Figure 1.3 (C)), resulting in conformational changes of ATP by twisting the base into the hydrophobic pocket that accommodates the base. Our simulation reveals that the ATP finishes its major conformational change before 0.2ns, while the cross locking residues accomplish their movement after 0.2ns. The ATP insertion is affected by the movement of the phosphate group to its binding position. As illustrated in Figure 3.2 (A), the distances between the pocket residue ASP474 and the phosphate groups (αPi, βPi, γPi), and between ASP474 and the ribose C3’ decrease simultaneously. The relative conformations of the three phosphate groups are nearly stable. With the development of binding, the interaction forces accumulate between the phosphate and the binding pocket, and the forces generally prompt the adenine insertion. The adenine turns along the phosphate axis about 120 degree (Fig. 3 (B)) and finally is buried into the gap between helicase H13 and H9 (residue group ARG548 - LYS554 and SER430 - GLY431) [14]. The insertion is paired with the ATP bending. Figure 3.2 (C) shows that the ATP adenine group and the ribose bend about 40 o toward the phosphate group, while the ATP gate closes about 25 o (Figure 3.2). During the insertion, the residue pairs of the cross locking system also move closer to each other approaching the minimum allowed distance, which is shown by the decreasing slopes in Figure 3.2 (D)-(F). At this point the “lock” is fully “locked”. Figure 3.2 (D)-(F) also illustrate that the three locks reach their fully locked state sequentially, the kink part indicate that the major movement of the locking is finished. The first 31 slowdown point occurs (Figure 3.2 (D)) on lock1 after 0.2ns, it is followed by the second slowdown point (Figure 3.2 (E)) on lock2, and then the third point on lock3 (Figure 3.2 (F)). The sequential locking procedures is similar to the “binding zipper” model in the F1-ATPase, where the binding affinity increases progressively [46]. At the same time, the ATP angle finishes minor adjustments to the TS position (Figure 3.2 (C)). The gate is fully closed to prevent the insertion of other ATP (Figure 3.2 (TS)). Three residue pairs cross each other and form a cross locking system. The tight binding state is therefore characterized by the fully close of the binding pocket, this key state is similar to the observation in the ATP unbinding process of the F 1 -ATPase [2]. Figure 3.3: The timeline of the major hydrogen bonds formed during the binding procedure. The hydrogen bonds formed between the ATP and the cis-residues are plotted in black lines, between ATP and trans-residues are plotted in grey lines, between the binding lock residues are plotted in red lines, and between apical water and the coordinated residues are plotted in blue lines. 32 The mutation study [15] indicates that there are three groups of critical residues for the ATP binding: 1) the D429, S430, G431, K432, T433 on the P-loop; 2) the N529, D474 on the Sensor I motif, and 3) the arginine finger tR540 and lysine finger tK418. By analyzing the trajectory, we found that all four groups form strong hydrogen bonds with the phosphate group. The ATP interacts with P-loop residues first, and then with the rest of the residues in the binding pocket. As illustrated on Figure 3.3, the hydrogen bonds are formed between the βPi group and residues D429, S430. At the same time, G431 forms hydrogen bonds with the αPi group. In the binding transition stage, the βPi group begins to further interact with the P-loop residues. For example, the new hydrogen bonds are formed between K432 and the βPi group, and then between T433 and αPi sequentially (Figure 3.3). Then the lysine finger tK418 forms hydrogen bonds with the αPi and βPi groups. In addition, the arginine finger tR540 forms hydrogen bonds with the γPi group, and the Sensor I residue N529 also forms a weak hydrogen bond with the γPi group. Further, the ATP establishes coordination with the pocket residues through an apical water during binding; for example, the apical water forms hydrogen bonds with the arginine finger tR540 in the docking stage, with another arginine residue tR498 and the sensor I residue N529 in the transition stage, and with D474 in the later shrinking stage (Figure 3.3, blue lines). The mutation study shows the lack of tR540, tR498, N529 and D474 causes significant reduction of ATPase activity, the formation of these hydrogen bonds might help preparing the nucleophilic attack reaction for the ATP hydrolysis later on [15]. 33 As shown in Figure 3.3, the number of hydrogen bonds increases linearly with time. The sequential formation of these hydrogen bonds ensures nearly constant force generation, which may lead to the smooth closing motion of the binding pocket throughout the whole duration of the binding transition. This observation matches the corresponding sequential binding procedure in the F 1 -ATPase ATP binding. The majority of the hydrogen bonds are formed between the ATP-Mg 2+ complex and the cis-residues on the P-loop until the system reaches the weak binding state. With the development of the binding procedure, the trans-residues, such as tR540, tK418 and tR498 form the hydrogen bonds within the binding pocket. These bonds close the binding pocket to form the tight binding state. The observation is consistent with the experimental results [15] that the ATP-Mg 2+ complex attached to the cis-residues of the binding pocket first, and then forms the interaction with the trans-residues. Shrinking stage from TS to ATP bound state The binding of ATPs to the pocket triggers the global change after the tight binding stage: the D2/D3 domain of each monomer folds upwards to the D1 domain (Figure 3.4). From the global point of view, the six domain folding movements are integrated into a shrinking movement of the C-terminal domain inwards to the central channel and upwards towards the N-terminus (Figure 3.4 (E)). We name this stage from TS to the ATP bound state as the shrinking stage. From Apo state to WS, to TS to ATP bound state, the cooperative conformational change of the six D2/D3 sub-domain folding movements resemble that of a closing iris, thus named the iris model [14]. During the shrinking stage, the conformations and positions of the bound ATPs remain relative stable, since all of the 34 binding pockets are closed. The major conformational change is caused by the D2/D3 upwards movement (Figure 3.4 (C) and (E)). It is interesting to notice that, given a fixed N terminal: 1) the central channel residues on the middle section move faster than those on the bottom 2) The residues in the bottom section move faster in the shrinking stage than in the binding transition stage. For example, residue H513 and D455 are located at the middle and bottom of the central channel respectively. The average RMSD slope of the α carbon of residue H513 is ~12 Å/ns in the binding transition stage, and ~14 Å/ns during the shrinking stage (Figure 3.4 (A)). The average RMSD slope of α carbon of residue D455 is ~5 Å/ns during the binding transition stage and ~14 Å/ns during the shrinking stage (Figure 3.4 (B)). These observations indicate that the β-hairpin movement is originated by the state transition of the binding pocket. After TS, the movement is mainly caused by the domain folding. That the middle section (H513) of the central channel moves faster than the bottom section (D455) may cause DNA stretching in the middle section of the central channel, which agrees with the recent crystal structure with DNA in the central channel. Figure 3.4 (C) illustrates the change of the domain angle (LYS331-ASN366-HIS513) between D1 and D2/D3. This angle also changes at two different rates. The rate in the shrinking stage is four times faster than during the binding transition stage. Results from these figures all suggest that the major D2/D3 movement is triggered after ATP is tightly bound to the pocket, which is consistent with the observation in the literature [17]. 35 Figure 3.4: The movement of the helicase monomer sub-domain. (A) The C-α displacement of the tip residue H513 on the β-hairpin. (B) The C-α displacement of the tip residue D455 on the central channel of the C-terminal. (C) The average folding angle of D2/D3, represented by the angle between C-α of LYS331, ASN366 and HIS513. (D) The average channel radius on the top (black), middle (red) and bottom section (blue). (E) The angle that can be changed between two domains (domain folding angle) is indicated (red arrow), C-α positions of the top K331, middle H513 and bottom D455 are shown in the helicase. The H513 and D455 displacements are shown in grey arrows. 36 We found that the radius of the central channel at the C-terminal end decreases faster than that in the middle section. While the central channel radius at the N-terminus remains essentially unchanged (Figure 3.4 (D)). This result infers that the shrinking force comes from the D2/D3 domain, which can be used for pushing the DNA from the C- terminal to the N-terminal through the channel (Figure 3.4 (E)). 3.2.2 Mg 2+ coordination The coordination of Mg 2+ plays an important role in the ATP binding. Similar experimental studies of F 0 F 1 -ATPase show that the addition of Mg 2+ will increase the binding affinity of the nucleotide and helps to proceed to the tight binding state [2, 68]. The binding pocket residues coordinate with the Mg 2+ ion directly or through some intervening water molecules. Among these intervening water molecules, the apical water, WAT1, near the γPi helps to stabilize the pocket residues, and attack the γPi during hydrolysis. In our simulation, we observed the coordination of Mg 2+ with the intervening waters during the ATP binding procedure (Figure 3.5). The Mg 2+ has a strong propensity to assume an octahedral coordination [68]. During the ATP docking stage, the Mg 2+ ion forms a complex of six-element octahedral coordination with the β, γ oxygen and four water molecules. The whole complex (ATP-Mg 2+ and five coordinated water molecules) docks into the binding pocket until the weak binding state. There is a flattening stage in the distance profile between cis-residue T433 and the Mg 2+ (Figure 3.5), which indicates that the T433 is searching for a best position to attack the Mg 2+ in the complex. At the binding transition stage, T433 begins to attack the Mg 2+ . The invasion pushes one of the 37 coordinated waters, WAT3, close to its neighbor WAT2, which forces WAT2 to leave the stable coordination position with the Mg 2+ cation (Figure 3.5 (C)). As we can see from Figure 3.5 (A), there is a steep decrease in the distance between T433 and Mg 2+ together with a sharp increase of the distance between WAT2 and Mg 2+ . On the other hand, the distance variation between WAT3 and Mg 2+ is subtle (Figure 3.5 (B)). The stable coordination distance between Mg 2+ and ligand is ~2.0 Å. The coordination transition indicates that hydration waters may not necessarily be stripped at once. As in the case of F 1 -ATPase, the ATP may progressively exchange its hydrogen bonds with the hydration waters for hydrogen bonds with the ATP-pocket residues [2]. 38 Figure 3.5: Mg 2+ coordination transition procedure. (A) The distance profile between Mg2+ cation and the coordinated residues; (B) The box plot of the distance between Mg2+ and the coordinated residues; (C) The attacking of T433 to the Mg-ATP complex. Left: The initial position of T433 before invasion. Middle: T433 invasion, WAT2/WAT2 relocation. Right: The new stable structure. (D) The distance between the apical water and the coordinated residues, D474(black), N529(red), ATP γPi O1(green) and O2(blue), tR498(cyan) and tR540(Magenta). All the coordination distances converge to stable values after 0.6 ps on simulation time scale. These coordinations in the ATP bound state are illustrated on the right side. 39 3.2.3 Apical water coordination When the ATP-Mg 2+ complex diffuse near the pocket, the negatively charged phosphate group will interact with positively charged or polar amino acids, such as arginine (R540,R498), lysine (K418,K419) and asparagine (D502,D474). However, at the beginning, these charged groups may form hydrogen bonds with waters. When the ATP- Mg complex comes in, the waters may act as temporary bridges that should be weakened and broken with molecular vibrations during ATP-Mg 2+ binding, and eventually be replaced and expelled by the ATP-Mg 2+ complex. However, some of these water molecules will act as bridges via hydrogen bonds between the charged amino acids and the ATP-Mg 2+ during the entire binding process. The 2.0 Å crystal structures of LTag in different nucleotide bound states also reveal some of these fixed water molecules in the binding pocket before and after ATP-Mg 2+ binding. Here, we focus on the apical water and the water molecules coordinated with Mg 2+ since they are directly related to the hydrolysis of ATP. Our simulation results show that the apical water is unusually coordinated through four residues: two cis-residues D474, N529 and two trans-residues tR540 and tR498 (Figure 3.5 (D)). There is no particular order of coordination observed during the binding procedure. The distance between the four coordinated residues and the oxygen of the γPi group varies until the shrinking stage, when all the coordination distances converge to a stable hydrogen bond distance around 3.5 Å. The coordination procedure could be considered as a shrinking cage for the apical water (Figure 3.5 (D)). The vibration of the water molecule decreases until the cage shrinks to the stable state, when the apical water will be in a position ready for the nucleophilic attack. 40 3.2.4 Interaction energy The PDLD/S-LRA method is used to evaluate the binding energy of a series of 20 key snapshots (intermediate structures) sampled from the TMD simulation trajectory. The results give a rough binding pocket energy profile between the Mg-ATP complex and binding pocket. Figure 3.6 (A) and (B) illustrate the results using dielectric constants of 20 and 40 respectively. The calculated trends do not depend on the choice of protein dielectric constant. The energy profile starts at -6 kcal/mol, which corresponds to the interaction between Mg-ATP complex and the water from the beginning. And we use -6 kcal/mol as a base line to measure the binding energy. There is an energy barrier of 8 kcal/mol from WS to TS (Figure 3.6 (A)). The time corresponds to the Mg 2+ coordination exchange, where the WAT2 (Figure 3.5 (C)) escapes from its stable position due to the invasion of residue T433. The coordination transition is similar to the transition from the Mg-ATP diphosphate coordination state and Mg-ATP tri-Phosphate coordination state. In the diphosphate coordination state, the Mg 2+ coordinates with ATP through β and γ phosphates. In the triphosphate coordination state, the Mg 2+ coordinates with ATP through all three phosphates. The transition energy barrier is around 11 kcal/mol (18 K b T) at 300K in the water [36], slightly larger than our simulation results. One possible explanation is that the conformation of the binding pocket protein may facilitate the coordination transition of Mg 2+ by decreasing the barrier about 3 kcal/mol. This is followed by an energy valley of 13 kcal/mol, which lasts throughout the binding transition stage, and ends at the beginning of the shrinking stage. Then comes another energy barrier of 5 kcal/mol. There are three hydrogen bonds formed with N529 at this 41 time (Figure 3.3). One bond is formed with the γPi group oxygen, the other is formed with the tR498 and the third is formed with the apical water. The adjustment in the shrinking stage helps to prepare the apical water to attack the γPi in the following ATP hydrolysis stage. The energy profile stablizes at -12 to -14 kcal/mol, and the binding energy is about 8 kcal/mol. The experimental result of the TNP adenine nucleotide analogues binding energy is -8 kcal/mol (-33kJ/mol) [18], which could be used as a reference of our simulation results. Therefore, we conclude that our simulation results fall within a reasonable range, in comparison with the previous studies [2]. The major energy barrier is in the binding transition stage. Most of the binding energy is released during the docking and binding transition stage. On the other hand, most of the domain scale conformational changes happen after the binding transition stage (Figure 3.4). The sequence may imply that the domain scale is triggered by the ATP binding. Similar models have been reported, for example, in the F 0 F 1 -ATPase model, the energy transduction takes places during the binding transition stage as well [2]. Some recent studies of similar bacteriophage T7 helicase [21, 35] also reported that the global conformational change is triggered either by ADP release or by ATP binding. The motor domain engages with DNA after ATP binding [17]. 42 Figure 3.6: The interaction energy profile between the ATP-Mg 2+ and the binding pocket. The negative time slot represents the conformation before the docking stage. The top profile (A) is generated with ε = 20, and the bottom profile (B) is generated with ε = 40. 3.2.5 Monomer Conformational change The above simulation results indicate that the domain wise conformational changes happen in the ATP binding transition and the shrinking stages. The most significant conformational change is the D2/D3 domain movement towards the D1 domain (or D2/D3 folding). The major folding movement occurs in the shrinking stage of the ATP 43 binding process, with ~20% occuring in the ATP binding transition stage. Our previous work has illustrated a ~17Å movement on the tips of the β-hairpin [14]. In this simulation study, we found that these two movements can be derived from the angled D2/D3 folding movement toward the N-terminal D1 domain, with an angle of approximately 20 o . And the hinge point for the angled folding movement is around the joint of helix H5 and H6 (Figure 3.7 (A)). From the bottom view (Figure 3.7 (C)), the folding pushes the ATP- interacting cis-residues in an anti-clockwise direction to the neighboring trans-residues to form the cross-lock interactions to lock the ATP in the binding pocket. Figure 3.7 (C) illustrates an interesting movement of the β-hairpin during the folding. The tip of the β- hairpin moves upward along the central channel with a screw motion, which is consistent with the simulation results in section 1. 3.2.6 The cooperative movement triggered by ATP binding Because the LTag monomer conformational changes triggered by ATP binding occur in the context of a hexamer, the six monomers within a hexamer have to cooperate with each other during the conformational switch. Our simulation shows that the most significant cooperative movement is the formation of the ATP binding pocket and the concomitant domain-wise folding of D2/D3 in the first transition stage. The cis-residues for ATP binding sit in the front and face towards the folding direction. The folding movement pushes the cis-residues to the position with the shortest locking distance (bonding distance) with the corresponding trans-residues of its anti-clockwise neighbor monomer for ATP binding. At the same time, the folding movement of the neighboring 44 monomer slides the trans-residues, which are located at the right side of the monomer (Figure 3.7 (E), to the contacting position for the incoming cis-residues. At the end of the cooperative movement, the two sides of the ATP binding pocket reach the shortest bonding distance to form the cross-locking interactions for ATP binding (Figure 3.7 (E)). Accompanying the folding, six β-hairpins rotate and move along their slant axes as illustrated in Figure 3.7 (D) and Figure 3.7 (F). The ATP binding pocket is located at the base of the β-hairpin, thus the folding movement triggered by ATP binding could be amplified through the lever effects of six β-hairpins and transferred to the tip residues. The binding of six ATPs is therefore coupled with both the screw movements of the six β-hairpins towards the N-terminal in the central channel and the collective angle folding movement of the six D2/D3 domains towards the D1 domains, like an iris of the camera (Figure 3.7 (E)). However, we could not perform reliable energetic analysis of the nature and extent of the cooperativity between the subunits within a hexamer at this time due to the lack of the experimental kinetics data on the cooperativity of LTag helicase. 45 Figure 3.7: A refined iris model for helicase cooperative movement. (A) A side view of a monomer in the context of a LTag hexamer, viewing from the outside. The yellow, blue and cyan helices are the alpha-helices H15, H6 and H8 respectively. (B) Bottom view of the monomer. The dotted line with two round ends is the axis, along which, the D2/3 part moves around. The circle with a cross inside indicates the position of the central channel. The red and blue residues are cis and trans residues respectively. (C) A side view of the monomer in the context of the LTag hexamer, viewing from inside of the hexamer. The movement of the tip residue of the β hairpin (H513) is illustrated in a series of red dots. The moving trajectory is about 15 o to the axis of the central channel. (D) Side view of the monomer perpendicular to the rotation axis. The D2/D3 movement is illustrated by a series of tip residues, such as H513 and D455. The green, cyan, yellow residues correspond to the position of WS,TS and ATP bound state. The circles in yellow, dark blue and cyan represent the axis position of H15, H6 and H8, respectively. (E) The cooperative iris movement of the D2/D3 domain from the bottom view. (F) The cooperative upwards movement of the β-hairpin along the central channel in a screw manner. The upward arrows represent the H513 movement on the tips of the β-hairpin. The curved arrows illustrate the domain folding movement of D2/D3 along the axes in solid line. 46 3.3 Discussion ATP binding and hydrolysis by the LTag helicase motor is essential. We have performed a simulation study of the ATP binding process by LTag helicase in order to understand the energetics of ATP binding and the associated conformational changes for LTag helicase function in DNA unwinding. Based on our simulation results, we proposed a cross-locking model for the ATP binding procedure for LTag helicase. The binding model can be divided into three main stages, namely, the docking stage from Apo state to the weak binding state, the binding transition stage from weak binding state to tight binding state, and the shrinking stage from the tight binding state to the ATP bound state. The first two binding stages are similar to the “binding zipper” model proposed for the F 1 -ATPase system. During the ATP binding process, the Mg-ATP complexes diffuse to the binding pocket in the docking stage. And the phosphate group begins to interact with the binding pocket residues, such as the P-loop, and forms the conformation of WS. In the WS conformation, the bonding interactions between the three pairs of lock residues are not formed, and the three locks are fully open and the adenine group is completely outside the pocket. The WS progresses to the TS during the binding transition stage. The Mg- ATP complex gradually forms hydrogen bonds with the residues in the binding pocket through the phosphate group and the ribose. These interactions induce the conformational changes of both ATP and the lock residues around the pocket. For the ATP, as the adenine inserts into the hydrophobic gap between H9 and H13, the dihedral angle 47 between the adenine/ribose and the phosphate group increases about 150 degrees. The ATP also bends down to a right angle. For the binding pocket, the three locks close sequentially, first lock1 (Ribose and LYS419), then lock2 (ASP429 and LYS418/419), and finally lock3 (ASP474 and ARG540). The hydrogen bond analysis shows that the Mg-ATP complex first interacts with the P-loop and cis-residues, and then forms hydrogen bonds with the trans-residues. The major stablizing hydrogen bonds begin to form with the cis-residues on the P-Loop and N529, then trans-residues tK418 and tR540. This corresponds with the results of the mutation study [15] and the ATP binding observation from the F0F1-ATPase system [2, 46]. The number of hydrogen bonds increases linearly with the binding process, which is consistent with the results of the “binding zipper” model [2]. The apical water is important for the nucleophilic attack in ATP hydrolysis. Our simulation shows the position of the apical water is stabilized during the shrinking stage. The intra-ring conformational change and the relocation of residues compress the “cage” space around the apical water, and after certain adjustments, the coordinated residues are stabilized near the apical water in the ATP binding stage. At the end of the binding transition, the gate of the ATP pocket is fully closed. Negatively charged side chains, such as ILE428, ASP429, and ribose bases, all gather outside the gate. This may help to prevent the approach and binding of the other ATP (Fig.3 (TS)). In the binding transition stage, all the significant movement is concentrated in the binding pocket. Only ~20% of the domain-wise conformational changes occur at this stage, which includes a subtle D2/D3 domain movement. The major domain-wise 48 conformational change (~80%) is accomplished in the shrinking stage. It is interesting to note that the radius of the central channel in the C-terminal bottom portion decreases more than the middle portion during the ATP-binding triggered conformational change, which could mean that the D2/D3 upwards movement toward the N-terminal D1 domain may generate a pushing force for moving DNA through the central channel. This movement is part of the iris-motion of LTag hexamer associated with the ATP binding and hydrolysis processes. 3.4 Methods All the simulation is calculated using the CHARMM program package [7] and the binding energy profile is calculated by the POLARIS module of the Molaris program package [33]. The CHARMM27 all-atom force field [42] and the TIP3 water model [24] is employed. The standard sphere truncation methods are used to estimate long-range electrostatic interaction, and the cutoff radius for the non-bonded interactions is 14Å. The SHAKE algorithm is adopted to fix the hydrogen bond during the simulation [53]. We have built two models for Apo state LTag helicase. The first Apo structure is built for TMD simulation of the ATP binding procedure. Six ATPs extracted from the ATP bound state are placed 20 Å away from the original Apo state helicase structure [14]. The O software program is used to adjust the ATP spatial position [22]. The ATPs are relaxed for 10 ps at 300 K. We use the Dowser program to place the internal water for the Apo structure [75]. A water sphere of 70 Å is built to wrap around the Apo structure. 36 chloride ions and 28 sodium ions are used to neutralize the system. We minimized the 49 system for 3000 steps and then equlibrated the system for about 50 ps from 0 to 300 K, this is followed by another 200 ps equilibration at 300 K. The second model is built to verify the concerted model of ATP binding. The system is built by replacing one of the Apo state monomer with the corresponding ATP bound state structure. The position of the new monomer is decided by aligning the D1 domain to that of the original Apo state monomer. The system is quenched for 10000 steps and equilibrated for 50 ps from 0 to 300 K, this is followed by another 200 ps equilibration at 300 K. The ATP bound state is scanned by the Dowser program to place the missing inner water. A 70 Å TIP3 water sphere wraps the system. Again, we quench the system for 10000 steps and equilibrated it from 0 to 300 K. The TMD simulation used an additional energy term based on the RMSD of the initial structure and final (target) structure. The energy term has the form: 2 * )] ( ) ( [ 2 1 t rmsd t rmsd k V − = , where k is the force constant (20 kcal·mol -1 ·Å −2 ), rmsd(t) is the root means square distance of the current simulated structure from the target structure, and rmsd*(t) is the predefined target rmsd value at time t. Since the forward and backward trajectory pathways are supposed to be the same, real crystal structure data provides a good starting point. Therefore, our TMD simulation started from the equilibrated ATP bound coordinates and ends at the equilibrated Apo state for 1.5ns. The step size is 2 fs seconds. This strategy is also employed in the previous E. coli MurD study [49]. It is important to note that in the original crystal structures, the Apo state monomer might not correspond to the ATP bound state monomer with the same segment name. We aligned each pair of the monomers between the Apo and the ATP bound state 50 and save the 15 ( ) pair-wise alignment scores. Then align the six sequential monomers in the Apo state with those in the ATP bound state. For example, the segment of Apo and ATP bound state are represented by ABCDEF and A’B’C’D’E’F’ respectively. We first align the monomer sequence ABCDEF with A’B’C’D’E’F’, and then align it to B’C’D’E’F’A’, and next to C’D’E’F’A’B’, and so on so forth. The final alignment is the one with the best overall sequence alignment score. In our study, we used the last 1ns from the trajectory. The extra 0.5 ns is removed since it is related with the surface diffusion when ATP approaching the Apo helicase, which is out of the current research. To consolidate our results we have tried another two TMD simulations. One is the normal pathway from the equilibrated Apo state to the equilibrated ATP bound state. The conformational change is similar to the results above. Another TMD simulation involves an intermediate Apo state with ATPs bound to the pockets. The ATPs’ positions are decided by aligning the ATP monomer with the Apo monomer. The intermediate Apo state is quenched and equilibrated in the same way as described above. The TMD simulation starts from the ATP bound state, and goes through the intermediate Apo state and ends at the Apo state. The simulation results are similar to the results presented above and therefore strengthen the results of our conformational pathway. The PDLD/S-LRA (Linear Response Approximation version of the semi- microscopic PDLD) method is designed to effectively evaluate the protein-ligand binding free energies through a thermodynamic cycle that is a fast approximation of the rigorous Free Energy Perturbation (FEP) [56]. The PDLD methods have been described in a series 51 of theoretical papers, including the PDLD method [66] , the semi-macroscopic version, PDLD/S [32] and the fast approximation version, PDLD/S-LRA [43]. Despite of the relative accurate result from a fast approximation, the original microscopic PDLD suffers from the numeric problem of large number compensation, and usually need to several runs of calculation to average out the errors, this treatment decreases the overall performance. The semi-macroscopic version solves the numerical problem by model the charges and permanent dipoles explicitly, while treat the other effect implicitly [31, 33]. And the LRA is adopted in the PDLD/S-LRA to capture the reorganization energy and therefore can provide consistent results. The well established PDLD/S-LRA method has been widely applied in the related biological systems, such as the F 1 -ATPase [63] and HIV protease [56]. We have also successfully applied the PDLD/S-LRA method on the LTag DNA translocation analysis, which will be discussed in Chapter 4 [37]. In the ATP binding simulation, we used the PDLD/S-LRA method to evaluate the 20 snapshots (intermediate structures) from the simulated TMD trajectory. The PDLD/S- LRA method evaluates the change in electrostatic free energies upon transfer of a given ligand (l) from water to the protein by starting with the effective PDLD potentials (3.1): ( ) s p p l q p l sol w p p l sol p l sol p l elec U U G G G U + + − ∆ + − ∆ − ∆ = + ′ + ε ε ε ε ε µ l intra , 1 1 1 1 , s p p l sol w p l sol w l elec U G G U ′ + − ∆ + − ∆ = ε ε ε ε l intra , 1 1 1 1 , where G sol denotes the electrostatic contribution to the solvation free energy of the indicated group in water (e.g., p l sol G + ∆ denotes the solvation of the protein-ligand complex (3.1) 52 in water). The values of the G sol ’s are evaluated by the Langevin dipole solvent model. l q U µ is the electrostatic interaction between the charges of the ligand and the protein dipoles in vacuum (this is a standard PDLD notation). This approach provides a reasonable approximation for the corresponding electrostatic free energies (3.2): ( ) ( ) [ ] l w l elec l w l elec l p l elec l p l elec elec bind U U U U G , , , , 2 1 + − + = ∆ ′ ′ (3.2) where the effective potential U is defined in Eq. 3.1 and l and l ′ designate an MD average over the coordinates of the ligand-complex in their polar and non-polar forms. It is important to realize that the average of Eq. 2 is always done where both contributions to the relevant elec U are evaluated at the same configurations. That is, the PDLD/S energies of the polar and non-polar states are evaluated at each averaging step by using the same structure. The 20 structures are sampled evenly from the initial docking stage, through the binding transition stage (WS to TS), and the shrinking stage (TS to ATP bound state). All 20 structures are relaxed for 500 ps at 300 K. Each structure is then evaluated for 10 different runs. The mean values of these 20 structural evaluations are connected as a rough energy profile (Figure 3.7). We also studied the influence between the adjacent binding pockets when the conformation of the related monomers has been changed. This influence may promote the ATP binding of the adjacent pockets. This hypothesis was examed by 1) transforming one of the monomer into the next nucleotide binding state, e.g. from Apo to ATP bound state, while leaving the rest monomers unchanged, and 2) performing a 2ns MD 53 simulation to investigate the gap (between adjacent monomers) distance variance. Here we use the Cα distance between residue ASP429 and ARG418 (Lock 2 residues pair), and between ASP474 and ARG540 (Lock 3 residues pair) to measure the gap variance. The results in Figure B.1 (Appendix B) shows that the gap distance varies randomly, the gap enlarged by the transformed monomer has not shown significant propagation (influence) along the helicase ring except for some variance within 1Å, which could be caused by the normal thermal fluctuation. Therefore, we might not be able to conclude a significant influence pattern without a longer term simulation. The figures are plotted by the R software package [50], VMD software suite [19] and Pymol [11]. 54 Chapter 4 DNA translocation study The translocation mechanism in the central channel is another key problem to understand the molecular motor. Due to the relative complicated coordinated movement in the DNA translocation, we used a simplified model to calculate the electrostatic interaction to capture the main effect involved in the translocation movement. 4.1 Overview The crystal structures of LTag hexameric helicase at different nucleotide-bound states revealed large conformational changes triggered by ATP binding and hydrolysis, which includes the longitudinal movement of a β-hairpin and a DR/F loop structure along the central channel [14, 34]. The movements of the β-hairpin and the DR/F loop were suggested to play a role in DNA translocation and unwinding [13, 14, 58]. A similar β- hairpin is also seen in central channel of the N-terminal structure of M. thermoautotrophicum MCM (mtMCM), and positively charged residues on these β- hairpins are shown to be critical for DNA binding and helicase function of mtMCM and LTag. 55 Despite the progress in structural studies, it is unclear at present how the conformational changes triggered by ATP binding/hydrolysis lead to the DNA translocation. Gai et. al. suggested a structural mechanism relating the motion of the β- hairpins to the DNA translocation process [14], and, similarly, Enemark et. al. provided insight on DNA interaction [12]. But these reports had not given a clear relationship between the protein structural changes and the translocation process. Other recent work [9, 45] provided additional important structural and kinetic information. However, these studies did not present clear energy considerations that would allow one accept or reject a proposed mechanism. Recent theoretical attempts to explore the directionality in PcrA Helicase [73, 74] provided an interesting insight into DNA translocation in a monomeric helicase system. However, the previous work had not considered the rate determining barriers (those associated with the ATPase reaction) and involved somewhat unjustified interpolation. Thus, the origin of the translocation directionality has not been resolved uniquely by structure-energy studies until the present study. Our work introduced a renormalization strategy for the study of a rather complex hexameric helicase system. We first simulated a single-stranded DNA translocation process using the available LTag hexamer helicase and E1 helicase/DNA structures, and then used a reduced model to simulate the actual translocation process and examined its time dependence. The unidirectional property of the translocation is determined by constructing a free-energy surface of helicase movement and ssDNA translocation. And the time dependence of the LTag translocation procedure is simulated by Langivin 56 dynamics, and the translocation trajectory is then projected to a longer time scale comparable to the real biological process. This study provided insights about the exquisite relationship between the electrostatic energy landscape and the directionality of translocation process, and a general way for structure function correlation of translocases and the related motor proteins. 4.2 A general analysis of DNA translocation process To study the translocation, we first present a general analysis of the conditions for an efficient hexameric translocase. The relevant biological system should be able to convert the energy of ATP hydrolysis to vectorial translocation of DNA or related molecules. In this case, we consider a system that encircles the tranlocated ssDNA in a hexameric helicase, such as LTag. To analyze a translocation process it is useful to start with a hypothetical model system that supports such a process. This is shown in Figure 4.1, in which the ssDNA is described as having equally spaced dents, representing point with strong interaction between the protein and the DNA (e.g., the phosphate groups), and the protein is described as a gray object with a dent that represents the region with the strongest interaction with the DNA. R 0 is a reference point for the spatial translocation of the DNA. In the process of moving from 1 to 2 (Figure 4.1), the protein position is shifted while retaining the interaction with DNA via site 6. In the 2nd and 3rd steps, the transition to E 1 leads to a major reduction in the protein-DNA interaction, and the return of the protein 57 conformation to T (3 to 4) occurs without shifting the DNA. The overall cycle results in translating the DNA from 6 to 5. Figure 4.1: A hypothetical system that leads to DNA translocation. The figure describes a “protein” (gray and black) that has strong interaction with the DNA (nucleotide positions 5, 6, and 7 are indicated) at the indicated black protrusion. The system starts at the T state with strong bonding at site 6 of DNA, moving from T to E pushes the DNA down and then the motion to E lead to a relatively weak interaction between the protein and the DNA. The return to T leads to a strong interaction, but now with site 5 [stage 4(T 2 )]. Thus, the overall process pushes the DNA downward. The indexes T 1 and T 2 designate the same T state but with a translation step. The system of Figure 4.1 suggested an effective translocation. To judge the corresponding efficiency, we generated an energy-structure description in terms of some generalized coordinates. This is shown in Figure 4.2 (A), where we described the effective free energy of the hypothetical system in terms of two coordinates (Q and R). The first coordinate Q describes the conformational change from the ATP bound state (T) to the ADP bound state (D) and then to the Apo state (E) configuration. The second coordinate Q designates the DNA translocation. If the energy of the system behaves like the surface in Figure 4.2 (A), we will have a vectorial process along the path designated by 1–2–3–4. That is, in this path the system starts from the minimum at 1 and then when the protein moves from T to D the system moves to 2 and then to 3 when the system 58 moves to E. The transition from E to T then takes the system along the least energy path to point 4 rather than back to point 1 and thus results in translocation. Figure 4.2: Two types of free-energy maps for translocation process. (A) A map that describes the energetics of the translocation process of Error! Reference source not found.. The surface involves 2 effective coordinates; the protein structural changes (the Q axis) and the DNA translocation (the R axis). The indexes T 1 and T 2 designate the same T state but with a translated DNA. (B) A 2D map that describes an ineffective translocation. The surface is built in the same way as in Error! Reference source not found. A but now the motion from T to D does not involve a shift in the minimum. In this case, the system has an equal chance to move on paths A and C. In the case considered in Figure 4.2 (A), the map is based on the energetic of the actual system studied, and it allows one to explore the translocation directionality by Langevin dynamics simulation or by conceptual analysis. On the other hand, if the system behaves like it does in Figure 4.2 (B), we will have an extremely inefficient translocase because the system have equal chance to move from 6 to 7 or from 6 to 5 and even to stay at 6. This situation will not lead to effective translocation unidirectionlity. Thus, the condition for effective action is that the minima of the surface will be shifted while moving from T to D. 59 To validate the hypothetical analysis above, we generated two types of structure function correlations to see how the real system behaves. In our case, we would try to produce maps of the type of Figure 4.2 from the actual energetics of the system. Our main point is that the surface generated from a given model system does not necessarily supports a unidirectional process, and failing to get unidirectional energy valleys in the surface would indicate that the model is problematic. With the above considerations, we generated an effective free-energy surface for the LTag system, using the crystal structure of papillomavirus E1, a viral initiator protein, with ssDNA and ADP bound [12] (PDB entry 2GXA). The sequence identity between E1 and LT SV40 is 28%, the crystal structure of LTag at ADP conformation (PDB entry 1SVL) aligned well on the E1 helicase at ADP state with ssDNA. We first superimposed the structure of LTag ADP (PDB entry 1SVL) bound state with the E1 (PDB entry 2GXA) structure, which is also in ADP bound state and has the ssDNA in the central channel. Then we superimposed the structures of the LTag ATP bound state (PDB entry 1SVO), Apo state (PDB entry 1SVM) on to the structure of ADP bound state based on the zinc domain (D1) domain. This is because D1 functions as a collar of the LTag helicase, and the conformation is relative stable during the DNA translocation. The internal structures of all of the conformations were kept unchanged during the transformation process. In the subsequent step, we generated series of intermediate conformations by taking the vectors that connect each pair of conformations. That means, starting from the crystal structure of LTag helicase with bound ATPs, we constructed nine structures going from 60 the crystal structure of ATP bound state to ADP bound state. This was followed by another nine structures going from the ADP bound structure to the Apo structure. Finally, nine other structures were constructed going back from Apo structure to the initial ATP bound structure. Those 30 structures were used in our calculation to represent the helicase structural changes during the ATP hydrolysis cycles. The co-crystal structure of E1 helicase-ssDNA complex contains an ssDNA with six nucleotides per helical turn, which corresponds to an average helical twist angel of 60° from one nucleotide to the next. The average distance between two adjacent nucleotides in the ssDNA is about 6.8 Å. So we constructed an ssDNA of 50 DT (A DT is a simplified atom designating the phosphate located at the ssDNA backbone), using the twist angle and spacing parameters, and superimposing to the crystal structure of ssDNA inside the helicase channel. Focusing on the negatively charged phosphate backbone, we constructed the model depicted in Figure 4.3. The constructed helical ssDNA was assumed to spin while being translated so as to maximize the electrostatic interactions with the LTag residues. 61 Figure 4.3: A structural model of LTag hexamer in complexed with ssDNA. (A) The crystal structure of LTag protein at ATP bound conformation with a ssDNA (phosphate chain, in ball- stick model) inserted into the protein channel. (B) Side view of LTage central channel. For simplicity, only chain A and D of a hexamer are shown. (C) Critical residues in central channel of LTag proteins in different conformations. The structural changes corresponding to the ATP, ADP, and Apo states are shown in pink, light blue, and green, respectively. To study the energetics of the above structural information, we only focused on the ionized phosphate groups of the DNA. This is because phosphate is the primary source of the protein-DNA interaction for hexameric helicases [37]. The electrostatic energy of the protein-DNA model was evaluated using the same PDLD/S-LRA method (see section 3.4 for details). Using this approach, we first mapped the free-energy surface for the protein- DNA electrostatic interaction. The resulting surface indicates that the DNA-protein interaction energy is much stronger in the T sate than in the D and E states. A B C 62 Our energy surface does not include the protein and ATP internal energy and thus the relative height of the three minima has to be adjusted. This adjustment involved the following considerations. The transfer from T 1 to T 2 (Figure 4.4) involves about −8 ×6 kcal/mol change in free energy, because it reflects a change form (ATP+water) to (ADP+P i ) in aqueous solution [63, 68] for 6 ATP molecules. The transfer from T 1 to D 1 is assumed to involve about −1× 6 kcal/mol contribution from the protein and the reacting system, while the transfer for D 1 to E 1 is assumed to involve −7× 6 kcal/mol contribution from the reacting system. This estimate is based on the situation in F 1 -ATPase [63, 68], another hexameric molecular machine. In this respect our assumption is fully consistent with the fact that LTag and F1-ATPase have similar k cat (0.3 s −1 and 0.2 s −1 LTag and F1- ATPase, respectively) [15, 63, 68]. The interaction with the DNA and between the subunit only changes the barrier by 1–2 kcal/mol. Furthermore, from our experience in F 1 -ATPase, we expected that the chemical barriers were higher or equal to the conformational barriers. At any rate, the activation barrier for transition between T and D is taken as 18 kcal/mol, and the transition from D to E as having a barrier of 17 kcal/mol. These results are based on the similar results in F 1 -ATPase and on the fact that our final results are not affected too much by the barriers except that the translocation rate becomes smaller if the barriers increase. We also assumed that the ΔG values for the different steps was similar to that of F 1 -ATPase and again our overall results were not affected by this assumption, because the real driving force is the ATP hydrolysis, which gives us a downhill gradient of about 7×6 kcal/mol regardless of the nature of the ATPase. We would also like to clarify that the downhill energetic is not the origin of the 63 directionality. Because the barrier for a fully simultaneous hydrolysis reaction in all of the six subunits is estimated to be 6×18 kcal/mol, an energy barrier that will be overcome at room temperature in about 10 28 years, we can conclude that at least this chemical reaction step should occur in a non-correlated or independent way. However, the finding that the chemical reactions are uncorrelated does not preclude the possibility that the conformational change can occur in a simultaneous way after the hydrolysis reactions are completed, although it is more likely that the hydrolysis reaction occurs in several subunits and create a spring loaded type effect on the conformational transition. At any rate, we were not trying to explore here the detailed steps in the overall conformational transition and represent them by a single coordinate. Thus, the overall drop in energy in any complete transition has to represent the effect of all of the six subunits. The above contributions were added to the protein−DNA electrostatic interaction to provide the overall free-energy surface. The resulting surface is shown in (Figure 4.4). 64 Figure 4.4: The effective free-energy surface (in kilocalories per mole) for the translocation process in LTag. This surface include the adjustment that reflects the internal energy of the LTag states (see the text in this section). The indexes T 1 and T 2 designate the same T state but with translated DNA. R ″ represents DNA coordinates (see text for details) and Q ″ (Q″ = Q/λ Q , λ Q = ( ℏ/2)ω Q δ Q 2 ) represents the protein structural changes. In our model, the ssDNA was assumed to spin-moves along the channel to maximize the electrostatic interaction, the vertical axis represents this combined movement of rotation and translation. For simplicity, we used a coordinate R ″ whose change by 1.0 unit representing a rotation of 60° and a vertical translocation of 3.4 Å along the channel. We grouped the movement of six subunits into one effective coordinate. This means that the details of the partially sequential conformational change are represented in a coarse way. Thus, the large reduction in free energy going from E 1 to T 2 should have been distributed between six steps, where it would provide the driving force needed to complete the overall conformational change. Neglecting this detail can result in not being able to reproduce the exact number of nucleotides transferred per ATP usage. It is quite 65 possible that the number of nucleotides translated per ATP may be larger (or smaller) than what was predicted here once we are able to deduce the details of the conformational changes, but this improved treatment will have probably to wait until we have more information from single molecules and related experiments. However, the details are unlikely to affect the overall calculated directionality, which is the focus of this study. Figure 4.4 shows that the least energy path forces the system to move from the minimum at R ″ =2.5 and T 1 to R″=1.6 and D 1 . Then the system moves to R ″= 1.5 and E and subsequently continue to R″=1.2 and T 2, where it completed a translocation and ready for the next cycle. To identify the residues that contribute to ssDNA translocation, we evaluated the electrostatic group contributions for the interaction of the DNA with the protein [37], and taken the corresponding difference between the potential at D and T at R ″= 2.4 and the difference between the potential at D and T at R ″= 2. Residues that decrease the first difference and increases the second difference created the pattern of Figure 4.4. Mutating these residues and examining the resulting translocation will be extremely instructive for the future studies. Thus, the free-energy surfaces allow one to decide whether the translocation is unidirectional. Next, we use a renormalized method to quantify the time dependence of the corresponding process. This is done by fitting a simplified free-energy surface to the surface of Figure 4.4 and simulating the system with Langavin dynamics (LD) (See section 4.3). The LD simulations used effective frictions and reduced masses that represented the dynamics of the complete system. 66 The estimated translocation time of hexameric helicase T7 gp4 is around 132 nt/s for a different [27], which corresponds to 0.007s for a motion of one nucleotide. At present, this process cannot be simulated easily by our LD approach in a reasonable time frame. Thus, we adopted an interpolation philosophy of scaling down the potential to values that allow direct simulation (we consider barriers up to 10 kcal/mol) and then increasing the barrier to interpolate the trend at the actual high barrier case [6]. The results of the simulations with barriers of 4 kcal/mol are depicted in Figure 4.5. As seen from the figure the system moves in a stochastic way in a unidirectional manner. The driving force for this process is the ATP hydrolysis reaction. The calculated translocation time for different potentials is then used to estimate the translocation time with the full potential by interpolating to the corresponding value [6] for a related treatment. The interpolation to a barrier of 18 kcal/mol gave a translocation time of 0.004s per nucleotide which is in a qualitative agreement with the observed trend (0.007s). An improved agreement may be obtained by dividing the T to D step to 6 individual steps, but this is not the purpose of the present study, which focused on the origin of the directionality. 67 Figure 4.5: The simulated time dependence of the translocation process of low barrier. The figure displays the time dependence of the R ″ and Q″ coordinates and snapshots along the translocation path. The specific simulation is done for a barrier of 4 kcal/mol for the T to D transition. Simulations with higher barriers give similar results (but, of course, with longer translocation times) To explore the possibility that the dynamics of the translocation process is somehow non stochastic, we also changed the friction constants of the solvent and solute coordinates (γ Q and γ R , respectively) to see whether this can change the translocation time. We found that in cases with very low barriers (e.g., <3 kcal/mol) the values of the friction 68 constants had significant impact on the translocation time. However, at the high barrier limit (with barriers >5 kcal/mol) the values of these constant had very little impact on the translocation time. This indicates that in the present case, when we have high barrier for the chemical steps, the translocation time is controlled by the free-energy landscape. It should be pointed out that it is hard to obtain a vectorial translocation without imposing it on the model, and obtaining such a result while considering the electrostatic interactions indicates that they dictate the physics of the system. 4.3 Discussion The mechanism of DNA translocation by helicases is an issue of fundamental interest both in terms of the replication control and the general issue of conversion of chemical energy to mechanical work. We introduced a physically consistent energy-based analysis of the action of a hexameric helicase by developing energy diagrams that allow us to translate the structural information to translocation efficiency. Furthermore, a specialized renormalization approach allows us to explore the dynamics of the translocation process. We studied the molecular origin of the unidirectional translocation action of the LTag helicase by the available structural information on LTag and E1 with ssDNA. Focusing on the electrostatic energy of the system allows us to construct energy diagrams in the 2D space of the protein structural change and the DNA translocation. The resulting diagram provided a new view of the nature of the translocation process. It appears that the electrostatic-based surface leads (without special parameterization that would force a specific directionality) to a vectorial translocation of the ssDNA. 69 Because the modeling of the translocation process by brute force MD simulations is impractical at present, a multiscale renormalization approach [25] similar to the ones used in the studies of proton translocation [6], and catalytic landscape [52], F1-ATPase rotary mechanism study [44] was exploited in this work. Basically, we represented the system by an equivalent coarse-grained system with the relevant free-energy surface. Performing LD or related simulations of the simplified surface provided an effective way of understanding the functions of complex systems. The use of langevin dynamics or related formulations in instructive studies of biological motors is not new [73]; however, reproducing motor directionality with a consistent “first principle” structure function correlation has not been accomplished before this work. It should be mentioned that previous studies [73, 74] have presented a pioneering attempt to explore the translocation directionality of ssDNA in PcrA, a monomeric helicase, which had some elements similar to the current study. However, in our view, the origin for the directionality has not been resolved. A key problem in modeling directionality is the generation of a physically-based free-energy surface, which is based on the energetic of the system under consideration. Here, as also recognized in ref. [73], the attempt to draw directionality information from the so called steered molecular dynamics (SMD) is unlikely to lead at present to unique potential of mean force (PMF) in such complex systems as helicase/DNA complexes, because of enormous hysrersis and convergence problems. Thus, ref. [73] focused on the interaction between the protein and the nucleotides by using the LRA approach [31] in its LIE version [3]. Although this approach was validated by us in DNA polymerase [70], it does not give quantitative 70 results for the interaction of the protein with the highly charged phosphate groups without introducing a rather large dielectric constants [70]. The key advance in the present work is the construction of a free-energy surface in the complete space of the DNA protein motion. Our studies provide a complete surface that lead to directionality, and Figure 2 of ref. [73] gave the profile for the protein–DNA binding energy for 2 points (the DNA binding with and without ATP) on the full surface. Interpolating these 2 points to obtain the relevant barriers is very challenging, and it is hard to justify the interpolation used in ref. [73], which seems to be equivalent to deducing the barrier between 2 bound state just based on the information at these states. Another important issue is the need to consider the dependence of the free-energy surface on the ATPase coordinates which is a key element of the present work. Although our study revealed the origin of translocation directionality (i.e., translocation in one direction only) by focusing on the DNA main chain to construct a 2D energy diagram, resolving the issue of 3 ′ → 5′ or 5′ → 3′ movement will have to involve the evaluation of the interaction of the protein with the bases (the interaction with the main chain is identical in both cases). Such a study will greatly benefit from direct structural information about the complete protein-DNA complex. It is important to note that taking into account the driving force upon the ATP to ADP transition is crucial, but it does not explain the translocation directionality. That is, moving from ATP to ADP can push the DNA up or down or just leave it in its position. Obtaining the specific coupling that pushes the ssDNA unidirectional is the key outcome 71 of this model and the basis for the directionality. This outcome is not intuitively obvious without using out two-dimensional surfaces. Currently, there is significant interest in whether molecular motors operate by random stochastic forces or by some type of coherent motions. The present study suggested that the translocation process may be driven by stochastic random motions, which were dictated by the free-energy landscape. This conclusion is based on the finding that the use of physically-based friction and the change of the friction by 2 orders of magnitudes have not change the translocation time, once we move to the high barrier limit (which corresponds to the actual feature of the helicases). Another interesting insight that emerged from the present energy-based analysis is that the system must operate by uncorrelated chemical hydrolysis, because otherwise it will require an infinitely long time. Nevertheless, it is still possible that the conformational change occurs after the chemical step and is completed in a somewhat simultaneous way. The exact way by which the uncorrelated or stepwise ATP hydrolysis activates the conformational changes is one of the most intriguing unsolved puzzles about this system, and its solution will require a combination of structural, single-molecule, and computational studies. Additionally, a more quantitative analysis of the directionality problem should involve a combination of the electrostatic group contribution of the type with experimental studies of the effect of mutations on the translocation efficiency. Overall, we believe that the most significant value of our work is in introducing a new powerful way of analyzing translocation processes and thus opening the way for more systematic analysis of the emerging structural and biochemical results. 72 It is quite significant that the main physics of the translocation process could be simulated while considering only the electrostatic energy of the system plus the barriers for the chemical transformations. The restoring force of the electrostatic free energy reflects van der Walls repulsive energies and entropic effects, but they follow more or less the linear response approximation and establish the electrostatic reorganization energy and the effective dielectric constant. The finding that electrostatic effects can provide a powerful structure function correlation for molecular motors and other systems and is consistent with other studies [63], [67]. 4.4 Methods To simulate the translocation process we need to generate an effective free-energy surface and to simulate the dynamics on this surface. Obviously a full macroscopic evaluation of the relevant free-energy surface is too challenging, in part in view of the absence of the full structure of the complex and in part because of extreme convergence problems. At present, we believe that the most effective strategy is to focus on the electrostatic contribution to the free energy of the system, and this is done here with the PDLD/S-LRA approach [33]. The PDLD/S-LRA method is described in Chapter 3. Here, we used this approach with a dielectric constant, ε p = 20. This high value reflects the fact that we deal with a highly charged system and that our regular PDLD/S treatment considers usually charge- charge interactions macroscopically while using another dielectric (ε eff ) with a high value of 40. The nature of these dielectric constants and the justification for their values is 73 considered extensively in other studies [67]. Furthermore, we used the PDLD/S-LRA treatment only for residues in a cylinder placing along the helicase channel with radius of 18Å and then evaluated the effect of the charges on the distant residues, using macroscopic Coulombs law with ε eff = 40. This type of treatment has been validated in extensive studies of mutational effects [67]. We also used in an initial screening a simplified treatment based on the evaluation of electrostatic group contributions [43]. This approach evaluates the electrostatic group contributions to the binding energy by scaling the electrostatic interactions with a dielectric, ε x , using as ε x ≈ 4 for polar residues and ε x = ε eff ≈ 40 for ionized residues. This approach was examined in several test cases [43] and provide a reasonable result for an initial screening. The present work has not considered van der Walls steric forces because calculations that include such interactions converge extremely slowly and would give meaningful results only after free-energy perturbation calculations that are not practical at present for the large system involved. Fortunately, however, studies in many charged systems have shown that after the steric effects are sampled correctly the main free- energy contribution comes from the electrostatic interactions [67]. The dynamics of the effective coordinates of the system was explored by introducing a LD approach similar to the one used in our studies of proton translocation processes [67]. That is, to explore the time dependence that coupled protein-DNA motions, we approximate the effective surface obtained by the PDLD/S-LRA approach by a multi minima empirical valence bond (EVB)-type potential surface. In this way the system is represented by mixing potential of the form: 74 , ) ( 2 ) ( 2 , 2 2 , , m l m R R l Q Q lm lm lm lm R Q H α δ ω δ ω ε + − + − ≈ = (4.1) where Q and R are the effective dimensionless coordinates of the protein (solvent) and the DNA, respectively R is related to the dimensional coordinate, R′, by, M R R R ω ' = , whereas Q is defined by Q = −(ε 2,m − ε 1,m ) el ℏω Q δ Q . Here, l = 1, 2, 3 for the ATP, ADP and empty forms, respectively, whereas m = 0, 1, 2, 3, for different positions of the DNA. Finally, α i is the difference between the minimums of the diagonal energies. The actual potential surface is obtained by diagonalizing the system Hamiltonian , g g g C E HC = (4.2) Based on the effective potential surface from Eq. 4.2, it is possible to run Langevin dynamics (LD) simulations and to explore the time dependence of the translocation process. 75 Chapter 5 Conclusions, discussions and future directions Based on the recent biochemical, structural information of LTag helices, this thesis presented the first simulation study on a ring-shaped helicase motor, the SV40 LTag helicase. The study integrated various established molecular simulation methods, and proposed two new protocols for structure-function study of proteins with cooperative movement. The proposed probabilistic model, ATP binding model and the translocation model from intensive simulation enhanced our understanding of the fundamental molecular mechanism underlying directional movement of ring-shaped helicase motor. 5.1 Conclusions and discussions In this thesis, we first studied the ATPase and helicase activity results from the biochemical experiments, and proposed a probabilistic model for the cooperative conformational change in the hexameric ring. The model connected the microscopic activity assumptions of hexamer ring configuration with the macroscopic assay activity 76 curve, so it can be applied to explain the underlying cooperative mechanism. Based on the probabilistic model we derived six candidate models to fit the ATPase and helicase activity data. The fitting results on four groups of ATPase mutant doping experiments indicated that the helicase monomers hydrolysis the nucleotides in a semi-sequential model in the presence of DNA. The helicase hydrolysis ATP in a linear model without DNA, this indicates that the hydrolysis may not need to form a hexameric ring. The fitting results on four groups of helicase mutant doping experiments shows different cooperative model based on the type of DNA presented in the assay. When the viral origin DNA is presented, the helicase monomer translocates/unwinds the DNA in a sequential model. When the fork DNA is presented, the helicase monomer cooperates in a semi-sequential pattern. The sequential pattern indicates that the six monomers are all required to initialize the DNA unwinding, while the semi-sequential pattern infers that two inactive adjacent monomers are tolerated for DNA unwinding. This probabilistic model framework can be applied to any ring-shaped helicase cooperative model. Then we proposed a three-stage ATP binding model based on the dynamics simulation. We built a pre-Apo model that put the ATP outside the binding pocket and then used the non-equilibration targeted molecular simulation method to simulate a dynamic trajectory. The trajectory suggested a three-staged ATP binding model with two key states, the weak binding state and the tight binding state had been observed. The ATP docked to the pocket in the weak binding state and was eventually “locked” into the binding pocket by three pairs of residues with charge-charge interactions across the pocket. Such a “cross-locking” ATP binding process is similar to the “binding zipper” 77 model reported for the F 1 -ATPase hexameric motor. The simulation also illustrated a coordination transition mechanism of Mg 2+ in the Mg-ATP complex. During the binding, the coordinated water molecules were stripped off and replaced by the residues in the binding pocket gradually. The ATP-Mg complex exchanged its hydrogen bonds to hydration waters with bonds to the catalytic site. These local movements triggered the large conformational changes of LTag helicase. This simulation study also showed that the DR/F-loop close to the C-terminal shrink faster than the β-hairpin in the central channel. The combined effect resulted in a screwed movement of the ssDNA along the central channel. The ATP binding process and the accompanying conformational changes in the context of a hexamer leaded to a refined cooperative iris model. We also quantitatively evaluated the interaction energy profile using the key structures of the binding process. The linear response approximation version of the semi-microscopic Protein Dipoles Langevin Dipoles method was adopted to calculate the binding energy. The two significant energy barriers on the profile were in consistent with the coordination transform of the magnesium cation and configuration preparation of the apical water nucleophilic attack. And the overall binding energy released is in the same magnitude as that in the water. One may question the TMD method on the artificial effect of trespassing the energy barrier. In this case, the structure divergence between the Apo state and the ATP bound state is relative small (rmsd is about 4Å). A 1ns trajectory is possible to capture the major conformational change. And all our six ATPs showed a similar binding pattern from docking to weak binding to strong binding stage, which consists with the experimental observation of nucleotides binding/escaping pathway. 78 These results indicate that the non-equilibrium TMD method, when combined with the normal molecular dynamics method, was efficient in capturing the major conformational change in a system with relative small conformational divergence. In Chapter 3, we simulated the electrostatic guidance of the helicase translocation and proposed a unidirectional ssDNA translocation model. The molecular origin of the helicase action was explored. A model was built based on the different crystal structures of the LTag hexameric helicase and the single-stranded DNA from the E1 protein of the same helicase superfamily. The single-stranded DNA was simplified as a sequence of ionized phosphates on the backbone. The coupling between the conformational change of the protein structure and the vectorial translocation of DNA was evaluated using a 2-D effective electrostatic free-energy surface. The simulated motion along the free-energy surface resulted in a vectorial translocation of the DNA. The electrostatic energy of the system appears to reproduce the directionality of this process. We then used a renormalization method to study the time dependence of the translocation. The extended translocation time is in the same magnitude of a similar helicase motor in the bacteriophage T7. Thus, we provided a consistent structure-based molecular description of the energetic and dynamics of the translocation process. This analysis may have general implications for relating structural models to translocation directionality in helicases and other DNA translocases. 79 5.2 Future directions There are quite a few interesting directions to be explored in the future, for example, the ATP hydrolysis procedure. The LTag helicase binding pocket is a typical ATP binding pocket with Walker A, B and sensor motifs. One prominent direction is to simulate the reaction of ATP hydrolysis, which can be segmented into several steps based on the possible intermediate resonance states. The empirical valence bond method (EVB) integrates the quantum mechanics to capture the energy of the covalent bond breaking and reforming, therefore can be applied to simulate the reaction in the hydrolysis process. Based on the energy calculated from EVB, one can evaluate the rate limit steps during the reaction. One may also mutate the residues to simulate the same reaction. The results might indicate the electrostatic and steric contribution of the residues in the reaction site and their functions in the pocket can be studied systematically in parallel with the biochemistry experiment. Another direction is to use the latest coarse-grained (CG) model to further investigate the DNA translocation and unwinding process. The CG model has been successfully used to investigate the F 1 -ATPase energy profile recently [44]. Some preliminary results as well as the theoretical model indicate that the allosteric movement could be somehow close the semi-sequential. The translocation combines shifting and rotation movement. Moreover, the ssDNA model can also be refined to capture the interaction effect of phosphate and base. After building the refined model, one might need to explore the possible configurations (between ssDNA and helicase) before coming to a rough idea of the 80 detailed mechanism. This can be done in two ways: 1) try some hypothetical patterns, such as sequential (one monomer transforms to the next nucleotide binding state after it adjacent monomer starts to transform to that state), semi-sequential (a few monomers transforms after one of their adjacent monomer starts to transform), and see if the energy map agrees with one of these patterns. 2) Use Monte-Carlo simulation to sample the possible DNA/protein configuration space, the major electrostatic interaction energy is adopted to guide the Monte-Carlo sampling. In the first method, the hypothetical model should based on accurate analysis and modeling of the interaction. The second method might take longer time due to the large configuration space. One might first try to build some coarse map in the configuration space and then advance to a map with refined grid. Overall, the two directions should be able to capture more refined details underling the DNA translocation movement. 81 References 1. Aci-Sèche, S., M. Genest, and N. Garnier, Ligand entry pathways in the ligand binding domain of PPARγ receptor. FEBS Letters, 2011. 585(16): p. 2599-2603. 2. Antes, I., et al., The Unbinding of ATP from F1-ATPase. Biophysical Journal, 2003. 85: p. 695-706. 3. Åqvist, J., V.B. Luzhkov, and B.O. Brandsdal, Ligand Binding Affinities from MD Simulations. Accounts of Chemical Research, 2002. 35(6): p. 358-365. 4. Arthur, A.K., A. Hoss, and E. Fanning, Expression of simian virus 40 T antigen in Escherichia colilocalization of T-antigen origin DNA-binding domain to within 129 amino acids. Journal of Virology, 1988. 62(1988): p. 1999–2006. 5. Beke-Somfai, T.s., P. Lincoln, and B. Norden, Mechanical Control of ATP Synthase Function: Activation Energy Difference between Tight and Loose Binding Sites. Biochemistry, 2010. 49(3): p. 401–403. 6. Braun-Sand, S., M. Strajbl, and A. Warshel, Studies of Proton Translocations in Biological Systems: Simulating Proton Transport in Carbonic Anhydrase by EVB- Based Models. Biophysical Journal, 2004. 87(4): p. 2221-2239. 7. Brooks, B.R., et al., CHARMM: a program for macromolecular energy, minimization and dynamics calculations. J. Comput. Chem 1983. 4: p. 187-217. 8. Campbell, K.S., et al., DnaJ/hsp40 chaperone domain of SV40 large T antigen promotes efficient viral DNA replication Genes Dev, 1997. 11(1997): p. 1098– 1110. 9. Chen, Z., H. Yang, and N.P. Pavletich, Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structures. Nature, 2008. 453(7194): p. 489-494. 82 10. Cheng, X., et al., Targeted Molecular Dynamics Study of C-Loop Closure and Channel Gating in Nicotinic Receptors. PLoS Comput Biol, 2006. 2(9): p. e134. 11. DeLano, W.L., The PyMOL Molecular Graphics System. 2002. 12. Enemark, E.J. and L. Joshua-Tor, Mechanism of DNA translocation in a replicative hexameric helicase. Nature, 2006. 442(7100): p. 270-275. 13. Gai, D., Y.P. Chang, and X.S. Chen, Origin DNA melting and unwinding in DNA replication. Current Opinion in Structural Biology, 2010. 20(6): p. 756-762. 14. Gai, D., et al., Mechanisms of conformational change for a replicative hexameric helicase of SV40 large tumor antigen. Cell, 2004. 119: p. 47-60. 15. Greenleaf, W., et al., Systematic study of the functions for the residues around the nucleotide pocket in simian virus 40 AAA+ hexameric helicase. J Virol., 2008. 82(12): p. 6017-6023. 16. Hanson, P. and S. Whiteheart, AAA+ proteins: have engine, will work. Nat. Rev. Mol. Cell Biol. , 2005. 6(7): p. 519-529. 17. Hopfner, K.-P. and J. Michaelis, Mechanisms of nucleic acid translocases: lessons from structural biology and sigle-molecule biophysics. Current Opinion in Structure Biology, 2007. 17(1): p. 87-95. 18. Huang, S., K. Weisshart, and E. Fanning, Characterization of the nucleotide binding properties of SV40 T antigen using fluorescent 3'(2')-O-(2,4,6- trinitrophenyl)adenine nucleotide analogues. Biochemistry, 1998. Nov 3(37(44)): p. 15336-44. 19. Humphrey, W., A. Dalke, and K. Schulten, VMD -- Visual M}olecular Dynamics,. Journal of Molecular Graphics, 1996. 14(33-38). 20. Isralewitz, B., M. Gao, and K. Schulten, Steered molecular dynamics and mechanical functions of proteins. Current Opinion in Structural Biology, 2001. 11(2): p. 224-230. 83 21. Johnson, D.S., et al., Single-Molecule Studies Reveal Dynamics of DNA Unwinding by the Ring-Shaped T7 Helicase. Cell, 2007. 129(7): p. 1299-1309. 22. Jones, A., O Software. 2008, Biomedical Centre, Institute of Cell and Molecular Biology, Uppsala University. 23. Jones, P. and A. George, Molecular-Dynamics Simulations of the ATP/apo State of a Multidrug ATP-Binding Cassette Transporter Provide a Structural and Mechanistic Basis for the Asymmetric Occluded State. Biophysical journal, 2011. 100(12): p. 3025-3034. 24. Jorgensen, W.L., et al., Comparison of simple potential functions for simulating liquid water. J. Chem. Phys., 1990. 79: p. 926-935. 25. Kamerlin, S.C.L., et al., Coarse-Grained (Multiscale) Simulations in Studies of Biophysical and Chemical Systems. Annual Review of Physical Chemistry, 2011. 62(1): p. 41-64. 26. Kamerlin, S.C.L. and A. Warshel, On the Energetics of ATP Hydrolysis in Solution. The Journal of Physical Chemistry B, 2009. 113(47): p. 15692-15698. 27. Kim, D.-E., M. Narayan, and S.S. Patel, T7 DNA Helicase: A Molecular Motor that Processively and Unidirectionally Translocates Along Single-stranded DNA. Journal of Molecular Biology, 2002. 321(5): p. 807-819. 28. Kovall, R. and B.W. Matthews, Toroidal Structure of λ-Exonuclease. Science, 1997. 277(5333): p. 1824-1827. 29. Kuriyan, J. and M. O'Donnell, Sliding Clamps of DNA Polymerases. Journal of Molecular Biology, 1993. 234(4): p. 915-925. 30. Leach, A.R., Molecular Modelling: Principles and Applications 2ed. 2001: Prentice Hall. 84 31. Lee, F.S., et al., Calculations of antibody-antigen interactions: microscopic and semi-microscopic evaluation of the free energies of binding of phosphorylcholine analogs to McPC603. Protein Engineering, 1992. 5(3): p. 215-228. 32. Lee, F.S., et al., Calculations of antibody-antigen interactions: microscopic and semi-microscopic evaluation of the free energies of binding of phosphorylcholine analogs to McPC603. Protein Engineering, 1992. 5(2): p. 215-228. 33. Lee, F.S., Z.T. Chu, and A. Warshel, Microscopic and Semimicroscopic Calculations of Electrostatic Energies in Proteins by the POLARIS and Enzymix Programs. J. Comp. Chem. , 1993. 14(2): p. 161-185. 34. Li, D., et al., The Structure of the Hexameric Replicative Helicase of SV40 Large Tumor antigen NATURE, 2003. 423: p. 512-518. 35. Liao, J.C., et al., Mechanochemistry of T7 DNA helicase. J. Mol. Biol, 2005. 350: p. 452-475. 36. Liao, J.C., et al., The conformational states of Mg.ATP in water. Eur Biophys J. , 2004. 33(1): p. 29-37. 37. Liu, H., et al., Simulating the electrostatic guidance of the vectorial translocations in hexameric helicases and translocases. Proc. Natl. Acad. Sci., 2009. 106: p. 7449-7454. 38. Lohman, T.M. and K.P. Bjornson, Mechanisms of helicase-catalyzed DNA unwinding. Annual Review of Biochemistry, 1996. 65: p. 169-214. 39. Ma, J., et al., A Dynamic Analysis of the Rotation Mechanism for Conformational Change in F1-ATPase. Structure, 2002. 10: p. 921-931. 40. Ma, J. and M. Karplus, Molecular switch in signal transduction: reaction paths of the conformational changes in ras p21. Proc. Natl. Acad. Sci., 1997. 94: p. 11905-11910. 85 41. Ma, J., et al., A Dynamic Model for the Allosteric Mechanism of GroEL. J. Mol. Biol. , 2000. 302: p. 303-313. 42. MacKerell, A.D., et al., All-atom empirical potential for molecular modeling and dynamics studies of proteins. J. Chem. Phys. , 1998. 102: p. 3586-3616. 43. Muegge, I., H. Tao, and A. Warshel, A fast estimate of electrostatic group contributions to the free energy of protein inhibitor binding. Prot Eng., 1997. 10: p. 1363-1372. 44. Mukherjee, S. and A. Warshel, Electrostatic origin of the mechanochemical rotary mechanism and the catalytic dwell of F1-ATPase. Proceedings of the National Academy of Sciences, 2011. 108(51): p. 20550-20555. 45. Myong, S., et al., Spring-Loaded Mechanism of DNA Unwinding by Hepatitis C Virus NS3 Helicase. Science, 2007. 317(5837): p. 513-516. 46. Oster, G. and H. Wang, Reverse engineering a protein: the mechanochemistry of ATP synthase. Biochim. Biophys. Acta. , 2000. 1458: p. 482-510. 47. Paci, E. and M. Karplus, Forced unfolding of fibronectin type 3 modules: an analysis by biased molecular dynamics simulations. Journal of Molecular Biology, 1999. 288(3): p. 441-459. 48. Patel, S.S. and K.M. Picha, Structure and function of hexameric helicases. Annual Review of Biochemistry, 2000. 69: p. 651-697. 49. Perdih, A., et al., Targeted Molecular Dynamics Simulatin Studies of Binding and Conformational Changes in E. coli MurD. PROTEINS: Structure, Function and Genetics, 2007. 68: p. 243-254. 50. R Development Core Team, R: A Language and Environment for Statistical Computing. 2008. 86 51. Ramakrishnan, C., V.S. Dani, and T. Ramasarma, A conformational analysis of Walker motif A [GXXXXGKT (S)] in nucleotide-binding and other proteins. Protein Eng., 2002. 15(10): p. 783-798. 52. Roca, M., et al., On the relationship between folding and chemical landscapes in enzyme catalysis. Proceedings of the National Academy of Sciences, 2008. 105(37): p. 13877-13882. 53. Ryckaert, J.P., G. Ciccotti, and H.J.C. Berendsen., Numerical-integration of cartesian equations of motion of a system with constraints - molecular-dynamics of N-Alkanes. J. Comput. Phys., 1977. 23: p. 327-341. 54. Schlitter, J., et al., Targeted molecular dynamics simulation of conformational change: application to the T-R transition in insulin. Mol. Sim 1993. 10 p. 291- 308. 55. Sclafani, R.A., R.J. Fletcher, and X.S. Chen, Two heads are better than one: regulation of DNA replication by hexameric helicases. Genes & Development, 2004. 18(17): p. 2039-2045. 56. Sham, Y.Y., et al., Examining methods for calculations of binding free energies: LRA, LIE, PDLD-LRA, and PDLD/S-LRA calculations of ligands binding to an HIV protease. Proteins, 2000. 39(4): p. 393-407. 57. Shan, Y., et al., How Does a Drug Molecule Find Its Target Binding Site? Journal of the American Chemical Society, 2011. 133(24): p. 9181-9183. 58. Shen, J., et al., The roles of the residues on the channel 尾-hairpin and loop structures of simian virus 40 hexameric helicase. Proceedings of the National Academy of Sciences of the United States of America, 2005. 102(32): p. 11248- 11253. 59. Shen, J., et al., The roles of the residues on the channel beta-hairpin and loop structures of simian virus 40 hexameric helicase. . PNAS., 2005. 102: p. 11248- 11253. 87 60. Shurki, A. and A. Warshel, Why does the Ras Switch “Break” By Oncogenic Mutations? PROTEINS: Structure, Function and Genetics, 2004. 55: p. 1-10 61. Singleton, M.R., M.S. Dillingham, and D.B. Wigley, Structure and Mechanism of Helicases and Nucleic Acid Translocases. Annual Review of Biochemistry, 2007. 76: p. 23-50. 62. Smelkova, N.V. and J.A. Borowiec, Synthetic DNA Replication Bubbles Bound and Unwound with Twofold Symmetry by a Simian Virus 40 T-Antigen Double Hexamer. Journal of Virology, 1998. 72(11): p. 8676-8681. 63. S ̆trajbl, M., A. Shurki, and A. Warshel, Converting conformational changes to electrostatic energy in molecular motors: The energetics of ATP synthase. Proc. Natl. Acad. Sci., 2003. 100(25): p. 14834-14839. 64. van der Vaart, A. and M. Karplus, Simulation of conformational transitions by the restricted perturbation-targeted molecular dynamics method. J Chem Phys. , 2005. 122(11): p. 114903. 65. Walker, J., et al., Distantly related sequences in the alpha- and beta-subunits of ATP synthase, myosin, kinases and other ATP-requiring enzymes and a common nucleotide binding fold. EMBO Journal, 1982. 1(8): p. 945–951. 66. Warshel, A. and S. Russell, Calculations of electrostatic interactions in biological systems and in solutions. . Q Rev Biophys., 1984. 17(3): p. 283-422. 67. Warshel, A., et al., Modeling electrostatic effects in proteins. Biochimica et Biophysica Acta (BBA) - Proteins & Proteomics, 2006. 1764(11): p. 1647- 1676. 68. Weber, J., et al., Mg2+ Coordination in Catalytic Sites of F1-ATPase. Biochemistry, 1998. 37(2): p. 608 -614. 69. White, M.K., et al., Human polyomaviruses and brain tumors. Brain Research Reviews, 2005. 50(1): p. 69-85. 88 70. Xiang, Y., et al., Simulating the Effect of DNA Polymerase Mutations on Transition-State Energetics and Fidelity: Evaluating Amino Acid Group Contribution and Allosteric Coupling for Ionized Residues in Human Pol β†. Biochemistry, 2006. 45(23): p. 7036-7048. 71. Yang, L.-J., et al., Steered Molecular Dynamics Simulations Reveal the Likelier Dissociation Pathway of Imatinib from Its Targeting Kinases c-Kit and Abl. PLoS ONE, 2009. 4(12): p. e8470. 72. Yea, J., et al., RecA-like motor ATPases-lessons from structures Biochim Biophys Acta., 2004. 1659(1): p. 1-18. 73. Yu, J., T. Ha, and K. Schulten, Structure-Based Model of the Stepping Motor of PcrA Helicase. Biophysical Journal, 2006. 91(6): p. 2097-2114. 74. Yu, J., T. Ha, and K. Schulten, How Directional Translocation is Regulated in a DNA Helicase Motor. Biophysical Journal, 2007. 93(11): p. 3783-3797. 75. Zhang, L. and J. Hermans, Hydrophilicity of cavities in proteins. Proteins, 1996. 24: p. 433-438. 89 Appendix A: R 2 of the allosteric models Table A.1: R 2 value of six models in fitting helicase inactivation experiments R 2 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 ATPase 0.653 0.953 0.891 0.993 0.932 0.881 ATPase + ssDNA 0.707 0.980 0.831 0.975 0.968 0.928 ATPase + folk DNA 0.891 0.984 0.634 0.862 0.988 0.993 ATPase + orig. DNA 0.910 0.975 0.598 0.835 0.982 0.995 Helicase + fork DNA 0.939 0.960 0.486 0.782 0.969 0.988 Helicase + orig. DNA 0.994 0.861 0.452 0.692 0.873 0.912 Helicase + fork DNA* 0.912 0.965 0.498 0.792 0.976 0.994 Helicase + orig. DNA* 0.983 0.749 0.299 0.550 0.756 0.803 * indicate the K512/H513 mutation 90 Appendix B: Interface distance Figure B.1: The gap size variation after transforming the state of one monomer. (a) The gap size variation of of Lock2 (c-α distance of ASP429 and ARG418) and (b) Lock3 (c-α distance of ASP474 and ARG540) after one monomer is altered to the ATP bound state. The average gap size is plotted in redline, and the trans-side gap is plotted in blue, the cis-sided gap is plotted in green.
Abstract (if available)
Abstract
Helicases are motor protein that utilize the energy derived from NTP binding and hydrolysis to translocate and unwind DNA/RNA during the replication. Understanding the energy coupling of NTP hydrolysis cycle to the DNA movement is the key to understand the DNA replication mechanism in the molecular motor. The helicase domain of simian virus 40 large tumor antigen (SV40 LTag) is a ring-shaped AAA+ domain that participates in viral DNA replication and host cell growth control. Recent SV40 LTag structure studies have provided a set of high resolution structures in different nucleotide binding states. Hence, in this thesis we use LTag helicase as a model protein, and present the first systematic simulation study on the mechanism of the LTag helicase motor. Our work includes three major sections: first, we model the LTag ATPase activity and the helicase activity based on the biochemistry experiment results. This model indicates that the LTag helicase subunits work in highly cooperative patterns. When the origin DNA is presented, the helicase translocates DNA in a sequential pattern. When the fork DNA is added, the helicase works in a semi-sequential pattern, otherwise, the subunit cooperativity is not significant. Second, we present the first simulation study on the ATP binding/hydrolsis procedure using the non-equilibrated molecular dynamics method, the results suggest a three-stage Locker-binding model. We evaluate the energy profile using the LRA version of the semi-microscopic Protein Dipoles-Langvin Dipoles method (PDLD/S). The energy profile matches the experimental results. Thirdly, we investigate the electrostatic energy that guides the single-strand DNA (ssDNA) translocation process and propose a unidirectional translocation model. To accomplish this work, an ssDNA/LTag complex model is built using the structure information from the LTag helicase and the E1 protein-DNA complex, a two-dimensional effective electrostatic free-energy landscape is calculated based on the ssDNA/LTag model, and the unidirectional model is proposed by evaluating the energy landscape. The time dependence of the coupled protein-DNA motion is explored by simulating the translocation process using a renormalized method. Altogether, our theoretical and simulation study advanced our understanding of the fundamental molecular mechanism underlying the directional movement of ring-shaped helicase motor.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Structural and biochemical studies of large T antigen: the SV40 replicative helicase
PDF
Mechanism study of SV40 large tumor antigen atpase and helicase functions in viral DNA replication
PDF
Biochemical characterization and structural analysis of two hexameric helicases for eukaryotic DNA replication
PDF
Genome-wide studies reveal the function and evolution of DNA shape
PDF
Machine learning of DNA shape and spatial geometry
PDF
Genome-wide studies of protein–DNA binding: beyond sequence towards biophysical and physicochemical models
PDF
Quantitative modeling of in vivo transcription factor–DNA binding and beyond
PDF
Structure and function of archaeal McM helicase
PDF
Data-driven approaches to studying protein-DNA interactions from a structural point of view
PDF
Molecular dynamics simulations of lipid bilayers in megavolt per meter electric fields
PDF
Exploring the nature of the translocon-assisted protein insertion
Asset Metadata
Creator
Shi, Yemin
(author)
Core Title
Simulating the helicase motor of SV40 large tumor antigen
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Computational Biology and Bioinformatics
Publication Date
05/01/2012
Defense Date
05/01/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
ATP binding,binomial model,computational simulation,DNA translocation,DNA unwinding,hHelicase motor,large tumor antigen,molecular dynamics,OAI-PMH Harvest,simian virus 40
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Chen, Xiaojiang S. (
committee chair
), Rohs, Remo (
committee member
), Warshel, Arieh (
committee member
)
Creator Email
biostanley@gmail.com,yeminshi@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-20225
Unique identifier
UC11288880
Identifier
usctheses-c3-20225 (legacy record id)
Legacy Identifier
etd-ShiYemin-698.pdf
Dmrecord
20225
Document Type
Dissertation
Rights
Shi, Yemin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
ATP binding
binomial model
computational simulation
DNA translocation
DNA unwinding
hHelicase motor
large tumor antigen
molecular dynamics
simian virus 40