Close
About
FAQ
Home
Collections
Login
USC Login
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A novel reaction of mismatched cytosine-cytosine pairs associated with Fragile X
(USC Thesis Other)
A novel reaction of mismatched cytosine-cytosine pairs associated with Fragile X
PDF
Download
Share
Open document
Flip pages
Copy asset link
Request this asset
Request accessible transcript
Transcript (if available)
Content
A NOVEL REACTION OF MISMATCHED CYTOSINE-CYTOSINE PAIRS ASSOCIATED WITH FRAGILE X Copyright 1999 by Rebecca Miranda Romero A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA in Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BIOCHEMISTRY) December 1999 Rebecca Miranda Romero Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK LOS ANGELES, CALIFORNIA 90007 This dissertation, written by R e b e c c a M ira n d a Rom ero under the direction of hex. Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillment of re quirements for the degree of DOCTOR OF PHILOSOPHY Dean of Graduate Studies Date ..November Chairperson N COMMITTEE Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgment I would like to thank my advisor, Dr. Ian S. Haworth, and my committee members. Dr. Broek, Dr. Johnson, Dr. Reddy, and Dr. Stallcup, for their guidance, encouragement, and support. I am deeply and forever grateful that all o f you believed me. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS Acknowledgment...................................................................................................................................ii List of Figures....................................................................................................................................... vii List of Tables...........................................................................................................................................x Abstract.................................................................................................................................................. xi Chapter I Introduction........................................................................................................................... 1 I. I What is Fragile X ? .......................................................................................................................1 1.1.1 Fragile X is Characterized By Three Tandemly Repeating Units of DNA.....................1 1.1.2 Fragile X is Caused by a Lack of Expression of the Mental Retardation Protein..........2 1.2 Other Diseases Associated With Tandem Repeat Expansion.................................................. 3 1.3 Current Questions....................................................................................................................... 4 1.4 DNA Conformations................................................................................................................... 5 1.5 Proposed Methods of Expansion............................................................................................... 7 1.5.1 Hairpins and Slippage Model............................................................................................. 8 1 . 6 Proposed Method of Methylation.............................................................................................1 1 1.6 .1 Active Methylation............................................................................................................ 1 1 1.6.2 Lack of Demethylation......................................................................................................12 1.7 Basis of Fragility........................................................................................................................12 1.8 Significance of Mismatched Cytosines within the d(CCG)n Repeating Sequence of the C- Rich Strand of Fragile X ..................................................................................................................13 1.9 Limits of Analysis......................................................................................................................14 1.10 Overview of Research..............................................................................................................15 Chapter 2 Modeling of the Structure of the C-Rich Strand of Fragile-X: At Physiological pH, d(CCG)i5 Forms a Hairpin and a Distorted Helix..............................................................................18 2.1 Overview of Chapter..................................................................................................................18 2.2 Introduction................................................................................................................................ 19 2.2.1 Classification of Triplet Repeat Expansion Sequences...................................................19 2.2.2 Hairpin Alignment.............................................................................................................20 i (a) Alignment....................................................................................................................... 21 ii (b) Alignment...................................................................................................................... 22 2.2.3 Significance of the (b) Alignment In Longer Sequences................................................ 22 2.3 Method....................................................................................................................................... 23 2.3.1 Simulation (i):Tetraplex DNA Containing Protonated C-C Pairs.................................24 2.3.2 Simulation (ii): B-DNA Hairpin Including Intrahelical, Mismatched C-C Pairs......... 24 Simulation (iia): Nonprotonated Hairpins........................................................................ 24 Simulation(iib): B-DNA hairpin including intrahelical, protonated mismatched C-C pairs..........................................................................................................................................25 2.3.3 Simulation (iii): Extended e-motif hairpin based on coordinates o f original e-motif...27 2.3.4 Simulation (iv): Extended Idealized e-motif Hairpin Including C/C Stacking in the Minor Groove.............................................................................................................................. 28 2.4 Results........................................................................................................................................ 29 2.4.1 Experimental Background.............................................................................................. 29 i Experimental Evidence Of Secondary Structure................................................................29 ii Experimental Evidence Fora Hairpin............................................................................... 33 iii Experimental Evidence For the Ee-motif Hairpin.......................................................... 36 iii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.4.2 Computer Modeling........................................................................................................ 40 i Simulation (i): Tetraplex DNA Containing Protonated C-C Pairs.....................................43 ii Simulation (ii): B-DNA hairpin including intrahelical, mismatched C-C pairs and intrahelical, protonated mismatched C-C pairs.................................................................... 47 iii Simulation (iii): Extended e-motif Hairpin Based on The Coordinates of The Original e-motif..................................................................................................................................... 48 iv Simulation (iv): Extended idealized e-motif hairpin including C/C stacking in the minor groove........................................................................................................................... 54 2.5 Discussion.................................................................................................................................. 55 2.5.1 Significance and Relevance of a Hairpin Folded in a (b) Alignment to Fragile X 59 2.6 Summary.................................................................................................................................... 61 Chapter 3 Molecular Dynamics Simulations of DNA Molecules Containing d(GCC)n «d(GCC)„ Fragments and C-C Mismatch Pairs....................................................................................................64 3.1 Overview of Chapter................................................................................................................. 64 3.2 Introduction................................................................................................................................65 3.3 Methods..................................................................................................................................... 69 3.4 Results and Discussion.............................................................................................................70 3.5 Summary.................................................................................................................................... 76 Chapter 4 Anomalous Crosslinking by Mechlorethamine of DNA Duplexes Containing C-C Mismatch Pairs..................................................................................................................................... 78 4.1 Overview of Chapter................................................................................................................. 78 4.2 Introduction................................................................................................................................ 79 4.2.1 Mechlorethamine as a probe............................................................................................. 80 4.3 Materials and Methods..............................................................................................................82 4.4 Results.........................................................................................................................................85 4.4.1 Anomalous crosslinking of a DNA duplex containing a C-C mismatch pair............... 85 4.4.2 Piperidine cleavage of the crosslinked C-C mismatch duplex gives fragments consistent with alkylation of the mismatched bases................................................................. 90 4.4.3 N7-deazaguanine substitution does not influence formation of the crosslink in the C-C mismtached duplex......................................................................................................................91 4.4.4 A mechlorethamine crosslink forms with any DNA duplex containing a single C-C mismatch pair..............................................................................................................................94 4.4.5 The formation of the C-C crosslink is pH dependent..................................................... 94 4.4.6 The C-C crosslinked species is reactive with DMS at guanine bases adjacent to the C- C mismatch pair...........................................................................................................................97 4.4.7 The mechlorethamine C-C crosslink is inhibited by the DNA minor groove binder Hoechst 33258.............................................................................................................................98 4.5 Discussion................................................................................................................................ 100 4.6 Summary...................................................................................................................................102 Chapter 5 Kinetics and Sequence Dependence of the DNA Crosslink formed by Mechlorethamine with Cytosine-Cytosine Mismatch Pairs:......................................................................................... 104 5.1 Overview of Chapter............................................................................................................... 104 5.2 Introduction.............................................................................................................................. 104 5.2.1 The Kinetics of Mechlorethamine.................................................................................. 106 5.3 Materials and Methods............................................................................................................ 108 5.4 Results.......................................................................................................................................112 5.4.1 The mechlorethamine C-C crosslink forms more rapidly than the 1,3 G-G crosslink, and reaches a higher final yield................................................................................................1 1 2 iv Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.4.2 The 1,3 G-G crosslink forms more slowly and reaches a lower final yield in a d(GCC)«d(GCC) sequence, compared to a d(GCC)*d(GGC) sequence...............................114 5.4.3 The mechlorethamine C-C crosslink is more stable than the 1,3 G-G crosslink.........114 5.4.4 The amount of mechlorethamine C-C crosslink formed is dependent on the GC content of the base pairs flanking the C-C mismatch........................................................................... 116 5.4.5 Molecular dynamics simulations suggest the C-C mismatch pair is less mobile in a d(GCC)*d(GCC) sequence than in a d(ACT)*d(ACT) sequence...........................................120 5.5 Discussion.................................................................................................................................124 5.5.1 Detectable amounts of 1,3 G-G crosslinks form after 10 hours in a d(Gi£C)«d(Gj£C) duplex fragment..........................................................................................................................125 5.5.2 The amount of the mechlorethamine C-C crosslink formed is reduced by a decreased GC:AT ratio in the bases flanking the C-C mismatch pair..................................................... 125 5.6 Summary...................................................................................................................................128 Chapter 6 Mechlorethamine can crosslink multiple C-C mismatches in the d(GCC)n Trinucleotide repeat sequence............................................................................................................ 130 6 .1 Overview and Introduction.......................................................................................................130 6.2 Materials and Methods...........................................................................................................132 6.3 Results....................................................................................................................................... 135 6.3.1 The mechlorethamine C-C Crosslink forms in multiple repeats with variable efficiency .....................................................................................................................................................135 6.3.2 The DPAGE mobility of the mechlorethamine C-C crosslinked species is dependent on the GC:AT/CC content of the base pairs flanking the C-C mismatch..............................139 6.3.3 Sequential replacement of A-T pairs by G-C pairs has a predictable effect on the amount of C-C crosslinked DNA and on the DPAGE mobility of the crosslink.................141 6.4 Discussion.................................................................................................................................143 6.5 Summary...................................................................................................................................144 Chapter 7 Electrophoretic Mobility of Mechlorethamine Crosslinked DNA Duplexes Containing C-C Mismatches and 5'-End Labeled with 32P-Phosphate or Fluoroscein Phosphoramidite....... 145 7.1 Overview of Chapter................................................................................................................145 7.2 Introduction...............................................................................................................................146 7.3 Material and Methods..............................................................................................................148 7.4 Results....................................................................................................................................... 150 7.4.1 The electrophoretic mobility of a C-C crosslinked duplex is dependent on the position of the crosslink in the duplex, and on the location of the 32P-phosphate 5'-end label..........150 7.4.2 Multiple DPAGE bands are observed for32P-phosphate double-labeled duplexes. ...151 7.4.3 The influence of crosslink position on mobility is unchanged by inclusion of a 5'- fluorescein phosphoramidite label, but the influence of the label position on mobility within a duplex does change................................................................................................................. 153 7.4.4 Duplexes carrying both 5'-fluorescein phosphoramidite and 5'- 32P-phosphate labels show an overall similar electrophoretic mobility.....................................................................155 7.4.5 The relative mobility of two crosslinked duplexes with identical sequences flanking the C-C mismatch also depends on the position of the C-C mismatch.................................. 159 7.4.6 For ‘symmetrical’ crosslinked duplexes carrying 5'-fluorescein phosphoramidite labels the different mobility caused by labeling either end of the duplex occurs as a function of the duplex sequence...............................................................................................................161 7.5 Discussion.................................................................................................................................162 7.6 Summary...................................................................................................................................169 7.7 Relevance to Fragile X.............................................................................................................169 v Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 8 Theoretical Computer Modeling of the C-C Crosslink..................................................171 Overview and Introduction............................................................................................................171 8.1 Methods....................................................................................................................................172 8.2 Results.......................................................................................................................................175 8.2.1 The mechlorethamine / DNA complex...........................................................................175 8.2.2 The N3(Cyt)-mechlorethamine-N3(Cyt) crosslinked diadduct....................................176 8.2.3 DNA motion in response to mechlorethamine binding................................................180 8.3 Discussion.................................................................................................................................183 8.4 Summary and Conclusions......................................................................................................184 References........................................................................................................................................... 187 Abbreviations......................................................................................................................................208 vi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure 1.1. DNA Conformations, (a) Structure of B-DNA (b) Schematic of hairpin DNA (c) Theoretical Triad DNA. Triangles represent base pairing, (d) Schematic of tetraplex/qudraplex ....................................................................................................................... 7 Figure 1.2. Expansion Mechanisms. (A-D) Models for Slipped or S-DNA structure (E) Expansion by hairpin formation within Okazaki Fragments...................................................................... 10 Figure 1.3. Strand Breaking Mechanism of Expansion......................................................................1 1 Figure 1.4. Theoretical Alignment of the Fragile X Strands During Slippage.................................14 Figure 2.1 The (a) and (b) hairpin alignment of d(CCG)n.................................................................21 Figure 2.2. Schematic of the structures used for the simulation of (i)tetraplex, and (ii) hairpin....26 Figure 2.3. Schematic of the e-motif, extended(iii), and idealized e-motif (iv)...............................28 Figure 2.4. Single-stranded (CCG)|5 forms two pH-dependent structures. (A) pH-dependent electrophoretic analysis. (B) Temperature-dependent electrophoretic analysis of ss(CCG)15.(C) UV absorbance melting profile of ss(CCG)|5 ................................................. 31 Figure 2.5 Possible arrangement of protonated cytosines, (a) Antiparallel and (b) parallel 33 Figure 2.6. Chemical modification with hydroxylamine at pH 8.5 reveals a hairpin in a (b) alignment...................................................................................................................................... 35 Figure 2.7. PI nuclease digestion of ss(CCG)|5 at pH 8.5................................................................. 38 Figure 2.8. PI nuclease digestion of ss(CCG)15 at pH 6.5. Results were similar at pH 7.5..........40 Figure 2.9. Possible conformations adopted by single-stranded (CCG)n.........................................41 Figure 2.9. continuedmotif..................................................................................................................42 Figure 2.10. Proposed strand and base arrangment for the (CCG)is tetraplex structure................45 Figure 2.11. Representative graphs of the various energy surfaces generated............................... 46 Figure 2.12. Construction of the Ee-motif......................................................................................... 49 Figure 2.13. Motion of the extrahelical cytosine bases during simulation (iii)Ee-motif............... 53 Figure 2.14 The Ee-motif and idealized Ee-motif. (A) The Ee-motif Ops starting structure and the 1-7 original interacting bases are highlighted. (B) Ee-motif lOOps structure and the new 1-4 interactions (C) The idealized Ee-motif.....................................................................................54 Figure 3.1. Possible hairpin alignments of d(CCG)n, for molecules where n, the number of CCG repeats, is odd or even................................................................................................................. 66 Figure 3.2. Hairpin alignment of d(CCG)l 1G used in the molecular dynamics simulation 71 Figure 3.3. Snapshots of (a) the d(G£C)*d(GCC) fragment of the 13mer duplex and (b) the d(GgCiuCn)»d(G:4C:5C26) fragment of the d(CCG)uG hairpin after Ops (canonical B-DNA conformation identical in each simulation), 60ps, I20ps and 180ps....................................... 72 Figure 3.4. Structures of (a) the d(GCC)*d(G£C) fragment of the 13mer duplex and (b) the d(Gu C inC. i )«d(G->jC-.<C ■ > , . ) fragment of the d(CCG)uG hairpin after I80ps of molecular dynamics, viewed looking into the major groove, and contrasted with a canonical B-DNA conformation (Ops).......................................................................................................................74 Figure 3.5. Guanine-guanine N7 to N7 distances in d(G£C)*d(G£C) fragments o f the 9mer and 13mer duplex and the d(CCG)l 1G hairpin................................................................................75 Figure 4.1. (a) Mechlorethamine (R=CH3 ) or chlorambucil (R=C6H4(CH2)3COOH) and (b) the mechlorethamine and chlorambucil 5'-GXC..5'-GYC 1,3 G-G crosslink.................................81 Figure 4.2. Mechlorethamine crosslinking of DNA duplexes of sequence d(CTCTCAGAGXCTCGTTCAG) d(CTGAACGAGYCTCTGAGAG) (Table 4.1, Series l).(A) Autoradiography of a 20% DPAGE gel of DNA duplexes where X-Y = C-G, C-A, C- C, T-G, T-A and T-C. (B) Autoradiography of a 20% DPAGE gel for duplexes X-Y = C-G, C-C and T-A in which either the top strand (lane T) or bottom strand (lane B) was labeled 8 8 vii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 4.2 continued. Mechlorethamine crosslinking of DNA duplexes of sequence d(CTCTCAGAGXCTCGTTCAG)d(CTGAACGAGYCTCTGAGAG) (C) Quantification of the mechlorethamine-crosslinked bands (D) Autoradiography of a 20% DPAGE gel for single strands or duplexes where X=C, Y =C............................................................................89 Figure 4.3. Piperidine cleavage of the crosslink bands from lanes T and B of Figure 4.2B..........91 for the crosslinked DNA duplexes dfCiTjCjT^sAftCTAnGeXioCn T^CuGuTu TibCnAiKGisJ^dCCioTiiGiiAijAi^isGjftAiTGigYwCjoTji C3 2 T3 3G 34A35G36A 37G 38) ( X-Y = C-G or C C, Table 4.1, Series 1). Lanes G and C are Maxam-Gilbert G and C reactions. Lanes T and B show the results of piperidine cleavage of the top strand (that containing X) and the bottom strand (that containing Y)..................................................................................91 Figure 4.4. Autoradiography of a 20% DPAGE gel following incubation with mechlorethamine of duplexes of sequence d(CTCTCAGAMXnTCGTTCAG)* d(CTGAACGANYmTCTGAGAG) (where X-Y = C-G or C-C, n and m = C, and M and/or N = G or D (N7-deazaguanine, Table 4.1, Series 2) and d(CTCTCACACCGTGGTTCAG)* d(CTGAACCACCGTGTGAGAG) (Series 2a)........................................................................92 Figure 4.5. Piperidine cleavage of the crosslink bands..................................................................... 93 Figure 4.6. Autoradiogram of a 20% DPAGE gel following incubation with mechlorethamine of duplexes d(CTCTCACAMCnTGGTTCAG)*d(CTGAACCANCmTGTGAGAG) (Table 4.1, Series 3) and d(CTCTCACGACTCGGTTCAG)*d(CTGAACCGACTCGTGAGAG) (Table 4 .1, Series 3a)...................................................................................................................96 Figure 4.7. Autoradiogram of 20% DPAGE gels following incubation with mechlorethamine or chlorambucil of duplexes of sequence d(CTCTCAGAGXCTCGTTCAG)* d(CTGAACGAGYCTCTGAGAG) (Table 4.1, Series 1), where X-Y = C-G, or C-C.........98 Figure 4.8. (A) Maxam-Gilbert sequencing gel of the piperidine cleavage products resulting from incubation of the chlorambucil and mechlorethamine-crosslinked C-C mismatch duplexes with DMS. (B) Autoradiography of a 20% DPAGE gel of the duplex of sequence d(CTCCCAATTCAATTCCCAG)*d(CTGGGAATTCAATTGGGAG) (Table 4.1, Series 4), following pre-incubation with Hoechst 33258, then incubation with lOOpM mechlorethamine....................................................................................................................... 1 0 0 Figure 5.1. The structure of mechlorethamine (a) and representations of the mechlorethamine crosslinks with (b) a d(GXC)«d(GYC) duplex fragment (a 1,3 G-G crosslink) and (c) a C-C mismatch pair............................................................................................................................. 106 Figure 5.2. The nitrogen mustard reaction.........................................................................................107 Figure 5.3. A. Autoradiogram of a 20% DPAGE gel following incubation for times of up to 120 minutes of 100pM mechlorethamine with duplexes la and lb (Table 5.1 ).B. Quantification of the autoradiogram showing the time course of total crosslink formation ...................... 113 Figure 5.4. A. Autoradiogram of a 20% DPAGE gel following incubation for times of up to 24 hours of lOOpM mechlorethamine with duplexes la and lb (Table 5.1).B. Quantification of the autoradiogram showing the time course............................................................................ 115 Figure 5.5. A.. Autoradiogram of a 20% DPAGE gel following incubation for 6 hours with lOOpM mechlorethamine of duplexes Ila to Ilk (Table 5.2). B Sequencing gel of the piperidine cleavage products excised from A............................................................................................ 119 Figure 5.6. Data from molecular dynamics simulations of duplexes Ila' and I l j '........................ 122 Figure 5.7. A. Plot of the standard deviations, Sigma, in A as a function of the mean distance C l’- C l’...............................................................................................................................................1 2 2 Figure 5.8. Snapshots taken from the molecular dynamic simultion.............................................. 123 Figure 6.1. Autoradiogram of a 20% DPAGE gel following incubation for 6 hours with lOOpM mechlorethamine of Series Iduplexes (Table 6.1). (A) Single isolated repeats. (B) Multiple and isolated repeats....................................................................................................................137 Figure 6.2. Autoradiogram of a 20% DPAGE sequencing gel........................................................ 138 viii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 6.3. Autoradiogram of a 20% DPAGE gel showing the variable mobility of the C-C crosslink, of duplexes from Table 6.1, Series 1.......................................................................140 Figure 6.4. Autoradiogram of a 20% DPAGE gel following incubation for 6 hours with 100pM mechlorethamine. (A) ccc at the bottom indicates the duplex with three repeats form Table 6.1, Series 1, a indiates the duplex from Table 6.2, Series 2, a. (B) Duplexes from Table 6.2, Series 2, a-e)............................................................................................................................... 142 Figure 7.1. Structure of the 5’-fluorescein label...............................................................................147 Figure 7.2. Autoradiography of a 20% DPAGE gel following incubation with lOOpM mechlorethamine of 3 2 P-labeled duplexes 10a, 7a and 3a (see Table 7.1 for sequence) 152 Figure 7.3. Autoradiography of a 20% DPAGE gel following incubation with lOOpM mechlorethamine of fluoroscein-labeled duplexes 10a, 7a and 3a (see Table 7.1 for sequence).................................................................................................................................... 154 Figure 7.4. Autoradiography of a 20% DPAGE gel following incubation with lOOpM mechlorethamine of duplexes 10a, 7a and 3a (see Table 7.1 for sequence) carrying simultaneous 3:P and fluorescein labels...................................................................................157 Figure 7.5. Side by side view of the double label and single 3 2 P label. T indicates that the top strand was labeled with 3 2 P, B indicates that the bottom strand was labeled with 3 2 P. Double indicates that both labels are present........................................................................................158 Figure7.6. Side by side view of the double label and single fluorescein-labeled duplexes. T indicates that the top strand was labeled with fluorescein, B indicates that the bottom strand was labeled with fluorescein. Double indicates that both labels are present........................ 158 Figure 7.7. Autoradiography of a 20% DPAGE gel following incubation with lOOgM mechlorethamine of 3 2 P-labeled duplexes 10b and 7b (see Table 7.1 for sequence)........... 160 Figure 7.8. Autoradiography of a 20% DPAGE gel following incubation with lOOpM mechlorethamine of fluoroscein-labeled duplexes 10c (see Table 7.1 for sequence).......... 161 Figure 7.9.Schematic representations of (a) 19 base pair DNA duplexes having a mechlorethamine C-C crosslink at base pair 10, base pair 7 and base pair 4 with a 3 2 P label.. .................................................................................................................................................... 163 Figure 7.10. Schematic representations of (a) 19 base pair DNA duplexes having a mechlorethamine C-C crosslink at base pair 10, base pair 7 and base pair 4, with a flourouscein label.......................................................................................................................167 Figure 7.11. Schematic representations of (a) 19 base pair DNA duplexes having a mechlorethamine C-C crosslink at base pair 10, base pair 7 and base pair 4, with a flourouscein label, and 3 2 P label............................................................................................... 168 Figure 8.1. (a) Dimethyl sulfate, (b) Hoechst 33258........................................................................171 Figure 8.2. Mechlorethamine and cytosine numbering system, (a) Mechlorethamine (b) Mechlorethamine monoadduct. (c) Mechlorethamine C-C crosslink diadduct.................... 175 Figure 8.3. Representative starting structure (Ops) for the diadduct crosslink simulation.. Shown is the top view and side view, respectively..............................................................................176 Figure 8.4. View of the motion of the cytosines and mechlorethamine after lOOps..................... 177 Figure 8.5 View into the minor groove at 200ps for both duplexes............................................... 178 Figure 8.6. Comparison of the Ops and 200 ps structures for both duplexes................................. 179 Figure 8.7. Inter-Base pair conformational parameters....................................................................180 Figure 8.8. Base Pair parameters for duplex ACT........................................................................... 181 Figure 8.9. Base Pair parameters for duplex GCC........................................................................... 182 ix Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Tables Table 1.1. Diseases Associated with Tandem Repeat Expansions..................................................... 4 Table 1.2. Structural Features of A, Ideal B-DNA, Z-DNA, and Theoretical Triad DNA..............5 Table 2.1: Melting Temperatures of 15 Class I Triplet Repeat Sequences in - I mM Na+........... 32 Table 4.1. DNA duplex sequences.....................................................................................................87 Table 5.1. Kinetic parameters and extent of reaction for mechlorethamine crosslinking of duplexes containing C-C mismatches and 1,3 G-G crosslink sites .....................................112 Table 5.2. Level of mechlorethamine crosslinking, electrophoretic mobility, and Tm of duplexes d(C TC TC A 4C 3M 2M lCn1 n2G 3G /rTCAG )*d(CTG AAC4C3N2NlCmI m 2 G 3T.,G A G A G ) 117 Table 6.1. Series 1: Level of mechlorethamine crosslinking, Tm, and electrophoretic mobility, of duplexes d(CTCTC (GCC)n GT ATC)*d(G AT AC (GYC)n GAGAG).................................136 Table 6.2. Series 2:Level of crosslinking and electrophoretic mobility of duplexes d(CTCCCM4M3M;M|Cn,n2n3a.CCCAG)*d(CTGGGN4N3N2N,Cmim2m3m4GGGAG). ..142 Table 7.1. DNA duplex sequences and shorthand notations...........................................................151 Table 8.1. Atomic point charges (milli-electrons) obtained from the AMl-derived electrostatic potential for free mechlorethamine in the bis(chloroethyl) form, for the mechlorethamine (N3)cytosine monoadduct and for the cytosine(N3) mechIorethamine-(N3)cytosine diadduct .................................................................................................................................................... 174 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. x Abstract Fragile-X Syndrome is the most common form o f inherited mental retardation and is characterized by expansions o f d(CGG)n»d(CCG)n trinucleotide repeat DNA sequences. In the Fragile-X sequence, strand separation o f the duplex can occur, and the resultant d(CGG)n and d(CCG)n single strands can fold into hairpins. For d(CCG)n, the hairpin may form with a specific alignment containing repeating d(GCC)«d(GCC) ‘duplex’ fragments, giving a hairpin stem in which every third base pair is a C-C mismatch that may become extrahelical. Mismatched C-C base pairs have been associated with unusual secondary structures formed by DNA trinucleotide repeat sequences, other fragile sites, and reduced mismatch repair. The trinucleotide repeat expansion mechanism, the cause o f fragility, or the reason for reduced repair is unclear, but may be due to unusual DNA conformations that can form during replication. To examine these conformations, computer modeling o f the structure o f the C- rich strand o f Fragile X and probing o f the structure o f the GCC alignment with mechlorethamine was performed. This has shown a new reaction o f mismatched cytosines with mechlorethamine. The reaction involves the formation o f a new mechlorethamine crosslink species that crosslinks DNA preferentially between two mismatched cytosine bases, in one repeat fragment, regardless o f the flanking base pairs. Kinetic and sequence dependence studies of the DNA crosslink formed by mechlorethamine with a cytosine-cytosine mismatch pair have determined that the efficiency o f the crosslink depends on the bases flanking the mismatched cytosines. This observation implies that the conformation o f the cytosines varies because xi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mechlorethamine requires a reactive distance o f 5.1 angstroms to form as shown by the modeling of the crosslink. Mechorethamine was also used as a probe on DNA fragments with increased numbers of mismatched cytosines found within the repeating sequence o f Fragile X. Mechlorethamine can crosslink multiple C-C mismatches in the d(GCC)n trinucleotide repeat sequence, with varying efficiency and varying electrophoretic mobility on a nondenaturing gel. These results suggest that as the number o f repeats increases, the conformation o f the mismatched cytosines varies. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. xii Chapter 1 Introduction 1.1 What is Fragile X? Fragile X syndrome is the single most common inherited cause o f n.ental impairment. The prevalence o f fragile X in the general population is estimated to be approximately one in 1,500 males and one in 2,500 females and is reported to affect all races and ethnic groups. (Warren, 1994; Ashley and Warren 1995; Kunst, 1996; Sutherland and Richards, 1995b; Timchenko and Caskey, 1996). Symptoms of fragile X syndrome include mental impairment, ranging from learning disabilities to mental retardation, attention deficit, hyperactivity, anxiety and unstable moods, and autistic-like behaviors. Boys are typically more severely affected than girls, and usually have mental retardation, with only one-third to one-half o f girls having significant intellectual impairment; the rest have either normal IQ or learning disabilities (Rousseau et al, 1991). Emotional and behavioral problems are common in both sexes. 1.1.1 Fragile X is Characterized By Three Tandemly Repeating Units of DNA The Fragile X site, FRAXA, was first diagnosed based on the expression of a break or weakness on the bottom o f the long arm o f the X chromosome (Xq27.3), induced in cell cultures under conditions o f folate deprivation. In 1991, the Fragile X Mental Retardation Gene 1, or FMR1, was identified and characterized by Verkerk, et al., (1991). The mutation in the gene involves expansion o f a tandemly repeating sequence of three nucleotides, CCG in the 5 ' -untranslated region. (Kremer et al., 1991; Ververk et al., 1991; Yu et al., 1992). This repeating sequence is polymorphic with approximately 6- 50 copies in unaffected individuals interspersed with the sequence AGG (Eichler et al., 1994; Hirst et al., 1994; Kunst and Warren, 1994; Snow et al., 1994). The premutation l Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. range involves lengths o f 50-230 uninterrupted repeat copies. In general a person who carries the premutation does not typically have any characteristics o f fragile X syndrome, but the stretch o f DNA is prone to further expansion when it is passed from a woman to her children. Carrier men (transmitting males) pass the premutation to all their daughters but none o f their sons. After passage through a female meiosis, the degree o f instability is reported to be related to the length o f the repeat (Weber, 1990), with the probability o f a full mutation increasing with copy number (Fu et al., 1991; Yu et al., 1991). Upon transmission, affected individuals can have from 230-4000 copies (Rosenberg, 1996). The symptoms o f Fragile X get progressively worse, and the number o f the repeats, progressively larger, in generations o f affected families. This phenomenon is known as anticipation. 1.1.2 Fragile X is Caused by a Lack o f Expression o f the Mental Retardation Protein In a full mutation, the CpG residues in the repeat and the adjacent CpG island within the promoter region o f the FMR1 gene are hypermythelated (Oberle et al., 1991). Methylation o f the promoter suppresses transcription and production o f the Fragile X Mental Retardation Protein (FMRP). FMRP1 is thought to interact and bind to two other uncharacterized gene products FRX1, and FRX3 (Zhang et al., 1995). These proteins all bind to RNA (Ashley et al., 1993; Siomi et al., 1993, Siomi et al., 1995) and are found in the cytoplasm o f the same type o f cells. FMRP 1 is expressed widely in the embryo, and after birth in the testis, uterus, and brain. There is evidence that the protein's primary function is within the brain, and that it plays an important role in synaptic change and development (W eiler et al., 1994). Since expansion probably occurs in a multicellular embryo and the extent o f expansion may vary from cell to cell, individuals often display 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. somatic heterogeneity in allele size. Some affected individuals, termed mosaics, exhibit both a premutation and a full mutation in blood (Mila et al., 1996). These mosaics do not exhibit the same level o f hypermethylation of FMR1 but still express reduced levels o f mRNA depending upon the repeat length. At the translational level production of FMRP 1 is suppressed by stalled ribosomes and their inability to traverse the CGG repeats (Feng, et al, 1995, 1997). 1.2 Other Diseases Associated With Tandem Repeat Expansion Table 1.1 list diseases associated with repeat expansion. In addition to the disease-linked fragile sites FRAXA (Fu et al., 1991; Kremer et a l., 1991), FRAXE (Knight et al., 1993), and F R A IIB (Jacobsen syndrome) (Jones et al., 1995), expansions o f CGG trinucleotide repeats have also been associated with the chromosomal fragile site FRAXF (Parish et al., 1994) and the autosomal fragile site FRAI6A (Nancarrow et al., 1994). Expansions o f 800-1000 repeats have been detected at each o f these fragile sites. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 Table 1.1. Diseases Associated with Tandem Repeat Ex Dansions DISEASE CHROMOSOME INHERITANCE RREPEAT NORMAL NUMBER DISEASE NUMBER Fragile X (A) Verkerk et al., 1991; Kremer etal., 1991 Xq27.3 X-linked dominant CGG 6-54 >200 Fragile X (E) Verkerk et al., 1991; Knight etal., 1993 Xq28 X-linked GCC 6-25 >200 M yotonic D rvstrophy Brooks et al„ 1992 19q 13.3 Autosomal dominant CTG 5-27 >50 S pinal-bulbar muscular atrophy LaSpada et al, 1991 Xq X-linked recessive CAG 13-30 39-60 D entatorubral Pallidoluysian atrophy K oideetal. 1994 12 Autosomal dominant CAG 8-25 54-68 H untington’s The Huntington’s Disease Collaborative Research Group 1993 4pl6.3 Autosomal dominant CAG 11-39 36-121 Spinocerebellar ataxia type 1 Orr, et al. 1993; Chung er al.. 1993 6p22-p23 Autosomal dominant CAG 25-36 43-81 Colon C ancer Fearonetal, 1990; Aaltonen et.al, 1993; Thibodeau et al, 1993 2p AT CA 1.3 Current Questions Some o f the current questions being actively researched on Fragile X are: What is the molecular basis o f expansion? What causes hypermethylation and how is this related to expansion? What causes the fragility within these sequences? Is there a relationship between the structure o f these repeats and the questions being asked above? Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.4 DNA Conformations In order to relate structure to the current questions being asked it is necessary to understand the conformational flexibility o f DNA structure. The ideal Watson-Crick B- DNA helix is shown in Figure 1.1(a). Duplex DNA can also adopt other stable conformational states such as A-DNA (also adopted by RNA) and Z-DNA, and a theoretical structure, triad DNA. Table 1.2 list some o f the parameters o f these conformations (Dickerson et a l., 1982; Kuryavyi and Jovin, 1995). In addition to these structures, which involve two strands, DNA crystals have been grown that involve three and even four strands. In the cell, it has been hypothesized that these DNA conformations may occur during recombination or expansion involving the folding o f single strands. The folds single strands can adopt have been the subject o f intense study. Table 1.2. Structural Features of A, Ideal B-DNA Duplex, Z-DNA, and Theoretical Triad DNA_________________ Parameters A B Z Triad Base Pairing Unit 2 2 2 3 Units per Helical Turn 11 10 12 6 Helix Rise Per Unit 2.6 A 3.4 A 3.7 A 3.4 A Helical Twist Per Unit 33° 36° 60° 60° Base Tilt Normal to the Helix Axis 20° 6U f 7 One single stranded conformation, which is very “B-DNA like”, is known as hairpin DNA (Figure 1.1(b)). The main difference between a hairpin and B-DNA is the loop region that is formed by the DNA folding over upon itself. One o f the questions regarding the expansion diseases is the significance o f the three repeating bases. To Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. address this issue, a theoretical hairpin structure involving the sequence d(CGG)n has been proposed by Kuryavyi and Jovin,(l995), involving the pairing o f only two strands, but requiring 3 bases to hydrogen bond in a alternating manner with the DNA backbone (Figure 1.1(c)). Single strands containing CGG have been shown to fold and form quadruplex/tetraplex structures (Fry and Loeb, 1994; Usdin anc/Woodford, 1995). A tetraplex involves base pairing o f four strands o f DNA. A schematic is shown in Figure 1.1(d). For the d(CGG)n strand or the G-Rich strand, a tetraplex can be easily formed and is quite stable at physiological pH. To form a stable tetraplex on the C rich strand, the strands o f the sequence must fold over, twist, and hydrogen bond with protonated antiparallel cytosines. This structure is more probable at low pH and will be discussed in more detail in chapter 2. Kohwi et al. (1993) have described a Zn 2+ - and Co 2+ - dependent non-B-DNA conformation for CAG triplet repeats. Other alternative (non-B) DNA structures such as cruciforms, left-handed Z-DNA, and intramolecular triplex structures can form in vivo in other defined ordered sequences known as DNA elements. These include inverted repeats, alternating purine-pyrimidine tracts [(GC) or (AC)], and homopurine-homopyrimidine tracts containing mirror repeat symmetry (e.g. (GA)n and (G)n), respectively (for a review, see Sinden 1994). Spontaneous mutations in prokaryotic and eukaryotic (including human) cells are frequently associated with DNA sequence elements that can form alternative non-B-DNA structures (reviewed in Wells and Sinden, 1993). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ' * s £ * 3 < s— y yy Minon-»y Groove Major Groove Figure 1.1. DNA Conformations, (a) Structure o f B-DNA (b) Schematic o f hairpin DNA showing strand arrangement. Arrows indicate backbone o f DNA in the 5 ' (tail) to 3 ' (head) direction, (c) Theoretical Triad DNA. Triangles represent base pairing, (d) Schematic o f tetraplex/qudraplex DNA that can form with a single strand. Squares represent base pairing. 1.5 Proposed Methods of Expansion The mechanism o f repeat expansion is completely unknown. Several hypotheses to explain this phenomenon have been proposed. One involves aberrant replication or recombination at slipped strand intermediates within the triplet repeats (Fu et al., 1991; Richards and Sutherland, 1992, 1994; Sinden and Wells, 1992; Wells and Sinden, 1993; Chong et al., 1994; Eichler et al., 1994; Jansen et al., 1994; Kunst and Warren, 1994; Snow et al., 1994; Kang et al., 1995). Slipped DNA, shown in Figure 1.2A-D, involves the loss o f the original duplex by sliding and misalignment, which results in strand separation. The separated strands then have the potential to refold and create new base 7 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pairs upon themselves. Although slipped strand DNA structures could theoretically form within long runs o f triplet repeats, no conclusive evidence has ever been presented for such structures. Recently, Chen et al., (1998) may have provided some evidence by showing that the Fragile X sequence can form a slippage structure that has a three-way junction consisting o f two Watson-Crick, d(GCC)n«d(GGC)n, arms and a third ss(GCC)n hairpin arm, using short oligonucleotides. However, considerable evidence exist which suggest that the sturcture o f CTG, CAG, CGG, and CCG single strands is an intrastrand duplex hairpin (Chen et al., 1995; Gacy et al:, 1995; Mitas et al:, 1995a,b, 1997; Smith et al., 1995, Gacy, 1998; Marinppan et al.. 1998). This structure unites the sequences o f the trinucleotide repeats in a mechanism o f expansion. 1.5.1 Hairpins an d Slippage Model Replication o f DNA requires the separation o f the duplex at the replication fork. It is at this time that the strands have the potential to form stable single stranded conformations such as hairpins. Large expansions may occur when the length of the triplet repeat increases beyond the size o f an Okazaki fragment, such that the 5 ' and 3 ' ends o f a nascent fragment, which may be prone to slippage, are within the repeat tract (Richards and Sutherland, 1994). For this model it has been proposed that during DNA synthesis o f the lagging strand template, slippage o f the nascent DNA strand, possibly caused by polymerase stalling, occurs in the repeat region, Figure 1.2E (W ells, 1996). The slippage allows the formation o f hairpin loops and extended DNA synthesis o f the template. The template strand is later filled in, resulting in expansion o f the trinucleotide repeat stretch. How the hairpins escape repair is not known. It could be due to the inability o f the repair mechanism to synthesize enough protein, or it may be that the 8 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. hairpins are not recognized by the repair system. Alternatively, other proteins might bind to and mask the hairpins from the repair mechanism Another model proposes that the cause o f expansion is due to double strand breaks. A stalled replication fork has been shown to cause double strand breaks (Michel et al., 1997). The expansion has been proposed to occur when the gaps created by these breaks are filled in as depicted in Figure 1.3 (Petruska et al., 1998; Sakar et al., 1998). The structure-specific metallonuclease FEN -1 (fiv e1 exonuclease-1 or flap endonuclease- l) (David et al., 1998), which also acts as an endonuclease for 5 ' DNA flaps (Harrington and Lieber, 1994; Murray et al., 1994; Borges et al., 1996; Nolan et al., 1996) is critical for the efficient processing o f Okazaki fragments during lagging strand DNA synthesis (Goulian et al., 1990; Lyamichev et al., 1993; Turchi and Bambara, 1993; Waga et al., 1994). Therefore, because Flap endonuclease (FEN-1) removes 5 ' overhanging flaps during DNA repair and processes the 5' ends o f Okazaki fragments in the lagging strand, it has been implicated to cause double strand breaks. In support o f this, a yeast FEN-1 null mutant is a strong mutator, as unexcised flap strands in Okazaki fragments lead to double-stranded DNA (dsDNA) breaks that are repaired by homologous recombination or end joining to yield large sequence duplications throughout the genome (Tishkoff et al., 1997). Thus, FEN-1 is a key protein for maintaining genome integrity, and mutations in FEN-1 may give rise to a number o f genetic diseases, such as myotonic dystrophy, Huntington's disease, several ataxias, fragile X syndrome, and cancer (Gordenin et al, 1997;Tishkoff et al., 1997; Freudenreich et al., 1998; Schweitzer and Livingston, 1998). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ■JULflJli i i » -innnr- ■ ■ JU U U L i n i i i V E Okazaki Fraaments DNA Replication Figure 1.2. Expansion Mechanisms. (A-D) Models for Slipped or S-DNA structure. (A) Due to the repetitive nature o f the triplet sequences, a variety o f slipped strand DNA (S- DNA) structures could form within a triplet repeat tract. A number o f different structures are possible in which the position and length o f the slipped out regions vary. In the structures shown, the lengths o f the slipped out strands in both complementary strands are identical, although this may not be a requirement for alternative triplet repeat structures. Thin lines represent trinucleotide repeats, while thick solid or interrupted lines represent flanking, unique sequence DNA. (B) Possible S-DNA with multiple slipped out regions in both complementary strands. (C) Structure with multiple slipped out regions in one strand and a single hairpin arm in the opposite strand. (D) Possible alternative triplet repeat structures in which a hairpin has formed in only one strand. The opposite strand may be unpaired (or in a collapsed but non-base-paired structure). Pearson, and Sinden, 1996 (E) Expansion by hairpin formation within Okazaki Fragments. 10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 r ~ = ' / 5 5' 5' 5' 5' Figure 1.3. Strand Breaking Mechanism o f Expansion. The strands slip and form hairpin structures that may be cleaved by nucleases. The hairpins are then free to slide and travel along the duplex and upon repair are expanded. 1.6 Proposed Method of Methylation Two theories regarding hypermethylation o f Fragile X repeats have been proposed by Mitas (1997). One based on active methylation o f the d(CCG)n strand, and one based on a lack o f demethylation. 1.6.1 Active Methylation In the active methylation model the human DNA methyltransferase (MTase), has increased affinity for cytosines at a CpG step in a C-Rich Fragile X hairpin (Chen et al., 1995; Smith et al., 1994; Laayoun and Smith, 1995). This affinity leads to enhanced methylation and is probably due to the increased flexibility present in hairpins with 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mismatched base pairs. The rate o f methylation has been shown to be higher if the cytosine in the CpG does not form a Watson-Crick hydrogen bond and instead is miss- paired or missing (Smith et. al., 1991, 1992; 1994; Laayoun and Smith, 1995) Due to this flexibility, it has been proposed that these flexible structures more readily resemble the transition state o f a flipped cytosine base necessary for methylation. 1.6.2 Lack o f Demethylation Demethylation occurs during gametogenesis, embryogenesis (Razin et al., 1984; Kafri et al., 1986,1988; 1992; Frank et al., 1991; Choi and Chae, 1993), and tissue- specific cellular differentiation (Sullivan and Grainger, 1986; Saluz et al., 1986; Paroush et al.. 1990). Cedar and Razin have proposed that proteins involved in demethylation are anchored to specific regions on the DNA. Because o f the length o f the repeats, the tethered protein would be unable to reach and remove all of the methylated repeats, due to spatial or kinetic constraints, therefore leaving regions with hypermethylation (Weiss et al., 1996). 1.7 Basis of Fragility It has also been proposed that the expansion and methylation o f these triplet repeats can alter the functional organization o f chromatin and nucleosome assembly, which may contribute to alterations in the expression o f the FMR1 gene, or in the fragility of the chromosome (Wang and Griffith, 1996; Goode et al., 1996). Despite the molecular characterization o f five fragile sites, the relationship between the chemistry o f fragile site induction and their DNA sequence composition is not yet clear. All fragile sites that have been described at the molecular level are o f the 12 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. folate-sensitive type, and all have the same DNA sequence composition, the trinucleotide d(CCG)n repeat (Kremer et al., 1991; Oberle et al., 1991 ; Verkerk et al., 1991 ; Yu et al., 1991 ; Knight et al., 1993 ; Nancarrow et al., 1994 ; Parrish et al., 1994 ; Jones et al., 1995 ; Sutherland and Richards, 1995a). These fragile sites arise by the unstable expansion of d(CCG)n repeats, which normally exhibit repeat copy number polymorphism and methylation o f both the expanded d(CCG)n repeats and the adjacent CpG islands (Pieretti et al., 1991; Vincent et al., 1991 ). 1.8 Significance of Mismatched Cytosines within the d(CCG)n Repeating Sequence of the C-Rich Strand of Fragile X A major structural component of the C-Rich strand o f Fragile X involves the potential to form mismatched cytosines with a d(GCC)n«d(GCC)n alignment. This issue will be discussed in more detail in chapter 2 and is the foundation o f this research. A schematic o f how mismatched cytosines may form within a slipped structure is shown in Figure 1.4. What is interesting about mismatched cytosines is that out o f the eight potential mismatches, C-C mismatches are the least effectively repaired (Lu et al., 1983; Su et al.. 1988; Fang and Modrich, 1993; Modrich, 1994). W hat is also unusual about these bases, which is probably related to their flexibility, is that they have never been crystallized in duplex DNA at neutral pH (a search only showed crystal structures o f triplex structures crystallized at low pH). NMR data is also confusing because structures for mismatched cytosines have been proposed to have a single hydrogen-bond between the C.C mispairs (Boulard et al., 1997; Marinppan et al., 1998) and have also been reported to be 13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. extrahelical (Gao et al., 1995,1996). It has been suggested that C-C mispairs are more mobile than G-C base-pairs and are in a dynamic exchange between an open (non hydrogen bonded) and closed (hydrogen bonded) conformation thus, the cytosines can be "flipped out" o f the helix more easily. Understanding the significance o f this might help provide answers into the mechanism o f expansion, methylation, or fragile sites. O CCGCCGCCG ~ ~ CGCCGCCG CCGCCGCC GGCGGCGGCGc c C g c g g c g g c g g c g g c g g G G C G G J G c C GCCGCCG CGGCGGC 5 ‘ — G 0 Figure 1.4. Theoretical Alignment o f the Fragile X Strands During Slippage. Note the formation o f mismatched cytosines in the hairpin. 1.9 Limits of Analysis The incredible length o f trinucleotide repeat DNA makes the study o f their structure a challenge experimentally. Previous methods used to analyze these structures involve NMR spectroscopy, UV, X-ray diffraction, computer simulations, and electrophoresis coupled with chemical and enzymatic probing. NMR can provide information about base pairing, base interactions and arrangement in solution, but this method o f analysis is limited by the need for high concentrations and short pieces o f DNA. Long stretches o f DNA complicate the signal and make it impossible to interpret. 14 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UV is useful to determine global structure regarding stacking and base pairing interactions but does not give detailed information at the atomic level. X-ray diffraction is useful and gives detailed information about structure, but has also been limited to short pieces o f DNA and the conditions used to solve X-ray structure can be harsh. Computer models are useful tools that provide valuable information regarding allowable conformation. Molecular dynamics simulations o f DNA fragments are particularly useful in illustrating the conformational flexibility o f the DNA helix. These simulations are, of course, heavily dependent on careful parameterization o f the specific system being studied. A measure o f the success o f a particular simulation can be based on comparison with direct physical structural data (such as an x-ray or NMR structure) or, in the absence of such data, with more indirect experimental structural phenomena. Computer simulations o f expansions o f DNA triplet repeats in fragile X using realistic relative probabilities o f hairpin formation, replication, slippage and repair, have produced results that corresponded with the observed range o f repeats and transition probabilities from normal to affected individuals (Bat et al., 1997). However these models do not give detailed structural information. Electrophoresis coupled with chemical and enzymatic probing has provided valuable information regarding global structure and detailed information about sequence, base pairing, and stacking arrangements. The problem with many o f these methods is that they must be performed in vitro. 1.10 Overview o f Research The research presented in this dissertation examines the molecular structure o f the d(GCC)n«d(GCC)n alignment in the C-Rich strand o f Fragile X using computer 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. modeling and specifically its novel reaction with mechlorethamine, a well known antitumor agent. Chapter 2-3 describes electrophoretic, chemical, and enzymatic probing, coupled with molecular dynamics simulations, that were used to elucidate and propose a new structure for the C-Rich strand o f Fragile X DNA. During the course of this work a new and uncharacterized reaction was discovered involving mismatched cytosines. This reaction appeared ideal for the analysis o f the structure o f the mismatched cytosines bases within Fragile X, but to simplify the analysis, we concentrated on only one repeating fragment. Therefore, the majority o f the chapters (4, 5, 7 and 8) are devoted to understanding this novel reaction in the context o f one isolated repeat fragment. However, chapter 6 digresses slightly to explore the reaction with multiple repeats and and even hairpins, and illustrates its potential as a molecular probe for mispaired cytosines within the Fragile X sequence. From this research it is proposed that the conformation o f the mismatched cytosines are dependent on neighboring base pairs. The cytosines are more flexible and have more potential to move out of the helix as the stability surrounding the mismatches decreases. It is the hope that this research may provide more insight into the conformation and reaction o f mismatched cytosine bases. This research does not attempt to answer the mechanism o f expansion, methylation, or fragile sites, but provides new knowledge about mismatched cytosines. Also, since mechlorethamine is known to react with DNA in vivo, in the presence o f nucleosomes (Millard et al., 1998), it may serve in the development o f an assay to measure slipped DNA in vivo. Understanding the formation, structure, and stability o f slipped and/or hairpin, strand structures in triplet 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. repeat DNA should contribute to the understanding o f the molecular mechanisms responsible for expansion and may even provide insight into the mechanism o f methylation and/or fragile sites. 17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 2 Modeling o f the Structure of the C-Rich Strand of Fragile- X: At Physiological pH, d(CCG),5 Forms a Hairpin and a Distorted Helix 2.1 Overview o f Chapter To investigate potential structures o f d(CGG)n»(CCG)n that might relate to their biological function and association with triplet repeat expansion diseases (TREDs), the structure o f a single-stranded (ss) oligonucleotide containing d(CCG)n-is [ss(C C G )im 5] was modeled using energy minimization and molecular dynamics. A structure was proposed that involved extrahelical stacking o f cytosines in the minor groove of a distorted helix, called the extended e-motif. This structure was based on experimental data generated in Dr. Michael Mitas’ lab (Yu et al., 1997) The experimental data can be summarized as follows: (/) pH and temperature dependence o f electrophoretic mobility, UV absorbance, circular dichroism, chemical modification, and PI nuclease digestion. (/'/) An unusually high pKa (7.7 ± 0.2). (/'//) At pH 8.5, a relatively unstable (Tm = 30°C in 1 mM Na+) hairpin containing mismatched cytosines surrounded by CpG base-pair steps, (/v) At pH 7.5, a hairpin that contained protonated cytosines but no detectable O + C base pairs (where a center dot designates H- bonds and a superscript plus designates a proton shared between the cytosines), with increased thermal stability (Tm = 37°C), increased stacking o f the CpG base-pair steps, and a single cytosine that was flipped away from the central portion o f the helix. Further examination o f longer strands, ss(CCG)is and ss(CCG)2o, which were designed to adopt hairpins containing alternative GpC base-pair steps, revealed hairpins containing CpG base-pair steps, pKa's o f -8 .2 and -8 .4 , respectively, and distorted 18 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. helices. Together, these results suggested that DNA sequences containing (CCG)n>15 adopt hairpin conformations that contain an alignment such that there are CpG rather than GpC base-pair steps surrounding the mismatch cytosine bases. Using computer modeling, based on experimental results and NMR coordinates o f Gao et al. (1995), a hairpin structure was generated to account for the structural dependence on pH and temperature, for the CpG alignment, increased stacking, and P 1 nuclease data. This model is proposed to have extrahelical cytosines that interact and stack in the minor groove o f a distorted helix. 2.2 Introduction 2.2.1 Classification o f Triplet Repeat Expansion Sequences To aid in the correlation o f potential structures o f triplet repeat nucleic acids with their function and their propensity to undergo expansion events, a sequence-based classification system o f double stranded triplet repeats, consisiting of I-V classses, was developed and described by Mitas et al., (1995b). Class I triplets d(CTG)n«d(CAG)n, d(CCG)n«d(CGG)n, and d(GTC)n»(GAC)n, are GC rich with sequences that can align to form a GC or CG palindromic dinucleotide. Class II triplets d(CAC)n«d(GTG)n and d(CTC)n«d(GAG)n have no GC or CG palindromic dinucleotide, but are GC rich. Class III, d(ATC)n*d(GAT)n, d(TAC)n»d(GTA)n, and d(ATA)n«d(TAT)n, have AT and TA palindromic dinucleotide sequences and are AT rich. Class IV triplets d(AGA)n»d(TCT)n and d(ACA)n«d(TGT)n, have no palidromic sequences and are AT rich. Class V triplet repeat sequences, d(AAA)n«d(TTT)n and d(CCC)n*d(GGG)n, contain homopolymers and are not associated with TREDs. 19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Class I repeats, which are rich in G-C/C-G base pairs, exhibit the lowest rates o f slippage synthesis (Schlotterer and Tautz, 1992), have the lowest base-stacking energies, and are associated with nine o f the 10 known triplet repeat expansion diseases (TREDs) (Mitas et al., 1995b). Members o f the TRED family include fragile X syndrome and Huntington's disease [for a review o f TREDs, see Ashley and Warren (1995)]. What is common to the Class I triplet repeats is the ability o f each complementary strand to form hairpin structures in vitro. This has led to the hypothesis that the six complementary single strands o f Class I triplet repeats could potentially form hairpin structures at the lagging strand o f the replication fork (Mitas et al., 1995b; Gacy et al., 1995), where expansion events might be initiated. Various biophysical and biochemical studies have shown that sequences containing d(CTG)n, d(CGG)n, d(CAG)n, d(GTC)n, and d(GAC)n form stable hairpins (Chen et al., 1995; Gacy et al., 1995; Mitas et al., 1995a,b; Smith et al., 1995; Yu et al., 1995a,b; Mariappan et al., 1996a,b; Petruska et al., 1996). 2.2.2 Hairpin Alignment Unlike d(CTG)n or d(CAG)n, which form hairpins containing only GpC base-pair steps, d(CCG)n sequences can potentially adopt hairpin alignments that contain either CpG or GpC base-pair steps. A hairpin (or duplex structure) containing d(CCG)n repeats can have GpC base-pair steps forming an (a) alignment or CpG base-pair steps forming a (b) alignment (Mitas et al., 1995a) (Figure 2.1). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. , C ^ ( C G > ‘ C G > c c ; G C G C C C C G C G C C G C G C C C C G C G C C G C G C C C C G C G C C G C G C C C C G C G C 5 ’ 3 y 5 ’ 3 ’ a b G p C C p G Figure 2.1 The (a) and (b) hairpin alignment o f d(CCG)n. The boxes highlight the alignment, GpC or CpG which refers to the sequence o f the base pairs surrounding the C- C mismatch in the 5 ' to 3 ' direction. / (a) Alignment For d(GCC)n <_7, sequences with a terminal 5’ guanine, hairpins have been reported to form in the (a) alignment (Chen et al., 1995; Mariappan et al., 1996a). In an (a) alignment hairpin containing CCG repeats, the methylatable cytosine is mispaired with another cytosine. Studies with the human MTase (Smith et al., 1987; Baker et al., 1991; Smith et al., 1991) and two bacterial cytosine MTases (Klimasauskas and Roberts, 1995; Yang et al., 1995) have shown that when a methylatable cytosine is paired with a base other than guanine, high rates o f methylation are observed. Presumably, this is due to the ease with which the methylatable cytosine flips away from the helix, a requirement for methylation at the 5 position [for a review o f other DNA reactions catalyzed by base flipping, see Roberts (1995)]. 21 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1H NMR studies do not reveal any H-bonds within the C-C mismatches o f (GCC)n<7 hairpins (Chen et al., 1995; Mariappan et al., 1996a), suggesting that the methylatable cytosine in these hairpins might also be free to flip out of the helix and undergo rapid methylation. In support o f this possibility, in vitro studies have shown that human MTase rapidly methylates d(GCC)n <7 hairpin structures (Chen et al., 1995). It has been proposed that hypermethylation o f the triplet repeat region in the FMR-1 gene is a direct result o f formation o f hairpins in the (a) alignment followed by the action o f human MTase (Chen et al., 1995; Laayoun and Smith, 1995). ii (b) Alignment With chemical probing, Yu et al., have determined that (CCG)n>15 adopts hairpin conformations that contain a (b) alignment, which is in contrast to previous studies on shorther sequences. However, a completely unexpected and new DNA structure that has recently emerged from studies o f triplet repeat sequences is the e-m otif formed by the short duplex sequence d(CCG);*d(CCG ) 2 (Gao et al., 1995). The e-m otif duplex also has a (b) alignment with two CpG base-pair steps. This alignment is achieved by having two overhanging 5' cytosines (Figure 2.3). Surprisingly, the cytosines within the lone C-C mismatch are centrally located in the duplex and are extrahelical, pointed away from one another, and symmetrically located in the m inor groove. 2.2.3 Significance o f the (b) Alignment In Longer Sequences To understand mechanisms o f sequence amplification, gene hypermethylation, and folate-induced chromosomal fragile sites, it is im portant to determine structures o f oligonucleotides containing relatively large numbers o f CCG repeats. The structural 22 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. characterization o f the Fragile X sequence conformations o f seven repeats or less has been achieved using NMR spectroscopy (Chen et al., 1995; Mariappan et al., 1996a), but technical difficulties prevent the observation o f larger structures with NMR. Because o f this, we have chosen to examine the conformation o f longer repeats using a molecular dynamics simulation approach by com bining the experimental results o f Yu et al.. with the NM R structure o f the e-m otif (Gao et al.. 1995). In this chapter the results of four simulations containing the sequence d(CCG )uG , in the (b) alignment with CpG base-pair steps, will be discussed. The essential result that emerged is that when the mismatched cytosine bases are placed in a fully extrahelical conformation, sim ilar to the e-motif, at the start o f the simulation, they have a tendency to stack with each other in the minor groove. These interactions occur between cytosine bases that are on ‘opposite’ strands o f the helical stem o f the hairpin. Thus, the most interesting structure obtained from the modeling was the extended-e-motif, which is predicted to have a distorted backbone with extrahelical cytosines and stacked guanines. 2.3 Method Molecular dynamics simulations were performed using the DNA sequence d(CCG)uG. In all o f these conformations the additional 3 ' guanine base was included as the complement o f the 5 ' cytosine base, giving a (b) alignment. All calculations were performed using the AMBER4.0.1 force field, implemented on a Silicon Graphics Indigo workstation. Standard AMBER parameters and charges were applied to the DNA bases. For calculations involving a protonated cytosine, charges were developed by fitting to the computed AMI wavefunction for a protonated cytosine base (Ferenczy et al., 1990). 23 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Other parameters used were largely those o f a normal cytosine base, with a few additional parameters to describe the protonated ring. The four simulations are labeled (i) tetraplex, (ii) hairpin: (iia) non-protonated and (iib) protonated, (iii) extended e-motif, and (iv) idealized extended e-motif. Further details o f the construction o f these conformations are given in the sections that follow. In all the calculations, except for (/'), the DNA was solvated in a periodic box o f TIP3P water molecules (Jorgensen et al., 1983), with a minimum distance o f 8.0A from the DNA to the box edge. Counterions were added at the positions o f most negative potential (van Gunsteren et al., 1986) to obtain two thirds neutralization, then the whole DNA / counterion system was resolvated in a larger periodic box. The molecular dynamics was performed in the nPT ensemble at 298K, using a time step of 0.002ps, a 8.0A non-bonded cut-off, a dielectric constant o f 1 and SHAKE bond length constraint (van Gunsteren et al., 1977). 2.3.1 Simulation (i):Tetraplex DNA Containing Protonated C-C Pairs Using an in house program, TETRA, developed in our lab, various arrangements o f the ss(CC G)im 5 G sequence were constructed. The rise, and twist o f the helix was varied, and the backbone of the DNA minimized for each structure. Total energy was examined and plotted. A schematic o f this structure is shown in Figure 2.2(7) 2.3.2 Simulation (ii): B-DNA Hairpin Including Intrahelical, Mismatched C-C Pairs. i Simulation (iia): Nonprotonated Hairpins. Hairpins o f d(CCG)4-is were constructed and minimized in various alignments. The longest hairpin that could be subjected to molecular dynamics simulation in a 24 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. reasonable amount o f computer time was d(CCG)nG. The sequence numbering o f the stem o f the d(CCG )nG hairpin is shown in Figure 2.2(ii) (bases C l to G15 and C20 to G34). The stem was constructed in a regular B-DNA conformation using the Quanta 4.0.1 package. Cytosine bases forming C-C mismatch pairs were included in their ‘norm al’ location in a C-G pair, and no attempt was made to induce hydrogen bonding since 1H NMR studies did not reveal any H-bonds within the C-C mismatches of hairpins between the two cytosines o f the mismatch (Chen et al., 1995; Mariappan et al., 1996a). A four base fragment (C l6 to C l9) comprising the loop region was manually added to the stem helix, and then the loop fragment relaxed using energy minimization with the stem frozen. Several different starting points were used for the loop conformation, and the lowest energy o f these following minimization was chosen as the starting structure for molecular dynamics. The complete results o f this simulation will be discussed in more detail in chapter 3. ii Simulation(iib): B-DNA hairpin including intrahelical, protonated mismatched C-C pairs. In this simulation, before minimization, the structure was first protonated at the N3 position o f C4, C7, CIO and C l3. Other than this modification, the molecular dynamics protocol was identical to (ii) above. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 25 r s 16C C19 ' w ' 0 mG GmC C-G C CQC C 13C C22 G-C Q*C C«G C-G O G G*C 10C C25 G-C C GoC C C-G GmC G»G 7C C28 G-C C*G G*C C-G C CPC C 4C C31 G-C QmC C«G C-G C-G 5’ 3’ / ii Figure 2.2. Schematic o f the structures used for the simulation o f (Z)tetraplex, and (ii) hairpin. (/) tetraplex showing base pair arrangement where closed circles indicate Watson-Base pairing and open circles indicate parallel base pairing o f cytosines, (ii) hairpin and the numbering sequence used to describe the conformation. Starting conformations for the molecular dynamics simulations o f the stem o f d(CCG)nG was in a duplex B-DNA conformation in the (b) alignment with intrahelical mismatched cytosine bases. The schematic has exaggerated the starting positon o f the mismatched cytosines, for calrity. The initial conformation o f the cytosines in the stem is similar to a C-G base pair stacked within the helix. For the protonated hairpin the intrahelical mismatched cytosine bases were protonated at the N3 position o f bases C4, C l, CIO and C13. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.3 Simulation (iii): Extended e-motif hairpin based on coordinates o f original e-motif. A d(CCG)iiG hairpin including extrahelical cytosine bases was constructed by combining a series o f e-m otif fragments based on the coordinates o f the d(CCGCCG)«d(CCGCCG) DNA duplex e-m otif (Gao et al., 1995). Using the numbering system shown in Figure 2.3(/77), the original duplex fragment would comprise, for example, C4 to G9 and C25 to G30 (the shaded region in Figure 2.3(/77)). The C7 and C28 bases are extrahelical, and the C4 and C25 bases form overhanging ends in the original structure. Construction of the extended e-m otif was then achieved by computationally overlaying a central C-G pair (for example, C5-G30 in Figure 2.3) o f one e-m otif onto the terminal C-G pair of another, thereby extending the m otif (such that, for example, G9-C24 is an identical copy o f G6-C27). The overhanging ends were deleted in each case. The C16-C17-G18-C19 loop was constructed as a two base pair duplex (a C-C and a C-G pair) using Quanta 4.0, manually positioned and then subjected to 4000 cycles o f gas-phase minimization to provide a starting point for molecular dynamics simulations. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 27 V* G-C ^ P * C S C-G /C-G 16C C Vq -C^ 5’ ^ C -G C-G C-G 5’ 3’ C-G C-G 5’ 3’ 5’ 3’ e-m otif iil Ops iii 100ps iv Figure 2.3. Schematic o f the e-motif, extended(/77), and idealized e-motif (iv). For the orginal e-motif, the arrows indicate the actual positon o f the cytosines in the minor groove o f the helix. Not shown is the position o f the guanines surrounding the extrahelical cytosines which are 4.10 A apart. The guanine distance is better represented by the schematic shown in the idealized e-m otif (iv). (iii) Extended e-motif conformation constructed from the original e-m otif coordinates for a trinucleotide fragment d(GCC)«d(GCC). The arrows indicate that the formally mismatched cytosine bases are extrahelical, with respect to the neighboring base pairs, in the identical location to that observed in the orginal e-motif structure. Arrows o f the Ops structure indicate the initial l -7 interaction o f the extrahelical cytosines, lOOps the l -4 interaction (iv) “Idealized” extended e-m otif conformation, based on the results obtained in simulation (iii). Extrahelical cytosine bases are located in the minor groove o f the 5’-CG step to the 5’ side o f the cytosine, and stacking occurs between two cytosines from ‘opposite’ strands (that is, between C7-C31, C10-C28 and C13-C25, giving a I-4 interaction). 2.3.4 Simulation (iv): Extended Idealized e-motif Hairpin Including C/C Stacking in the Minor Groove. Based on the observations made in simulation (iii), a structure was built which retained the essential elements o f the (CCG)uG conformation that emerged from simulation (iii). Using in-house software, BTOZ, developed in our lab, coordinates for 28 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the base locations were generated to approximate those observed in simulation (iii). This geometry was initially derived from a CURVES 3.1 (Ravishanker et al., 1989) analysis of the trajectory from simulation (iii). The backbone was then fitted to the base locations, again using in-house software, and again based on backbone torsion angles derived from simulation (iii). 2.4 Results 2.4.1 Experimental Background i Experimental Evidence O f Secondary Structure pH Dependent Mobility: Protonation o f ss(CC G )u at a p H above Neutrality. At pH 8.5-7.9, the Mrel o f intramolecular ss(CCG)is varied between 1.02 and 1.00, values higher than that o f random coil ss(GAT)is DNA but lower than that o f a hairpin containing paired mismatched bases (Figure 2.4A). This result suggested that although ss(CCG)i5 contained secondary structure at these pH values, the structure was not as stable as the ss(CTG)is hairpin. The Mrel o f ss(CmCG)is was similar to that of nonmethylated ss(CCG)is (Figure 2.4A), suggesting that methylation o f the CpG dinucleotide did not result in a significant structural change. At pH 7.7, the Mrel of ss(CCG)i5 and ss(CmCG)is increased to 1.13 and 1.14, respectively, values similar to that o f ss(CTG)is. The changes in electrophoretic mobilities o f the C-rich sequences as a function o f pH provided strong evidence that at least some fraction o f the cytosines were protonated at pH 7.7, a value well above the pKa o f cytosine. The approximate midpoint in the electrophoretic transition o f ss(CCG)is and ss(CmCG)i5 (i.e., the pKa) was 7.8. Lowering o f the pH to 6.5 did not result in a further increase in the Mrel o f ss(CCG)is or 29 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ss(Cm CG)i5 (data not shown), suggesting that no additional cytosines were protonated between pH 7.7 and 6.5. At the five pH values tested, the relative electrophoretic mobility (Mrel) o f ss(GAT)is was slow (0.89 < Mrel < 0.87), while that o f ss(CTG)is was fast (1.16 < Mrel < 1.15) (Figure 2.4A). No electrophoretic transitions were detected with ss(GAT)i5 or ss(CTG)1 5 (Figure 2.4A) or with ss(ATC)i5, ss(CAG)is, ss(GAC)is, or ss(GTC)is (data not shown). These results indicated that the bases within these sequences were not protonated at pH 7.5. O f the six possible GC-rich single-stranded (ss) repeating triplet sequences, d(CCG)n contains the highest number o f cytosines. Cytosine’s pKa is the highest among all the bases (Saenger, 1984; for a monophosphate nucleotide the reprorted pKa o f the N3 atom is 4.4, Dawson et al., 1987), therefore it seemed reasonable to observe pH-dependent structural transitions for the C-rich strand, d(CCG)n. Based on the increased electrophoretic mobility (Mrel) with decreasing pH, it was proposed that the structure o f ss(CCG)is might undergo a structural transition from a less to a more compact structure. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 30 c. Figure 2.4. Single-stranded (CCG)15 forms two pH-dependent structures. (A) pH- dependent electrophoretic analysis. The data plotted are the relative electrophoric mobilites (Mrel) o f the various ssDNA sequences, which are listed to the right o f the figure. Mrel o f ssDNA = distance ssDNA migrated from origin/distance dsDNA migrated from origin. Data are the mean o f two experiments. Except for minor variations due to "smiling" during electrophoresis, the rates o f migration o f all dsDNAs containing 15 triplet repeats were identical. (B) Temperature-dependent electrophoretic analysis o f ss(CCG)15. The data plotted are the Mrel of ss(CCG)isat the pH indicated to the right o f the figure. (C) UV absorbance melting profile of ss(CCG)is in a solution o f 150 mM NaCl, 10 mM Tris HC1, and 1 mM EDTA. Rate o f heating was 0.5°C/min. Thermal Stability’ o f ss(CCG)is+ Is Modestly Higher Compared to That o f ss(CCG)/}■ The nature o f the compact structure might be related to protonation. To investigate the thermal stabilities o f the nonprotonated and protonated [ss(CCG)is+] structures o f ss(CCG)is, electrophoretic mobility melting profiles (EMMPs) were obtained at pH 7.5 and 8.5 ([Na+] 1 mM; Figure 2.4B). Using gradual electrophoretic phase transition, which is an estimate of the melting temperature (Tm) o f the DNA (Wartell et al., 1990; Ke and Wartell, 1993; Mitas et al., 1995a; Yu et al., 1995a,b), it 31 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. was determined that the Tm was 30°C at pH 8.5. This Tm value is the lowest among all Class I triplet repeat DNAs (Table 2.1), indicating that C-C mismatches are the least stable among the four homomismatches. At pH 7.5 the structure o f ss(CCG)i5+ is stabilized modestly by cytosine protonation and has an estimated Tm o f o f 37°C (Table 2.1). Using UV absorbance at a higher concentration o f salt, 150 mM NaCl, the T m ’s were determined to be -~54°C at pH 8.5 (a broad peak between 50°Cand 58°C was observed) and 57.5°C at pH 7.5. The increased hypochromicity o f ss(CCG)is+ provided direct evidence that the bases of ss(CCG)is+ were better stacked compared to those in ss(CCG)i5 . These results suggested only a minor increase in thermal stability o f ss(CCG)i5+ compared to ss(CCG)|5. Table 2.1: Melting Temperatures o f 15 Class 1 Triplet Repeat Sequences in ~1 mM Na+ (XXX) 15 pH Tm (C) reference expansion o f triplet repeats associated with the following disease CGG 8.5 75 Mitas et al.,1995b fragile X syndrome (Ververk et al., 1991) GAC 8.5 49 Yu et al., 1995b none CTG 8.5 47 Yu et al., 1995a,b mytotonic dystrophy (Brook et al., 1992; Mahadevan et al., 1992) CAG 8.5 38 Yu et al., 1995b Huntington's disease (Huntingon's Disease Collaborative Research Group, 1993) GTC 8.5 38 Yu et al., 1995a none CCG 7.5 37 this study fragile X syndrome (Ververk et al., 1991) CCG 8.5 30 this study fragile X syndrome (Ververk et al., 1991) a Oligonucleotides containing 15 triplet repeats also contained flanking sequences. The sequence o f the oligonucleotide containing 15 pyrimidine-rich triplet repeats was GATCC(XXX) 15GGTA CCAAGCT, where XXX = CCG, CTG, and GTC. bThe sequence o f the oligonucleotide containing 15 purine-rich triplet repeats was AGCTTGGTACC(XXX)15GGATC, where XXX = CGG, CAG, and GAC. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CD spectra failed to detect C *+C base pairs in (CCG) 1 5 + at p H 7.5. Protonation of cytosine N3 can result in formation o f O + C base pairs (where a center dot designates H-bonds and a superscript plus designates a proton shared between the cytosines), that can be arranged in a antiparallel (Brown et al., 1990) or parallel (Gehring et al., 1993) double helix (Figure 2.5 ). When analyzed by circular dichroism (CD) DNA structures that contain O + C base-pairs arranged in parallel exhibit marked positive bands at 290 nm (Gray et al., 1988). At pH 6.5 there were no CD features associated with these pairs. However, below pH 6.5, the long-wavelength CD and absorption bands underwent characteristic red shifts, which provided evidence that some fraction o f the cytosines had the spectral characteristics o f the hemiprotonated O + C base pairs (Gray et al., 1988, Antao and Gray, 1993). The significance of O + C protonated pairs will become evident in the molecular modeling section. H— / — h --Ih Sugar, Sugar H— O Sugar Sugar' ; igure 2.5 Possible arrangement o f protonated cytosines, (a) Antiparallel and (b) parallel ii Experimental Evidence For a Hairpin The Structure o f ss(CCG)n at p H 8.5 Is a Hairpin in a (b) Alignment. To verify if ss(CCG),5 could fold into a hairpin at neutral and mildly basic pH, chemical modifications were performed with hydroxylamine (HA) (Singer and Grunberger, 1983; Johnston and Rich, 1985; Johnston, 1992) and 2-hydroperoxytetrahydrofiiran (THF- 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. OOH) (Liang et al., 1994). These reagents preferentially react with cytosines that are not base-paired or hydrogen bonded. If ss(CCG)is adopted a hairpin conformation, one o f the cytosines within a given CCG triplet should be base-paired to a guanine (i.e., a C-G pair), while the other cytosine should pair with a cytosine. Since it is not known whether the mismatched cytosines in a (CCG)n-containing hairpin are H-bonded, the mismatched cytosines will be designated as a C-C pair. The cytosine o f the C-G pair should react poorly with HA or THF-OOH. Therefore, incubation o f ss(CCG)is with HA or THF- OOH should reveal which o f the two cytosines within a triplet is base-paired to guanine. If ss(CCG)is were to adopt a hairpin in the (a) alignment, the cytosine 5 1 to the nearest guanine (designated as 5'C) should be much more reactive with HA or THF-OOH compared to the cytosine 3 ' to the nearest guanine. In contrast, if ss(CCG)15 were to adopt a hairpin in the (b) alignment, the 3 ' Cs should be much more reactive with HA compared to the 5 ' Cs. Incubation of ss(CCG)i5 with HA (Figure 2.6) or THF-OOH (data not shown) at pH 8.5 revealed high reactivity o f the 3'Cs, especially those at the 3 ' -end o f the sequence. These results provided evidence for a hairpin in the (b) alignment with CpG steps. Reactivity o f the 5'Cs was not observed, with the exception o f C28, a nucleotide located in the presumed loop region o f ss(CCG)15. Also, both cytosines in triplet I were nonreactive with HA (Figure 2.6) or THF-OOH (data not shown), a result consistent with a hairpin in the (b) alignment. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CO r IV III II xf xir 28Q G 29 27C C 30 r 26G0Q31 V II 2 5 £ « G 3 2 -I ,X' — L 24 C C 33 - r 2 3 G « £ 3 4 VI 2 2 G * G 3 5 - L 2 1 C C 36" ' 2 0 G -& 3 7 V 1 9 £ « G 3 8 - - 18C C 39 ‘ T 17G«C.40 16fi«G 41 . L 15C C42 “ T 14G -C .43 1 3 C -G 4 4 - 12C C 45 1 11 G*C.46 10C »G 47 J L 9C C 4 8 “ " 8G -C .49 I 7 Q « G 5 0 - . 6C »G 51 5C * G 5 2 4C T53 3T «A 54 2A C55 1G oC 56 A57 A58 G 59 C60 Tei f * -G i , - -G il «-h- ' "G m — Giv - -G y •m* v ^ -G v i * * * “ Gvii Z ^ - GVIII ^ 0 < ■ * - G IX w - G X - - G x i u -G x n * * — Gxill “ Gxiv € * “ G52 Figure 2.6. Chemical modification with hydroxylamine at pH 8.5 reveals a hairpin in a (b) alignment. Single-stranded (CCG)is was incubated with hydroxylamine (HA) at 37°C, pH 8.5, in 50 mM Na+ and applied to a 20% polyacrylamide gel containing 8 M urea. The concentration o f HA in the reaction mixture was 6.3 M. The deduced hairpin structure o f ss(C C G ) i5 is shown on the left. The 32P labels in the oligonucleotide are 5’ to A58 and C60. Roman numerals indicate triplet repeat numbers. Conventional Arabic numerals indicate the position o f a given nucleotide with respect to the 5’-end. The positions o f the reactive Cs within the deduced hairpin structure are shown. DMS indicates Dimethyl sulfate (21 mM) reaction. 35 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Hydroxylamine highly reacted with C55 and C56, two cytosines not part o f the triplet repeat region. C55 was opposite to A2 in the hairpin structure o f ss(CCG)is. Since C55 cannot form H-bonds to A2 at pH 8.5, the HA reactivity o f C55 served as a internal control for an unpaired cytosine. The HA reactivity o f C55 was comparable to that of C48, a cytosine within a C-C pair. The fact that the HA reactivities of C55 and C48 were similar provided evidence that there were no H-bonds in the C-C mismatches of ss(CCG)i5 at pH 8.5. The Structure o f ss(CCG) n+ at p H 7.5 Is a Hairpin in a (b) Alignment. The results o f electrophoretic mobility (Figure 2.4A), UV absorbance (Figure 2.4C), and CD studies indicated that the bases of ss(CCG)is were more highly stacked at pH 6.5-7.5. To determine if base-pairing was different at pH 7.5, HA reactions at various temperatures ( 15-55°C) were performed in 50 mM Na+ at pH 7.5. Similar to results obtained at pH 8.5 (Figure2.6), reactions performed at or below 45°C revealed high reactivity o f the 3'Cs, indicating that ss(CCG)is formed a hairpin in the (b) alignment. iii Experimental Evidence For the Ee-motif Hairpin The Sugar-Phosphate Backbone o f ss(CCG)u Is Distorted at p H 8.5. The 1NMR e-m otif structure o f Gao et al. describes a duplex containing CCG repeats in a (b) alignment with a distortion o f the CpG base-pair steps (Gao et al., 1995). To investigate the possibility that ss(CCG)is formed a (b) alignment hairpin containing distorted CpG base-pair steps, PI nuclease reactions were performed at various temperatures in 50 mM NaCl. At pH 8.5 and 25°C, the major sites o f PI nuclease cleavages was in the triplet 36 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. repeat region where a fold would be present in the loop o f a hairpin conformation, near triplet VIII (Figure 2.7). The 5 1 CpG 3 1 phosphodiester linkages in triplets X-XIV of ss(CCG)i5 were also cleaved to a minor extent (Figure 2.7). This result was significant since minor cleavages of the phosphodiesters in the stem regions o f oligonucleotides containing all other class I triplet repeat sequences have not been observed at pH 7.5 and 37U C (Mitas et al., 1995a,b; Yu et al., 1995a,b). To determine if experimental conditions such as a high pH, might cause cleavage in the stem, another sequence, (GTC)is, was constructed as a control that shares the same alignment, a similar Tm, and mismatched bases that are pyrimidines. This sequence did not cleave in the stem and provided further evidence that ss(CCG)is had a different conformation. Also longer sequences ss(C C G )| 8.2o, also gave similar results (data not shown). What is significant about these results is that the cleaved phosphodiesters in the stem region were within the CpG base- pair steps and not the C-C mismatches. The ability o f PI nuclease to cleave the phosphodiester linkage within two C-G base pairs provides direct evidence that the sugar phosphate backbone was distorted and might possibly form an e-m otif structure. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 37 ; f I 2B£ G29 1*270 C30 r 26G »£3' V D 2SQ*G32 124C C331 f 230*234 |x > ! " L S f S ||; ■ $ _25fC 35°C 45°C 55°C I i 1 Y IU 1 1 X II - xnt X IV xv L isC C3H 170*2,40 1 6 C * G < 1 .1 5 0 C42 ■14Q*£43 13fi*044 . 120 0 4 5 ] 110*246 10C-G47 90 0441 60*249 72*050 J 60*051 5 0 * 0 5 2 ^ 40 T 5 3 ^ s 3T*A54 j 2A C65 s 1Q*C68 - ' A S 7 * \ A56 ' 059 C60 T«1 M g • “® l m - -gii - * Jin • > - “«IV « ■ -Gy * -G vi V • -®vu 0 -Ovi - *"<*X . 0 ~^a . 0 ~Gxii - 0 "Gal 0 - < h a * - # ”< % N .# -0 6 1 — 0 -06? • m ^ - W W t * * • « M | i « Figure 2.7. PI nuclease digestion o f ss(CCG)is at pH 8.5. Single-stranded (CCG)is was incubated with PI nuclease for 5 min at the indicated temperatures. Buffer contained 50 mM Na+. Reaction products were applied to a 20% polyacrylamide gel containing 8 M urea. The amounts of P 1 nuclease added at the respective temperatures were as follows: 25-45°C, 0.116 (left lane) and 0.31 unit (right lane); 55°C, 0.035 (left lane) and 0.116 unit (right lane). Control lane was incubated at 25°C in the absence of PI nuclease. DMS indicates Dimethyl sulfate. Roman numerals represent triplet repeat numbers. Conventional Arabic numerals indicated the position o f a given nucleotide with respect to the 5’-end. Arrows indicate sites o f PI nuclease cleavage in the ss(CCG)is hairpin. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 38 At a slightly lower pH base stacking and base flipping at pH 7.5-6.5 was observed based on substantial cleavages o f the G41-C42 and C42-C43 phosphodiester linkages in the center of the helix stem at pH 6.5 and 7.5. These phosphodiesters flank a cytosine (C42) o f a C-C mismatch. Little or no cleavages o f the C40-G41 or C43-G44 phosphodiester linkages were observed (Figure 2.8), indicating that C42 and only C42 was extruded or flipped away from the 3 ' portion of the helix. Cleavages of the G41-C42 and C42-C43 phosphodiesters were also observed over a wide range o f salt concentrations (0-400 mM NaCl) and over a range o f temperatures (37-57°C) at pH 7.5 (data not shown), indicating that the flipped cytosine configuration was heat- and salt- stable. Little or no cleavage of the G14-C15 and C15-C16 phosphodiesters was observed, indicating that C l5, the base opposite that of C42 in the formal hairpin, was not flipped out of the helix. A longer sequence ssfCCG^o also showed a flipped out helix, however, unlike C42 o f ss(CCG)is, C62 o f ssfCCGho was not located in the center o f the helix but rather toward the 3 ' terminus. These results indicated that the cytosines can become extra helical, and distort the sugar-phosphate backbone. The structure responsible for this might be similar to that o f the e-motif. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 39 Tampwatur* of P1 laacttona. «C tn 1 10 20 25 Q «C>Qit iC * O u «C T S 3 iT *As4 aA C»» 9 -Oxv -os< Qst C«o t b, Figure 2.8. PI nuclease digestion o f ss(CCG)l5 at pH 6.5. Single-stranded (CCG)I5 was incubated with P l nuclease at the indicated temperatures. Buffer contained 50 mM Na+, pH 6.5. The amount o f PI mnuclease added per reaction was 3.5 * 10-2 unit. Incubation times are as follows: control, 5 h; l°C, 5 h; 10°C, 25 min; 20°C, 15 min; and 25°C, 9 min. Control lane was incubated at 25°C in the absence o f PI nuclease. DMS is the Dimethyl sulfate reaction. Results were similar at pH 7.5 2.4.2 Computer M o deling Based on the increased electrophoretic mobility and the possibility that the structure was more compact and stable with some cytosine protonation, we reasoned that the ss(CCG)n sequence might fold into three possible sturctures: (/) a tetraplex, (//) a hairpin (iii) an e-motif. Sterochemically possible structures o f ss(CCG)n+ are shown in Figure 2.9. 4 0 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. - & r - f G. t r\ G.C * ^ ! C.G C c j c c C C-J c c ,uC C ioC C C. ( , G.C G.C C.G C.G G.C (i • c C.G C.G G.C G.C ! C.G C c i r C C C J C C C Cm C Cm C. < ; < ;• ( ’ G.C C.G C.G 1 G.C G • c C.G C.G G.C G.C C.G C C > c c C C > C C C C C C r . ( , G.C G.C C.G C.G G.C 5 a u s * \J 5 y 5 . 1 ♦ Ka) J ib)___ 2(a) : M i Figure 2.9. Possible conformations adopted by single-stranded (CCG)n. Possible conformations of d(CCG)n are shown schematically in the lower panels, with the.stem regions o f each structure shown in all-atom format in the upper panels. Each model structure was built using software developed to allow the flexible construction o f DNA conformations (J. Shin, R. Romero, and I. S. Haworth, unpublished work). Structures 1(a) and 1(b) are quadruplex conformations, structures 2(a) and 2(b) are regular B-DNA hairpins, and structures 3(a) and 3(b) are extended e-m otif hairpins. The (a) and (b) notation for each structure refers to the alignment o f the interacting CCG repeat sequences. In alignment (a) a 5’ -CCG repeat pairs in an antiparallel m anner with a second 5’ -CCG repeat, thus forming a GpC base-pair step. In the (b) alignment, the 5’-CCG repeat is paired with a 5-G C C repeat, thus forming a CpG base-pair step..Conformation 1 is a single-stranded quadruplex containing a repeat unit o f the one C4 + and two CGCG quartets, arranged as shown. The quadruplex can be formed by a "folding" o f the appropriately aligned hairpin, with the quadruplex loops adopting a "crossover" orientation at the top o f the figure. This creates the strand organization necessary to allow two parallel-stranded O + C interactions. Cytosine protonation is necessary to stabilize this conformation. Conformation 2 is a regular B-DNA hairpin conformation containing C-C mismatch pairs (either protonated or nonprotonated). These pairs are clearly unstable, and the result o f a minimization from the standard B-DNA conformation is to separate the two cytosine bases, as shown in the center o f the stem. W e believe that this instability ultimately leads to the development o f nonstandard hairpin conformations for this sequence. 41 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.9. continued. Conformation 3 is referred to as an extended e-motif, and is based on the original e-motif structure (Gao et al., 1995). The essential element in this structure is the presence o f extrahelical cytosine bases formed from cytosines that were originally present as unstable C-C mismatch pairs. In the (b) alignment o f the extended e-motif, expulsion o f the cytosine bases from the body o f the helix results in the formation o f a pseudo-5'-GC step, which provides the structure with considerable stability, via G-G stacking. Such stacking is not possible in the pseudo-5'-CG step formed in the (a) alignment, and consequently we believe the extended e-m otif is only plausible for the (b) alignment. On the basis o f a molecular dynamics simulation (starting from a conformation obtained from the original e-motif; R. Romero, M. Mitas, and I. S. Haworth, to be published), we believe that the extrahelical cytosines, located in the minor groove, fold back toward the 5'-end o f the strand and stack with a neighboring extrahelical cytosine, providing further stability. To illustrate this process, two cytosine bases, C(x) and C(y), have been labeled in the regular hairpin conformations shown for structures 2(a) and 2(b). In the extended e-m otif conformations, these cytosine bases fold back such that they occupy the minor-groove region o f the neighboring 5'-CG step [in structure 3(b)] and ultimately form a stacked C-C pair. Examples o f this stacking interaction can be seen in the all-atom structure in the upper panel o f structure 3. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 42 i Simulation (i): Tetraplex DNA Containing Protonated C-C Pairs. In this theoretical structure, the (CCG)n sequence can fold into tetraplexes by virtue o f formation o f O + C pairs that are arranged in parallel (Figure 2.10(c)). The (CCG)n tetraplex is composed o f four strands. In each structure one 'side' o f the tetraplex has a pair o f parallel strands (Strand 1-3 and 2-4, Figure 2.10 (a)), and the other side has a second pair o f parallel strands (Strand 1-4 and 2-3), with the second pair being anti parallel to the first pair. Four bases can interact to form a tetrad. C*+C and parallel pairs can then form every third tetrad between stands 1 and 3 and between strand 2 and 4, while G-C and C-G Watson-Crick pairs can bridge strands 1 and 4 and strands 2 and 3 in the two intervening tetrads. Various alignments were constructed. The proposed putative (CCG)is tetraplex contains both parallel O + C and anti-parallel G-C Watson-Crick base pairs. Hydrogen bonding schemes for these two tetrads are shown in Figure 2.10. In addition to the intra base pair interactions, we also propose that both the GC-containing and the C42* tetrads have inter-base pair stabilization. The arrangement o f the GCGC tetrad is opposite to that observed by Leonard et al. (Leonard et al., 1995) in the sense that the interacting functional groups o f the base pairs are those from the major groove side o f the pair. Modeling o f a GCGC tetrad containing interactions between the minor groove functionality as proposed by Leonard et al required a large groove between the anti parallel strands (equivalent to the major groove o f a duplex DNA). In a (CCG)is tetraplex, this would compromises the formation o f the C*+C pairs. Further modeling o f the GCGC tetrad alignments suggested that the structure shown in Figure 2.10 (a) fitted much more effectively into the tetraplex alignment present experimentally. This 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. arrangement contains parallel and anti-parallel strands and parallel stranded base-base interactions. The guanine base interactions proposed by this model suggested that chemical probing with dimethyl sulfate (DMS), which reacts with the N7 atom o f guanines, could provide evidence of a tetraplex structure. If a tetraplex structure is present then the guanine bases should be unreactive to DMS. This issue will be explored in the disscussion section. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 44 -igure 2.10. Proposed strand and base arrangment for the (CCG)is tetraplex structure, (a) Proposed strand arrangement. Arrows indicate the 5’ to 3 ’ (tail to arrow head) strand direction. Arrows in the opposite direction indicate an antiparallel strand arrangement. Arrows in the same direction indicate a parallel arrangement, (b) Proposed hydrogen bonding arrangements for the GCGC (or CGCG) and (c) the C 42+ tetrads in b alignment. In the GCGC tetrad, the major groove functions (0 6 and N7 o f guanine and NHi-4 of cytosine interact with each other in the center o f the tetrad). The inclusion o f bifurcated hydrogen bonds between certain atoms in each tetrad is speculative. The alternative arrangement o f the GCGC tetrad (Leonard et al., 1995), formed by rotation o f the base pairs such that their minor grooves interact, was considerably less stable when placed in the context o f the overall tetraplex structure. Similarly, construction o f a G2C2 tetrad (in which one m ajor groove and one minor groove interact) gave a less favorable structure. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. After determination o f the tetraplex arrangement, various twist angles and rise distances between the bases were sampled and minimized to determine the best structure. A structure with a helical twist o f 32 0 and a rise o f 3 angstroms between bases gave the lowest energy. This arrangement retained the proposed hydrogen bonding and maintained the integrity o f the sugar phosphate backbone. Figure 2.11 shows a graph o f the results which reflects a sample o f some o f these calculations and the structure generated. Energy Energy Twist Figure 2.11. Representative graphs of the various energy surfaces generated when the twist and rise o f the helix was altered. Also shown is the quadruplex structure with the lowest energy having a twist o f 32° and a rise o f 3 angstroms between bases. 46 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The molecular modeling predicted that ss(CCG)is+ had the potential to fold into a tetraplex, but further experimental results led us to construct another model since ss(CCG)i5+ contained no detectable O + C base pairs at physiological pH. Recall, that in the experimental section, only the structure below pH 6.5 might correspond to a tetraplex like structure because O + C base pairs were detected by CD. Since this putative tetraplex structure only forms under nonphysiologic acidic conditions, it will not be discussed in further detail. ii Simulation (ii): B-DNA hairpin including intrahelical, mismatched C-C pairs and intrahelical, protonated mismatched C-C pairs. Based on the experimental data, a hairpin structure, with mismatched intrahelical cytosines arranged in the (b) alignment (Figure 2.9 structures 2(b)) was constructed. The stem o f this hairpin is identical to a duplex structure that contains CpG base-pair steps. Simulations on hairpin DNA were performed with and without protonated cytosines since electrophoretic mobility indicated that some o f the cytosines might be protonated. A complete disscussion o f the results o f the non-protonated hairpin in a (b) alignment will be covered in chapter 3. The overall result and the main global effect obtained from these simulations was that an apparent bending o f the hairpin towards the major groove developed. This motion was primarily caused by a ‘collapse’ o f the guanine bases towards the major groove and a simultaneous motion o f the mismatched cytosine bases towards the minor groove. Similar results were obtained with both non-protonated and protonated cytosines, however, the motion o f the DNA upon protonation o f the mismatched C-C pairs showed some difference to that for the unprotonated helix. A similar ‘w edge’ formation, also developed, but this was not as pronounced. The reason 47 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. for this is that the non-protonated cytosine remained stacked between the Watson-Crick pairs throughout the simulation. Conversely, the protonated cytosine o f each pair (C4, C7, CIO and C l 3) moved completely into the minor groove over the course o f the simulation. The results on the hairpin modeling were interesting but these structures, did not explain ail the experimental data. iii Simulation (Iii): Extended e-m otif Hairpin Based on The Coordinates o f The Original e-motif. In the last theoretical structure, which is an extension o f the e-m otif recently described by Gao et al., (1995), pairs o f extrahelical cytosines, which are formally separated by two C-G base pairs, stack together in the minor groove (Figure 2.12A). The simulations o f the d(CCG)i |G hairpin, which will be discussed in more detail in chapter 3, provided evidence that this structure may adopt a conformation which includes extrahelical cytosine bases, and that this conformation may be closely related to the e- m otif structure which has an isolated d(GCC)*d(GCC) trinucleotide repeat fragment sequence that surrounds the C-C mismatch.. To examine potential structures for the extended e-motif, we first needed to construct a starting point for the simulation, based on the original isolated e-motif. The starting conformation was constructed by combining a series o f e-motif structures based on the coordinates o f the [d(CCGCCG ) ] 2 DNA duplex e-m otif (see Method for details). Construction o f the extended e-m otif (henceforth referred to as the Ee-motif, Figure 2.12 B and C) was achieved by overlaying a central C-G base pair of 48 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. one e-m otif onto the terminal C-G base pair o f another, thereby extending the motif. Figure 2.12 shows the orginal e-m otif and the constructed Ee-motif. Figure 2.12. Construction o f the Ee-motif. (A) The orginal e-m otif with the extra helical cytosines shown in gray. The view is into the minor groove. (B) The constructed Ee- m otif with the sequence that was overlaid from the original e-m otif highlighted. (C) The same structure as in B but rotated 90° with all the interacting cytosines highlighted. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 49 Construction o f the Ee-motif resulted in the arrangement o f extra-helical cytosines shown in Figure 2.12 B and C. The orientation of the extrahelical cytosines in the original e-motif, C7 for example, (using the numbering in Figure 2.3 and also shown in Figure 2.13), is located in the minor groove at the C5-p-G6 step. Similarly, C28, the formal ‘com plement’ o f C7, is located in the minor groove at the C26-p-G27 step. In extending the e-motif, we noticed that the arrangement of the cytosines leads to C28 being located close to C13, and C31 being located close to CIO. This provided us with an initial indication that interactions between the extrahelical cytosines (but not between cytosines that are formally paired in the original hairpin) was possible and that these could provide a source o f stability for the structure. These interactions reflect a 1,7 relationship based on the original C-C mismatch pairs, that is bases spanning seven steps are interacting. In the Ee-motif hairpin, the Watson-Crick pairs essentially form a CG-altemating helix. However, the helical twist for each base pair step is not similar (as in a typical B- DNA twist o f 36°). In contrast, they alternate approximately 80° for the CpG steps and close to 0° for the GpC steps (the GpC step that is formed when the formal C-C pairs are extrahelical). These twist angles persist throughout the simulation. Other than the alternating twist angles, the general appearance o f the Ee-m otif Watson-Crick paired helix is normal. The base pair planes are essentially coplanar and the rise between the Watson-Crick pairs is similar to that in B-DNA. This is illustrated by the Ops structure in Figure 2.1 IB, in which the extrahelical cytosines have been lightened for clarity. O ur initial intention in performing the molecular dynamics simulation of the Ee- m otif d(CCG)uG hairpin was to examine if the putative interactions C28 - C13 and C 31 - 50 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CIO might be retained. As shown in Figure 2.13A these pairs have N3-N3 distances o f about 4A and 5A respectively in the starting model, and could potentially lead to protonated pairs. However, during the simulation any possible interaction between these pairs was lost (Figure 2.13A). As the simulation progressed the DNA in the central part o f the stem deformed, while base pair planes towards the termini and the loop (C14-G21, G12-C23 and G3-C32, C5-G30 respectively) remained essentially coplanar (The two terminal base pairs frayed to some extent, and the base pair (G15-C20) closest to the loop also became unpaired to an extent). The motion was accompanied by an extension o f the hairpin and an increased rise in the center of the stem. The plane o f base pairs G9-C26 was tilted some 45° with respect to the planes o f base pairs C14-G21 and G3-C32. The planes o f C14-G21 and G3-C32 remained essentially coplanar, suggesting the deformation was localized to the center o f the helix, in the region o f the C-C extrahelical pairing. This motion was clearly correlated with the shift o f the extrahelical cytosines, and our original hypothesis o f interactions between extrahelical cytosines was strengthened by the development o f three new cytosine pairs C13-C25, C10-C28 and C7- C31 (Figure 2.13B) as described below. Figure 2.13B graphs the development of the C13-C25 and C7-C31 pairs by monitoring their N3 - N3 distance in the simulation. We stress that none o f the cytosines were protonated in this simulation. We also note that these cytosine pairs are not those formally paired in the original hairpin. The N3-N3 distance has decreased to below 5A in each case. Furthermore, the 2-carbonyl and 4-amino functions o f the paired cytosines align themselves in the simulation such that the pairing cytosines are ideally positioned to form a protonated pair. However for the C10-C28 initially the N3-N3 distance decreases, 51 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. but due to the helical strain imposed by the development o f the new cytosine pairs, CIO and C28 are separated by C28 flipping out o f the helix (Figure 2.14B). These results are consistent with the observed PI nuclease cleavage in the stem o f the hairpin (Figure 2.8). 52 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 20 2 15 4> c 10 5 0 (0 * - < n E Average Distance of Orginal Interacting Cytosine 1*7 "pairs' 13&28 10&31 50 100 Time (ps) Average Distance of New Interacting Cytosine 1-4 "pairs' 150 50 100 Time (ps) 150 CG 16C C19 G-C C-G 13C C22 G-C C-G 10C C25 G-C C-G 7C C28 G-C C-G 4C C31 G-C C-G C-G 7&31 13&25 10&28 Figure 2.13. Motion o f the extrahelical cytosine bases during simulation (iii) Ee-motif as determined by their N3-N3 distances. (A) The original interacting cytosine base pairs, 1- 7, spanning seven base pairs. These interactions formed from the initial structure of (iii) at the beginning o f the simulation. (B) The interacting cytosine bases that developed during the simulation, 1-4 interactions spanning four base pairs. Except for bases CIO and C28, the new 1-4 interactions were stable. 53 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.14 The Ee-motif and idealized Ee-motif. (A) The Ee-motif Ops starting structure and the 1-7 original interacting bases are highlighted. (B) Ee-m otif lOOps structure and the new 1 -4 interacting bases are highlighted. Notice how they stack in the groove. Also shown is the flipped out C28. (C) The idealized Ee-motif. Highlighted are the stacked cytosines which were observed in the initial Ee-motif at lOOps and used as a starting structure. Notice that in this idealized structure C10-C28, which are in the center o f the helix, are stacked in the groove. iv Simulation (iv): Extended idealized e-motif hairpin including C/C stacking in the minor groove. Based on these results the idealized e-motif was constructed. Figure 2.14C shows the starting structure o f the idealized Ee-motif. The cytosine bases C13-C25, C10-C28, and C7-C31 were arranged in a stacked arrangement similar to that observed in the previous simulation. The twist, rise, and backbone o f the original e-m otif was analyzed with CURVES 3.1 and used as input in the BTOZ program developed in our lab. The structure was identical to the original except for the stacked cytosines in the minor groove, a 1-4 interaction. For this structure the C10-C28 bases were positioned in an 54 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. idealized position, similar to that observed for the other cytosines to see if this conformation was stable. Various simulations were performed on this structure to monitor the stability o f the stacked cytosines. The cytosines were constrained and positioned to allow hydrogen bonds to develop in a stacked and parallel position. The cytosines were also protonated on one side o f the helix, and then on both sides, again to allow hydrogen bonds to form. In every simulation, to allow C13-C25 and C7-C31 to remain near each other, and reduce the helical strain imposed by this interaction, CIO and C28 would lose their interaction and C28 would move out and flip out o f the helix. 2.5 Discussion Since C-G base pairs are not stabilized by cytosine protonation, we concluded that the protonation in ss(CCG)is was limited to the C-C mismatches. We are aware o f three potential types o f cytosine-cytosine interactions that are stabilized by protonation. The first and classic example is the hemiprotonated C»+C pairs arranged in parallel that was first described in 1963 (Langridge and Rich, 1963). Since C»+C base pairs have been observed at pH values as high as neutrality (Lavelle and Fresco, 1995), we initially speculated that ss(CCG)is+ might contain C*+C pairs within an intramolecular tetraplex. However, several results rule out the possibility that ss(CCG)is+, ss(CCG)i8+, or ss(CCG)20+ contained C«+C pairs. The results from CD studies provided no indication o f C»+C pairs in ss(CCG)is until the pH was reduced below pH 6.0. DMS was able to alkylate the N7s o f ss(CCG)is+ (see DMS lanes in Figures 2.6, 2.7, and 2.8), a result that was not consistent with a tetraplex containing CGCG tetrads. Also, the results o f the UV absorbance melting 55 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. profile performed at pH 7.5 indicated a single structural transition o f ss(CCG)is+ (Figure 3C). This result was most consistent with a hairpin random coil transition, and not a tetraplex hairpin random coil transition. The results provided no evidence for an intermediate structure. The second type of a protonated C-C pair is one arranged in an antiparallel orientation (Brown et al., 1990). Experimental evidence ruled out this orientation. A high HA reactivity o f the cytosines 3' to the nearest guanine in ss(CCG)is+ (Figure 2.6) suggest a lack o f H-bonds to N4 o f these cytosines. Also, an antiparallel O + C pair is similar in structure to the T T pair observed in d(CTG)n hairpins (Smith et al., 1995; Marriapan, 1996b); both base pairs are pyrimidine-pyrimidine mismatches that involve H-bonds formed between N3 and 02. The results o f PI nuclease studies performed with ss(CCG)i5+ (Figures 2 .8 ), ss(CCG)is,2o+ (data not shown) are not consistent with a helix that contains pyrimidine-pyrimidine base pairs such as those observed in ss(CTG)is or ss(GTC ) , 5 (Mitas et al., 1995a; Yu et al., 1995a,b). pH-dependent Stabilization o f ss(C C G )l5 Is Not Due to the Formation o f Parallel or Antiparallel C*+C Pairs. Interestingly, although spectroscopic data indicated further protonation o f ss(CCG)is between 7.5 and 7.0, electrophoretic data did not (Figure 2.4A). We suspect that this result was because the Mrel o f ss(CCG)is is governed by charge, as well as structure. Without any structural change, protonation o f ss(CCG)is alone would result in a decrease in its Mrel and not an increase. Therefore, the increase in the Mrel o f ss(CCG)15 at pH 7.7 relative to pH 7.9 must have been due to a large structural change. 56 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Without any structural change, protonation o f ss(CCG)is alone would result in a decrease in its Mrel and not an increase. Therefore, the increase in the Mrel o f ss(CCG)is at pH 7.7 relative to pH 7.9 must have been due to a large structural change. Between pH 7.7 and 7.0, structural changes were probably offset by the addition o f positive charges such that there was no net gain in the Mrel o f ss(CCG)i5 . C-C+ Pairs Stacked in the Minor Groove: A New DNA Structure. A third type of a protonated C-C pair (C-C+) is theoretical in nature and results from extension of the e- m otif structure (Gao et al.. 1995). The C-C+ pairs in this extended e-m otif are not base pairs per se, since they are not stabilized by H-bonds and do not lie in the same plane. Rather, the stability o f the pairs arises from stacking interactions within the minor groove of a rather distorted helix (Figure 2.14C). The experimental data described in this chapter are completely consistent with this new DNA structure, in which the interacting cytosines are formally separated by two C-G base pairs. At pH > 7.9, the mismatched cytosines might be pushed toward the minor groove but do not fully develop into C-C+ stack pairs. The conformation between 7.9 to 7.0 will be explored further in the next chapter. A striking feature of ss(CCG)is+ was the flipping o f a single cytosine away from the central portion o f the hairpin stem (Figure 2.8 and 2.14B). In the absence o f other data provided in this report, it could appear that the flipped cytosine, that is the PI nuclease cleavage at this site, provided evidence for a tetraplex structure, which must contain a fold in the middle o f the hairpin stem. However, several lines o f evidence, including those described above, suggest that the flipped cytosine in ss(CCG)i5+ was a 57 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. manifestation o f the stress imposed upon the sugar-phosphate backbone by the extrahelical C-C stack pairs, and not due to a tetraplex arrangement. First, significant cleavage o f the G41-C42 phosphodiester o f ss(CCG)i5 was achieved at pH 8.5 by lowering the temperature o f the reaction to 1 or 10 C (data not shown). Also minor cleavage o f the G41-C42 phosphodiester was also observed at pH 8.5 at 25°C (Figure 2.7). This result suggested that the central portion o f the helix was significantly deformed in the absence o f cytosine protonation, a result that was not consistent with a tetraplex structure. One interpretation is that reduction in temperature allowed for the extrahelical cytosines within a stack pair to move closer toward one another, thereby adding strain to the sugar-phosphate backbone. We also note that the UV absorbance melting profile conducted at pH 8.5 (Figure 2.4C) indicated a minor structural perturbation at ~25°C, perhaps indicating reorientation o f the mispaired cytosines at this temperature. Second, the molecular dynamics simulations o f the d(CCG)uG hairpin starting from a conformation o f an extended e-m otif predicts a flipping o f the central cytosine away from the core o f the helix. Third, although evidence for a flipped cytosine was not found in the central portion o f the hairpin stem o f ss(CCG):o+ (data not shown), a DNA sequence predicted to form a more stable tetraplex compared to ss(CCG)is+, there was significant cleavages o f the phosophodiesters nucleotide involved in a C-C mismatch located away from the center o f the helix stem at pH 7.5. The fact that cleavage of ss(CCG)2o was not located in the center o f the helix stem suggests that nuclease cleavage results from a structural deformation (such as relief o f helical strain imposed by the C-C+ mismatches) that was not a tetraplex. It is possible that the reason cleavages o f the C62 phosphodiesters (in ss(CCG)2o+) were not as pronounced as the C42 phosphodiesters (in 58 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ss(CCG)is+) was that the additional helical length o f ss(CCG)2o allowed for greater dissipation o f the backbone strain imposed by the C-C+ stack pairs. The amount of helical strain may be inversely related to the pKa o f the sequence. In support o f this possibility, the pKas o f ss(CCG)ig and ss(CCG)2o were -8 .2 and -8 .4 . To our knowledge, these are the highest values ever reported for pKa o f a cytosine. 2.5.1 Significance and Relevance o f a Hairpin Folded in a (b) Alignment to Fragile X The stability o f the (b) alignment hairpin may be recognized easier by the human MTase. Previous in vitro studies have provided evidence that the human MTase is capable o f methylating hairpins formed from (CCG)is (Smith et al., 1994) and (GCC)s- 7,ii (Chen et al., 1995) with the rates o f methylation o f these sequences increasing as the number o f repeats increased. Previous 1H NMR studies have shown that (GCQ5.7 fold into (a) alignment hairpins (Chen et al., 1995), while the present study has shown that (CCG)i5 .i8,2o fold into (b) alignment hairpins. It is possible that the high rate o f in vitro methylation o f d(CCG)is observed in the previous study might be due to one of two factors. First, d(CCG)is may exist in solution in a number o f conformations, o f which the (b) alignment hairpin is the predominant form. The (b) alignment hairpin might be ignored by the human MTase, while a minor population o f (a) alignment hairpins might be rapidly methylated as a consequence o f the increased length o f the hairpin stem. In support o f this possibility, it has been shown that optimal rates o f methylation by the human MTase are achieved when DNA fragment sizes are at least 20 bp (Laayoun and Smith, 1995), which is the approximate helix length o f d(CCG)is. The second and more intriguing possibility is that due to the distorted nature o f sugar-phosphate backbone, the 59 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (b) alignment hairpin may be an excellent substrate for the MTase. In support o f this possibility, cytosines within extrahelical CpG dinucleotides (i.e., an extremely distorted helix) are rapidly methylated by the human MTase (Laayoun and Smith, 1995). Potential Relationship Between (b) Alignment Hairpins. Slipped Structures, and Fragile Sites. Recently, Peason and Sinden (1996) provided evidence that reduplexmg of d(CCG)n«(CGG)n repeats resulted in formation of alternative secondary structures, which were best understood as slipped structures stabilized by single-stranded hairpins. These structures not only may play a role in expansion events but also may play a role in the genesis o f chromosomal fragile sites, which appear as gaps in metaphase chromosomes and can be induced under a variety o f culture conditions (Sutherland and Hecht, 1985). Those fragile sites induced by folic acid deprivation have been localized to d(CGG)n«d(CCG)n sequences (Ashley and Warren, 1995) methylated at CpG dinucleotides (H om stra et al., 1993). It is not known whether methylation plays a direct role by inducing the formation of unusual DNA structures or plays an indirect role by slowing DNA replication. Prior to the isolation o f the FMR-1 gene, Sutherland et al. (1985) hypothesized that a folate-sensitive fragile site was an amplified polypurine/polypyrimidine tract which did not package during mitosis. Failure o f the DNA to package could be due to the presence o f single-strand DNA gaps (Ledbetter et al., 1986) or some type o f DNA structure that prevented packaging, or both. In some cases, the chromosome physically breaks at the fragile site (Sutherland, 1983), suggesting that d(CGG)n*d(CCG)n 60 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sequences might be cleaved by an endonuclease. Again, it is possible that our proposed structure due to its distorted backbone, might also be a substrate for endonucleases. Further studies are required to determine whether folate-sensitive fragile sites result from hairpin structures in the (b) alignment, result from tetraplex structures formed by d(CGG)n (Fry and Loeb, 1994; Mitas et al., 1995a; Usdin and Woodford, 1995; Kettani et al., 1995), or result from other factors such as nucleosome exclusion (W ang and Griffith, 1996). 2.6 Summary In this chapter, the structural properties o f oligonucleotides containing 11, 15, 18, or 20 CCG repeats, the lengths o f which are not amenable to high-resolution NMR spectroscopy, were discussed. Experimental evidence by Yu et al. indicated that the hairpin structure formed by d(CCG)n>15 is the only one among class I triplet repeat sequences that contains a helix which can be cleaved by an endonuclease (Figures 2.7 and 2.8). The results o f pH-dependent electrophoretic studies (Figure 2.4A) provided strong evidence that the structures o f ss(CCG)i5 and ss(CmCG)is were similar, suggesting that the helix o f a methylated (b) alignment hairpin is also distorted. Since it is apparent that the number o f repeats can influence the conformation o f a (CCG)n hairpin, we chose to examine the conformation o f such longer hairpins using a molecular dynamics simulation approach. It was determined experimentally that all the sequences exclusively form hairpins in a (b) alignment, however, the hairpins exhibit unusual features: a population o f the mismatched cytosines appear to be protonated at a relatively high pH, and the sugar-phosphate backbones are highly distorted. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In theory, ss(CCG)i5,|g.2o could have adopted the alternative (a) alignment, whereby the cytosines 5 1 to the nearest guanine were mismatched with one another. The (a) alignment contains GpC base-pair steps, whereas the (b) alignment contains CpG base-pair steps. The stacking energies o f GpC and CpG base-pair steps have been calculated to be -14.59 and -9.69 kcal/mol, respectively (Saenger, 1984). Therefore, if one assumes that the interactions between the mismatched Cs in (a) or (b) alignments are identical, one would also assume that an (a) alignment should be more stable than a (b). The failure of Yu et al. to detect an (a) alignment o f ss(CCG) 15.18 .2 0 suggests that when the number o f repeats is >15, the interactions o f the mismatched Cs in a (b) alignment are more stable compared to the interactions o f the mismatched Cs in a (a) alignment. It is also possible that the extrahelical cytosines create a pseudo GpC step which is energetically favored. When the number o f repeats is small, the loop structure and/or end effects may favor an (a) alignment as previously observed in 1H NMR studies for the sequences d(GCC)s-7 (Chen et al., 1995; Mariappan et al., 1996a). However, as the number o f repeats increases, the preferred alignment becomes (b), because the cytosines have a greater tendency to become extrahelical, creating an Ee-m otif like structure with stable GpC steps. The results described in this chapter are entirely consistent with the experimental data and a new DNA structure containing multiple pairs o f extrahelical cytosines. However, there are still questions about the conformation o f the mismatched cytosines. It is contradictory that a short duplex like the e-m otif [d(CCG)z]2 has extrahelical cytosines while short hairpins o f ss(CCG)s have been reported to have none, and fold in an (a) alignment. Therefore, we reasoned that if short repeats could be arranged in a (b) 62 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. alignment, perhaps the e-m otif conformation could exist. The next chapter continues the computer modeling, while chapter 4 discusses the results o f chemical probing on a d(GCC)i e-m otif like fragment. 63 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 3 Molecular Dynamics Simulations of DNA Molecules Containing d(GCC)n «d(GCC)„ Fragments and C-C Mismatch Pairs 3.1 Overview o f Chapter Fragile X syndrome is characterized by expansion o f genomic trinucleotide repeat DNA o f the sequence d(CGG)n.«d(CCG)n Hairpins formed by both the d(CGG)n and d(CCG)n single strands (containing G-G and C-C mismatch pairs, respectively) have been implicated in the disease state. For the d(CCG)n strand, the hairpin can adopt two alignments in which either underlined cytosine (C£G ) or (£CG ) o f the repeat is involved in a C-C anti-parallel mismatch pair. We have previously shown (Yu et al., Biochemistry, 1997, 36, 3687) that the latter o f these alignments forms for d(CCG)is. In this molecule, the hairpin stem comprises a series o f d(G £C)«d(G £C) trinucleotide repeat helical fragments. In this chapter, we examine the conformational mobility o f this fragment using molecular dynamics simulations. We contrast the conformation of the fragment when centrally placed in a random sequence DNA duplex (using simulations o f 9mer and 13mer duplexes, d(AGAG£CTCG)»d(CGAG£CTCT) and d(TCAGAG£CTCGTT)*d(AACGAG£CTCTGA), respectively), with the mobility o f the fragment in a hairpin conformation formed from the single strand d(CCG)nG. This hairpin has a stem containing four d(G £C)«d(G £C) helical fragments. In 220ps simulations o f the 9mer and 13mer duplexes, the cytosine bases o f the mismatch pair remain stacked within the helix. In contrast, in the hairpin conformation the cytosine bases o f the mismatch pairs o f the central two d(G £C)«d(G £C) fragments move into the minor groove. The flanking G-C base pairs o f each fragment retain three strong hydrogen 64 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. bonds, but move towards the major groove. The guanine bases o f these pairs closely approach each other, and a structure develops that has the characteristics o f e-m otif DNA (Gao et al., J. Am. Chem. Soc. 1995, 117, 8883). We interpret this structural motion in terms o f the initial experimental data discussed in chapter 2 which also suggests that, at high pH, the mismatched cytosine bases o f the d(CCG)n hairpin may be extrahelical. 3.2 Introduction To continue to explore the conformation o f the mismatched cytosines and to determine if their position could be altered by neighboring sequences, computer simulations were performed to examine the conformation o f the mismatched cytosines in a random sequence with one d(GCC)i repeat and d(CCG)nG. The overall result from these simulations indicated that the cytosines remained essentially within the helix, with less motion surrounding the mismatched region with one repeat (a more normal B-DNA structure). For the structure with more then one repeat, more motion was observed, with a tendency for the cytosines to move into the minor groove due to the collapse o f the major groove. In the introduction it was noted that many single-stranded DNA molecules corresponding to trinucleotide repeat sequences implicated in human disease have been shown to form stable secondary structures and hairpin DNA conformations (Gacy et al., 1995; Pearson et al., 1996; Mitchell et al., 1995; Mitas et al., 1995a; Yu et al., 1995b; Mitas et al., 1995b; Mariappan et al., 1996b; Chen et al., 1995; Mariappan et al., 1996a; Yu et al., 1997). O f all the trinucleotide repeat sequences, the CCG repeat may be the most 65 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. conformationally flexible. As we and others have previously described, the CCG repeat hairpin can exist in two alternative alignments, in which either underlined cytosines (C£G ) or (£C G ) can form the C-C mismatch pair (Chen et al., 1995; Mariappan et al., 1996a; Yu et al., 1997). In chapter 2 we referred to these alignments as the (a) and (b) alignment, respectively (Yu et al., 1997). Shown in Figure 3.1, are more schematics o f the (a) alignment hairpin, in which the stem comprises helical repeat fragments of sequence d(C£G )«d(C£G ), while in the (b) alignment the stem contains d(G £C)*d(G £C) repeats. Both o f these fragments contain a central C-C mismatch pair. Throughout this chapter the use o f the d(G £C)«d(G £C) nomenclature will be used, with an underlined C denoting a cytosine in a mismatch pair. c G-C C-G C ( \ ( \ ( \ r ^ C G C C C C G C I ) ^ ) I ) ^ ) G-C C-G G-C C-G C C G-C C-G C C C-G C C C C G-C G-C] C-G G-C C-G] C C G-C] C-G] C C r [C-G C C C C r [G-C G-C r [C-G r [G-C C-G C C G-C C-G C C C-G C C C G-C 5' C-G 5' C-G 5' C 5' (a) (b) odd even odd e v e n Figure 3.1. Possible hairpin alignments o f d(CCG)n, for molecules where n, the number o f CCG repeats, is odd or even. The brackets r[ ], indicate a particular unit o f repeat alignment, r, in which the mismatched cytosines are in the center. In the (a) alignment hairpins, the cytosine bases 5’ to the guanine o f each r[C£G ] repeat are involved in formal C-C mismatch pairs, while in the (b) alignment the cytosine bases 3 ’ to the guanine o f each r[G£C] repeat form the mismatch pairs. The two hairpin stems (a) and (b) then contain d(C£G ).d(C£G ) and d(G £C).d(G £C) repeat fragments, respectively. 66 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. As stated in chapter 2, based on NMR spectroscopy, it has been shown that both single stranded ss(GCC)s and ss(GCC ) 6 (alternatively written as ss[G(CCG)4 CC] and ss[G(CCG)sCC] ) form stable hairpins in the (a) alignment (Chen et al., 1995). In contrast, based on hydroxylamine reactivity o f specific cytosine bases, discussed in chapter 2, we showed that a longer hairpin ss(CCG)is preferentially adopts a (b) alignment (Yu et al., 1997). This apparent switch in alignments caused by an increase of the trinucleotide repeat number is in contrast to the (CGG)n sequence, in which two alignments (defined by the specific guanine bases involved in the G 5> '"-G“"" mismatch pair) are also possible (Mitas et al., 1995a). For (CGG)n all structures examined have exhibited the alignment having the 3 ' guanine (CGQ) bases forming the mismatch pair, regardless o f sequence length (Mitas et al., 1995a; Mariappan et al., 1996a). The characterization o f a novel DNA conformation, observed for the [d(CCGC.iCG ) ] 2 DNA duplex and referred to as an e-m otif (Gao et al., 1995) discussed in chapter 2 , provided a further possibility for the conformation o f the (b) alignment d(CCG)n hairpin. At first sight o f the sequence, one might expect this unique duplex to have an (a) alignment with CCG repeats (four Watson-Crick C-G pairs and two C-C mismatch pairs). However, Gao et al. (1995) have shown, using NMR spectroscopy, that the favored conformation has the 5 ' cytosine o f each strand overhanging the other strand, and thereby causing a shift to the (b) alignment in which the C4 o f each strand would be formally mismatch paired with each other. This provides for a central d(G£C)«d(G£C) fragment (where £ is C4 o f each strand). M ost interestingly o f all, the mismatched cytosine bases actually adopt an extrahelical conformation in the minor groove, and the 67 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ‘core’ helix with the cytosines removed comprises a [d(CGCG ) ] 2 structure, with the central base pair step being a pseudo 5 ' -GpC step. The stacking energy associated with such a step is probably the main energetic driving force for adoption o f the conformation. Hence, the e-m otif is much more likely to occur in the (b) alignment for the d(CCG)n sequence, since the equivalent displacement o f the mismatched cytosine bases in an (a) alignment would result in the formation o f a pseudo-5 ' -CpG step. 5 ' -CpG steps in normal, W atson-Crick helices are less stable than 5’-GpC steps. In chapter 2 the experimental data determined that the (CCG) 15 hairpin exhibits a pH dependence in its conformational flexibility (Yu et al., 1997). This is manifested in an increase in electrophoretic mobility at pH < 7.9. Although not completely definitive, this and other behavior has been interpreted as a transition from an unstable, unprotonated hairpin structure at high pH, to a more stable low pH structure in which some cytosine protonation (at N3) may have occurred (Yu et al., 1997). Based on NM R spectroscopy it appears that the formally mismatched C-C pair in extended runs o f CCG repeats is interchanging between an extrahelical location o f the cytosine bases, and an intrahelical C-C pair that may be protonated (Zheng, et al., 1996). Given the conformational flexibility apparent for d(CCG)n DNA molecules in general, and for the d(G £C)»d(G £C) fragment in particular, we have used molecular dynamics calculations to characterize further this behavior. Simulations o f the d(G £C)«d(G £C) fragment in the context o f an otherwise random sequence duplex suggest that the mismatched cytosine bases remain intrahelical and stacked with the 68 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. flanking G-C and C-G pairs. In contrast, for a hairpin structure having contiguous d(G £C)«d(G £C) fragments, we find that the cytosine bases o f the mismatch pairs tend to become extrahelical, and move into the minor groove. At the same time, the flanking guanine bases approach each other, and a developing guanine-guanine stacking interaction is apparent. 3.3 Methods Molecular dynamics simulations were performed on the DNA duplexes d(AGAG£CTCG)«d(CGAG£CTCT) and d(TCAGAG£CTCGTT)» d(AACGAG£CTCTGA), both o f which contain a central d(G £C)«d(G £C) fragment, and on a (b) alignment hairpin conformation o f the DNA molecule d(CCG )nG. The additional 3 ' guanine base was included in the hairpin as a complement o f the 5 ' cytosine base to stabilize the terminus o f the hairpin (see Figure 3.2). All calculations were performed using the AMBER4.0.1 force field (Pearlman et al., 1991; Weiner et al., 1986), implemented on a Silicon Graphics Indigo workstation. Standard AMBER parameters and charges were applied to the DNA bases. Simulations were performed using a canonical B-DNA starting conformation for the two duplexes. The stem o f the d(CCG )nG hairpin shown in Figure 3.2 (bases C l to G15 and C20 to G34) was similarly constructed in a regular B-DNA conformation using the Quanta 4.0.1 package. For the duplexes and the hairpin stem the cytosine bases forming C-C mismatch pairs were included in their ‘normal’ location in a C-G pair, and no attempt was made to form hydrogen bonds between the two cytosines o f the mismatch. A four base fragment (C l6 to C l9) comprising the loop region o f the hairpin 69 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. was manually added to the stem helix, and then the loop fragment relaxed using energy minimization with the stem frozen. Several different starting points were used for the loop location, and the lowest energy o f these following minimization was chosen as the starting structure for molecular dynamics. In all the calculations the DNA was solvated in a periodic box o f TIP3P water molecules, with a minimum distance o f 8.0A from the DNA to the box edge. Sodium counterions equivalent to 2/3 neutralization o f the DNA charge were added at the positions o f most negative potential and then the whole DNA / counterion system was resolvated in a larger periodic box. The molecular dynamics was performed in the nPT ensemble at 298K, using a time step o f 0.002ps, a 8 .0 A non-bonded cut-off, a dielectric constant o f 1 and SHAKE bond length constraint. Each simulation was performed for 220ps, including 40ps o f equilibration o f solvent only, and lOps o f linear heating (from 0 to 298K following the solvent equilibration). Coordinates were saved every 0.4ps for subsequent analysis. 3.4 Results and Discussion The essential result that emerged from the molecular dynamics simulations was that the conformational mobility o f the d(G £C)»d(G£C) fragment when placed in a run o f several o f these fragments is much greater than that for the same fragment when isolated in a random sequence duplex DNA. To illustrate this, we show snapshots o f the structures of d(G£C)«d(G£C) fragments generated in the simulations o f the 13mer duplex (Figure 3.3(a)), and contrast these with similar snapshots o f the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. d(G 9£.oC n)*d(G 24£ 2 5C 2 6 ) fragment (see Figure 3.2 for numbering) from the simulation o f the d(CCG)nG hairpin (Figure 3.3(b)). C-G ( \ Cl* Cl, I ) G-C C-G 1 3C C22 G-C C-G 1 0C C25 Figure 3.2. Hairpin alignment o f d(CCG )nG used G_c in the molecular dynamics simulation. This c-G alignment corresponds to a (b) alignment 7 c c3 8 (see text and Figure 3.1). G-C C-G «C C31 G-C C-G iC-G 5' 3' Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 71 60ps 120ps 180ps (b) Figure 3.3. Snapshots o f (a) the d(G£C)«d(G£C) fragment o f the I3mer duplex and (b) the d(G*)£ioCn)«d(G:4£ : 5C 2 6) fragment o f the d(CCG)nG hairpin after Ops (canonical B- DNA conformation identical in each simulation), 60ps, I20ps and I80ps. The time is taken from the end o f the 40ps water equilibration. Each structure is viewed with the major groove on the left, and the C-C mismatch pair centrally positioned. In (a) no substantial change from the B-DNA conformation occurs over the simulation, but in (b) the development o f a wedge formed by the flanking G-C and C-G pairs is apparent, with the concomitant movement o f the mismatched cytosine bases into the minor groove. For the isolated d(G£C)»d(G£C) fragment in the I3m er duplex there is no significant motion away from the initial B-DNA starting conformation (Figure 3.3), although some widening o f the major groove does occur (see below). Similar behavior was observed in the 9mer simulation. In contrast, for the dfGqCinCn^dfG-uC-xCv.) fragment o f the hairpin, there is a large conformational change. The main global conformational effect is a local bending o f the hairpin towards the major groove. This is primarily caused by a ‘collapse’ o f the G-C pairs (Gg .C2 6 and G 2 4 -Q i) towards the major groove and a simultaneous motion o f the mismatched cytosine bases ( £ 1 0 and Q><) towards the minor groove. Ultimately, a ‘wedge’ is formed from the planes o f the G-C and C-G pairs flanking the cytosine mismatch, with the narrow end towards the major 72 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. groove. This is exemplified in Figure 3.3 by the structure formed by the d(G<£>oC„)*d(G2 4£ 25C2 6 ) trinucleotide fragment. This behavior also occurs for the other internal fragment in the hairpin stem, d(G 6£ 7C 8)«d(G:7C 2sC2 9 ). W e stress that, despite the considerable motion o f the G-C and C-G pairs, the integrity o f the hydrogen bonds in each pair is not compromised. Figure 3.4 shows the structures of the d(GCC)«d(GCC) fragment of the 13mer and o f the dfGgCioCiO^dfGiaCasCib) fragment o f the hairpin after completion o f the molecular dynamics simulation. In the latter it is apparent that the wedge conformation (Figure 3.3) formed by the G-C and C-G pairs flanking the C-C mismatch is accompanied by a tendency for the two guanine bases to develop a stacking interaction in the major groove (Figure 3.4). Although the guanine bases do not reach co-planarity over the period o f the simulation, it is possible that a fully stacked guanine-guanine interaction (as in the e-m otif (Gao et al ., 1995) could emerge from the conformations shown in Figures 3.3 and 3.4. The motion o f the internal d(G £C )«d(G £C ) fragments in the hairpin is in contrast to the isolated d(G £C)«d(G£C) repeat, in which the guanines remain essentially in their initial positions (Figure 3.4(a)). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 73 Ops (a) ' (b) Figure 3.4. Structures o f (a) the d(G £C)«d(G£C) fragment o f the 13mer duplex and (b) the d(G 9£ioCii)«d(G 24C 2sC2 6) fragment o f the d(CCG )nG hairpin after 180ps of molecular dynamics, viewed looking into the major groove, and contrasted with a canonical B-DNA conformation (Ops). Some widening o f the major groove is apparent in (a), but a much greater conformational change has occurred in (b). Here the two guanine bases are developing a stacking interaction in the major groove. The mismatched cytosine bases are visible behind the guanine bases. An analysis o f the N7 to N7 distance for the two guanines in the two internal repeats o f the d(CCG )nG hairpin is shown in Figure 3.5, and provides some quantification o f the conformational changes described above. For the internal d(G £C)«d(G £C) fragments o f the hairpin, there is a considerable reduction in the N7 to N7 distance (from about 8A in a canonical B-DNA conformation to about 5A). There is a corresponding increase (to about 10A) in this distance in the d(GCC)«d(GCC) fragment in both the 9mer and 13mer duplexes, which may be a consequence o f the instability of the C-C mismatch causing a slight widening of the major groove. However, as can be seen in Figure 3.3 and 3.4, this does not cause a significant global conformation change for an isolated d(G £C)*d(G £C) fragment. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. < 12 11 10 9 (a) 5 4 (b) 0 30 60 90 120 150 180 Time(ps) Figure 3.5. Guanine-guanine N7 to N7 distances in d(G £C)*d(G £C) fragments o f the 9mer and I3mer duplex and the d(CCG)l lG hairpin, (a) shows averaged data from the 9mer and I3mer simulations, and (b) shows averaged data from the two internal fragments o f the d(C C G )llG hairpin [d(G9£ioCn)«d(G24C25C26) and d ( G 6 C 7 C 8 ) « d ( G 2 7 £ 2 8 C 2 9 ) ] - The above simulations suggest that the d(CCG )nG hairpin may adopt a conformation which includes extrahelical cytosine bases, and that this is a consequence o f having multiple d(G £C)«d(G £C) repeat fragments, because the same behavior is not observed for an isolated fragment. The motion o f the guanine bases of the internal fragments o f the hairpin is particularly interesting. The tendency for stacking in the major groove is very reminiscent o f the e-m otif conformation (Gao et al., 1995) and it is possible that the extended, unprotonated (b) alignment hairpin could form a structure with a pseudo-GC repeat core helix, and exclusion o f the formally mismatched cytosine bases into the minor groove. Simulations o f this conformation suggest that some interaction could occur between extrahelical cytosine bases from different mismatch pairs (chapter 2 ). The observation that the (b) alignment d(CCG)n repeat hairpin could include extrahelical, unprotonated cytosine bases, and the recent suggestion by Zheng et al. 75 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Zheng et al., 1996) o f a conformation o f a d(CCG)3«d(CCG ) 3 repeat duplex containing intrahelical, protonated cytosine bases, may provide an explanation for the anomalous pH-dependent electrophoretic behavior of d(CCG)is (Yu, et al 1997). It is possible that the hairpin containing extrahelical cytosine pairs may be less compact, compared to the hairpin containing intrahelical pairs, and would therefore have reduced electrophoretic mobility. We showed in chapter 2 that the mobility o f the d(CCG)is hairpin at high pH is slower than that at lower pH. As discussed in the introduction, the tendency to form a (b) alignment hairpin may be a function o f the number o f CCG repeats, and the conformational properties o f such a hairpin, including the pH effect, may also be subtly dependent on the number o f repeats. 3.5 Summary In this chapter we have shown that it is possible that the neighboring sequences have an effect on the conformation o f the cytosines. In a d(CCG)uG hairpin in a (b) alignment, which has four repeating d(GCC )4 duplex segments in the stem, the guanines can wedge together, and push the cytosines out into the minor groove. In contrast, one repeating duplex segment d(GCC)i, within a random sequence, does not have the same mobility and the cytosines remain essentially intrahelical. In the chapters that follow, we continue with chemical probing in an attempt to clarify the conformation o f the d(GCC)n fragment, and to determine if this kind o f DNA conformational mobility could exist in the Fragile X repeating sequence. However, as will be explained in chapter 4, the attempt to study these issues led to the discovery o f a new reaction o f the repeats o f Fragile X DNA. As a note, the simulation on the isolated 76 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. d(G £C)*d(G£C) repeat was not performed until the experiments in chapter 4 were initiated. The motivation for probing the repeat fragments, which will be discussed further in chapter 4, came from the simulations o f the multiple repeats. The simulations for the isolated and multiple repeats are presented together in this chapter to show the contrast in structure, which is dependent upon the neighboring base pairs. It is to be stressed that these simulations only represent a sample distribution o f the possible conformations that these structures can obtain and provide us with models to test experimentally. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 77 Chapter 4 Anomalous Crosslinking by Mechlorethamine of DNA Duplexes Containing C-C Mismatch Pairs 4.1 Overview o f Chapter Nitrogen mustards such as mechlorethamine have previously been shown to covalently crosslink DNA through the N7 position o f the two guanine bases o f a d(GXC)«d(GYC) duplex sequence, a so-called 1,3 G-G-crosslink, when X-Y = C-G or T- A. Here, we report the formation o f a new mechlorethamine crosslink with the d(GXC)«d(GYC) fragment when X-Y is a C-C mismatch pair. Mechlorethamine crosslinks this fragment preferentially between the two mismatched cytosine bases, rather than between the guanine bases. The crosslink also forms when one or both of the guanine bases o f the d(GCC)«d(GCC) fragment are replaced by N7-deazaguanine, and, more generally, forms with any C-C mismatch, regardless o f the flanking base pairs. Piperidine cleavage o f the crosslink species containing the d(GCC)«d(GCC) sequence gives DNA fragments consistent with alkylation at the mismatched cytosine bases. We also provide evidence that the crosslink reaction occurs between the N3 atoms o f the two cytosine bases by showing that the formation o f the C-C crosslink is pH dependent for both mechlorethamine and chlorambucil. Dimethyl sulfate (DMS) probing o f the crosslinked d(GCC)*d(GCC) fragment showed that the major groove o f the guanine adjacent to the C-C mismatch is still accessible to DMS. In contrast, the known minor groove binder Hoechst 33258 inhibits the crosslink formation with a C-C mismatch pair flanked by A-T base pairs. These results suggest that the C-C mismatch is crosslinked by mechlorethamine in the minor groove. Since C-C pairs may be involved in unusual secondary structures formed by the trinucleotide repeat sequence d(CCG)n, and 78 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. associated with triplet repeat expansion diseases, mechlorethamine may serve as a useful probe for these structures. 4.2 Introduction Mismatched base pairs in DNA can occur during replication or recombination, and have the potential to be mutagenic if not efficiently repaired. In Escherichia coli the repair efficiency has been reported to depend on the sequence o f the mismatch (Kramer et al., 1984), the bases flanking the mismatch (Jones et al., 1987), and the affinity o f the MutS protein for the mismatch (Su et al., 1988; Wagner et al., 1995), with pyrimidine- pyrimidine mismatches having the lowest efficiency for repair (Modrich, 1995; Thomas et al., 1991). Replication fidelity is dependant on mismatch recognition, which is linked to mismatch structure, and as a result, the structural features o f mismatched base pairs have been explored using X-ray crystallography (Holbrook, et al., 1991), circular dichroism (Gray et al.. 1984), NMR spectroscopy (Boulard et al., 1997), and binding ligand studies (Chen, 1998). Mismatched base pairs have also been associated with secondary structures formed by DNA trinucleotide repeat sequences (Mitas, 1997; Chen et al., 1995; Gacy et al., 1995; Mariappan et al., 1996a,b; Petruska et al., 1996). The expansion and instability o f such repeat sequences is associated with a number o f human genetic disorders. One such disorder, Fragile-X Syndrome (Fu et al., 1991), is characterized by expansions o f d(CGG)n»d(CCG)n repeat sequences. The trinucleotide repeat expansion mechanism is unclear, but we and others have speculated that it might be due to unusual DNA conformations (Kuryavyi et al., 1995; Mitas et al., 1995a; Gacy et al., 1998; Yu et 79 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. al., 1997) that can form during replication. In the Fragile-X sequence, strand separation of the duplex can occur, and the resultant d(CGG)n and d(CCG)n single strands can fold into hairpins (Mitas et al., 1995a; Gacy et al., 1998; Yu et al., 1997). For d(CCG)n, the hairpin forms with a specific alignment containing repeating d(GCC)«d(GCC) ‘duplex’ fragments, giving a hairpin stem in which every third base pair is a C-C mismatch (Yu et al.. 1997). 4.2.1 Mechlorethamine as a probe We chose to probe the structure of the d(GCC)«d(GCC) fragment with mechlorethamine, a common nitrogen mustard. Mechlorethamine (Figure 1(a), R=CH3 ), and related mustards, such as chlorambucil (Figure 1(a), R=-C 6 H4 (CH 2)3COOH), react with nucleophilic centers on the DNA duplex, via an aziridinium ion intermediate (Rutman et al.. 1969). This reaction occurs favorably with the guanine N7 atom (Mattes et al.. 1986; Kohn et al., 1987), but adducts with N3 o f adenine have also been reported (Pieper et al.,1990; Wang et al., 1991, Wang et al., 1994). Since the nitrogen mustards are bifunctional, they can form interstrand crosslinks with suitable DNA sequences Formation o f guanine-guanine interstrand crosslinks was originally suggested to occur between the N7 atoms o f guanine bases in neighboring base pairs (that is, at a d(GC)«d(GC) site; a 1,2 G-G crosslink) (Brookes and Lawley., 1961), because this site provides the shortest N7 to N7 distance in a regular B-DNA duplex (Amot et al., 1976). However, it is now accepted that a d(GGC)*d(GCC) duplex fragment is preferentially crosslinked in the major groove o f DNA between the distal guanine bases (Rink et al., 1993; Rink et al., 1995; Hopkins et al., 1991; Millard et al., 1990; Millard et al., 1991), 80 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. giving a 1,3 G-G crosslink product (Figure 4.1(b)). This selectivity for the d(GXC)»d(GYC) site over the d(GC) d(GC) site is retained for X-Y = T-A (Ojwang et al.. 1989). CH, CH 3 5' | C |/x/G 3 a /N. A X N Y c , / \ / V ^ C ) c (a) 5’ (b) Figure4.1. (a) Mechlorethamine (R=CH 3) or chlorambucil (R=C 6H4 (CH 2 )3COOH) and (b) the mechlorethamine and chlorambucil 5'-GXC..5'-GYC 1,3 G-G crosslink. The N7 to N7 distance in the d(GXC)»d(GYC) site is somewhat larger than the span o f the bis(ethyl)amine crosslinking bridge o f mechlorethamine. To accommodate the 1,3 G-G crosslink, a distortion o f the DNA is necessary to reduce the interstrand guanine-guanine N7 to N7 distance from 8.9A in a ‘B-DNA’ conformation (Amot et al., 1976) to approximately 5.1 A in the crosslinked species (Remias, et al 1995). A bend o f about 15° for a mechlorethamine-crosslinked oligomer has been estimated (Rink et al., 1993. 1995), which could be a result o f this distortion. We have previously suggested that the local distortion o f the DNA and a decrease in the N7 to N7 distance can be induced by the initial, non-covalent interaction o f the mustard with the DNA (Remias, et al., 1995). In performing the work described here, our original hypothesis was that a mismatch X-Y pair at the center o f the d(GXC)«d(GYC) crosslink site would allow the DNA helix to distort more easily, and should therefore allow for more efficient 81 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. crosslinking. However, for X-Y = C-C we have observed a crosslinked species with different properties to those expected for a 1,3 G-G crosslink. We show that this species results from crosslinking o f the C-C mismatch pair, and provide evidence that the crosslink forms between the cytosine N3 atoms in the minor groove o f the DNA helix. 4.3 Materials and Methods (i) Chemicals: Mechlorethamine (N,N-bis(2-chloroethyl)methylamine), chlorambucil (4-(p-(N,N-bis(2-chloroethyl)amino)-phenyl)butyric acid), Hoechst 33258 and T4 polynucleotide kinase were purchased from Sigma. [y-3 2P]ATP was purchased from ICN. N7-deazaguanine was purchased from Glen Research. All synthetic oligonucleotides (Table 4.1) were synthesized on an Applied Biosystems Model 394 automated synthesizer, deprotected, and purified with a COP cartridge at the USC Norris Cancer Center. DNA used for sequencing reactions was further purified on a 20% denaturing polyacrylamide gel. All other reagents were analytical grade. (ii) n ?-5'-end labeling o f DNA: Approximately lOpg o f column purified synthetic DNA was 5'-end labeled with [y-3 2P]ATP (5pl, 4500 Ci/mmol) by incubation in buffer (30mM Tris (pH 7.8), 10 mM MgC12, 5mM dithiothreitol) and 30 units o f T4 polynucleotide kinase for 1 hour at 37°C (Sambrook. et al.,1989). The reaction was stopped by addition o f 5.5pL 3M sodium acetate (pH 5.2) and 150 (iL pre-chilled 95% ethanol. The unincorporatd y3 2P-ATP was removed by precipitation in 95% ethanol at - 20°C overnight, lyophilized, and resuspended in a 0.1 M NaCl solution. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (iii) Alkylation o f DNA: An equal amount o f the unlabeled complementary strand was added to a 0.1M NaCl solution o f the labeled oligonucleotide, heated to 65°-70°C and then slowly cooled to room temperature. Following annealing o f the strands, a lpM duplex DNA solution containing 0.1M NaCl and lOmM Tris (pH 7.5) was incubated for 6 hours at 37°C with lOOpM o f mechlorethamine or chlorambucil in a total volume of lOOpL. For each experiment, a fresh solution o f the nitrogen mustard (lOOmM) was prepared in dimethyl sufoxide (DMSO), rapidly diluted to lOmM and immediately added to the DNA solution. Following incubation with the mustard, the reaction was terminated by addition o f 5.5pL o f 3M sodium acetate, lpL tRNA (5mg/mL), and 150pL pre-chilled 95% ethanol, and precipitated in three times the volume o f pre-chilled 95% ethanol at - 20°C overnight, washed, and then lyophilized. The DNA was then dissolved in 2pL distilled water and 8 pL tracking dye (80% formamide, ImM EDTA, 0.025% bromophenol blue and xylene cyanol). Controls were performed with omission of complement strand, or omission o f the alkylating agent, or omission o f the annealing step. (iv) Detection o f alkylated DNA: The samples were loaded onto a 20% denaturing polyacrylamide gel, DPAGE, (29:1 acrylamide/bisacrylamide, 8 M urea, 89mM Tris- borate (pH 8.5) 2mM EDTA (TBE buffer), 0.4mm thick, 42x33 cm, 2500V, 45W ) until the xylene cyanol marker migrated 15cm. (v) Determination o f Cross-linking Site: The band thought to be due to the mechlorethamine-crosslinked DNA was recovered from the gel using the crush-and-soak 83 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. procedure (Sambrook, et al., 1989). The DNA was then ethanol precipitated, washed, lyophilized, and resuspended in 10% aqueous piperidine in a total volume o f lOOpL. To ensure complete cleavage o f all alkylated bases, samples were heated for 1 hour at 90°C. To determine the cross-linking site, control Maxam-Gilbert G and C reactions were performed in parallel fMaxam and Gilbert, 1980). Following this reaction, the control DNA was cleaved using 10% aqueous piperidine, in a total volume o f lOOpL, for 30 minutes at 90°C (Maxam and Gilbert, 1980). All samples were lyophilized overnight, resuspended in 2 pL distilled water and 8 pL tracking dye (80% formamide, ImM EDTA, 0.025% bromophenol blue and xylene cyanol), heated at 90°C for 2 min, chilled on an ice bath, then loaded onto a 20% denaturing polyacrylamide gel (29:1 acrylamide/bisacrylamide, 8 M urea, 89mM Tris-borate (pH 8.5) 2mM EDTA (TBE buffer), 0.4mm thick, 42 x 33 cm, 2900 V, 50 W) until the xylene cyanol marker had migrated 10 cm. Bands were assigned by reference to the Maxam-Gilbert G and C lanes. (vi) pH-dependent alkylation reaction: The DNA was alkylated with chlorambucil or mechlorethamine by the same method as in (Hi) except that the buffer used was lOmM potasium phosphate prepared at pH 4.0, 5.8, and 8.0. In these experiments incubation with the mustard was for 24 hours at 37°C, to allow sufficient reaction time for the slower reacting chlormbucil. (vii) D imethyl sulfate (DMS) probing reaction: Mechlorethamine- and chlorambucil-crosslinked DNA duplexes were recovered from the gel, purified by precipitation, then incubated with 1% DMS, in a total volume o f 50 pL, at 37°C for 30 84 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. minutes. The DNA was then ethanol precipitated, washed, lyophilized, and cleaved with 10% aqueous piperidine, in a total volume o f lOOpL at 90°C for 30 minutes to convert sites o f alkylation into strand breaks. (viii) Hoechst 33258 inhibition o f crosslinking: Experiments were performed similarly to those described in steps (ii), (iii) and (iv), except that, prior to alkylation with mechlorethamine, the annealed DNA duplexes were preincubated at room temperature for 30 minutes with varying concentrations o f Hoechst 33258. The incubation time for the alkylation reaction was 1 hour. 4.4 Results 4.4.1 Anomalous crosslinking o f a DNA duplex containing a C-C mismatch pair DNA duplexes of the sequence shown in Table 4.1, Series 1, were incubated with mechlorethamine, and the products were electrophoresed on a 2 0 % polyacrylamide denaturing gel. In Figure 4.2A, the bands with slower mobility (labeled X), are due to mechlorethamine-crosslinked duplexes (Rink et a l, 1993; M illard, et al 1990; Hartley et al., 1991). These bands are observed for duplexes in which X-Y = C-G and T-A, Watson- Crick pairs (Figure 4.2A, lanes 2 and 10), and when X-Y = C-C, a mismatch pair (Figure 4.2A, lane 6 ). The duplexes containing mismatch pairs X-Y = C-A and T-C (Figure 4.2A, lanes 4 and 12) do not give observable crosslink bands. A very weak band was discernible for duplex X-Y = T-G, a mismatch pair which is relatively stable compared to other mismatch pairs. The identity o f the bands with slower mobility for duplexes X-Y = C-G, C-C, and T-A was further confirmed in a separate experiment by the formation of species with similar mobility in which either the top or the bottom strand was 85 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. radiolabeled (Figure 4.2B). The crosslinking efficiency is highest for the duplex containing the C-C mismatch in both Figure 4.2A and 2B. Quantification by densitometry of the slower mobility bands in Figure 4.2B shows that the crosslink band for X-Y = C-C is about 25% o f the total DNA, compared to an average o f 6 % for the Watson-Crick paired duplexes (Figure 4.2C). In addition, the crosslink species for duplex X-Y = C-C appears to have a slightly slower mobility in the denaturing gel, compared to the crosslinked W atson-Crick duplexes. To show that the band labelled X (Figure 4.2A and 2B) for X-Y = C-C was due to a crosslink between the two complementary strands, control experiments were performed with only one strand. Incubation o f either top or bottom strands with mechlorethamine (Figure 4.2D, lanes 2 and 4) did not give a crosslink band. Further, to show that the crosslinked species for X-Y = C-C in Figure 4.2A and 2B did not result from non-duplex conformations created by the annealing conditions, experiments were performed with elimination o f the annealing step. Hence, cold complementary strands were added to the reaction mixture at room temperature. As shown in Figure 4.2D (lanes 6 and 8 ), mechlorethamine was still able to crosslink under these conditions. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 86 Table 4.1. DNA duplex sequencesa. S e r i e s S e q u e n c e s 1 5 ' - CTCTCAGAGXCTCGTTCAG GAGAGTCTCYGAGCAAGTC-5' X-Y= C-G C-A C-C T-G T-A T-C 2 2a 5 ' - CTCTCAGAMXnTCGTTCAG GAGAGTCTmYNAGCAAGTC-5' 5 ' - CTCTCACACCGTGGTTCAG GAGAGTGTGCCACCAAGTC- 5 ' n-N = C-G C-G C-G C-G C-D X-Y= C-G C-G C-C C-C C-C M-m= G-C D-C G-C D-C D-C 3 3a 5 ' - CTCTCACAMCnTGGTTCAG GAGAGTGTmCNACCAAGTC-5 1 5 1 - CTCTCACGACTCGGTTCAG GAGAGTGCTCAGCCAAGTC- 5 ' n-N= C-G T-A G-C C-G G-C T-A M-m= G-C G-C T-A C-G C-G C-G n-N = C-G T-A A -T T-A M-m= T-A T-A T-A A-T 4 5 ' - CTCCCAATTCAATTCCCAG GAGGGTTAACTTAAGGGTC-5 1 a D=N7-deazaguanine Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 87 Figure 4.2. Mechlorethamine crosslinking o f DNA duplexes o f sequence d(CTCTCAGAGXCTCGTTCAG)d(CTGAACGAGYCTCTGAGAG) (Table 4.1, Series 1). Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands) on the right side o f the figures. (A) Autoradiography o f a 20% DPAGE gel of DNA duplexes where X-Y = C-G, C-A, C-C, T-G, T-A and T-C, following incubation with lOOpM mechlorethamine, in lanes 2, 4, 6 , 8 , 10 and 12, respectively. Lanes 1, 3, 5, 7, 9 and 11 are controls (no mechlorethamine). In all experiments only the top strand (X=C or T) was labeled. (B) Autoradiography of a 20% DPAGE gel for duplexes X-Y = C-G, C-C and T-A in which either the top strand (lane T) or bottom strand (lane B) was labeled, following incubation with lOOpM mechlorethamine. Four experiments were run for each duplex, two with the top strand labeled and two with the bottom strand labeled. The exposure time results in weak bands for the Watson-Crick duplexes, in order to avoid over-exposing the X-Y = C-C crosslink band. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Crosslink Efficiency □ X -Y C-G C-G C-C C -C T -A T -A D. SST+ SSB+ SST SSB SSB SST Duplex X -Y 1 2 3 4 5 6 7 8 9 1 i •M •s Figure 4.2 continued. Mechlorethamine crosslinking o f DNA duplexes o f sequence d(CTCTCAGAGXCTCGTTCAG)d(CTGAACGAGYCTCTGAGAG) (Table 4.1, Series 1). Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands) on the right side o f the figures. (C) Quantification o f the mechlorethamine- crosslinked bands in Figure 4.2B (bands labeled X) using densitometry. The intensity of the crosslink band is expressed as a percentage o f the total DNA in each lane o f the gel. (D) Autoradiography o f a 20% DPAGE gel for single strands or duplexes where X=C, Y=C, following incubation with lOOpM mechlorethamine. Lanes 2 and 4 contain only top strand and only bottom strand, respectively. Lanes 6 and 8 contain both strands, but the experiment was performed with omission o f the annealing step, and with the top strand labeled (lane 6 ) and bottom strand labeled (lane 8 ). Lanes 10 and 12 show the results o f experiments peformed as in Figure 4.2B, with the top strand labeled (lane 10) and the bottom strand labeled (lane 12). Lanes 1, 3, 5, 7, 9 and 11 are controls (no mechlorethamine). 89 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.4.2 Piperidine cleavage o f the crosslinked C-C mismatch duplex gives fragments consistent with alkylation o f the mismatched bases. To determine the bases through which the mechlorethamine crosslink forms, the crosslink bands for X-Y = C-G and C-C (labeled X in Figure 4.2B), were subjected to piperidine cleavage. This reaction results in strand cleavage to the 5' side o f the alkylated base (Mattes et al.. 1986). The results are shown in Figure 4.3, and suggest that the crosslink in the two duplexes is different. For X-Y = C-G, piperidine cleavage gives bands corresponding to cleavage at G9 and G28, consistent with a 1,3 G-G crosslink. However, for X-Y = C-C, piperidine treatment produces bands with mobilities consistent with cleavage at bases X and Y. This suggests that the mechlorethamine crosslink forms at the mismatched C-C pair. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 90 G C T G £ B G38 G36 G34 C32 G19 C17 G14 C13 C l l X10 G9 C30 Y29 G28 G26 C25 G7 C5 G C T G C B G38 G36 G34 C32 G19 C17 G14 C13 C l l X10 G9 C30 Y29 G28 m G26 C25 G7 C5 Figure 4.3. Piperidine cleavage o f the crosslink bands from lanes T and B o f Figure 4.2B for the crosslinked DNA duplexes d(C |T 2C 3T4CsA6G 7AgG9 XioCii T 12C 13G 14T 15 T 16C 17A 1 gG 19)*d(C2oT21G22A 23 A 2 4C 2 5G 26A 2 7 G 2 8 Y 2 9C 30T 3 1 C 32 T 3 3 G34A3sG36A37G3g) ( X- Y = C-G or C C, Table 4.1, Series 1). Lanes G and C are Maxam-Gilbert G and C reactions. Lanes T and B show the results o f piperidine cleavage o f the top strand (that containing X) and the bottom strand (that containing Y). 4.4.3 N7-deazagu a nine substitution does not influence formation o f the crosslink in the C-C mismtached duplex. To prove that the crosslink o f the duplex containing a C-C mismtach pair does not form through guanine N7, mechlorethamine crosslinking o f duplexes containing N7- deazaguanine (Table 4.1, Series 2, where D = N7-deazaguanine) was examined. N7- deazaguanine substitution o f one or both o f the guanine bases o f the d(GXC)»d(GYC) fragment prevents the formation o f an N7-N7 1,3 G-G crosslink. Hence, no crosslink band is observed for the N7-deazaguanine substituted duplex when X-Y= C-G (Figure 91 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.4, lane 4). The retention o f a crosslink in N7-deazaguanine substituted duplexes containing a C-C mismatch (Figure 4.4, lanes 8 and 10, bands labeled X) further supports the conclusion that the crosslinked species does not contain the normal 1,3 G-G crosslink. Further, a crosslink also forms for a duplex (Table 4.1, Series 2a) which has a central d(CCG)«d(CCG) sequence and no guanine-guanine 1,3 G-G crosslink site (Figure 4.4, lane 12). Piperidine cleavage o f each o f these crosslinks results in products o f mobility consistent with alkylation at the C-C mismatch pair (Figure 4.5). « -X 4 — M 5 ' 5 ' n - N x - y M-m 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 • • • • C -G C -G C -C C -C C -C C -C G - C D - C Q - C D - C D - C C - G Figure 4.4. Autoradiography of a 20% DPAGE gel following incubation with mechlorethamine o f duplexes o f sequence d(CTCTCAGAMXnTCGTTCAG)* d(CTGAACGANYmTCTGAGAG) (where X-Y = C-G or C-C, n and m = C, and M and/or N = G or D (N7-deazaguanine, Table 4.1, Series 2) (lanes 1-10), and a duplex (Table 4.1, Series 2a) o f sequence d(CTCTCACACCGTGGTTCAG)* d(CTGAACCACCGTGTGAGAG) (lanes 11 and 12). In all experiments only the top strand (that containing X) was labeled. Lanes 2, 4, 6, 8, 10 and 12 include lOOpM mechlorethamine and lanes 1, 3, 5, 7, 9 and 11 are controls (no mechlorethamine). Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). 92 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. G T G B G T G B G T G B G T G B 8 G1 9 * I - ' I g1 9 ^ * 6 3 8 ^ " • 1 9 9 « G19 # 6 3 8 *G36 # 6 3 6 • G36 # G 3 6 #G34 # 6 3 4 # G 3 4 «G14 • G14 •G 1 4 # G 1 4 # G 3 4 • i #G 13 # 6 3 2 X # '* f x 4 Y ^ x « * 4 # G 1 1 # G 3 0 0 6 9 # 6 2 8 » 0 6 9 - * G2B * X • Y * • # * « # G 7 0 6 2 6 0 G 7 # G26 # G 7 - 0 m • 5 ' - N C - G C - G C - D G - C - Y C - G C - C C - C C - C - m G - C G - C D - C C - G / Figure 4.5. Piperidine cleavage o f the crosslink bands from lanes 2, 6, 10 and 12 of Figure 4.4 for the mechlorethamine-crosslinked duplexes of sequence d(CTCTCAGAM 9X,onuTCGTTCAG)«d(CTGAACGAN 2 8Y 2 9m 3 oTCTGAGAG) (X-Y = C-G or C-C, n and m = C, and M and/or N = G or D (N7-deazaguanine), Table 4.1, Series 2), and a fourth duplex (Table 4.1, Series 2a) o f sequence d(CTCTCACAC,C 1 0 G , 1 TGGTTCAG)*d(CTGAACCAC2 8 C2 9G 3 oTGTGAGAG). In the figure, the duplexes are identified by their central three base pair sequence. The lanes marked with G are the Maxam-Gilbert G reactions. Lanes T and B show the results of piperidine cleavage o f the top strand (that containing X) and the bottom strand (that containing Y). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.4.4 A mechlorethamine crosslink forms with any DNA duplex containing a single C-C mismatch pair To determine if mechlorethamine crosslinking o f a C-C mismatch is sequence dependent, crosslinking experiments on the DNA duplexes o f sequences shown in Table 4.1 (Series 3 and 3a) were performed. The results (Figure 4.6) show that any C-C mismatch pair can be crosslinked by mechlorethamine, regardless o f the sequence context. Piperidine cleavage o f these crosslinks also resulted in products that were consistent with alkylation at the C-C mismatch pair (data not shown). We note that the crosslink forms with variable efficiency, and has slightly different electrophoretic mobility, with different duplexes (Figure 4.6, bands labeled X). The crosslink efficiency may be related to the stability o f the cytosines within a duplex and the variable mobility may reflect slightly different conformations o f the crosslinked species in the denaturing gel (Romero et al., 1999b ). These issues will further explored in chapter 5,6 and 7. 4.4.5 The form ation o f the C-C crosslink is pH dependent The reported pKa o f cytosine N3 in a C-C mismatch pair is 6.95 (Boulard et al., 1997). Hence, if the C-C crosslink occurs through N3 o f cytosine, then its formation should be blocked by protonation o f N3 at low pH. However, for mechlorethamine, this is complicated by protonation o f the mustard itself (mechlorethamine has a pKa o f 6.45 (Cohen et al., 1948), which would be expected to reduce the reactivity o f the crosslinking agent. Because o f this, we also carried out the pH dependent crosslinking experiments using chlorambucil (Figure 4.1a), which, due to the aromatic ring, has an amine with a much lower pKa o f 2.49 fStewart et al., 1980), and is essentially unprotonated at the pH 94 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. corresponding to the pKa o f the C-C mismatch. These experiments were performed on the duplexes shown in Table 4.1 (Series 1) where X-Y = C-G and C-C. In Figure 4.7 (lanes 8, 10, and 12, bands labeled X) we show that formation o f the C-C crosslink is pH dependent for chlorambucil. The chlorambucil crosslink band is intense at PH 8, but is barely visible at pH 5.8 or pH 4.0. The mechlorethamine crosslink also forms efficiently at pH 8, but essentially no reaction occurs at lower pH. For X-Y = C-G, lanes 2-7, pH does not influence strongly the formation o f the conventional 1,3 G-G crosslink with chlorambucil or mechlorethamine, providing a good control for the X-Y = C-C results. We note that protonation o f cytosine N3 could influence alkylation at other atoms, either directly, or through a protonation-induced DNA conformational change, and that the pH dependence does not, therefore, prove that the crosslink forms through N3, although it is suggestive o f this. We also note that these experiments were conducted with longer incubation periods than those in Figures 4.2, 4.4, and 4.6 to allow sufficient time for the slower crosslinking reaction of chlorambucil. This results in additional crosslinks forming for the C-C duplex (lanes 12 and 13) which have similar mobility to the bands observed for X-Y = C-G. This suggests that the C-C crosslink forms more rapidly than the 1,3 G-G crosslink. The rate o f crosslink formation will be covered in the next chapter. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 95 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 « * - n r I . 3 ' T T T T T T T T T T C n - N C T G C G T C T A T T C C C C C C C C C C C M-m Q G T C C C T T T A A A A A A A A A A A A G ■ X •M •S Figure 4.6. Autoradiogram o f a 20% DPAGE gel following incubation with mechlorethamine o f duplexes d(CTCTCACAMCnTGGTTCAG)»d(CTGAACCANCmTGTGAGAG) (Table 4.1, Series 3) and d(CTCTCACGACTCGGTTCAG)«d(CTGAACCGACTCGTGAGAG) (Table 4.1, Series 3a). The top strands o f each duplex have the central sequences shown in the figure. In each sequence, the central C is part o f a mismatch pair. In lanes 1-14, the odd numbered lanes are controls (no mechlorethamine) and the even numbered lanes include lOOpM mechlorethamine. Lanes 15, 18, 21 and 24 are also controls. Lanes 16, 17, 19, 20, 22, 23, 25 and 26, include lOOpM mechlorethamine. Lanes 16, 19, 22 and 25 are from experiments at 25°C, and lanes 17, 20, 23 and 26 are from experiments at 37°C. Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 96 4.4.6 The C-C crosslinked species is reactive with DMS at guanine bases adjacent to the C-C mismatch pair To establish the groove in which the C-C crosslink forms, we probed the chlorambucil- and mechlorethamine-crosslinked DNA, using dimethyl sulfate (DMS), which reacts with guanine N7 atoms that are solvent accessible. In these experiments, only crosslinked DNA species (excised from Figure 4.7, lanes 12 and 13, bands labeled X) were incubated with DMS. If the crosslink is formed in the major groove, one would expect the guanine bases in the d(G CC)d(G CC) fragment to be unreactive with DMS, particularly for chlorambucil, where the additional bulk o f the crosslinking agent might be expected to block the approach o f DMS. In Figure 4.8A we show these guanines are still reactive with DMS in the chlorambucil- and mechlorethamine-crosslinked duplexes, which suggests that the crosslink is occurring in the minor groove. We note that guanines in the crosslinked species to the 3' side o f the crosslink probably react with DMS, but do not appear to do so (Figure 4.8A), because the conditions used for piperidine cleavage of the DMS adducts also results in cleavage o f the crosslink itself (leading to the appearance o f bands corresponding to fragments cleaved at a cytosine o f the C-C mismatch). Hence, bands resulting from cleavage at guanine bases 3' to the mismatch are not observable. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 97 Figure 4.7. Autoradiogram o f 20% DPAGE gels following incubation with mechlorethamine or chlorambucil o f duplexes o f sequence d(CTCTCAGAGXCTCGTTCAG)«d(CTGAACGAGYCTCTGAGAG) (Table 4.1, Series 1), where X-Y = C-G, or C-C. Lane 1 is a control, zero nitrogen mustard. Lanes 2-7 show the products o f the reaction o f duplex X-Y = C-G with 100pM chlorambucil (lanes 2, 4 and 6 ) or lOOpM mechlorethamine (lanes 3, 5 and 7) at the indicated pH. Lanes 8-13 show the products o f the reaction of duplex X-Y = C-C with lOOpM chlorambucil (lanes 8 , 10 and 12) or lOOpM mechlorethamine (lanes 9, 11 and 13) at the indicated pH. Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). 4.4.7 The mechlorethamine C-C crosslink is inhibited by the DNA minor groove binder Hoechst 33258 To define further the groove in which the C-C crosslink forms, we carried out mechlorethamine crosslinking experiments on the duplex shown in Table 4.1 (Series 4), following pre-incubation with Hoechst 33258. The duplex sequence was designed so that two favorable Hoechst binding sites, d(AATT)«d(AATT), (Abu-Daya et al., 1995) flank the C-C mismatch pair. Hoechst 33258 has been shown to be a DNA minor groove binder in two x-ray structures (Pjura et al., 1987;Teng et al., 1988,) o f its complex with a duplex containing a d(AATTC)«d(GAATT) sequence. The location o f the ligand is somewhat different in these two structures (either associated with the minor groove o f the d(AATT)«d(AATT) (Teng et al., 1988) or d(ATTC)«d(GAAT) (Pjura et al., 1987) 98 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. fragments. Hence, for the duplex (Table 4.1, Series 4), Hoechst 33258 should bind in the d(AATT)«d(AATT) sequences flanking the C-C mismatch, or perhaps may interact directly with the C-C mismatch minor groove, in a manner analagous to that seen for the d(ATTC)*d(GAAT) sequence. In either case, Hoechst 33258 binding should be effective in blocking crosslinking o f the C-C mismatch pair, if this crosslinking occurs through the minor groove. In Figure 4.8B (bands labeled X) we show that the C-C crosslink is sensitive to Hoechst 33258, being significantly inhibited by lpM ligand, and eliminated at 2pM ligand. We note that this effect could be due to indirect conformational changes induced by the Hoecsht 33258 binding, rather than a direct blocking effect, and, indeed, we do observe a second Hoechst 33258-dependent band (labeled H in Figure 4.8B), the origin o f which we have not yet determined. However, the combined DMS and Hoechst 33258 results do suggest that the mechorethamine C-C crosslink forms through the minor groove. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 99 A. B. 1 2 3 1 2 3 4 5 6 7 8 G 1 9 # m <— X G 1 4 • Q G 9 * A • . . 4 1 2 < • t <— H G 7 S Figure 4.8. (A) Maxam-Gilbert sequencing gel o f the piperidine cleavage products resulting from incubation o f the chlorambucil and mechlorethamine-crosslinked C-C mismatch duplexes with DMS. The crosslinked DNA was excised from Figure 4.7, lanes 12 and 13, respectively, (bands labeled X). Lane 1 is the standard Maxim-Gilbert guanine reaction with sequence d(CTCTCAG 7AG 9XCTCGuTTCAG|q) (Table 4.1, Series 1, X= C). Lane 2 shows the products o f DMS probing o f the chlorambucil-crosslinked duplex from lane 12 o f Figure 4.7. Lane 3 shows the products o f DMS probing o f the mechlorethamine-crosslinked duplex from lane 13 of Figure 4.7. (B) Autoradiography of a 20% DPAGE gel o f the duplex o f sequence d(CTCCCAATTCAATTCCCAG)»d(CTGGGAATTCAATTGGGAG) (Table 4.1, Series 4), following pre-incubation with Hoechst 33258, then incubation with lOOpM mechlorethamine. Lane 1 is a control (no Hoechst 33258 and no mechlorethamine), and lanes 2 - 8 contain 0, 1,2, 5, 10,25 and 50pM Hoechst 33258. Bands are identified as X (crosslink), H (Hoechst 33258 concentration-dependent band), M (monoadduct), S (unreacted single strands). 4.5 Discussion In this chapter We have presented evidence for a new nitrogen mustard crosslink between the cytosine bases o f a C-C mismatch pair. For the crosslinked d(GCC)«d(GCC) duplex fragment, piperidine cleavage data are consistent with alkylation at the 100 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mismatched cytosine base, and the retention o f the crosslink in N7-deazaguanine- substituted duplexes excludes the possibility o f the expected N7 to N7 1,3 G-G crosslink. We have also ruled out the possibility o f crosslink formation at an alternative site (perhaps purine N3), because a crosslink still occurs with different base pairs adjacent to the C-C mismatch. It seems likely that the crosslink is formed between the N3 atoms of the cytosine bases, since these are the most nucleophilic atoms o f the mismatch pair. This is supported by the observation that C-C crosslink formation is pH dependent, and crosslinking is much more efficient at a pH above the pKa o f cytosine N3. However, as explained in the previous section, alternative explanations o f the pH dependence are possible. In particular, N3 protonation could induce a DNA conformational change that could prevent crosslink formation at another atom o f cytosine. N3 and 0 2 are the only likely sites for alkylation, and in a fully Watson-Crick paired duplex (in which N3 and 0 2 are involved in base pair formation) no adduct was observed between chlorambucil and a cytosine base (Bank, 1992). Methylation o f cytosine N3 by dimethyl sulfate is favored over other nucleophilic centers (Brookes and Lawley, 1962), although reactions at 0 2 can also occur for other alkylating agents (Ford et al.,1993). We also note that N3- alkylcytosines are heat labile (Liang et al., 1994), consistent with the heat-induced cleavage o f the C-C crosslink. Based on the reactivity with DMS o f the N7 position o f guanines adjacent to the crosslinked cytosines and on the inhibition o f the crosslink by Hoechst 33258, a known DNA minor groove binder, we believe that the crosslinking reaction occurs through the 101 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. minor groove, in contrast to the major groove guanine-guanine 1,3 G-G crosslink. The minor groove is also the most likely site because the 4-amino group o f the cytosine bases effectively blocks approach from the major groove, although this might be less of a problem in a distorted duplex. 4.6 Summary Various geometries have been reported for C-C mismatch base pairs in different sequence contexts. The anti-parallel C-C mismatch is intrahelical in several duplex sequences (Boulard et al., 1997; Brown et al.,1990). In contrast, Gao et al. (1995) have shown that the duplex d(CCGCCG)i has a central d(GCC)«d(GCC) sequence in which the cytosines o f the C-C mismatch adopt an extrahelical location in the minor groove, and the two G-C pairs stack within the helix. In chapter 3 the molecular dynamics simulations (Romero et al., 1998) o f a hairpin conformation o f the single strand d(GCC)n trinucleotide repeat sequence, in which the hairpin stem has four d(GCC)*d(GCC) repeats and four C-C mismatches was discussed. What this simulation revealed was the motion o f the mismatched cytosines towards the minor groove. This is in contrast to the simulation o f the duplex containing a single C-C mismatch (Romero et al., 1998), in which the C-C pair remained essentially intrahelical. The motion o f the mismatched cytosine bases in the repeating hairpin could be a prelude to the formation o f an e-motif- like structure, or some other non-standard hairpin conformation (Yu et al., 1997), consistent with the anomalous electrophoretic and pH-dependent behavior we and others have previously observed for d(CCG)n hairpins (Yu et al., 1997; Gacy et al., 1998). Because o f the limit o f the reactive distance o f mechorethamine (5.1 angstroms) we reasoned that the distance o f the mismatched cytosines in our isolated repeat duplex 102 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. fragments probably reflected what was observed in our simulation for the one repeat fragment in chapter 3. If our experimental duplexes contained an e-m otif like structure then it would seem reasonable to believe that mechlorethamine would have a difficult time reacting with the second cytosine. In an e-motif structure, which has only one pair of extrahelical cytosines, the cytosine N3 distance is -15 angstroms. Why mechlorethamine would prefer to react with two cytosines -15 angstroms away over two guanines, which are the preferred base for reaction, seemed illogical. In this chapter, the initial duplex sequences that were probed all had the preferred GNC sequence that mechlorethamine has been characterized to favor. Our initial reason for using mechlorethamine was to probe the distance o f the guanines in repeat fragments because the simulations in chapter 3 predicted a wedging o f the guanines. The e-motif has guanines 4.10 angstroms apart, therefore closer guanines might increase reaction if an e-m otif like structure exist. Under the conditions o f our experiments, for a 19 mer duplex with one isolated repeat, the e- m otif structure does not exist. The realization that mechlorethamine was instead reacting with the mismatched cytosines in preference to guanines, led us to speculate that the probing o f repeat structures with nitrogen mustards may reveal more details o f the conformations o f single stranded d(CCG)n, given the concentration o f potential 1,3 G-G and C-C mismatch crosslink sites in these structures. It also has led us to speculate, based on the results o f the isolated repeat in chapter 3 that the cytosines in our experimental duplexes are probably intrahelical. In the next chapter we continue to characterize the reaction and investigate the effect o f neighboring base pairs. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 5 Kinetics and Sequence Dependence of the DNA Crosslink formed by Mechlorethamine with Cytosine-Cytosine Mismatch Pairs: 5.1 Overview o f Chapter The conformation and reactivity o f mismatched cytosines is relevant to repair and to the d(GCC)n«d(GCC) sequence associated with trinucleotide repeat diseases such as Fragile X and other fragile sites. We have shown previously that a cytosine-cytosine mismatch pair can be crosslinked by mechlorethamine. One o f the issues raised by this work was why was there an undetectable amount o f 1,3 G-G crosslinks in a d(G|XC)»d(G 3YC), when X-Y= a mismatched base pair. To answ er this question, we have examined the kinetics and sequence dependence on duplex fragments containing a single mismatch, d(M 2M i£ n in 2 )*d(NiN 2£ m im 2) where £ is a mismatched cytosine, and M-m and n-N are complementary base pairs. The C-C crosslink forms with a rate constant o f 0.05mm'1 in the duplex where M |-mi = G-C and ni-Ni = C-G, and reaches a final yield o f about 27% o f the total DNA after 2 hours. A 1,3 G-G crosslink can form in the same duplex, but forms very slowly and comprises only 6 % o f the total DNA after 20 hours. We have also determined that the efficeincy o f the mechlorethamine C-C reaction decreases when the G-C content o f the base pairs neighboring the C-C mismatch is reduced. We are able to explain this behavior, in part, based on structural data from molecular dynamics simulations o f the solvated duplexes. These results suggest that the conformation o f mismatched cytosine can vary, depending upon the flanking sequences. 5.2 Introduction Mechlorethamine (Figure 5.1(a)) is a nitrogen mustard that alkylates nucleophilic sites on DNA, via an aziridinium ion intermediate (Rutman et al., 1969). Such reactions have been shown to occur w ith N7 o f guanine (M attes et al., 1986; Kohn et al., 1987) and 104 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with N3 o f adenine (Pieper et al., 1989; Pieper and Erickson, 1990; Wang, et a /.,1991, 1994). Since the nitrogen mustards are bifunctional, they can form interstrand crosslinks with suitable DNA sequences, via reaction through a second aziridinium intermediate (for a review, see Povirk and Skuker, 1994). In chapter 4 we discussed the previously characterized crosslinking reaction o f mechlorethamine which occurs with the guanine N7 atom in DNA duplexes that contain the sequence d(GiXC)«d(G 3YC) (Ojwang et al., 1989; Millard et al.; 1990 Hopkins et al.. 1991; Rink et al., 1993; Rink and Hopkins, 1995), where X-Y is a Watson-Crick pair, resulting in the crosslinking o f the G1-G3 guanine bases in the major groove o f the DNA duplex. The 1,3 G-G crosslink product (Figure 5.1(b)) is formed preferentially over the apparently geometrically more favorable 1,2 G-G crosslink (Millard et al., 1990). In chapter 4 we also showed that mechlorethamine can crosslink DNA through reaction with the two cytosine bases o f a C- C mismatch pair (Figure 5.1(c)). As discussed in the previous chapter this crosslink probably forms through the N3 atoms of the cytosine in the DNA minor groove, and its formation occurs regardless o f the base sequence flanking the C-C mismatch (Romero et al., 1999a). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 105 c h 3 CH3 5 ' C I /\ /G 3 c l / v N ^ c l (a) G{ C 5’ (b) n CH3 5, I N Q ^ \ / N / \ / Q m Figure 5.1. The structure o f mechlorethamine (a) and representations of the mechlorethamine crosslinks with (b) a d(GXC)»d(GYC) duplex fragment (a 1,3 G-G crosslink) and (c) a C-C mismatch pair. 5.2.1 The Kinetics o f Mechlorethamine Mechlorethamine 1,3 G-G crosslink formation is thought to proceed as two pseudo-first order reactions (Rutman et al., 1969). The rate-determining step of the reaction involves initial loss o f chloride from the free mechlorethamine, and subsequent formation o f a cyclic aziridinium ion (Figure 5.2). This reaction, at 37°C, has a reported rate o f 0.04m m '1 by Rutman et al., (1969) and 0.02 to 0.05 min' 1 by Price (1958). Reaction o f the aziridinium ion with DNA results in the formation o f a monoadduct which, with the appropriate DNA duplex sequence, can go on to form a interstrand crosslink by reaction o f the 'second arm' o f the mustard. For mechlorethamine with calf thymus DNA, the rate o f this second arm reaction has been reported as 0.08min'1 (Rutman et al., 1969). Also, both Bauer and Povirk (1997) and Hartley et al., (1991) have reported that the second arm reaction for mechlorethamine crosslink formation is rapid, 106 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. compared to the equivalent reaction for other nitrogen mustards such as melphelan (L- phenylalanine mustard) and uracil mustard (Hartley et al., 1991; Bauer and Povirk, 1997). P , C1 L r \ p p p N7 Gua3 N7 Gual'^x -'T'T'x ' ' N7 Gua3 Figure 5.2. The nitrogen mustard reaction. The dark triangle represents the aziridinium ion intermediate. Not shown is the formation o f the second azriridinium ion. This chapter tries to answer two questions raised in chapter 4. The first is why was there an undetectable amount o f a 1,3 G-G crosslink in duplexes containing the fragment sequence d(GXC)*d(GYC), where X-Y are mismatched bases? The second question to answer is can the efficiency o f the mechlorethamine crosslink vary with variable G-C content? During the completion o f the work presented in chapter 4, we noticed a difference in crosslinking efficiency. Could the efficiency o f the C-C crosslink be related to 1,3 G-G crosslink present in the C-C mismatch, although sequencing did not detect a 1,3 G-G. If not, was the efficiency a function o f the conformation o f the cytosines. As shown in the previous chapter the bands of a 1,3 G-G crosslink and the C-C crosslink have almost the same mobility on a gel. However, as will be shown in this chapter, the mobility does differ and the bands can be separated. To answer these questions a time course o f mechorethamine reaction was monitored in a duplex fragment containing both a 1,3 G-G and a C-C crosslink site, d(G i£C)*d(G 3£ C ), and compared to 107 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. one with the fragment sequence d(GiCC)«d(G 3GC), which contains just a 1,3 G-G crosslink site. The kinetics o f the reaction was quantified, the crosslinked duplexes sequenced, their Tm determined, and the efficiency o f the mechlorethamine reaction meseasured by densitometry in variable sequences. We propose that the rate o f formation o f the C-C crosslink is limited only by the initial rate o f aziridinium ion formation through loss o f chloride from mechlorethamine. We also show that the efficiency o f the C-C crosslink formation is a function o f the base pairs flanking the C-C mismatch, which decreases with decreasing G-C content. 5.3 Materials and Methods Chemicals: Mechlorethamine [bis(2-chloroethyl)methylamine, nitrogen mustard] and T4 polynucleotide kinase were purchased from Sigma. [y-3 2P]ATP was purchased from ICN. All synthetic oligonucleotides were synthesized on an Applied Biosystems Model 394 automated synthesizer, deprotected, and purified with a COP cartridge at the USC Norris Cancer Center at the University o f Southern California. All other reagents were at least analytical grade. j:P-5'-end labeling o f DNA: Approximately lOpg o f column purified synthetic DNA was 5'-end labeled with [y-3 2P]ATP (5pl, 4500 Ci/mmol) by incubation in buffer (30mM Tris (pH 7.8), 10 mM M gCh, 5mM dithiothreitol) and 30 units o f T4 polynucleotide kinase for 1 hour at 37°C. The reaction was stopped by addition o f 3M sodium acetate (5.5pL, pH 5.2) and pre-chilled 95% ethanol (150 pL). The Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. unincorporated [y-3 2P]ATP was removed by precipitation in 95% ethanol at -20°C overnight, lyophilized, and resuspended in a O.IM NaCl solution. Alkylation o f DNA: An equal amount o f the unlabeled complementary strand was added to a O.IM NaCl solution o f the labeled oligonucleotide, heated to 65°-70°C and then slowly cooled to room temperature. Following annealing o f the strands, a lpM duplex DNA solution containing O.IM NaCl and lOmM Tris (pH 7.5) was incubated for various times at 37°C with lOOpM mechlorethamine in a total volume o f lOOpL. For each experiment, a fresh solution o f lOOmM mechlorethamine was prepared in dimethyl sulfoxide (DMSO), rapidly diluted to lOmM and immediately added to the DNA solution. Following incubation, the reaction was terminated by addition o f 3M sodium acetate (5.5pL), tRNA (5mg/mL, 5 pL), and pre-chilled 95% ethanol (150pL), and precipitated in three times the volume o f pre-chilled 95% ethanol at -20°C overnight, washed, and then lyophilized. The DNA was then dissolved in distilled water (2pL) and tracking dye ( 8 pL, 80% formamide, ImM EDTA, 0.025% bromophenol blue and xylene cyanol). Detection o f crosslinked DNA: The samples were loaded onto a 20% denaturing polyacrylamide gel (29:1 acrylamide/bisacrylamide, 8 M urea, 89mM Tris-borate (pH 8.5) 2mM EDTA (TBE buffer), 0.4mm thick, 38 x 31 cm, 2500 V, 45 W) until the xylene cyanol marker had migrated 15 cm-18cm. The band due to the mechlorethamine- crosslinked DNA was recovered from the gel and sequenced as previously (14) to determine the cross-linking site. 109 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Quantitation and Kinetic Analysis o f DNA Crosslink Formation: After the gel was exposed to X-ray film, the intensity o f the crosslink was analyzed and expressed as a percentage o f the total DNA in each lane o f the gel. Averaged band intensities from three experiments were plotted against incubation time, and the data fitted to the differential equation, -ln[l- (yt/ym ax)] = k,*t, where yt is the percentage o f crosslink at time t and ym a x is the total final crosslink yield, to obtain the rate constant kt. The rate o f total crosslink formation was then fitted to the 1st order rate equation y, = y m ax ( 1 - e~kI). Tm o f DNA duplexes: All absorption measurements were made with a Shimadzu UV-visible spectrophotometer model UV160U and a 1cm cuvette. Duplexes at 10 pg/ml or 20 pg/ml were annealed via heating to 90°C for 1 minute in Tris buffer (pH 7.5, 0 .1M NaCl, 0.01 M Tris) and slow cooling back to ambient temperature. Thermal denaturation profiles were then measured by monitoring absorbance at 260 nm at various temperature intervals. The solutions were allowed to equilibrate at each temperature for 15 minutes before measuring the absorbance. The absorbance was corrected for volume expansion (Mandel and Marmur, 1968) and the melting temperature profile was determined by plotting absorbance against temperature. The peak of the first derivative plot o f the melting temperature profile was defined as the Tm. Molecular Dynamics Simulations: Simulations were performed using the AMBER 4.0 force field (Pearlman et al.. 1991; Weiner et al., 1986) on the 13-mer helices, d(TC A C A G £C T G G n>d(A A C C A G £C TG TG A ), and d(TCACAA£TTGGTT) •d(AACCAA£TTGTGA), where £ indicates a mismatched cytosine base. For controls U0 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. two additional simulations were performed with the identical sequences except that the £ - £ mismatch was replaced with a C-G base pair. For all simulations canonical B-DNA helices were first relaxed in a 4000-step conjugate gradient energy minimization, prior to solvation in a box o f TIP3P water molecules (Jorgensen et al., 1983). The water box had dimensions 59A x 40A x 40A and a minimum depth o f 8 A from the solute to the edge o f the box. Sodium counterions were then added by evaluating the electrostatic interaction energy o f the DNA with a +1 point charge located at the coordinates o f the oxygen o f each water molecule, and replacing the water molecule at the point o f most negative electrostatic potential with a counterion (van Gunsteren et al., 1986). This process was then repeated (with inclusion o f the interactions o f previously placed sodium ions) until the required number o f counterions (sixteen) had been added to achieve two thirds electrical neutrality. Following brief minimizations o f the water and counterions, the entire system was heated for 2ps from 0 to 25K, and then pre-equilibration o f the solvent performed for 40ps. A 200ps simulation o f the entire system was then performed in the nPT ensemble, using a time step o f 0.002ps and a non-bonded cutoff o f 8 A, at a temperature o f 298K. The temperature was increased linearly from OK to 298K in the first lOps o f the simulation. The coordinates of structures generated in the trajectory were saved every 0.4 ps. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I l l 5.4 Results 5.4.1 The mechlo rethamine C-C crosslink forms more rapidly than the 1,3 G-G crosslink, and reaches a higher final yield To examine the rate o f mechlorethamine crosslinking o f a C-C mismatch pair, and to compare this with the rate o f formation o f the mechlorethamine 1,3 G-G crosslink, the time course o f mechlorethamine crosslinking o f DNA duplexes la and lb (Table 5.1) was measured. These duplexes have identical sequences, except for the central base pair, which is a C-C mismatch pair in duplex la and a C-G base pair in duplex lb. Duplex la has both a C-C mismatch crosslinking site and a 1,3 G-G site, while duplex lb contains only a 1,3 G-G crosslinking site. The results o f incubations from zero to 120 minutes o f duplexes la and lb with mechlorethamine are shown in Figure 5.3A, and a kinetic analysis o f these data is shown in Figures 5.3B and 5.3C. This analysis gave a rate constant o f 0.05m m '1 for the C-C mismatch crosslink formation in duplex la, and 0.02mm ' 1 for the 1,3 G-G crosslink formation in duplex lb (Table 5.1). We stress that these rate constants are characteristic o f the rate o f appearance o f crosslinked DNA, and not the rates o f the ’ second arm* reaction. It is also apparent from Figure 5.3B that the C- C crosslink reaches a final yield o f about 27% o f the total DNA after about 2 hours, compared to a final yield o f about 10% for the G-G crosslink. Table 5.1. Kinetic parameters and extent o f reaction for mechlorethamine crosslinking o f duplexes containing C-C mismatches and 1,3 G-G crosslink s ite sa. # Duplex Sequence Nature of crosslink Rate constant, kt (% crosslinks m in'1 ) Maximum yield% crosslink la 5 1-CTCTCACAGCCTGGTTCAG GAGAGTGTCCGACCAAGTC- 5 ' C-C 0.05 ±.0.010 27.5 1,3 G-G 0.0015 ±.0.00015 6 . 0 lb 5 ' -CTCTCACAGCCTGGTTCAG GAGAGTGTCGGACCAAGTC- 5 ' 1,3 G-G 0 . 0 2 ±.0.008 1 0 . 0 “The results are average values (±S.D o f the mean) obtainec from three separate experiments. 112 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A X-H (O Q 05101520304060120 0 51015203040601 2 C -uN ir « - X (U G G ) M s B la (C-C) lb (1,3 G-G) i x < 0 I I I I , I 20 40 60 80 100 120 4 3 2 lb (1,3 G-G) 1 0 0 20 40 60 80 100 120 T im e (m in ) Tfme(min) Figure 5.3. A. Autoradiogram o f a 20% DPAGE gel following incubation for times o f up to 120 minutes o f lOOpM mechlorethamine with duplexes la and lb (Table 5.1). For each duplex lane 0 is a control and all the other lanes show the products o f incubation for the time (in minutes) indicated above the lane. For duplex la a C-C crosslink (band X (C-C)) is observed, and for duplex lb a 1,3 G-G crosslink (band X (1,3 G-G)) is observed. Bands due to monoadducts and unreacted single strands are identified as M and S, respectively. B. Quantification o f the autoradiogram showing the time course o f total crosslink formation following incubation with 100pM mechlorethamine o f duplexes la (A, C-C crosslink) and lb (□ , 1,3-G-G crosslink) for incubation times up to 120 minutes. Error bars (standard deviations from the mean) are smaller than the symbols. C. Linearization o f the time course data for crosslink formation, showing a plot o f -ln[l-(yt/ynuu)] vs. k,*t, where t = time, ym a x is the percentage o f total yield o f crosslinking, y, is the percentage crosslink at time t and kt is the first order rate constant for crosslink formation. 113 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.4.2 The 1,3 G-G crosslink forms more slowly and reaches a lower final yield in a d(GCC) •d(GCC) sequence, compared to a d(GCC) •d(GGC) sequence A longer incubation o f duplex la with mechlorethamine showed that the 1,3 G-G crosslink can form with the sequence d(GCC)«d(GCC), but is present in undetectable amounts at 6 hours, and requires about 2 0 hours to reach a final yield o f about 6 % of the total DNA (Figure 5.4). The rate constant for formation o f the 1,3 G-G crosslink in duplex la was estimated to be 0.0015m m 1 , based on a similar kinetic analysis to that shown in Figure 5.3C. We note that after 24 hours the total amount o f 1,3 G-G crosslinked DNA in duplexes la and lb is about the same (Figure 5.4B). 5.4.3 The mechlorethamine C-C crosslink is more stable than the 1,3 G-G crosslink. The products o f incubation o f mechlorethamine for up to 24 hours with duplexes la and lb are shown in Figure 5.4A, and the band intensities plotted against time in Figure 5.4B. For the C-C crosslink in duplex la, a final yield o f 27% o f the total DNA attained after 2 hours is still maintained at 24 hours (Figure 5.5.4B). In contrast, the maximum level o f the 1,3 G-G crosslink in duplex lb (about 10% o f the total DNA after 2 hours) decreased to only 6 % after 24 hours. We have also examined the long term stability o f the C-C crosslink in duplex la. The amount o f crosslink is about 20% o f the total DNA following a week o f incubation at 37°C (during which time the mechlorethamine was not removed). For a similar incubation at 25°C, 26.6% o f the DNA remained crosslinked after one week. The long term chemical stability o f the mechlorethamine C-C crosslink, compared to the 1,3 G-G crosslink, provides for the possibility o f crystallization o f a crosslinked duplex, and we are currently pursuing this. 114 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A (C-C) x z t (1,3 G-G) 0 1 2 34 5 6 10152024 0 1 2 3 4 5 610 152024 •iillNHI •« — X ( U G-G) I la 1 lb I B **n - rk k k k k A --------------* --------------* -----------A la (C-C) 30 - C 20 - < / ) < / > O k . 10 ■ o 0 i lb ( U G-G) — _ & _______ g . ^ ■ ■ - a ---------- Q la (1,3 G-G) 1----------1 I I I 0 4 8 12 16 20 24 Time(hr) Figure 5.4. A. Autoradiogram of a 20% DPAGE gel following incubation for times o f up to 24 hours o f lOOpM mechlorethamine with duplexes la and lb (Table 5 .1). For each duplex lane 0 is a control and all other lanes show the products o f incubation for the time (in hours) indicated above the lane. For duplex la both a C-C crosslink (band X (C-C)) and a 1,3 G-G crosslink (band X (1,3 G-G), on the left o f the figure) are observed, and for duplex lb a 1,3 G-G crosslink (band X (1,3 G-G), on the right o f the figure) is observed. Bands due to monoadducts and unreacted single strands are identified as M and S, respectively. B. Quantification o f the autoradiogram showing the time course o f total crosslink formation following incubation with lOOpM mechlorethamine o f duplexes la (A, C-C crosslink, A, 1,3-G-G crosslink) and lb (□ , 1,3-G-G crosslink) for incubation times up to 24 hours. For duplex la no 1,3-G-G crosslink could be detected for incubation times at or below 6 hours. 115 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.4.4 The amount o f mechlorethamine C-C crosslink form ed is dependent on the GC content o f the base pairs flanking the C-C mismatch. To examine the mechlorethamine C-C crosslink formation as a function o f the base pairs flanking the C-C mismatch pair, DNA duplexes o f the sequences shown in Table 5.2 (note that duplex Ha is the same as duplex la) were incubated with mechlorethamine, to determine the final yield o f crosslinked duplexes. These duplexes are all crosslinked by mechlorethamine at the C-C mismatch, as shown by piperidine cleavage and subsequent sequencing o f the crosslinked product (Figure 5.5A and B). The amount o f crosslinked DNA as a percentage o f the total DNA (Table 5.2) was found to decrease with decreased GC:AT ratio o f the base pairs (M i-n^, M |-m i, m-Ni and n 2-Ni) flanking the C-C mismatch. Hence, for duplexes Ila, lib and lie about 27% of the total DNA is crosslinked, whereas < 20% o f the total DNA was crosslinked in duplexes Hi and llj (Table 5.2). The duplexes having an intermediate GC:AT ratio in the base pairs flanking the C-C mismatch (duplexes lid, He, Ilf, and Ilg) also have an intermediate level o f crosslinking (Table 5.2). It is interesting to note that their is a slight drop in efficiency between duplexes Ila-c and Ilk, which have an identical GCiAT ratio, but reversed sequences in the flanking base pairs (Table 5.2). Based on this com parison, it appears that the proximity o f G-C base pairs in the sequence neighboring the C-C pair, in addition to the G O A T ratio, determines the level o f crosslinking. Specifically, having the G-C pairs distal to the C-C mismatch in duplex Ilk reduces the level o f crosslinking from that seen in duplex Ila (Table 5.2). Also o f note is the mobility o f the bands which differ and appear to be sequence dependent also. This issue will be discussed further in the next chapter. 116 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5.2. Level o f mechlorethamine crosslinking, electrophoretic mobility, and Tm of d(CTCTC A4C3M2M, C n, n2G 3G4TTC AG)«d(CTG AAC 4C 3N 2N , C m , m2G 3T4G AG AG).a # 5 1 -A4C3M2M1C nxn2G3G4 T4G3m2miC NiN2C3C4-5 1 % G-C b % A - T b % Crosslink C M obilityd (cm) T m e (°C) Ila A G C C T T C C G A 62.5 37.5 27.5 14.6 54.5 lib A C C G T T G C C A 62.5 37.5 27.1 14.6 * lie A C C C T T G C G A 62.5 37.5 27.0 14.6 54.5 lid A T C C T T A C G A 50.0 50.0 26.7 14.4 * He A C C T T T G C A A 50.0 50.0 26.5 14.4 * Ilf A G C T T T C C A A 50.0 50.0 26.0 14.4 53.8 Hg A T C G T T A C C A 50.0 50.0 25.9 14.4 * Ilh A T C T T T A C A A 37.5 62.5 25.0 14.2 52.9 iii A T C A T T A C T A 37.5 62.5 20.9 14.2 51.2 Uj A A C T T T T C A A 37.5 62.5 18.2 14.2 ** Ilk G A C T C C T C A G 62.5 37.5 25.4 14.4 52.3 a Data from Figure 5A. b GC and AT content o f the four base pairs flanking the C-C mismatch on each side. c Values are averaged based on three scans. d Determined as the distance from the loading well to the center o f the crosslink band. e Based on an average o f three experiments. (Mean o f SD+ 1.00) * Tm was not measured. **Tm could not be accurately measured due to the flatness o f the absorbance curve. The autoradiograph o f the sequenced crosslinked bands is shown in Figure 5.5B, the band labeled X. Also shown are the monoadducts formed during this reaction (Figure 5.5B, band labeled M). The expected monoalkylation at guanine bases is evident in every sequence, however, there is also some monoadduct formation at the mismatched cytosines, particularly, an increased monoalkylation in the 5 ' -adenine neighboring the C- 117 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C pair for duplex Ilj which is not observed in duplex Ilk. This duplex also has an 5 adenine to the the C-C pair. This suggest that some o f the cytosines and the 5 ' -adenine duplex Ilj might be more solvent accessible. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A 1 2 3 4 5 6 7 8 9 10111213141516171819 202122 i i . l » * • Ila lib He Dd He Ilf Ilg Oh Ili Ilj Ilk < — X B P C MX PC M X PC M X PC MX PC MX f W v w i i . 1 - * AGC.CT AGC.TT ATfiLTT A A £TT GAC.TC Ila Ilf 11 h Ilj Ilk Figure 5.5.A. Autoradiogram o f a 20% DPAGE gel following incubation for 6 hours with lOOpM mechlorethamine o f duplexes Ila to Ilk (Table 5.2). In lanes 1-22, the odd numbered lanes are controls (no mechlorethamine) and the even numbered lanes show the products resulting from incubation o f the indicated duplex with mechlorethamine. Bands are identified as X (C-C crosslink), M (monoadduct), and S (unreacted single strands). B Sequencing gel o f the piperidine cleavage products excised from A. In the figure, the duplexes are identified by their central four base pair sequence. The lanes marked with P and C are the Maxam-Gilbert reactions for purines and cytosines. Lanes X and M show the results o f piperidine cleavage o f bands excised from A, band X, the crosslink and band M, those bands due to monoadducts. 119 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.4.5 Molecular dynamics simulations suggest the C-C mismatch pair is less mobile in a d(GCC) •d(GCC) sequence than in a d(ACT) «d(ACT) sequence. To understand the structural basis for the sequence-dependent properties of mechlorethamine C-C crosslink formation, molecular dynamics simulations (in a water box, see methods) were performed on duplexes comprising the central 13 base pairs of duplexes Ila and Ilj (Table 5.2), d(TCACAG£CTGGTT)*d(AACCAG£CTGTGA) and d(TCACAA£TTGGTT)»d(AACCAA£TTGTGA), referred to as duplexes Ila' and Ilj', respectively. As a control additional duplexes with the same sequences, but with the mismatched cytosines £ - £ replaced with a C-G pair, were also constructed and subject to the same conditions. The fluctuation over each simulation trajectory o f hydrogen bond distances in the base pairs neighboring the C-C mismatch was used as a measure of duplex thermodynamic stability, and these data are shown in Figures 5.6A and 5.6B for duplexes Ila' and Ilj', respectively. For duplex Ila', the G-C base pairs flanking the C-C mismatch remain strongly hydrogen bonded and the N3-N3 distance in the C-C mismatch pair is essentially unchanged over the simulation, suggesting that duplex Ila’ is themodynamically stable in the region surrounding the C-C mismatch. In contrast, for duplex Ilj', the hydrogen bonding of the A-T base pairs flanking the C-C mismatch is unstable and disrupted, and the C-C mismatch itself is unpaired. This suggests a local distortion at the center o f the duplex Ilj'. To determine if the observed local distortion of the helix was due to a function o f the simulation, two additional duplexes o f the same sequence, except that the C-C mismatches were replaced with a C-G base pair, were examined. A plot o f the level o f fluctuation o f the mismatched cytosines compared to a Watson-Crick base pair, shows that the mismatched cytosines surrounded by A-T base 120 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pairs have a larger level o f fluctuation compared to the mismatched cytosines surronded by G-C base pairs (Figure 5.7A). A Watson-Crick C-G base pair within a similar sequences shows much less movement. We also examined the number o f conformations generated for the mismatched cytosines (the population during the 2 0 0 ps simulation) in an open (a cytsoine N3-N3 distance greater then 5.1 A) versus a closed (an N3-N3 distance less then or equal to 5.1 A) arrangement (Figure 5.7B). In an open conformation the cytosines are essentially extrahelical, in a closed conformation, intrahelical. This analysis showed that there was a large population o f open conformations for cytosines surrounded by A-T base pairs during the simulation. Figure 5.8 shows snapshots o f the Ops and the 200ps for both mismatched duplexes. These structures show that for a C-C pair surrounded by G-C base pairs (Figure 5.8c) the helix remains fairly stable through out the simulation and there is little deformation surrounding the mismatch pair. However, for the C-C mismatch surrounded by A-T base pairs (Figure 5.8b), the 200ps structure shows the deformation and opening o f the center o f the helix, with the extrahelical cytosines and the solvent exposed 5’-adenine nearest the C-C pair, suggesting less thermodynamic stability in the center o f the helix. Note that although the helix is opened up in the center, the bases at the ends still remained hydrogen bonded (Figure b, side view) This suggest that the mismatch cytosine distance is dependent upon the thermodynamic stability of the base pairs surrounding it. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B 1 2 a o 1 2 1 J J m ^ 10- ifW ( a ) ° i 8- 4 T 8 6- S 4‘ *8 2- ***** ( b ) ( b ) -----1 ----1 ----1 ---- 5 o-----1 ----1 ----1 ---- 0 50 100 150 200 Tlme(ps) 0 50 100 150 200 Time (ps) Figure 5.6. Data from molecular dynamics simulations o f duplexes Ila' and Ilj' (the central 13 base pairs o f duplexes Ila and Ilj, Table 5.2). Shown for duplex Ila’ A. and duplex Ilj' B. are (a) the N3 to N3 distance o f the mismatched cytosine pair and (b) the averaged distance o f all the W atson-Crick hydrogen bonds o f the two base pairs neighboring the C-C mismatch. B 3 2.5 is 1.5 0.5 0 10 11 12 13 M ean CV-Cr (A) 14 Figure 5.7. A. Plot o f the standard deviations, Sigma, in A as a function o f the mean distance C l'-C l' for the C-C mismatches within duplex Ila ' with the fragment sequence d(GCC)*d(GCC),0, and Ilj' with the fragment sequence d(ACT)*d(ACT),D, and Watson-Crick base pairs C-G within the sequence fragment d(AGT)«d(ACT),B, and d(GGC)«d(GCC),*. This plot shows the level o f fluctuation (sigma), o f the bases during the simulation. The C l ' -Cl' atoms are common to all these sequences. B. Plot o f the population o f cytosines with an open conformation (N3-N3 distance greater then 5.1 A, white bars, or closed (N3-N3 distance less then or equal to 5.1 A) black bars, for duplexes Ila ' and Ilj’, during the simulation. 122 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5.8. Snapshots taken from the molecular dynamic simultion. Highlighted are the mismathed cytosines (a) Starting structure for both duplexes at Ops. (b) Structure o f duplex I lj' after 200pico seconds (ps) o f simulation. Shown are two views, one into the minor groove and a side view revealing the stability and base pairing o f the bases further away from the C-C pair. Note the instability at the center o f the helix and the adenine base 5 ' to the C-C pair that which is solvent exposed, (c) Structure o f duplex Ila ' after 200ps o f simulation. The side view shows that the C-C pair is essentially intrahelical. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.5 Discussion The above results show that the mechlorethamine C-C crosslink forms more rapidly than the 1,3 G-G crosslink. The rate constant for C-C crosslink formation, 0.05m m '1 , is similar to that previously reported for the formation o f the aziridinium ion from mechlorethamine, which is necessary as an initial step in the DNA alkylation reaction. Hence, we conclude that the C-C crosslink formation is rate-limited by this initial step, and that once a mechlorethamine monoadduct is formed it rapidly completes a 'second arm 1 reaction and forms a C-C crosslink. Again, we stress that the rate constant measured in this work is that for the overall formation o f the crosslinked species, and not that for the explicit second arm reaction. We determined the overall rate constant for 1,3 G-G crosslink formation to be 0.02mm ' 1 for a d(GCC)«d(GGC) site, and about 0.0015m m '1 for a d(GCC)«d(GCC) site. A value o f 0.009m m '1 has previously been reported for this reaction in a plasmid DNA (Ulanov et al., 1992). The rate for the second arm reaction for mechlorethamine crosslinking o f calf thymus DNA (presumably for 1,3 G-G crosslink formation, although the nature o f the crosslink was not understood at the time this work was performed) has been reported as 0.08m m '1 (Rutman et al., 1969), and subsequently the rapid rate o f this reaction, relative to the initial azridinium ion formation, has been confirmed in a Hindlll pBR322 restriction fragment (Hartley et al., 1991) and in shorter duplexes containing 1,3 G-G crosslink sites (Bauer and Povirk, et al., 1997). Although the second arm reaction may occur rapidly, there are other competing reactions (particularly hydrolysis (Salvati et al., 1992) that will produce terminal monoadducts. The lower rate constant for the 1,3 G- 124 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. G mechlorethamine crosslink compared to the C-C crosslink suggest that these competing reactions are more significant for the former. 5.5.1 Detectable amounts o f 1,3 G-G crosslinks form after 10 hours in a d(GiCC) •d(G}QC) duplex fragment The very slow rate o f formation for the 1,3 G-G crosslink for a site containing also a C-C mismatch site (d(Gi£C)»d(GiCC) sequence fragment) is consistent with the observation discussed in chapter 4, that is, no 1,3 G-G crosslinks were detected for a 6 hour incubation o f mechlorethamine with duplexes having 1,3 G-G crosslink site containing other mismatched bases. We have previously suggested (Remias, et al., 1995) that for a 1,3 G-G crosslink to form, the nitrogen mustard must distort the DNA in the non-covalent complex such that the N7 atoms o f the guanine bases are appropriately positioned not more then approximately 6.5A apart (compared to 8.5 A in a canonical duplex). It is possible that the motion o f a mismatch pair, which is more mobile than a W atson-Crick base pair, prevents the formation and / or retention o f this arrangement of the guanine bases and leads to the reduced rate o f 1,3 G-G crosslink formation. 5.5.2 The amount o f the mechlorethamine C-C crosslink formed is reduced by a decreased GC:AT ratio in the bases flanking the C-C mismatch pair This was not due to the loss o f a 1,3 G-G crosslinking site, which in theory could increase the amount o f crosslink observed. The sequencing o f the crosslink bands showed no observable 1,3 G-G crosslinks. However, sequencing o f the monoadducts, M, revealed that the cytosines surrounded with less G-C pairs have either more or just as many cytosine monoadducts as crosslinks. This suggest that these cytosines are more solvent exposed or spend more time away from each other. W hat is also interesting is that not 125 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. only does the GC:AT ratio alter the efficiency o f the crosslink but also the proximity o f the G-C base pairs to the C-C pair (compare duplexes Ila-c with Ilk, which have the same GC:AT ratio). This suggest that changes in base content and sequences spanning at least five bases around the C-C mismatch can have an effect. This issue will be explored further in chapter 6 . The reduced reaction may be a consequence o f several factors. The first might be a decreased duplex stability (as indicated by the duplex melting temperature, Tm), therefore less duplex is available for. reaction. The second could be a local opening in the center o f the helix near the C-C mismatch, caused by having increased numbers o f A-T pairs in the flanking region (as suggested by computer simulation). We note that the series o f duplexes in Table 5.2 show the expected relationship o f GC:AT ratio to Tm, but that the duplex melting temperatures do not vary by more than 3.5°C. This perhaps suggests that the amount of crosslink formation may be more dependent on local instability and conformational fluctuation around the C-C mismatch, rather than on the overall position of the duplex to single-strand equilibrium. The second element, that could alter the efficiency o f the crosslink reaction is related to the local instability o f mismatch base pairs. A local thermodynamic instability associated with a decreased GC:AT ratio could lead to a wider minor groove. A wider minor groove might be populated with more water molecules. In support o f this Boulard et al. (1997) have suggested that a water molecule can hydrogen bond to a intrahelical C- C pair within the fragment sequence d(C£A )«d(T£G). To further support this hypothesis, 126 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. it has recently been reported by Becker et al., (1999) that the hydrolysis rates of acridinium ester (AE), a highly chemiluminescent reporter molecule that binds to the minor groove o f DNA, are dependent upon the sequence o f DNA, and that mismatched, enhanced hydrolysis o f AE by inducing hydration o f the double helix that spread approximately five base pairs on either side o f the mismatch. Therefore decreased reactivity could be a consequence of more hydration in the m inor groove, dependent upon sequence. A third reason, also related to local instability and conformational fluctuation around the C-C mismatch, involves extrahelical cytosines. Mechlorethamine quickly forms monoadducts with the more solvent exposed N7 atoms o f guanines in the major groove and the N3 atom o f adenine in the minor groove. A decreased amount of crosslinks at a C-C mismatch could be explained by a hydrated minor groove. However, it is also possible that the reactivity o f mismatched cytosines with mechlorethamine is limited by the dynamic equilibrium o f cytosine conformations. As suggested by the dynamic simulation, the equilibrium is shifted toward a closed intrahelical, position within a G-C pair. This equilibrium can be shifted towards extrahelical within an A-T pair. An extrahelical conformation would also lead to more cytosine monoadduct formation at the C-C pair. This hypothesis is partially supported by the sequencing gel. What is also interesting, is the increased reactivity o f the 5 ' -adenine to the C-C pair in duplex Ilj but not in duplex Ilk, which also has a 5’-adenine. This result is consistent with the computer simulation, which predicts instability in the center o f the helix and a solvent exposed 5’-adenine and extra helical cytosines (Figure 5.8b). 127 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.6 Summary In recent years there have been a number o f papers speculating on the likely structure o f the C-C mismatch pair (Brown et al., 1990; Gao et al., 1995; Boulard et al., 1997;), on the structure o f DNA hairpins containing such pairs (Chen et al.. 1995; Zheng et al., 1996; Yu et al., 1997; Darlow and Leach, 1998a, b; Gacy and McMurray, 1998), and, in particular, on the possibility that the cytosine bases o f the C-C pair may sometimes adopt an extrahelical position (Gao, et al., 1995; Yu et al., 1997; Zheng, et al.,1996; Darlow and Leach et al., 1998a, b). Such a conformation may be facilitated by the instability o f the C-C pair dependent upon the flanking sequence (Peyret, et al.,1999), and it is possible that C-C pairs may be in dynamic equilibrium between intrahelical and extrahelical locations (Zheng et al.,1996). Previous X-ray measurements o f other mismatches have shown only small changes in the double helix that are largely confined to the mismatch and immediate nearest neighbors (Kennard, 1993). However, as stated in chapter 1, there are no X-ray crystal structures o f duplex mimatched cytosines, which supports the dynamic nature o f this mismatch. The above observations suggest that mechlorethamine can efficiently crosslink a duplex that has its equilibrium shifted towards a intrahelical C-C conformation. We draw this conclusion based on two observations. First, a comparison o f the rate constant for C- C crosslink formation with that for 1,3 G-G crosslink formation suggests that there is a lack o f competing hydrolysis reactions during C-C crosslink formation. This could be to due to the proximity o f the second nucleophilic cytosine which is available for reaction with the second arm o f mechlorethamine. However, this availability must happen often 128 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and must have a distance o f at least 5.1 A to accommodate the reactive arms o f mechlorethamine. Although we do not have definite proof as to which atoms are reacting, the distance requirement suggest that mismatched cytosines within a G-C pair must spend more time within the helix. Second, having G-C pairs proximal to the C-C mismatch results in a greater amount o f crosslink being formed and also, as shown by computer simulation, allows for a more thermodynamically stable intrahelical C-C pair. This chapter discussed the reactivity o f a single mismatched cytosine pair. The next chapter explores the use o f this new reaction to probe multiple mismatched cytosines within the Fragile X sequence. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 129 Chapter 6 Mechlorethamine can crosslink multiple C-C mismatches in the d(GCC)n Trinucleotide repeat sequence. 6.1 Overview and Introduction Two molecular dynamics simulations were performed on duplexes with a single d(GCC)«d(GCC) fragment placed within two different random sequences. In chapter 3 and in addition chapter 5, both these simulations showed no significant motion of the mismatched C-C pair. The cytosine bases remained essentially intrahelical and stacked with the neighboring G-C pairs. In contrast, the simulation in chapter 3 o f a d(CCG)n hairpin (containing four d(GCC)«d(GCC) repeat units) revealed that the C-C pairs of the central fragments had the potential to move into the minor groove, excluded by the motion o f the neighboring guanine bases moving towards each other. We noted that the motion is consistent with a conformational change towards a structure similar to an e- m otif DNA, characterized by Gao et al. (1995) and described in chapter 2. This structure contains a central d(GCC)«d(GCC) sequence in which the central cytosine bases are displaced into the minor groove, with stacked guanine bases approximately 4.10 A. To provide an experimental correlate for the dynamics results, in this chapter we examine the reaction o f ‘multiple’ mismatched cytosines within the d(GCC)»d(GCC) fragment with mechlorethamine. As stated, the original reason for choosing mechlorethamine as a probe was the difference in the G(N7)-G(N7) distance that was predicted to be present in the multiple d(GCC)4«d(GCC) 4 repeat fragments. Based on this prediction, we reasoned that mechlorethamine might react more with this sequence, since the reactivity o f mechorethamine is sensitive to the N7-N7 distance (Remias et al., 1995). 130 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Recall that the simulation in chapter 3 showed that the distance o f the Guanine N7-N7 atoms changed from a starting distance o f 8 A to about 5A). Due to the ‘wedging’ motion o f the guanines the mismatched cytosines were pushed into the minor groove. This is important because it suggest that the cytosines may be more themodynamically stable in an extrahelical position due to the creation o f a pseudo GpC step. Since we now know that mechlorethamine can form a crosslink between a single pair o f mismatched cytosines, we wondered if reactivity would be different, due to a conformational shift o f the mismatched cytosines present in multiple repeats, as predicted by the simulation. In the previous chapter we showed that the reactivity o f mechlorethamine, in an isolated mismatched cytosine pair within the repeat fragment d(GCC)«d(GCC), was dependent upon the sequence and stability o f the flanking base pairs. We proposed that this reactivity suggested different conformations for the mismatch cytosines. In this chapter we show a similar dependence o f mechlorethamine reaction for multiple mismatched cytosines in duplex sequences and also show that mechlorethamine can crosslink a hairpin with an isolated mismatch cytosine. Furthermore, we show that this dependence is related to at least four base pairs on both sides o f the C-C pair. The efficiency o f the crosslink reaction in ‘isolated’ cytosines in a d(GCC)*d(GCC) fragment is greater then the reaction in multiple d(GCC)*d(GCC) repeat fragments. The difference in reactivity is consistent with the simulation data, and the experimental data presented in chapter 2 which suggests that the mismatched C bases in d(GCC)*d(GCC) multiple repeats may be extrahelical. In addition, in this chapter we show that the mobility o f the C-C crosslink on a denaturing gel is also dependent upon base stability. 131 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2 Materials and Methods Chemicals: Mechlorethamine (N,N-bis[2-chloroethyl]methylamine),T4 polynucleotide kinase were purchased from Sigma..[y-3 2P]ATP was purchased from ICN. All synthetic oligonucleotides (Table 6.1 and 6.2) were synthesized on an Applied Biosystems Model 394 automated synthesizer, deprotected, and purified with a COP cartridge at the USC Norris Cancer Center. DNA used for hairpin and sequencing reactions was further purified on a 20% denaturing polyacrylamide gel. All other reagents were analytical grade. 3 2P-5'-end labeling o f DNA: Approximately lOpg o f column purified synthetic DNA was 5'-end labeled with [y-3 2P]ATP (5pl, 4500 Ci/mmol) by incubation in buffer (30mM Tris (pH 7.5), 10 mM MgC12, 5mM dithiothreitol) and 30 units o f T4 polynucleotide kinase for 1 hour at 37°C (Sambrook, et al.,1989). The reaction was stopped by addition o f 5.5pL 3M sodium acetate (pH 5.2) and 150 |iL pre-chilled 95% ethanol. The unincorporatd [y3 2P]-ATP was removed by precipitation in 95% ethanol at - 20°C overnight, lyophilized, and resuspended in a 0 . 1 M NaCl solution. Alkylation o f DNA: An equal amount o f the unlabeled complementary strand was added to a 0.1M NaCl solution of the labeled oligonucleotide, heated to 65°-70°C and then slowly cooled to room temperature. Annealed duplexes were refrigerated for at least 1 hour at 4°C before reaction to insure integrity o f the duplex. Following annealing o f the strands, a lp M duplex DNA solution containing 0.1M NaCl and lOmM Tris (pH 7.5) was incubated for 3-6 hours at 37°C with lOOpM o f mechlorethamine in a total volume o f 132 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. lOOpL. For each experiment, a fresh solution o f the nitrogen mustard (lOOmM) was prepared in dimethyl sufoxide (DMSO), rapidly diluted to lOmM and immediately added to the DNA solution. Following incubation with the mustard, the reaction was terminated by addition o f 5.5pL o f 3M sodium acetate, 3pL tRNA (5mg/mL), and 150pL pre-chilled 95% ethanol, and precipitated in three times the volume o f pre-chilled 95% ethanol at - 20°C overnight, washed, and then lyophilized. The DNA was then dissolved in 2pL distilled water and 8 pL tracking dye (80% formamide, ImM EDTA, 0.025% bromophenol blue and xylene cyanol). Detection o f crosslinked DNA : The samples were loaded onto a 20% denaturing polyacrylamide gel (29:1 acrylamide/bisacrylamide, 8 M urea, 89mM Tris-borate (pH 8.5) 2mM EDTA (TBE buffer), 0.4mm thick, 38 x 31 cm, 2500 V, 45 W) until the xylene cyanol marker had migrated 15 cm-22cm. The temperature o f the gel was measured and determined to be 50°C. The band due to the mechlorethamine-crosslinked DNA was recovered from the gel and sequenced as previously (Romero et al., 1999a) to determine the cross-linking site. Quantitation and Analysis o f DNA Crosslink Formation: After the gel was exposed to X-ray film, the intensity o f the crosslink was analyzed and expressed as a percentage of the total DNA in each lane of the gel. Averaged band intensities from three experiments were calculated and the standard deviation determined. Tm o f DNA duplexes: All absorption measurements were made with a Shimadzu UV-visible spectrophotometer model UV160U and a 1cm cuvette. Duplexes at 10 pg/ml or 20 pg/ml were annealed via heating to 90°C for 1 minute in Tris buffer (pH 7.5, 0.1M 133 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. NaCl, 0 .0 1M Tris) and slow cooling back to ambient temperature. Thermal denaturation profiles were then measured by monitoring absorbance at 260 nm at various temperature intervals. The solutions were allowed to equilibrate at each temperature for 15 minutes before measuring the absorbance. The absorbance was corrected for volume expansion (Mandel and Marmur, 1968) and the melting temperature profile was determined by plotting absorbance against temperature. The peak o f the first derivative plot o f the melting temperature profile was defined as the Tm. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 134 6.3 Results 6.3.1 The mechlorethamine C-C Crosslink forms in multiple repeats with variable efficiency DNA duplexes o f the sequence shown in Table 6.1, Series 1, were incubated with mechlorethamine, and the products were electrophoresed on a 2 0 % polyacrylamide denaturing gel to determine the final yield o f crosslinked duplexes. In Figure 6.1, all the bands with slower mobility (band labeled X), are due to mechlorethamine C-C crosslinked duplexes (Romero, et al., 1999a), except lane 8 , which is a 1,3 G-G crosslink. The amount o f total crosslinked DNA as a percentage o f the total DNA was found to decrease with decreased GC:AT/CC ratio of the four base pairs on each side o f a C-C mismatch. The crosslinking efficiency is highest for the duplexes containing the isolated C-C mismatch in Figure 6.1. Hence, for duplexes with isolated repeats at least 27% o f the total DNA is crosslinked, whereas < 2 1 % o f the total DNA was crosslinked in duplexes with more then one repeat (Table 6.1). A completely W atson-Crick paired duplex Y 1Y2Y3 = GGG (Figure 6 . IB, Lane 8 ) with only a 1,3 G-G crosslink site gave a low 3%. It is apparent that less cytosine to cytosine crosslink is observed for the duplexes in which the d(GCC)«.d(GCC) repeats are present in tandem. However, interruption o f the repeat by a single d(GCC)«d(GGC) unit restores crosslinking (compare Lane 10 with lane 14 in Figue 6.1B)of the (now isolated) flanking d(GCC)*d(GCC) fragments. It is interesting to note that there is quite a difference in mobility for all these sequences. Therefore, sequencing reactions were performed to prove the identity o f the bands and their mobility quantified. (Table 6.1 and Figure 6.2). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 6.1. Series 1: Level o f mechlorethamine crosslinking, electrophoretic mobility, and Tm , o f duplexes d(CTCTC (GCC)n GTATC)*d(GATAC (GYC)n GAGAG)a 5 ' - (GCC)n (CYG)n -5' Y % c -c b % G-Cb % A-Tb % Crosslink Mobility (cm)c Tmd (°C) 5' -GCiCGCzCGCj CGxGCGaGCGB-S' G1 G2 G3 0.0 100 0.0 3.0 14.8 71 M uItfpleRepeafs • V .v * . - i L- 1: ' ' 5 ' -GCjCGCaCGfiLCG&C CCiGCCaGCCaG^G-S' C1 C2 C3 C4 47 c,** 12.5 75.0 12.5 5.5 11.5e C2 ** 25.0 75.0 0.0 5.0 11.3 Cj** 25.0 75.0 0.0 4.5 11.2 C4** 12.5 62.5 25.0 6.0 11.4 Total Yield o f Crosslink 21.0 5' -GCiCGC2CGC3C C£iGCC2GC£3 G -5' C.CjCj 42.0 c, 12.5 75.0 12.5 6.2 11.5 c 2 25.0 75.0 0.0 6.0 11.4 Cj 12.5 62.5 25.0 7.5 11.6 Total Yield o f Crosslink 19.7 5 ' -GCjCGCj CGCjC C£iGC£aGCG3 G -5' CIC2 G3 46.1 ***9 9 ? 7.1? 14.8 c , 12.5 75.0 12.5 6.9 13.0 C2 12.5 87.5 0.0 6.9 14.4 Total Yield o f Crosslink 20.9 IsoIatedRepeats 5' -GCiCGCaCGfibC CCiGCG jGCCjG- 5' C,C2 C 3 47.0 c, 0.0 87.5 12.5 15.0 13.5 C3 0.0 75.0 25.0 16.4 13.0 * Total Yield o f Crosslink 31.4 5' -GCx CGCjCGCjC CCiGCGaGCGjG-5' c, 0.0 87.5 12.5 27.9 14.4 * 5 ' -GC^CGfiaCGCjC CGiGC£aGCG3 G -5' C2 0.0 100 0 t l 14.6 52.1 5' -GCiCGCaCGfibC CGiGCGjGCCjG-S' c 3 0.0 75.0 25.0 I t 14.2 * Hairpin ; ■ • i -: ’ ’ v ! y.'.r ~ • .. - 5 ' -CTCTCAGAGCCTCGTTCAGTT 3 ' -GAGAGTCTCfiGAGCAAGTCTT c 0.0 62.5 37.5 26.5 14.7 * “Data from Figure 6 .1 and 6.3 bGC and AT and/or C-C content o f the four base pairs flanking the C-C mismatch on each side, d eterm in ed as the distance from the loading well to the center o f the crosslink band. dBased on an average o f three experiments.*Tm was not measured. eMobility corrected for molecular weight ** Best estimate, based on one sequencing gel for the top and bottom strands. ***This band splits into two uncleaved bands on the sequencing gel, with very light bands at both C l and C.2 136 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2 34 B 56 7 8 9 1011121314 I x. I 2 «» 3 - M -s Y3 GGC Y2 GCG Y1 CGG C G C CGG C G C G G C C G C Figure 6.1. Autoradiogram o f a 20% DPAGE gel following incubation for 6 hours with lOOpM mechlorethamine o f Series lduplexes (Table 6.1). (A) Single isolated repeats. Lane 1 has no mechlorethamine. (B) Multiple and isolate repeats. In lanes 5-14, the odd numbered lanes are controls (no mechlorethamine) and the even numbered lanes show the products resulting from incubation o f the indicated duplex with mechlorethamine. Bands have been numbered according to the way they were excised from the gel and sequenced. Bands are identified as X (C-C crosslink), M (monoadduct), and S (unreacted single strands). The duplex sequence is indicated by 3 '-Y iY 2Y 3-5 ' bottom strand. All duplexes had the same top strand and were labeled with 3 2P. Both gels ran until the xylene cyanol marker had migrated 15 cm. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 137 Y2 GCG C C G C Y1CGG C C C G Figure 6.2. Autoradiogram o f a 20% DPAGE sequencing gel. Lane labeled G indicates Maxam Gilbert guanine sequencing reaction. The lane numbers indicate the number of the band that was excised form Figure 6.1. Sequences are indicated on the bottom o f the gel. (A) Sequencing o f the single isolated repeats. The bands were excised from the gel in Figure 6.1 A (B l) Sequencing o f multiple repeats. The bands were excised form a gel similar to Figure 6 .1B, except that multiple lanes (4) were needed to obtain a high enough yield for sequencing and the gel was run for a longer time to allow the bands to separate. (B2) Sequencing o f isolated repeats. The bands were excised from the gel in Figure 6 .1B. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 138 6.3.2 The DPAGE mobility o f the mechlorethamine C-C crosslinked species is dependent on the GC:AT/CC content o f the base pairs flanking the C-C mismatch In addition to the variable crosslinking effciency, the crosslink species for duplexes with increasing number o f repeats have slower mobility in a denaturing gel, compared to an isolated mismatch (compare the mobility in Figure 6.1 A. Lane 2 and 4 with Figure 6 . IB lane. 10). These duplexes have the mismatched cytosine in the same positon. The only difference is that the duplex in lane 10 has two isolated mismatches. This suggest that mobility is more then a function o f position within the sequence. The electrophoretic mobility o f the crosslinked duplexes (Figure 6.1, and Table 6.1, Series 1) on a denaturing polyacrylamide gel (DPAGE) is quantified in Table 6.1. It is apparent from these data that the GC:AT/CC ratio of the flanking base pairs influences this mobility, and that higher AT and/or CC content leads to reduced electrophoretic mobility. To show that the band labelled X was due to a crosslink between two complementary strands, the identity was further confirmed in a separate experiment in which either the top or the bottom strand was radiolabeled (Figure 6.3B). These results were surprising and unexpected because not all the top and bottom labeled duplexes had similar mobility. Only the duplex with the isolated, central repeat (Figure 6.3B, lane 11) was consistent with what had been observed previously. Recall, in chapter 4 we identified the crosslink band by the formation o f species with similar mobility. A duplex that is identical in every way, except for the labeled end, should have the same mobility. We will explore this issue further in chapter 7. Also shown in Figure 6.3A, is the ability of mechorethamine to crosslink a hairpin with one mismatch cytosine in an isolated repeat. 139 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 00 11 22 33 44 ! • i i r M -S Figure 6.3. Autoradiogram o f a 20% DPAGE gel showing the variable mobility o f the C- C crosslink, o f duplexes from Table 6.1, Series 1 duplexes. (A) Crosslinking o f a hairpin within a single isolated repeat. The lane labeled R0 is a control and indicates a random sequence with the same molecular weight as the hairpin but is unable to form any secondary structure. Lane HO indicates the hairpin sequence incubated with OpM mechlorethamine, H I00 indicates hairpin incubated with lOOpM mechlorethamine, lane D 1 0 0 is the marker lane and indicates a duplex sequence identical to the hairpin stem incubated with lOOpM mechlorethamine. All incubations were for 3 hours (Table 6.1, Series 1). (B) Crosslinking o f the top and bottom strand o f 1-4 multiple repeats in tandem. The lane at the top indicates the number o f multiple tandem repeats. 00 is the control, 1 1 indicates, top and bottom strand labeled duplex, respectively, with a single isloated repeat in the center. All incubations were for 4 hours To allow the bands to separate, the gel ran until the xylene cyanol marker had migrated 24 cm. Bands are identified as X (C-C crosslink), M (monoadduct),S (unreacted single strands), and SS (unreacted hairpin and random single strands). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 140 6.3.3 Sequential replacement o f A-Tpairs by G-Cpairs has a predictable effect on the amount o f C-C crosslinked DNA and on the DPAGE mobility o f the crosslink To compare further the effect of the GC:AT/CC ratio, and the effect o f the proximity o f the G-C pairs to the C-C mismatch on the formation and DPAGE mobility o f the crosslink, we used the duplexe sequences shown in Table 6.2. Series 2. The autoradiogram o f the denaturing gel o f the products o f incubation o f these duplexes with mechlorethamine for 6 hours are shown in Figure 6.4 B, and the mobilities and crosslink efficiencies are shown in Table 6.2. Figure 6.4B shows that increasing the GC:AT ratio up to four base pairs on each side o f the C-C mismatch (duplexes b, c, d and e, compared to duplex a. Table 6.2, Series 2) increases both the amount o f crosslink formed and the DPAGE mobility o f the crosslinked species. For the four duplexes (b, c, d and e) having identical GC:AT content in the flanking base pairs, as the G-C pair becomes more proximal to the C-C mismatch the amount o f crosslink and the electrophoretic mobility o f the crosslinked duplex increases. Figure 6.4A shows the similar mobility o f C-C crosslinked duplexes that have flanking sequences that are A rich or C rich. The A rich duplex (Table 6.2, Series 2, a) has just one C-C crosslink site, while the C rich duplex (Table 6.1, Series 1) has three mismatched C-C sites in tandem. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 141 Table 6.2. Series 2:Level o f crosslinking and electrophoretic mobility o f duplexes d(CTCCCM,M3M2M,Cn,n2n3n4CCCAG)«d(CTGGGN4N3N2N1 Cmim2m3 ni4GGGAG).a # 5 ' -M4M3M2Mi C n i n 2n 3n4 m4m3m2miC NiN2N3N4- 5 ' % G -C b % A - T b % Crosslink M obilityc (cm) a A A T T C A A T T T T A A C T T A A 0 1 0 0 8.7 11.5 b GA TT CA AT C CTAAC TTA G 25 75 9.2 1 2 . 8 c AG TT C AA C T TC AA CT TG A 25 75 12.5 13.2 d AA CT C AG TT TTG AC TC A A 25 75 12.9 13.4 e AA TC C GA TT TT A GC CTA A 25 75 16.9 13.6 Data from Figure 4. b GC and AT content o f the four base pairs flanking the C-C mismatch on each side. c Determined as the distance from the loading well to the center o f the crosslink band. A B _1 _ 2_ • i 1 2 3 4 5 6 7 8 9 10 Figure 6.4. Autoradiogram o f a 20% DPAGE gel following incubation for 6 hours with lOOpM mechlorethamine. (A) a at the bottom indiates the duplex from Table 6.2, Series 2, a. CCC indicates the duplex with three repeats from Table 6.1, Series 1, (B) Duplexes from Table 6.2, Series 2, a-e. In lanes 1-10, the odd numbered lanes are controls (no mechlorethamine) and the even numbered lanes show the products resulting from incubation o f the indicated duplex with mechlorethamine. Bands are identified as X (C-C crosslink), M (monoadduct), and S (unreacted single strands). Both gels ran until the xylene marker reached 18cm. 142 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.4 Discussion The amount o f the mechlorethamine C-C crosslink formed is reduced by a decreased GC:AT/CC ratio in the bases flanking the C-C mismatch pair, and also, for the same GC:AT ratio, by having G-C pairs distal to the C-C mismatch. Therefore, not only does the GC:AT/CC ratio alter the efficiency, but also the proximity o f a G-C pair to the C-C mismatch. Changes in base content up to four bases from the C-C mismatch can have an effect. Again, this may be a consequence of decreased duplex stability (as indicated by the duplex melting temperature, Tm), or as proposed in the previous chapter, a different conformation o f the cytsoines dependent upon the flanking sequence. It is curious to note that the efficiency of three multiple repeats is very sim ilar to four and that the duplex with four repeats is slightly longer, therefore, in theory, more stable. In chapter 4, the duplex stability did not appear to be an issue. Recall that in chapter 4, single strands o f duplex were added to the reaction mixture o f mechlorethamine. After the addition o f the single strands, the complementary strand was added to the reaction mixture and allowed to incubate for 6 hours. Surprisingly, the amount o f crosslink was not reduced in these experiments (see chapter 4, Figure 4.2D,compare lane 6 and 8 with 10 and 12). Therefore, mechlorethamine does not disturb or alter the amount of preformed duplex, it m ight even help duplex formation. In addition, the duplexes are kept at 4°C before mechlorethamine is added (see methods), to assure duplex integrity. It is also interesting to note that the efficiency o f the 1,3 G-G crosslink in a perfectly stable Watson-Crick paired DNA, with many sites for reaction, was not increased, in contrast to a C-C crosslink (Table 6.1, Series I, YiY 2 Y 3= GGG, Tm=71). This result was also observed in chapter 4. This result indicates that duplex stability is not the only issue for 143 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. this reaction. What is also interesting to note is that the W atson-Crick paired duplex used for these experiments contains the sequence for both strands o f the Fragile X repeat, that is d(GCC)3«d(GGC)3. Lowered reactivity probably indicates that the duplex was not folding into a slipped structure with mispaired cytosines. 6.5 Summary The most interesting result obtained in this chapter is the mobility of the crosslinked species. The decrease in mobility is probably not due to the formation o f more then one crosslink on the duplex. The mobility o f 3-4 repeats is essentially the same (when mobility is corrected for molecular weight), and the mobility o f 3 repeats is similar to an AT rich sequence (Series 2. Figure 6.4A). Therefore, the reduced mobility is not just a consequence o f multiple crosslinks forming on a more C-C rich sequence. That the mobility can be increased in a AT rich duplex with only one crosslink site by sequentially replacing A-T pairs with G-C pairs and moving the G-C pair closer to the crosslink site indicates that the mobility is dependent on duplex stability. The nature o f this mobility will be explored further in chapter 7. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 144 Chapter 7 Electronho retie Mobility of Mechlorethamine Crosslinked DNA Duplexes Containing C-C Mismatches and 5 '-End Labeled with 3 2 P-Phosphate or Fluoroscein Phosphoramidite 7.1 Overview o f Chapter We have shown recently that a DNA C-C mismatch pair can be crosslinked by mechlorethamine. Here we report some unusual DPAGE mobility o f C-C crosslinked duplexes d(CCGN4AGN7AGN ,0ATTC ATCTG)«d(GGCCTCCTCCTAAGTAG AC), where N4=C, N7=G, Nio=G (duplex 4a), N4=G, N7=C, N|o=G (duplex 7a), and N4 =G, N 7=G, N io=C (duplex 10a), when these duplexes are 5 ' end-labeled with either fluorescein phosphoramidite or 3 2P-phosphate. First, we find the mobility o f the crosslinked duplexes increases from 10a to 7a to 4a, regardless o f the nature and the position o f the label, and despite the identical molecular weights o f the three duplexes. For the 3 2P-phosphate labeled duplexes, the mobility o f crosslinked duplexes 7a and 4a is greater when the label is positioned distal to the crosslink, compared to when the 3 2P- phosphate label is proximal to the crosslink in the equivalent duplex. 3 2P-phosphate labeled duplex 10a did not show this behavior. For duplexes 10a, 7a and 4a carrying fluorescein labels the mobility o f each duplex was enhanced by having the label on the top strand (proximal to the crosslink for 7a and 4a), relative to the equivalent duplex carrying a label on the bottom strand. This is the opposite result to that for the 3 2P- phosphate-labeled duplexes. Furthermore, the difference in mobility was greatest for duplex 10a, which contains a centrally placed crosslink. Double labeling o f the duplexes with a 3 2P-phosphate and fluorescein phosphoramidite produced mobilities that were inconsistent with a simple combination o f the single label results. We provide an explanation o f the results based (i) on the local sequence around the crosslinked C-C 145 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mismatch pair, (ii) on the location o f the mismatch relative to the duplex end, and the resultant 'pseudo-single stranded' nature o f the denatured crosslinked species, and (Hi) on the influence o f the label in ‘orienting’ the DNA for entry into the gel. 7.2 Introduction Gel electrophoresis is a technique for separation o f macromolecules by the differential migration o f the molecules in an electric field. The rate of migration is influenced by the size and shape o f the molecule, the charge it carries, the magnitude of the applied current, and the resistance o f the medium the molecules travel through. For DNA molecules, agarose gels are used to analyze large DNA samples with base pairs (bp) ranging from 70 bp (3 % agarose gels) to 800 000 bp (0.1 % agarose gels, while polyacrylamide gels are used to analyze DNA sequences that are between 6 bp (20 % acrylamide) and 1000 bp (3 % acrylamidc) long (Sealey and Southern, 1990) Chemical probing combined with gel electrophoresis is a powerful tool for elucidating the structure o f DNA (Lilley, 1992). Potentially interesting regions o f DNA can be probed with base specific adducts (DNA or RNA molecules with covalent modifications). Once formed, these adducts can be separated and purified from unreacted DNA molecules by gel electrophoresis. In particular, gel electrophoresis is applicable to the separation o f covalently crosslinked duplexes (duplexes that have reacted with a bifunctional alkylating agent, such that one reaction occurs on each strand) from a reaction mixture containing unreacted and monoalkylated duplexes (Hartley et al, 1993). Bifunctional alkylating agents are an effective group o f chemical probes that have been Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. used to show the presence o f alternate DNA structures such as cruciforms and supercoiling, Z-DNA, RNA, and protein DNA interactions. (Ussery et a l, 1992). In this chapter we continue to characterize the potential use o f the C-C mechlorethamine crosslink by investigating the electrophoretic mobilities o f several mechlorethamine C-C crosslinked duplexes 5' end-labeled with either 32P phosphate or fluorescein phosphoramidite, and report a number o f unusual results. We show that the mobility o f crosslinked duplexes o f the same length is a function o f the location o f the crosslink, relative to the duplex end. For the identical crosslinked duplex, the electrophoretic mobility also depends on the position o f the label (which may be on either the top or bottom strand). We also show that the differential mobilities o f identical duplexes carrying either top or bottom strand labels is a function o f the label itself (3 2P- phosphate or fluorescein phosphoramidite). We explain these results by considering a combination o f base content, local duplex stability, the influence o f the label in the gel, and the ‘pseudo-single stranded’ conformations adopted by the partially denatured crosslinked duplexes. O HN OH i O Figure 7.1. Structure o f the 5 ' -fluorescein label. 147 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.3 Material and Methods Chemicals: Mechlorethamine (bis[2-Chloroethyl]methylamine) was purchased from Sigma and 5 '-fluorescein phosphoramidite ( [ ^ ’.b’-dipivaloylfluoresceinylHJ- carboxamidohexyl]-1 -(2-cyanoethyl)-(N,N-diisopropyl)-phosphoramidite) was purchased from Glen Research. The oligodeoxynucleotides were obtained from the USC Norris Cancer Center Microchemical Core Facility at the University o f Southern California. They were synthesized on an Applied Biosystem Model 394 automated synthesizer, deprotected, purified and, when required, labeled with the 5 '- fluorescein phosphoramidite. All other reagents used were analytical grade. j:P-5'-end labeling o f DNA: Approximately lOpg o f column purified synthetic DNA was 5'-end labeled with [y-3 2P]ATP (5pl, 4500 Ci/mmol) by incubation in buffer (30mM Tris (pH 7.8), 10 mM MgC12, 5mM dithiothreitol) and 30 units o f T4 polynucleotide kinase for 1 hour at 37°C. The reaction was stopped by addition o f 3M sodium acetate (5.5pl, pH 5.2) and pre-chilled 95% ethanol (150 pi). The unincorporated y3 2 P-ATP was removed by precipitation in 95% ethanol at -20°C overnight, lyophilized, and resuspended in a 0.1M NaCl solution. Mechlorethamine crosslinking reaction: Following annealing o f the strands, a lpM duplex DNA solution containing 0.1M NaCl and lOmM Tris (pH 7.5) was incubated for 6 hours at 37°C in a water bath with a freshly prepared lOOpM mechlorethamine solution in DMSO (0.02 g o f mechlorethamine were dissolved in 1000 pi o f DMSO to give a lOOmM solution that was further diluted to lOmM, and 2pL o f this 148 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. solution was used in the incubation). Following incubation with the mustard, the reaction was terminated by addition o f 5.5|a.L o f 3M sodium acetate, 3pL tRNA (5mg/mL), and 150^L pre-chilled 95% ethanol, and precipitated in three times the volume o f pre-chilled 95% ethanol at -20°C overnight, washed, and then lyophilized. The DNA was then dissolved in 2pL distilled water and 8 pL tracking dye (80% formamide, ImM EDTA, 0.025% bromophenol blue and xylene cyanol). Detection o f crosslinked DNA: The samples were loaded onto a 20% denaturing polyacrylamide gel (29:1 acrylamide/bisacrylamide, 8 M urea, 89mM Tris-borate (pH 8.5) 2mM EDTA (TBE buffer), 0.4mm thick, 38 x 31 cm, 2700 V, 50 W) until the xylene cyanol marker had migrated 15 cm-22cm. The temperature o f the gel was measured and determined to be 50°C. Prior to loading o f the DNA, the gel was pre-run with TBE buffer at 2500 volts, 50 W for 60 minutes. Bands due to 5 ’-fluorescein phosphoramidite labeled DNA molecules were detected using UV light and photographed. 5'- 32P phosphate labeled DNA molecules were detected by autoradiography. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 149 7.4 Results 7.4.1 The electrophoretic mobility o f a C-C crosslinked duplex is dependent on the position o f the crosslink in the duplex, and on the location o f the 3 2 P-phosphate 5 '-end label. In the first series o f experiments we examined the electrophoretic mobility of mechlorethamine crosslinked duplexes 10a. 7a and 4a (Table 7.1). These duplexes have an identical ‘bottom ’ strand, but vary in their ‘top’ strands at positions 10, 7 and 4, which are a cytosine or a guanine. Hence, duplex 10a has a C-C mismatch pair located at the center o f the duplex sequence, while in duplexes 7a and 4a the C-C pair is shifted progressively towards the duplex end. The same base pairs neighbor the C-C mismatch pair in each duplex and the duplexes also have identical molecular weights. The electrophoretic mobilities o f the mechlorethamine C-C crosslinked duplexes 10a, 7a and 4a carrying a 3 2P-phosphate 5 '-en d label are shown in Figure 7.2. Two main results are apparent. First, the electrophoretic mobility o f the crosslinked duplexes increases from 10a to 7a to 4a. Second, for duplexes 7a and 4a, the location o f the 3 2P-phosphate label also has an influence on the mobility. In each, the species carrying a 3 2P-phosphate 5'-end label proximal to the crosslink (that is, labeled on the top strand) has a slower mobility than the identical crosslinked duplex carrying a 3 2P-phosphate 5 '-end label distal to the crosslink (that is, labeled on the bottom strand) (compare lanes 6 and 8 (duplex 7a) or lanes 10 and 12 (duplex 4a) in Figure 6.2). In accord with this trend, no discemable difference can be seen in the mobilities o f top and bottom strand labeled duplex 1 0 a, because the crosslink is centrally located in the duplex. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 7.1. DNA duplex sequences and shorthand notations.1 Duplex Notation Duplex Sequence % GC % AT #GCb 1 0 a 5 ' -CCGGAGGAG£ATTCATCTG GGCCTCCT CCTAAGTAGAC- 5 ' 50.0 50.0 7/3 7a 5 ' -CCGGAGCAGGATTCATCTG GGCCTC£TCCTAAGTAGAC- 5 ' 62.5 37.5 5/5 4a 5 ' - CCG£AGGAGGATTCATCTG GGCCTCCTCCTAAGTAGAC-5 ' 83.0 17.0 3/7 1 0 b 5 ' -CCTATACTC£GAGTATACC GGATATGAG£CTCATATGG-5 ' 50.0 50.0 4/4 7b 5 ' - ATACTC£GAGTATACCCCT TATGAGCCTCATATGGGGA-5 ' 50.0 50.0 2 / 6 1 0 c 5 ' -ATTCATCTG£CCGGAGGAG TAAGTAGAC£GGCCTCCTC-5 ' 75.0 25.0 3/7 The numerical part o f the notation refers to the position o f the C-C mismatch (involving either base 4, 7 or 10 of the top strand). The letter designation refers to related duplexes. ‘ ’ Number o f G-C pairs on each side o f the duplex L Only three bases on each side were counted for this duplex. 7.4.2 Multiple DPAGE bands are observed for i2P-phosphate double labeled duplexes. To probe further the differential mobility caused by the location o f the 3 2P- phosphate label, we electrophoresed the duplexes 10a, 7a, 4a (Table 7.1) carrying simultaneously a 3 2P-phosphate label on both strands. The results o f these experiments are also shown in Figure 7.2 (lanes 13 -15). It is apparent that for the double-labeled duplexes 7a and 4a bands are present in the gel with mobilities that correspond to the individual bands observed for equivalent single-labeled duplexes. Hence, we conclude that distinct populations o f crosslinked duplex, having different electrophoretic mobilities, are present for the double-labeled species, and that, for the single-labeled duplexes one o f these populations is preferentially ‘selected’ by the presence o f the label Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. either proximal or distal to the crosslink site. We will return to this conclusion in the section that follows. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 T B T B T B D D D 1 0 a 7 a 4 a 1 0 a 7 a 4a Figure 7.2. Autoradiography o f a 20% DPAGE gel following incubation with lOOpM mechlorethamine o f 3 *P-labeled duplexes 10a, 7a and 3a (see Table 7.1 for sequence). Even numbered lanes from 2 to 12 show the results o f mechlorethamine incubation with duplex 10a 32P-labeIed on the top strand (lane 2), duplex 10a 32P-labeled on the bottom strand (lane 4), duplex 7a 32P-labeled on the top strand (lane 6), duplex 7a 32P-labeled on the bottom strand (lane 8), duplex 4a 32P-labeled on the top strand (lane 10) and duplex 4a 32P-labeled on the bottom strand (lane 12). Odd numbered lanes from 1 to 11 are controls (no mechlorethamine) for duplex 10a 32P-labeled on the top strand (lane 1) and on the bottom strand (lane 3), duplex 7a 32P-labeled on the top strand (lane 5) and on the bottom strand (lane 7) and duplex 4a 32P-labeled on the top strand (lane 9) and on the bottom strand (lane 11). Lanes 13-15 show the results o f mechlorethamine incubation with 32P-double-labeled duplexes (labels on both strands) for duplex 10a (lane 13), duplex 7a (lane 14) and duplex 4a (lane 15). Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). 152 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.4.3 The influence o f crosslink position on mobility is unchanged by inclusion o f a 5 ’ -fluorescein phosphoramidite label, but the influence o f the label position on mobility within a duplex does change. To determine if the variable electrophoretic mobility o f the 3 2P-phosphate labeled crosslinked duplexes 10a, 7a and 4a was a function o f the label, we repeated the experiments using the same duplexes with 5 '-fluorescein phosphoramidite labels. The relative electrophoretic mobilities o f the duplexes are unchanged with the new label (that is, the mobility increases from 10a to 7a to 4a, Figure 7.3). However, in every other respect the results obtained for the 5 '-fluorescein phosphoramidite labeled duplexes are the opposite to those obtained with the 5 '- 3 2P-phosophate labeled duplexes (contrast Figure 7.2 with Figure 7.3). Hence, the differential mobility o f the same crosslinked duplex as a function o f the location o f the label (proximal or distal to the crosslink) is now most significant for duplex 10a, despite the symmetry o f this duplex. The difference in mobility is also apparent for duplexes 7a and 4a, but becomes progressively smaller from 10a to 7a to 4a. This is the reverse o f the order seen for the 5 '- 3 2P-phosphate labeled duplexes. Furthermore, the mobility o f crosslinked duplexes 10a and 7a is greater when the duplex carries a 5 '-fluorescein phosphoramidite label proximal (for 7a; for 10a the central location of the crosslink does not allow for this terminology) to the crosslink (that is, the label is on the top strand) (contrast lanes 2 and 4 ( 10a), lanes 6 and 8 (7a), or lanes 10 and 12 (4a) in Figure 7.3). Again, this is in contrast to the 5 '- 3 2P-phosphate labeled duplexes, in which duplexes carrying a label proximal to the crosslink had a slower mobility, compared to that o f the equivalent carrying a label distal to the crosslink (contrast lanes 6 and 8 (7a), or lanes 10 and 12 (4a), in Figure 7.3). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. T B T B T B 10a 7a 4a Figure 7.3. Autoradiography o f a 20% DPAGE gel following incubation with lOOfaM mechlorethamine of fluoroscein-labeled duplexes 10a, 7a and 3a (see Table 7.1 for sequence). Even numbered lanes from 2 to 12 show the results o f mechlorethamine incubation with duplex 1 0 a fluoroscein-labeled on the top strand (lane 2 ), duplex 1 0 a fluoroscein-labeled on the bottom strand (lane 4), duplex 7a fluoroscein-labeled on the top strand (lane 6 ), duplex 7a fluoroscein-labeled on the bottom strand (lane 8 ), duplex 4a fluoroscein-labeled on the top strand (lane 10) and duplex 4a fluoroscein-labeled on the bottom strand (lane 12). Odd numbered lanes from 1 to 11 are controls (no mechlorethamine) for duplex 1 0 a fluoroscein-labeled on the top strand (lane 1 ) and on the bottom strand (lane 3), duplex 7a fluoroscein-labeled on the top strand (lane 5) and on the bottom strand (lane 7) and duplex 4a fluoroscein-labeled on the top strand (lane 9) and on the bottom strand (lane 11). Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). 154 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.4.4 Duplexes carrying both 5 '-fluorescein phosphoramidite and 5 '-nP- phosphate labels show an overall similar electrophoretic mobility. In Figure 7.4 we show the DPAGE mobility o f duplexes 10a, 7a and 4a carrying simultaneously both 5 ' -fluorescein phosphoramidite and 5 '- 3 2P-phosphate labels. In Figure 7.4A the bands are visualized using the 32P radioactivity, and in Figure 7.4B the same gel is visualized using the fluorescence o f the fluorescein label. These duplexes also have increased electrophoretic mobility as the crosslink position is closer to the duplex end (that is, from 10a to 7a to 4a). However, when the 32P single label duplexes are viewed side by side with the double label duplexes some differences are evident. (Figure 7.5) The difference in mobility for duplex 4a between top and bottom within the double label (Figure 7.5A, lane 10 and 12), is reduced compared to the single P32 label (Figure 7.5B, lane 10 and 12). When the 5 ' -fluorescein phosphoramidite labeled duplexes are viewed with the double tabled duplexes, the differential mobility within each duplex observed for both sets is not as pronounced as observed for the singly labeled 5 '- fluorescein phosphoramidite duplexes (Figures 7.6). The reduction in the difference between band migration within each duplex could be due to a slight variation in gel temperature or total migration from the well, however, the difference within each duplex is essentially the same overall. That is. the difference in mobility within a duplex is greatest for duplex 10a and the least for 4a, as observed before for with the 5 '- fluorescein phosphoramidite label. Hence, the biggest change occurred for the 32P label and the 5 '-fluorescein phosphoramidite label mobility remained essentially unchanged. Consequently, when the duplexes have double labels, the effect o f the 5 '-fluorescein phosphoramidite label dominates. We note that the method o f visualization itself can 155 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. slightly influence the exact positions o f the bands, which are slightly different in Figures 7.4A and 7.4B, despite the figures showing the identical gel. 156 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B 123456789101112 • II B T B T B T ' 10a 7a 4a > -s Figure 7.4. Autoradiography o f a 20% DPAGE gel following incubation with lOOpM mechlorethamine o f duplexes 10a, 7a and 3a (see Table 7.1 for sequence) carrying simultaneous 32P and fluorescein labels. The gels in A and B are identical, but the bands in A are visualized via the 32P label, whilst those in B are visualized via the fluorescein label. A. Even numbered lanes from show the results o f mechlorethamine incubation with duplex 10a 3 2P-labeled on the top strand (and fluoroscein-labeled on the bottom strand) (lane 2), duplex 10a 3 2P-labeled on the bottom strand (and fluoroscein-labeled on the top strand) (lane 4), duplex 7a 3 2P-labeled on the top strand (and fluoroscein-labeled on the bottom strand) (lane 6 ), duplex 7a 3 2P-labeled on the bottom strand (and fluoroscein- labeled on the top strand) (lane 8 ), duplex 4a 3 2P-labeled on the top strand (and fluoroscein-labeled on the bottom strand) (lane 10) and duplex 10a 3 2P-labeled on the bottom strand (and fluoroscein-labeled on the top strand) (lane 12), Odd numbered lanes from 1 to 11 are controls (no mechlorethamine) for the equivalent duplexes. B. Lanes 1 to 12 are as in A, but the gel is visualized via the fluorescein label, and the label location indicated (T or B) below the lanes refers to the strand that was fluorescein-labeled. Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). Note the reversal in the mobility o f the single-stranded, unreacted DNA in B, compared to the equivalent lane in A, which occurs because the opposite strand is being detected in B, compared to A. 157 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A B 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 • * • * M « • » Ifhl T B T B T B T B T B T B 10a 7a 4a I 10a j 7a 4a M Double Label Single P Label Figure 7.5. Side by side view o f the double label and single 32P label. T indicates that the top strand was labeled with J‘P, B indicates that the bottom strand was labeled with 32P Double indicates that both labels are present. A B T B T B T T B T B T B 1 0 a 7a 4a 10 a 7a 4a Double Label Single F Label Figure 7.6. Side by side view o f the double label and single fluorescein-labeled duplexes. T indicates that the top strand was labeled with fluorescein, B indicates that the bottom strand was labeled with fluorescein. Double indicates that both labels are present. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 158 7.4.5 The relative mobility o f two crosslinked duplexes with identical sequences flanking the C-C mismatch also depends on the position o f the C-C mismatch. We showed in the previous chapter that the electrophoretic mobility of mechlorethamine C-C crosslinked duplexes is also a function o f the sequence flanking the C-C mismatch pair, and that a higher GC content in the flanking regions leads to a greater or faster mobility. In duplexes 10a, 7a and 4a the GC content of the eight base pairs flanking the C-C mismatch (four pairs on each side, except for 4a) is 50% (4/8), 62.5% (5/8 ) and 83% (5/6), respectively. Hence, the different mobilities o f these duplexes could be caused by this different GC content. To determine if this was the explanation for the different mobilities, experiments were performed on 3 2P-phosphate labeled duplexes 10b and 7b (Table 7.1). These duplexes have identical sequences flanking the C-C mismatch pair up to six base pairs from the mismatch in each direction. However, as shown in Figure 7.7, duplex 7.7b still has a greater electrophoretic mobility than duplex 1 0 b, suggesting that the principal origin o f this effect is from the crosslink position. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 159 X M <“ S T B T B 10b 7b Figure 7.7. Autoradiography o f a 20% DPAGE gel following incubation with lOOpM mechlorethamine o f 3 2P-labeIed duplexes 10b and 7b (see Table 7.1 for sequence). Even numbered lanes include mechlorethamine with duplex 10b 3 2P-labeled on the top strand (lane 2), duplex 10b 3 2P-labeled on the bottom strand (lane 4), duplex 7b 3 2P-labeled on the top strand (lane 6 ), and duplex 7b 3 2P-labeled on the bottom strand (lane 8 ). Odd numbered lanes are controls (no mechlorethamine) for duplex 10b 3 2P-labeled on the top strand (lane 1) and on the bottom strand (lane 3) and duplex 7b 3 2P-labeled on the top strand (lane 5) and on the bottom strand (lane 7). Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 160 7.4.6 For ‘symmetrical’ crosslinked duplexes carrying 5 '-fluorescein phosphoramidite labels the different mobility caused by labeling either end o f the duplex occurs as a function o f the duplex sequence. To understand further the different mobility for duplex 10a caused by having a 5 '-fluorescein phosphoramidite label on the top or bottom strand, we performed similar experiments on duplex 10c. In duplex 10c the flanking sequences to the 'right and left1 of the C-C mismatch (Table 7.1) are the same as those to the 'left' and 'right' o f the C-C mismatch in duplex 10a (Table 7.1). The electrophoretic mobility o f duplex 10c carrying 5 '-fluorescein phosphoramidite label on either the top or bottom strands is shown in Figure 7.8. The results are opposite those o f Figure 7.3, 10a, lanes 1-4. 1 0 c Figure 7.8. Autoradiography o f a 20% DP AGE gel following incubation with lOOpM mechlorethamine of fluoroscein-labeled duplexes 10c (see Table 7.1 for sequence). Lane 2 shows the results o f mechlorethamine incubation with duplex 1 0 c fluoroscein-labeled on the top strand, duplex 10c fluoroscein-labeled on the bottom strand (lane 4). Odd numbered lanes are controls. Bands are identified as X (crosslink), M (monoadduct), and S (unreacted single strands). 161 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.5 Discussion Our interpretation o f the results in this chapter are based on two previous observations. These are as follows, (i) in two dimensional gels in which there is a denaturing gradient orthogonal to the direction o f movement o f the DNA duplex, those duplexes that are easily denatured run with greater mobility (Frisher and Lerman, 1979, 1983; Myers, et al., 1987). (ii) under identical nondenaturing conditions duplexes with a higher GC content are stable and run with greater mobility compared to duplexes with unstable or unpaired or ‘bubbled’ mismatched sequences. (Frisher and Lerman, 1979, 1983; Myers, et al., 1987). Although these observations were made for large DNA molecules, (2 0 0 -base pairs) they can be used to rationalize the differential mobility o f the mechlorethamine C-C crosslinked 19mer duplexes. In Figure 7.9 we show a schematic representations o f the DNA conformations we anticipate will occur in the gel when 32P labeled, and much o f the following discussion refers to this figure. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 162 1 0 7 4 (a) (b) (c) p t 1 pB PT 1 p a A H 1 A px p a pT pB i 'A p a Figure 7.9.Schematic representations o f (a) 19 base pair DNA duplexes having a mechlorethamine C-C crosslink at base pair 10, base pair 7 and base pair 4 with a 32P label, (b) Final possible ‘pseudo-single strand’ conformations o f the same duplexes that could occur in a denaturing polyacrylamide gel, and (c) the possible way in which the initial denaturation of each duplex might occur as it enters the gel. For each duplex (10, 7 and 4) two structures are shown carrying 3 2P-phosphate labels (P) on either the top (PT) or bottom (PB) strands. In this representation it is assumed that the labeled end o f the duplex enters the gel first, and the lower structures show the initial denaturation for each orientation o f each duplex a short time after entry into the gel. The initial structure as the duplex enters the gel for top versus bottom determines which band within the same duplex will travel faster. The creation o f a longer single strand in the leading side, with a ‘tight’ end, enhances movement through the pores. Notice that this creates a conformation that is similar for duplex 10a, but makes a bigger difference in 4a. The consistent result obtained for all the duplex sequences examined in this chapter, regardless of the label position, the label type, or the number o f labels on the duplex, was that duplexes with a crosslink proximal to a duplex end have greater electrophoretic mobility than those with a more centrally-positioned crosslink. However, in chapter 6 we showed that for a centrally crosslinked duplex, that mobility varied greatly as a function o f the local sequence around the crosslink. Nonetheless, the different mobilities o f duplexes 10b and 7b (Figure 7.7), which have identical sequences six base 163 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pairs to each side o f the crosslink, suggests that the differential mobility observed is also due to the crosslink position. W e believe the reason for this is that, on denaturation, the former duplexes are able, by rotation around the crosslink site, to adopt conformations that are somewhat ‘single-stranded’ (Figure 7.9(b)). Such conformers would be expected to have a higher electrophoretic mobility than those formed from the latter duplexes because they would be able to wind their way through the pores o f the polyacrylamide maze easier. A more compact and tight single stranded structure (or a tight double stranded duplex, which will be discussed later) would work its way through the pores faster. (Figure 7.9(b)). More difficult to explain is the differential mobility observed for identical crosslinked duplexes that differ only in their carrying o f a label proximal or distal to the crosslink site. That is, for duplexes labeled on the top or bottom strand. Top and bottom crosslinked duplexes run different as observed in the previous chapter when the top and bottom strands were labeled for the multiple repeats (Chapter 6 , Figure 6.3B). For the 3 2P-phosphate labeled duplexes 7a and 4a (Figure 7.2) and for duplex 7b (Figure 7.7), the duplex carrying a label on the bottom strand (that is, distal to the crosslink) has a greater electrophoretic mobility than the equivalent duplex with a label on the top strand. We hypothesize that this difference may be due to the initial rate o f denaturation o f each duplex in the gel, as represented in Figure 7.9(c). Since the label position is the only difference between a given pair o f duplexes, we suggest that the label can have an effect on how the DNA duplex enters the gel (which is not initially denatured since none o f the duplexes are subjected to heating before loading onto the gel to preserve the integrity o f 164 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the crosslink). We assume in Figure 7.9(c) that the additional phosphate group from the J:P label ‘leads’ the duplex into the gel. Hence, for a crosslink proximal to the label, initial denaturation may be hindered by the crosslink, while for the duplex having a crosslink distal to the label denaturation can occur for a greater length o f the duplex without hindrance. This, in turn, would influence the rate at which a fully denatured conformation could be attained, and hence account for the difference in the overall electrophoretic mobility caused by the label position. Such an effect would be more pronounced for a duplex with a crosslink close to the end o f the duplex (for example, duplex 4a) and would not produce a different mobility for top and bottom strand labeled duplexes containing a central crosslink (for example, duplex 10a or duplex 10b). We note that, in the 3 2P-phosphate double-labeled duplexes we observe multiple bands that correspond to the band mobilities for the single-labeled duplexes. The appearance of these multiple bands is consistent with the above hypothesis. For the fluoroscein-labeled duplexes the observation that the duplex o f greater mobility is now that having the label proximal to the crosslink is not straightforward to explain. This is particularly so because the largest effect is seen for the duplex having a central crosslink (duplex 1 0 a) and hence a simple reversal o f the hypothesis described above for 3 2P-phosphate labeled duplexes (that is, the fluoroscein-label preferentially enters the gel ‘following’ the duplex) is not entirely satisfactory. Therefore, we hypothesize that although the mode o f entry is opposite to that o f the 3 2P-phosphate label there is an additional effect o f the bulky fluoroscein-label. This effect will be described in the next section. We also believe that the results for the 3 2P-phosphate, fluoroscein 165 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. double-labeled duplexes support this conclusion. For these duplexes, we observed that the pattern o f mobility was most similar to the single flouroscein label and conclude that the effect o f the fluoroscein label is dominant, but the 32P label still leads the entry into the gel. For duplex 10a carrying a single fluorescein label, the only difference between the duplexes labeled on the top and bottom strands is the actual duplex sequence, which contains 7out o f 9 G-C pairs to the ‘left’ of the C-C crosslink, and only 3 out o f 9 G-C pairs to the ‘right’ (Table 7.1). Hence, it is possible that if the fluorescein label ‘followed’ the duplex into the gel, there could be a different equilibrium position between the single strand and duplex conformations for each half o f the duplex. However, this effect apparently does not occur for 3 2P-phosphate labeled duplex 10a. One possible difference that could account for this is that, for the fluorescein-labeled duplex 1 0 a, the bulky label itself might destabilize the less G-C rich side and increase the rate o f denaturation on the less stable side. This would happen when the fluorescein label is on the bottom strand for duplex 10a and on the top strand for duplex 10c, therefore reversing the results. For duplexes 7a and 4a a similar phenomenon might occur, again leading to the enhanced mobility o f the top strand labeled duplex, but with the difference in mobilities being reduced by the occurrence o f an opposing effect similar to that described for the 3 2P- phosphate-labeled duplexes. In Figure 7.10 we show a schematic representations o f the DNA conformations we anticipate will occur in the gel for a fluoroscein label. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (a) FT (b) FT F (c) H FB F f b F n u X *x" X V x V Figure 7.10. Schematic representations o f (a) 19 base pair DNA duplexes having a mechlorethamine C-C crosslink at base pair 10, base pair 7 and base pair 4, with a flourouscein label. The ‘bubbled’ end represents the destabilized side with the bulky label. Note that this destabilization can occur for both ends but is exaggerated in the end o f the duplex with less GC content. FT indicates labeled on the top, FB,bottom..(b) Final possible ‘pseudo-single strand’ conformations o f the same duplexes that could occur in a denaturing polyacrylamide gel, and (c) the possible way in which the initial denaturation o f each duplex might occur as it enters the gel. For each duplex (10, 7 and 4) two structures are shown carrying fluoroscein labels on either the top (FT) or bottom (FB) strands. In this representation it is assumed that the labeled end o f the duplex enters the gel last, and the lower structures show the initial for each orientation o f each duplex a short time after entry into the gel. The initial structure as the duplex enters the gel for top versus bottom determines which band within the same duplex will travel faster. The creation o f a longer single strand in the leading side, with a ‘tight’ end, enhances movement through the pores. If a ‘wide’ or open end is present (present when the label causes destabilization) this will slow the mobility down. Notice that this creates a bigger difference in conformation for duplex 10a then in duplex 4a. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 167 In Figure 7.11 we show the schematic o f DNA labeled on both ends and show that essentially the observed effect is due to the bulky fluoroscein label. 10 7 (a) P B P B FT (b ) (c) FB FT FT P B F Figure 7.11. Schematic representations o f (a) 19 base pair DNA duplexes having a mechlorethamine C-C crosslink at base pair 10, base pair 7 and base pair 4, with a flourouscein label, and 32P label. The ‘bubbled’ end represents the destabilized side with the bulky label. Note that this destabilization can occur for both ends but is exaggerated in the end o f the duplex with less GC content. FT indicates fluoroscein labeled on the top, PT, 32P labeled top, FB, fluoroscein bottom, PB, 32P labled bottom, (b) Final possible ‘pseudo-single strand’ conformations o f the same duplexes that could occur in a denaturing polyacrylamide gel, and (c) the possible way in which the initial denaturation o f each duplex might occur as it enters the gel. For each duplex (10, 7 and 4) two structures are shown carrying labels on either the top or bottom strands. In this representation it is assumed that the 32P labeled end o f the duplex enters the gel first, and the lower structures show the initial denaturation for each orientation o f each duplex a short time after entry into the gel. The initial structure as the duplex enters the gel for top versus bottom determines which band within the same duplex will travel faster. The creation o f a longer single strand in the leading side, with a ‘tight’ end, enhances movement through the pores. If a ‘w ide’ or open end is present (present when the label causes destabilization) this will slow the mobility down. Notice that this creates a bigger difference in conformation for duplex 10a then in duplex 4a. 168 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.6 Summary The overall determinant o f the mobility o f the mechlorethamine C-C crosslink is dependent upon the position o f the crosslink and how tight the conformation is as it travels through the pores o f the gel. This tightness could be single ‘strandedness,’ as observed for the duplexes in this chapter, or a tight and stable duplex (as observed in chapter 6 ). Recall that in chapter 6 , the crosslink with the central crosslink had the fastest mobility. This is in complete contrast to the duplexes in this chapter. However, we know that a central crosslink can have great variation in its mobility depending on the proximity o f a G-C pair. We propose that when a C-C crosslink is surrounded by increasing G-C content it tends to be more stable and preserve its duplex structure as it travels in the gel. For a Psoralen crosslink, positioned in the center o f the helix, it has been hypothesized that the crosslinked duplex does not denature when subjected to denaturing conditions (Kandallu, et al., 1992). However, if the duplex is destabilized, by adding mismatches or increasing the A-T content, the duplex mobility will decrease in accordance with duplex stability. This is consistent with our results and suggest that a ‘bubble’ or local region o f instability will decrease mobility. 7.7 Relevance to Fragile X With the knowledge that the reaction can form in multiple repeats and hairpins, and that it is possible to understand its mobility on a nondenaturing gel, we begin to imagine how this information can be useful. It is unknown whether slipped structures actually form or what they really look like in the expansion mechanism. If a slipped structure could be engineered by crosslinking a hairpin, then annealing the hairpin in a Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. particular region to a duplex, the migration o f this species could be monitored and quantified. The reaction could then be used to probe cells for these structures. 170 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 8 Theoretical Computer Modeling of the C-C Crosslink Overview and Introduction This chapter models the structure o f the crosslink duplex based on the research in chapter 4. In that chapter we proposed that the crosslink forms in the minor groove o f a DNA helix. Two experiments where performed to determine this. The first one involved reaction with dimethyl sulfate (DMS) (Figure 8 . 1(a)). This chemical probe can react with the N7-guanine atom in the major groove. After the crosslink was isolated and purified, it was probed with DMS to determine if the N7-guanines were still reactive. The reasoning was that if the crosslink occurred in the major groove, then DMS would be sterically hindered and would be unable to react. The second experiment performed to determine the reactive groove, involved the use o f the ligand, Hoechst 33258 (Figure 8.1(b)), which binds noncovalently to the minor groove o f DNA. It was shown to inhibit the formation o f the crosslink. o II CHjO— S— O C H , 3 || 3 o (a) / N C H * U H H OH (b) Figure 8.1. (a) Dimethyl sulfate, (b) Hoechst 33258. 171 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. We also proposed that the covalently linked atoms were the N3 atoms o f the C-C pair, based on the pH dependence o f the reaction. We note that protonation o f cytosine N3 could influence alkylation at other atoms, either directly, or through a protonation- induced DNA conformational change, and that the pH dependence does not, therefore, prove that the crosslink forms through N3, although it is suggestive o f this. We also note that N3-alkylcytosines are heat labile (Liang et al., 1994) and this is consistent with the heat-induced cleavage o f the C-C crosslink. Based on the experimental information, we have modeled the mechlorethamine C-C crosslink. This model is conceptual, and requires more definite experimental evidence such as NMR or an X-ray crystal. Nonetheless, molecular modeling can provide valuable information about atomic spatial requirements. 8.1 Methods M olecular Dynamics Simulations: Simulations were performed using the AMBER 4.0 force field (Pearlman et al., 1991; Weiner et al., 1986) on the 13-mer helices, d(TCA CA A £ 7TTGGTT)«d(AACCAA£ 2oTTGTGA), which will be refered to as duplex ACT, and d(TCA CA G £ 7C TG G n>d(A A C C A G £!oC T G T G A ), duplex GCC, where £ indicates a mismatched cytosine base. Standard AMBER 4.0 parameters and charges were applied to the DNA, except for the cytosines which were alkylated by mechlorethamine. The parameters o f Chandrasekhar et al., were used to describe the chlorine atom. The force field parameters and charges for the alkylated cytosines and mechlorethamine were derived, using AM I calculations (W einer,et al., 1986) o f the MOPAC6.0 package. The m olecular electrostatic potential (mep) o f mechlorethamine in the free form, m onoadduct (mechlorethamine 172 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. covalently attached to one cytosine base), and diadduct (mechlorethamine covalently attached to two cytosines) was derived from the AMI wave function (Ferency et al., 1990) using the program RATTLER (Oxford Molecular). The atomic point charge distributions were evaluated from the mep (Singh et al., 1984) using the same program. (Shown in Table 8.1 are the Atomic Point Charges). The crosslinked DNA was formed by manually docking the free form of mechlorethamine (Figure 8.2(a)) between C7 and C20 on the DNA. After several simulations and minimization o f the ligand for 2000 steps, with the DNA constrained, the whole system was subjected to a subsequent minimization o f 4000 steps.. The system with the most favorable energy minimum was used to position mechlorethamine in the starting structure o f the monoadduct. The procedure was repeated again until a favorable starting energy for the diadduct structure was achieved. Canonical B-DNA helices were constructed using the QUANTA 4.0 package (M olecular Simulations). For all simulations canonical B-DNA helices with the mechlorethamine crosslink were first relaxed in a 4000-step conjugate gradient energy minimization, prior to solvation in a box of TIP3P w ater molecules (Jorgensen et al., 1983). The water box had dimensions 59A x 40A x 40A and a minimum depth o f 8 A from the solute to the edge o f the box. Sodium counterions were then added by evaluating the electrostatic interaction energy o f the DNA with a +1 point charge located at the coordinates o f the oxygen o f each water molecule, and replacing the water molecule at the point o f most negative electrostatic potential with a counterion (van Gunsteren, et al., 1986). This process was then repeated (with inclusion o f the interactions o f previously 173 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. placed sodium ions) until the required number o f counterions, (fourteen for the diadduct) had been added to achieve two thirds electrical neutrality. Following brief minimizations o f the water and counterions, the entire system was heated for 2ps from 0 to 25K, and then pre-equilibration o f the solvent performed for 40ps. A 200ps simulation o f the entire system was then performed in the nPT ensemble, using a time step o f 0.002ps and a non bonded cutoff o f 8 A, at a temperature of 298K. The temperature was increased linearly from OK to 298K in the first lOps o f the simulation. The coordinates o f structures generated in the trajectory were saved every 0.4 ps. Table 8.1. Atomic point charges (milli-electrons) obtained from the AM I-derived electrostatic potential for free mechlorethamine in the bis(chloroethyl) form, for the mechlorethamine (N3)cytosine monoadduct and for the cytosine(N3)-mechlorethamine- (N3)cytosine diadduct atom 3 free drug monoadduct 1 , 1 diadduct CM (HM) -206(121) -281(136) -170(109) N -565 -553 -561 C la (H la ) 116(43) 383 (65) 44(131) C lb (H lb ) 162 (2 0 ) -269 (131 )c -32 (104)c C2a (H2a) 116(43) -5 (53) 44(131) C2b (H2b) 162(20) 315 (-20) -32 (104)c N3 (Cyt) -860 b -367 -383 C4 (Cyt) 935 b 728 677 N4 (Cyt) -834 b -551 -541 C5 (Cyt) -576 b -731 -657 C6 (Cyt) 185 b 516 435 N1 (Cyt) -187 b -504 -391 C2 (Cyt) -859 b 846 773 0 2 (Cyt) .£ > O O o tn « -548 -523 a See figure 1 for atom nomenclature, b Standard AMBER4.0 charges. c Methylene group bonded to cytosine N3. 174 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (a) R H I 1 R I 1 R (c) Figure 8.2. Mechlorethamine and cytosine numbering system, (a) Mechlorethamine (b) Mechlorethamine monoadduct. (c) Mechlorethamine C-C crosslink diadduct. 8.2 Results 8.2.1 The mechlorethamine/ DNA complex To determine a reasonable mechlorethamine starting location for the diadduct complex simulation, we performed a series o f preliminary gas phase energy minimizations using conditions similar to those described elsewhere (Yuki and Haworth, 1993) Mechlorethamine was manually positioned such that the aziridinyl ring (atoms -N- C la-C lb , see Figure 8.2) and the free chloroethyl group (-N-C2a-C2b-Cl) were located close to the cytosine 7 and 20 respectively. Various starting positions were sampled and the best energy structure was used to form the monoadduct with C7. After minimization o f the monoadduct, the diadduct was formed between C7 and C20. 175 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8.2.2 The N3(Cyt)-mechlorethamine-N3(Cyt) crosslinked diadduct The experimental evidence suggested that the N3 atom s were crosslinked. To examine the stability o f such a conformation, we conducted a simulation on a crosslinked diadduct, using the lowest energy conformation generated in the previous simulation as a starting point. The crosslinked diadduct formed a bond between atom C lb o f mechlorethamine and the N3 atom o f cytosine 7, in addition to that formed between atom C2b o f the drug and N3 o f cytosine C20. Hence the overall system had a six bond N 3(C7)-Clb-Cla-N -C2a-C2b-N3(C20) bridge between the two DNA strands. At the start o f the simulation mechlorethamine fits easily between the cytosines bases (C7-C20). It is stacked between the bases, causing no deformation o f the B-DNA structure (Figure 8.3). There is no increase in the helical twist angle or the rise between the bases, nor is this required to form the crosslink. The base pairs are within the same plane, hydrogen bonded, and appear normal. Figure 8.3. Representative starting structure (Ops) for the diadduct crosslink sim ulation.. Shown is the top view and side view, respectively. The cytosines and mechlorethamine have been highlighted for clarity 176 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Following 100 pico seconds o f the simulation for both duplexes (Figure 8.4), mechlorethamine has moved slightly out o f the groove. The cytosines have shifted and are no longer in the same plane. For duplex GCC, the cytosines are almost stacked upon each other, and mechlorethamine has aligned itself with the groove. This translational and rotational motion o f the cytosines is probably necessary to accommodate mechlorethamine in the DNA minor groove. Figure 8.4. View of the motion o f the cytosines and mechlorethamine after lOOps. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 177 At 200 pico seconds the difference in structure is more evident. For duplex GCC the C7 is positioned over C20, while duplex ACT has the opposite configuration, that is C20 is positioned over C7 (Figure 8.5). A side view o f the same structure at 200 pico seconds illustrates how the cytosines have moved out o f the helix (Figure 8 .6 ) View of Minor ( iroovc 20llps ,^r~ Figure 8.5 View into the minor groove at 200ps for both duplexes. Also shown are the cytosine numbers to help with orientation. Note that duplex GCC has a tighter conformation and that the C7 is stacked over C20. For duplex ACT the opposite has happened, C20 is essentially stacked over C7, but due to the less thermodynamic stability inherent in a crosslink surrounded by A-T pairs, the conformation is not as tight causing the helix to open up slightly. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 178 Figure 8 .6 . Comparison o f the Ops and 200 ps structures for both duplexes. A side view o f the starting structure is shown in the center and the 200 ps structure for duplex ACT is shown on the left. On the right is shown the side view for duplex GCC at 200ps. 179 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For both duplexes, the helix is fairly stable and regular except for the expelled cytosines and mechlorethamine. This causes a slight bend, and unwinding in the center o f duplex GCC, because the G-C pairs are still hydrogen bonded. For duplex ACT, there is slightly more opening and stretching at the crosslink site, because some o f the A-T base pairs have lost their hydrogen bonds. Despite this, the rest o f the helix is intact. 8.2.3 DNA motion in response to mechlorethamine binding A CURVES analysis for the DNA/mechlorethamine simulations was performed. The analysis was simplified and only the parameters (Figure 8.7) for the starting and ending structures are shown for simplicity. The key aspects o f the conformational motion is shown in Figures 8 . 8 (duplex ACT) and 8.9 (duplex GCC) which summarize the data in the 'Dials and Windows' format. A detailed discussion o f this representation o f DNA conformational data obtained from a molecular dynamics simulation can be found in a reference by Ravishanker et al., (1989). The range o f base pair parameters are listed at the top. Each row o f boxes represents a base pair with a measure o f a particular parameter. The relationship is between base pairs. What can be understood from these charts when the lines for the starting structure (Ops, on the left) are compared with the end (2 0 0 ps, on the right) is that most o f the change is occurring within the central bases. Shift Slide Rise Tilt Roll Twist Figure 8.7. Inter-Base pair conformational parameters. Each rectangle represents a base pair. 180 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission. shift slide rise tilt roll twist _ ____-50 SO 5 0 SO 0 0 SO 30 0 30 0 30 0 30 0 0 0 1 T1 -A26 ___ ___ ___ ____ ..... _.— ------------ ----- ------ C2 - G25 ___ A3 -T24 r ______ r ___ _ m m m rn m rr ______ i ____________ C4 - GP3 f— I — 1 _____ p AS -T22 ___ ___ ___ ____ ____ ___ _______ — ------- !_ A6 -T21 ______ I — I— 1 i— , i — I — 1 ___ C7 -C20 ______ ___ _____ _____ ___ ___ ___ T8 -A19 r_ T _ ) r _ r_ _ ) _____ ___ T9 -A18 r _ r _ _ 1 ___ __ G10-C17 : G11-C16 ___ T12-A15 p — r— . | — ___ ___ 1 — 1 — 1 ____ T13-A14 shift slide rise tilt roll twist T1 A26 4-0 40 4-0 4-0 ,4'° 4°'° 604 'eao 6 0 ao 70-° C2 - G2S _ _ ___ ______ __ ___ A3 -T24 C4 -G 23 ___ ____ _____ ______ _______ ________ ____ ___ AS -T22 _______ ________ _______ __ ____ _____ A6 -T21 _______ C7 - C 2 0 r _ R _______ T8 -A19_______________________ ______ _ _ _____ ______ ______ T9 -A18 G10-C17 ______ ______ _ _ _____ ____ ___ _______ G 1I-C 16_______________________ _____ _ _ _____ r_ _ T _ _ | |__r_ _ T12-A 15 r i r_ T _ _ n ______ __________ ______ T13-A14 Figure 8.8. Base Pair parameters for duplex ACT. At the top are the various parameters that are measured. A box within a column identifies one parameter between the base pairs indicated on the side. The left set o f parameters is derived from the Ops structure. The parameters on the right are derived from the 200 ps structure. Note that the ranges are different between sets. 181 Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission. T1 -A26 C2 -G2S A3 -T24 C4 -G 23 A5 -T22 G6 -C21 C7 -C20 C8 -G 19 T9 -A18 G10-C17 G11-C16 T12-A15 T13-A14 shift slide rise tilt roll twist ■50 5 0 SO 5 0 0 0 5 0 3 0 0 3 0 0 3 0 0 30 0 0 0 700 Tt -A26 C2 • G 25\ A3 -T24 C4 -G23 AS -T22 G6 -C21 C7 -C20 C8 -G19 T9 -A18 G10-C17 G11-C16 T12-A15 T13-A14 shift slide rise tilt roll twist 4 0 8 0 SO 5 0 0 0 15 0 50 0 600 -60 0 60 0 0 0 700 ][ ]m[ ]□ Figure 8.9. Base Pair parameters for duplex GCC. At the top are the various parameters that are measured. A box within a column identifies one parameter between the base pairs indicated on the side. The left set o f parameters is derived from the Ops structure. The parameters on the right are derived from the 200 ps structure. Note that the ranges are different between sets. 8.3 Discussion Overall, we hypothesize that to effect a crosslink, the drug must encounter the N3 atom o f cytosine. This encounter could be just outside o f the helix, since it would appear that the space between (Van der Waals interactions) the mismatched cytosines would be severely compromised within the helix (Note Figure 8.3 top view and how close the NH 2 hydrogen’s are in the major groove). Even though initial modeling suggest that mechlorethamine could sit between the base pair steps, this conformation was not retained. Therefore, based on the modeling, which is speculative, to initiate reaction, a mobile cytosine could swing the N3 atom in and out o f the solvent, such that the N3 atom could become alkylated by the first arm of mechlorethamine, then move slightly back into the helix. Once the first reaction occurred, the second cytosine (which is tucked within the helix) could ‘swing’ near the first (which it readily does) to react with the second arm o f mechlorethamine. This mechanism suggest that the initial orientation does not influence the reaction, and it does not matter which cytosine is alkylated first, just as long as they are near each other. This could be why the reaction is more efficient compared to a 1,3 G-G crosslink. This reaction is thought to demand strict steric requirements in the initial reaction with Gi in a d(GtNC)»d(GjNC) sequence (Remias et al., 1995). We have also postulated that for a much more potent guanine-guanine major groove crosslinking agent, diaziridinylquinone (DZQ, Lee et a/., 1992; Berardini et a/., 1993), the increase in reaction over mechlorethamine is not due to the initial orientation o f the drug (Haworth et al., 1993). The first reaction (monoadduct formation) is unimportant in determining the potential for 1,2 crosslinking. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The modeling also provides clues and helps to explain the variable efficiency observed in the crosslinking reaction. As shown by the moldeling, distance is critical in the reaction. Although, we have not absolutely proven that the crosslink occurs between the N3 atoms o f cytosine in the minor groove, we have provided convincing evidence that it occurs between two mismatched cytosines. Given the distance constraints, and the fact that the reaction occurs at the mismatched cytosines suggest that cytosines spaced further apart would react less. That is, form less crosslinks. And, as we have shown for both a free helix (chapter 5) and a crosslinked helix, an A-T base pair surrounding mismatched cytosines make the helix locally thermodynamically unstable. It is interesting to note that the modeling suggest that the conformation for the two crosslinked duplexes can be different, perhaps a clue regarding the differential mobility observed on a nondenaturing gel with variable GC content. The conformations observed in the simulation are the consequence o f thermodynamic stability within neighboring base pairs and reflect which bases could be denatured in solution or a nondenaturing gel. 8.4 Summary and Conclusions In chapter 2 we described a new and unusual structure for the C-rich strand of Fragile X. This structure prefers to form a (b) alignment with mismatched cytosines. What was determined from this work is that mismatched cytosines had the potential to become extrahelical. Chapter 3 continued to model the same alignment as a single unit within a random sequence and multiple units. What was concluded was that cytosines do have the potential to become extrahelical, but their ability to become extrahelical was dependent upon the thermodynamic stability o f the neighboring sequences. In chapter 4 184 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. chemical probing was used to understand the structure better. Surprisingly, during the course o f this work, a new and uncharacterized reaction was realized for the well-known antitumor drug mechlorethamine. This reaction involved unpaired cytosines. Chapter 5 attempted to characterize this new reaction by studying the kinetics and sequence dependence o f the C-C mismatch reaction. It was determined that the reaction varied and was also dependent upon the local stability o f the neighboring base pairs. We suggested that this might indicate a different conformation for the cytosines. Next, in chapter 6 we tried the new reaction on single and multiple repeating runs o f mismatched cytosines, and even a hairpin with one repeat. The conclusion from this chapter was that the reaction works when multiple trinucleotide repeats o f the Fragile X sequence are present, but also varies with sequence or number o f repeats. The interesting result from this chapter was the variable mobility o f the crosslinked duplexes in a nondenaturing gel. In chapter 7 we examined this result in more detail and we hypothesized that the mobility o f the crosslink was also dependent on local duplex stability and the position o f the crosslink. Since the mobility' o f a C-C crosslink can now be predicted, slipped structures could be preformed, and the mobility determined by electrophoresis. Once the in vitro structures have been determined, the verification o f slipped sturctures could then be explored. Finally in chapter 8 we modeled the crosslink to understand the spatial relationships necessary to form a C-C crosslink. What this work has confirmed is the flexibility o f mismatched cytosines within the Fragile X sequence. The reactivity o f multiple mismatched cytosines in multiple repeats implied that these cytosines are more mobile. It is this mobility, which is dependent upon sequence length, that may attract proteins such as methyltranferases, 185 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. nucleases, and/or contribute to incomplete packaging, leading to fragile sites. The summation of this work presents new information about the chemistry and conformation o f mismatched cytosines which is relevant to Fragile X, fragile sites, and mismatch repair. It also presents new insight into electrophoretic mobility and lays the foundation for an assay to measure slipped structures. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 186 References Abu-Daya, A., Brown, P. M., and Fox, K. R. (1995) DNA sequence preferences of several AT-selective minor groove binding ligands. Nucleic Acids Res. 23, 3385-3392. Antao, V. P., and Gray, D. M. (1993) CD spectral comparisons o f the acid-induced structures o f poly[d(A)j, poly[r(A)], poly[d(C)], and poly[r(C)]. J. Biomol. Struct. Dyn. 10,819-839. Amott, S., Campbell-Smith, P., and Chandreskharan, P. (1976) in the CRC Handbook of Biochemistry, Vol. 2, pp 411-422, CRC Press, Inc., Boca Raton, FL. Ashley,. C. T., and Warren, S. T. (1995) Trinucleotide repeat expansion and human disease. Annu. Rev. Genet. 29, 703-728. Ashley, C.T., Wilkinson, K.D., Reines, D., and Warren, S.T. (1993) FMR1 protein: conserved RNP family domains and selective RNA binding. Science 262, 563-566. Ashley, C.T., Sutcliffe, J.S., Kunst, C.B., Leiner, H.A., Eichler, E.E., Nelson, D.L., and Warren, S.T. (1993a) Human and murine FMR-1: alternative splicing and translational initiation downstream o f the CGG-repeat. Nature Genetics 4, 244-251 Ashley, C.T., Wilkinson, K.D., Reines, D., and Warren, S.T. (1993b) FMR1 protein: Conserved RNP family domains and selective RNA binding. Science 262, 563-566 Baker, D. J., Kan, J. L. C., and Smith, S. S. (1991) Recognition o f structural perturbations in DNA by human DNA(cytosine-5)methyltransferase. Gene 74, 207-210. Bank, B. B. (1992) Studies o f Chlorambucil-DNA Adducts Biochem. Pharmacol. 44, 571-575. Bauer, G. B. and Povirk, L. F. (1997) Specificity and kinetics o f interstrand and intrastrand bifunctional aikylation by nitrogen mustards at a G-GC sequence. Nucleic Acids Res. 25, 1211-1218. Bat O. Kimmel M. Axelrod DE. (1997) Computer simulation o f expansions o f DNA triplet repeats in the fragile X syndrome and Huntington's disease. Journal o f Theoretical Biology. 188, 53-67. 187 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Becker, M., Lerum, V., Dickson, S., Nelson, N.C, and Matsuda, E. (1999) The Double Helix Is Dehydrated: Evidence from the Hydrolysis o f Acridinium Ester-Labeled Probes. Biochemistry 38, 5603-5611. Berardini, M.D.; Lee, C-S.; Gibson, N.W., Hartley, J.A. (1993) Two Structurally Related Diaziridinylbenzoquinones Preferentially Cross-link DNA at Different Sites Upon Reduction with DT-Diaphorase. Biochemistry 32, 330-3312. Borges, K.M., Brummet, S.R., Bogert, A., Davis, M.C., Hujer, K.M., Domke, S.T., Szasz, J., Ravel, J., DiRuggiero, J., Fuller, C., Chase, J.W., and Robb, F.T. (1996) A survey o f the genome o f the hyperthermophilic archaeon, Pyrococcus furiosus. Genome Sci. Technol. 1, 37-46. Brook, J. D., McCurrach, M. E., Harley, H. G., Buckler, A. J., Church, D., Aburatani, H., Hunter, K., Stanton, V. P., Thirion, J.-P., Hudson, T., Sohn, R., Zemelman, B., Snell, R. G., Rundle, S. A., Crow, S., Davies, J., Shelboume, P., Buxton, J., Jones, C., Juvonen, V., Johnson, K., Harper, P. S., Shaw, D. J., and Housman, D. E. (1992) M olecular basis of myotonic dystrophy expansion o f a trinucleotide (CTG) repeat at the 3’end o f a transcript encoding a protein kinase family member. Cell 78, 799-808. Brookes, P., and Lawley, P. D. (1961) The Alkylation of Guanosine and Guanylic Acid. J. Chem. Soc., 3923-3928. Brookes, P., and Lawley, P. D. (1962) Methylation o f Cytosine and Cytidine. J. Chem. Soc., 1348-1351. Boulard, Y., Cognet, J. A. H., and Fazakerley, G. V. (1997) Solution Structure as a Function o f pH o f Two Central Mismatches, C«T and C*C, in the 29 to 39 K-ras Gene Sequence, by Nuclear Magnetic Resonance and Molecular Dynamics. J. Mol. Biol. 268, 331-347. Brown, T., Leonard, G. A., Booth, E. D., and Kneale, G. (1990) Influence o f pH on the Conformation and Stability o f Mismatch Base-pairs in DNA. J. Mol. Biol. 212,437-440. Chandrasekhar, J.; Smith, S.F.; Jorgensen, W.L. (1985) Theoretical Examination o f the SN2 Reaction Involving Chloride Ion and Methyl Chloride in the Gas Phase and Aqueous Solution. J.Am.Chem.Soc.107, 154-163. Chen, F.-M. (1998) Binding o f Actinomycin D to DNA Oligomers o f CXG Trinucleotide. Biochemistry 37, 3955-3964. 188 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chen, X., Mariappan, S. V. S., Catasti, P., Ratliff, R., Moyzis, R. K., Laayoun, A., Smith, S. S., Bradbury, E. M., and Gupta, G. (1995) Hairpins are formed by the single DNA strands o f the fragile X triplets: Structure and biological implications. Proc. Natl. Acad. Sci. U.S.A. 92,5199-5203. Chen, X., Mariappan, S.V., Moyzis, R.K., Bradbury, E.M., Gupta, G. (1998) Hairpin induced slippage and hyper-methylation o f the fragile X DNA triplets. Journal o f Biomolecular Structure & Dynamics. 15, 745-756. Choi, Y.-C., and Chae, C.-B. (1993) Demethylation o f somatic and testis-specific histone H2A and H2B genes in F9 embryonal carcinoma cells. .Mol. Cell. Bio. 13, 5538-5548. Chong, S. S., Eichler, E. E., Nelson, D. L., and Hughes, M. R. (1994) Robust amplification and ethidium-visible detection o f the fragile X syndrome CGG repeat using Pfu polymerase. Am. J. Med. Genet. 51, 522-526. Cohen, B., Van Artsdalen, E.R., and Harris, J. (1948) Reaction Kinetics of Aliphatic Tertiary B-Chloroethylamines in Dilute Aqueous Solution I. Cyclization Process. J. Am. Chem. Soc. 70, 281-285. Darlow, J. M. and Leach, D. R. F. (1998a) Secondary structures in d(CGG)»d(CCG) repeat tracts. J. Mol. Biol. 275, 3-16. Darlow, J. M. and Leach, D. R. F. (1998b) Evidence for Two Preferred Hairpin Folding Patterns in D(CGG)«d(CCG) Repeat Tracts in vivo. J. Mol. Biol. 275, 17-23. David J. H osfieldl, Clifford D. M oll, Binghui Shen2, John A. (1998) Tainer Structure o f the DNA Repair and Replication Endonuclease and Exonuclease FEN-1: Coupling DNA and PCNA Binding to FEN-1 Activity. Cell 95, 135-146. Dawson, M.C., Elliott, D.C., Elliott, W.H. and Jones, K.M. (1987) Data for Biochemical Research, 3rd ed. Clarendon, Oxford. Dickerson, R.E., Drew, H. R., Conner, B.,N., Wing, R.M., Fratini, A.V., and Kopka, M.L., (1982) The anatomy o f -A,-B-, and Z-DNA. Science 216, 475-485. Eichler, E.E., Richards, S., Gibbs, R.A., and Nelson, D.L. (1993) Fine structure of the human FMR1 gene [published erratum appears in Hum Mol G enet 1994 Apr;3(4):684-5]. Human Mol. Gen. 2, 1147-1153. 189 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Eichler, E.E., Holden, J.J.A., Popovich, B.W., Reiss, A.L., Snow, K.., Thibodeau, S.N., Richards, C.S., Ward, P.A., and Nelson, D.L. (1994) Length o f uninterrupted CGG repeats determines instability in the FM Rl gene. Nature Genetics 8, 88-94. Fang, W.-h and Modrich,P. (1993) Human strand-specific mism atch repair occurs by a bidirectional mechanism similar to that o f the bacterial reaction. J. Biol. Chem. 268, 11838-11844. Feng, Y, Absher, D, Eberhart, DE, Brown, V, Malter, HE and Warren, S.T. (1997) FMRP associates with polyribosomes as an mRNP and the I304N mutation o f severe fragile X syndrome abolishes this association. Molecular Cell 1,109-118. Feng, Y, Zhang, F, Lokey, LK, Chastain, JL, Lakkis, L, Eberhart, D and Warren, S.T. (1995) Translational suppression by trinucleotide repeat expansion at FM Rl. Science 268, 731-734 Ferenczy, G.G.; Reynolds, C.A.; Richards, W.G. (1990) Semi-Empirical AM I Electrostatic Potential-Derived Charges. J.Comp.Chem. 11, 159-170. Fischer, S.G., and Lerman, L.S. (1979) Length-independent separation of DNA restriction fragments in two-dimensional gel electrophoresis. Cell 16, 191-200. Fischer, S.G., and Lerman, L.S. (1983) DNA fragments differing by single base-pair substitutions are separated in denaturing gradient gels: correspondence with melting theory. Proc. Natl. Acad. Sci. U.S.A. 80, 1579-1583. Ford, G. P., and Wang, B. (1993) The updated electrostatic potential for cytosine completes the qualitative explanations o f base alkylation regiochemistry. Carcinogenesis 14, 1465-1467. Frank, D., Kesher, I., Shani, M., Levine, A., Razin, A. and Cedar, H. (1991) Demcthylation of CpG islands in embryonic cells. Nature 351, 239-241. Freudenreich, C.H., Kantrow, S.M., and Zakian, V.A. (1998) Expansion and length- dependent fragility o f CTG repeats in yeast. Science 279, 853-856. Fry, M., and Loeb, L. A. (1994) The Fragile X syndrome d(CGG)n nucleotide repeats form a stable tetrahelical structure. Proc. Natl. Acad. Sci. U.S.A. 91,4950-4954. 190 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fu, Y.H., Kuhl, D.P.A., Pizzuti, A., Pieretti, M., Sutcliffe, J. S., Richards, S., Verkerk, A. J. M. H., Holden, J. J. A., Fenwick, J. R. G., Warren, S. T., Oostra, B. A., Nelson, D. L., and Caskey, C. T. (1991) Variation o f the CGG repeat o f the Fragile X site results in genetic instability: resolution o f the Sherman paradox. Cell 67, 1047-1058. Gacy, A. M., Goeliner, G., Juranic, N., Macura, S., and McMurray, C. T. (1995) Trinucleotide repeats that expand in human disease form hairpin structures in vitro. Cell 81,533-540. Gacy, A. M., and McMurray, C. T. (1998) Influence o f hairpins on template reannealing at trinucleotide repeat duplexes: a model for slipped DNA. Biochemistry 37,9426-9434. Gao, L., Huang, X., Smith, K. G., Zheng, M., and Liu, H. (1995) A new antiparallel duplex m otif of DNA CCG repeats that is stabilized by extrahelical bases symmetrically located in the minor groove. J. Am. Chem. Soc. 114, 8883-8884. Gehring, K., Leroy, J.-L., and Gueron, M. (1993) A tetrameric DNA structure with protonated cytosine.cytosine base pairs. Nature 363, 561-565* Godde JS. Kass SU. Hirst MC. Wolffe AP (1996) Nucleosome assembly on methylated CGG triplet repeats in the fragile X mental retardation gene 1 promoter. Journal of Biological Chemistry. 271, 24325-24328. Gordenin, D.A., Kunkel, T.A., and Resnick, M.A. (1997). Repeat expansion-all in a flap? Nat. Genet. 16, 116-118. Goulian, M., Richards, S.H., Heard, C.J., and Bigsby, B.M. (1990). Discontinuous DNA synthesis by purified mammalian proteins. J. Biol. Chem. 265, 18461-18471 Gray, D. M., Cui, T., and Ratliff, R. L. (1984) Circular dichroism measurements show that C«C+ base pairs can coexist with A.T base pairs between antiparallel strands o f an oligodeoxynucleotide double helix. Nucleic Acids Res. 12, 7565-7580 Gray, D. M., Ratliff, R. L., Antao, V. P., and Gray, C. W. (1988) in Structure and Expression, Vol. 2: DNA and Its Drug complexes (Sarma, M. H., and Sarma, R. H., Eds.) pp 147-166, Adenine Press, Guilderland, NY. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Harrington, J.J. and Lieber, M.R. (1994). Functional domains within FEN-1 and RAD2 define a family o f structure-specific endonucleases: implications for nucleotide excision repair. Genes Dev 8, 1344-1355. Hartley, J. A., Berardini, M. D., and Souhami, R. L. (1991) An agarose gel method for the determination o f DNA interstrand crosslinking applicable to the measurment o f the rate o f the second arm. Analyt. Biochem. 193, 131-134. Hartley, J. A., Souhami, R. L. and Berardini, M. D. (1993) Electrophoretic and Chromatographic Separation Methods used to Reveal Interstrand Crosslinking o f Nucleic Acids. J. Chromatography 618, 277-288. Haworth, I. S., Lee, C.-S., Yuki, M. and Gibson, N. W. (1993) Molecular Dynamics Simulations can Provide a Basis for the Observed Preferences o f DNA Alkylation by Aziridinylbenzoquinones. Biochemistry 32, 12857- 12863. Hirst, M.C., Grewal, P.K., and Davies, K.E. (1994) Precursor arrays for triplet repeat expansion at the fragile X locus. Hum. Mol. Genet. 3, 1553-1560. Holbrook, S. R., Cheong, C., Tinoco, I., and Kim, S. H. (1991) Crystal Structure o f an RNA double helix incorporating a track o f non-W atson-Crick base pairs. Nature 353, 579-581. Hopkins, P.B., Millard, J.T., Woo, J., Weidner, M. F., Kirchner, J. J., Sigurdsson, S. T., and Raucher, S. (1991) Sequence Preferences o f DNA Interstrand Cross-Linking Agents: Importance o f Minimal DNA Structural Reorganization in the Cross-Linking Reactions o f Mechlorethamine, Cisplatin and Mitomycin C. Tetrahedron. 47, 2475-2489. Homstra, I. K., Nelson, D. L., Warren, S. T., and Yang, T. P. (1993) High resolution analysis o f th FM Rl gene trinucleotide repeat region in fragile X syndrome. Human. Mol. Genet. 2, 1659-1665. Huntingon's Disease Collaborative Research Group. (1993) A novel gene containing a trinucleotide that is expanded and unstable on Huntington's disease chromosomes. Cell 72, 971-983. Jansen, G., Willems, P., Coerwinkel, M., Nillesen, W., Smeets, H., Vits, L., Howeler, C., Brunner, H., and Wieringa, B. (1994) Gonosomal mosaicism in myotonic dystrophy patients: involvement o f mitotic events in (CTG)n repeat variation and selection against extreme expansion in sperm. Am. J. Hum. Genet. 54, 575-585. 192 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Johnston, B. H. (1992) Probing DNA structures in vitro. Methods in Enzymology 212, 180-194, Johnston, B. H., and Rich, A. (1985) Chemical probes o f DNA conformation: detection of Z-DNA at nucleotide resolution. Cell 42, 713-724. Jones, C., Penny, L., Mattina, T., Yu, S., Baker, E., Voullaire, L., Langdon, W. Y., Sutherland, G. R., Richards, R. I., and Tunna-cliffe, A. (1995) A fragile site within the proto oncogene CBL2 associated with a chromosome deletion syndrome. Nature 376. 145-149. Jones, M., Wagner, R., and Radman M. (1987) Repair o f a mismatch is influenced by the base composition o f the surrounding nucleotide sequence. Genetics 115, 605-610. Jorgensen, W.L., Chandrasekhar, J., Madura, J., Impey, R.W., Klein, M.L. (1983) Comparison o f Simple Potential Functions for Simulating Liquid Water. J.Chem.Phys. 79, 926-937. Kafri, T., Ariel, M., Brandeis, M., Shemer, R., Urven, L., McCarrey, J., Cedar, H. and Razin, A. (1992) Developmental pattern of gene-specific DNA methylation in the mouse embryo and germ line. Genes Dev. 6, 705-714. Kang, S., Jaworski, A., Ohshima, K., and Wells, R. D. (1995) Expansion and deletion of CTG triplet repeats from human disease genes are determined by direction of replication. Nat. Genet. 10,213-218. Kang, S., Ohshima, K., Jaworski, A., and Wells, R.D. (1996) CTG triplet repeats from the myotonic dystrophy gene are expanded in Escherichia coli distal to the replication origin as a single large event. J. Mol. Biol. 258, 543-547. Ke, S. H., and Wartell, R. M. (1993) Influence o f nearest neighbor sequence on the stability o f base pair mismatches in long DNA; determination by temperature-gradient gel electrophoresis. Nucleic Acids Res. 21, 5137-5143. Kennard, O, Salisbury, S.A. (1993) Oligonucleotide X-ray structures in the study of conformation and interactions o f nucleic acids.J. Biol. Chem. 268, 10701-10704. Kettani, A., Kumar, R. A., and Patel, D. J. (1995) Solution structure o f a DNA quadruplex containing the fragile X syndrome triplet repeat. J. Mol. Biol. 254,638-656. 193 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Klimasauskas, S. and Roberts, R. J. (1995) M .Hhal binds tightly to substrates containing mismatches at the target base. Nucleic Acids Res. 23, 1388-1395. Knight, S. J. L., Flannery, A. V., Hirst, M. C., Campbell, L., Christodoulou, Z., Phelps, S. R., Pointon, J., Middleton-Price, H. R., Oostra, B. A., and Davies, K. E. (1993) Trinucleotide repeat amplification and hypermethylation o f a CpG island in FRAXE mental retardation. Cell 74, 127-134. Kohn, K. W., Hartley. J. A., and Mattes. W. B. (1987) Mechanisms o f selective alkylation o f guanine N-7 positions by nitrogen mustards. Nucleic Acids Res. 15, 10531- 10549. Kohwi, Y., Wang, H., and Kohwi-Shigematsu, T. (1993) A single trinucleotide, 5'AGC375'GCT3', o f the triplet-repeat disease genes confers metal ion-induced non-B DNA structure. Nucleic Acids Res. 21, 5651-5655. Kramer, B., Kramer, W., and Fritz, H.J. (1984) Different base/base mismatches are corrected with different efficiencies by the methyl-directed DNA mismatch-repair system. Cell 38, 879-887. Kremer, E. J., Pritchard, M., Lynch, M., Yu, S., Holman, K., Baker, E., Warren, S. T., Schlessinger, D., Sutherland, G. R., and Richards, R. I. (1991) Mapping o f DNA instability at the fragile X to a trinucleotide repeat sequence p(CCG)n. Science 252, 1711-1714. Kunst, C.B. and Warren, S.T. (1994) Cryptic and polar variation o f the fragile X repeat could result in predisposing normal alleles. Cell 77, 853-861. Kunst, C.B., Zerylnick, C., Karickhoff, L., Eichler, E., Bullard, J., Chalifoux, M., Holden, J. J., Torroni, A., Nelson, D.L., Warren, S.T. (1996) FM Rl in global populations American Journal o f Human Genetics. 58, 513-522. Kumaresan, K.R., Ramaswamy, M., Yeung, A.T. (1992) Structure o f the DNA Interstrand Cross-Link o f 4,5’8-Trimethylpsoralen. Biochemistry 31, 6774-6783. Kuryavyi, V. V., and Jovin, T. M. (1995) Triad-DNA: a model for trinucleotide repeats. Nature Genetics 9, 339-341. 194 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Laayoun, A., and Smith, S. S. (1995) Methylation o f slipped duplexes, snapbacks and cruciforms by hum an DNA (cytosine-5) methyltransferase. Nucleic Acids Res. 23, 1584- 1589. Langridge, R., and Rich, A. (1963) Physical and enzymatic studies on poly d(I-C)-poly d(I-C), an unusual double-helical DNA. Nature 198, 726-728. Lavelle, L., and Fresco, J. (1995) UV spectroscopic identification and thermodynamic analysis of protonated third strand deoxycytidine residues at neutrality in the triplex d(C(+)-T)6:[d(A-G)6.d(C-T)6]; evidence for a proton switch. Nucleic Acids Res. 23, 2692-2705. Ledbetter, D. H., Ledbetter, S. A., and Nussbaum, R. (1986) Implications o f fragile X expression in normal males for the nature o f the mutation. Nature 324, 161-163. Lee, C-S. and Gibson, N. W.(1993) Nucleotide preferences for DNA interstrand cross- linking induced by the cyclopropylpyrroloindole analogue U-77,779.Biochemistry 32, 2592-2600. Lee, C.-S., Hartley, J. A., Berardini, M., Butler, J., Siegel, D., Ross, D. and Gibson, N. W (1992) Alteration in DNA Cross-Linking and Sequence Selectivity o f a Series o f Aziridinylbenzoquinones After Enzymatic Reaction by DT-Diaphorase. Biochemistry 31, 3019-3025. Leonard, G.A., Zhang, S., Peterson, M.R., Harrop, S.J., Helliwell, J.R., Cruse, W.B., d'Estaintot, B.L., Kennard, O., Brown, T., Hunter, W.N. (1995) Self-association o f a DNA loop creates a quadruplex: crystal structure o f d(GCATGCT) at 1.8 A resolution. Structure 3,335-340. Liang, G., Gannett, P., Shi, X., Zhang, Y., Chen, F.-X., and Gold, B. (1994) DNA Sequencing with the Hydroperoxide o f Tetrahydrofuran. J. Am. Chem. Soc. 116, 1131- 1132. Liang, G. Gannett, P. and Gold, B. (1995) The use o f 2-hydroperoxytetrahydrouran as a reagent to sequence cytosine and to probe non-W atson-Crick DNA structures. Nucleic Acids Res. 23,713-719. Lilley, D.M.J., (1992) Probes o f DNA Structure. Methods in Enzymology 212, 133-220. 195 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Lohman, T. M., and Ferrari, M. E. (1994) Escherichia coli single-stranded DNA-binding protein: multiple DNA-binding modes and cooperativities. Annu. Rev. Biochem. 63, 527- 570. Lu., A.-L., Clark,S., and Modrich, P. (1988) Methyl-directed repair o f DNA base-pair mismatches in vitro. Proc. Natl Acad. Sci. USA 80,4639-4643. Lyamichev, V., Brow, M.A.V., and Dahlberg, J.E. (1993) Structure-specific endonucleolytic cleavage o f nucleic acids by eubacterial DNA polymerases. Science 260, 778-783. Mahadevan, M., Tsilfidis, C., Sabourin, L„ Shutler, G., Amemiya, C., Jansen, G., Neville, C., Narang, M., Barcelo, J., O'Hoy, K., Leblond, S., Earle-Macdonald, J., De Jong, P. J., Wieringa, B., and Korneluk, R. G. (1992) Myotonic dystrophy mutation: an unstable CTG repeat in the 3' untranslated region o f the gene. Science 255, 1253-1255. Mahadevan, M.S., Amemiya, C., Jansen, G., Sabourin, L., Baird, S., Neville, C.E., Wormskamp, N „ Segers, B., Batzer, M., Lamerdin, J.. de Jong, P., Wieringa, B., and Korneluk, R.G. (1993) Structure and genomic sequence of the myotonic dystrophy (DM kinase) gene. Human Molecular Genetics. Human Mol. Gen. 2, 299-304. Mandel, M., and Marmur, J. (1968) Methods in Enzymology, Vol. Xllb, 195-206. Mariappan, S. V. S., Catasti, P., Chen, X., Ratcliff, R., Moyzis, R. K., Bradbury, E. M., and Gupta, G. (1996a) Solution structures o f the individual single strands o f the fragile X DNA triplets (GCC)n.(GGC)n. Nucleic Acids Res. 24, 784-792. Mariappan, S. V. S., Garcia, A., and Gupta, G. (1996b) Structure and dynamics o f the DNA hairpins formed by tandemly repeated CTG triplets associated with myotonic dystrophy. Nucleic Acids Res. 24, 775-783. Mariappan S.V., Silks, L.A., Bradbury, E.M.,. Gupta, G. (1998) Fragile X DNA triplet repeats, (GCC)n, form hairpins with single hydrogen-bonded cytosine.cytosine mispairs at the CpG sites: isotope-edited nuclear magnetic resonance spectroscopy on (GCC)n with selective 15N4-labeled cytosine bases. Journal o f Molecular Biology. 283, 111-120. Mattes, W. B., Hartley, J. A., and Kohn, K. W. (1986) DNA Sequence Selectivity of Guanine-N7 Alkylation by Nitrogen Mustards. Nucleic Acids Res. 14, 2972-2987. 196 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Maxam, A. M., and Gilbert, W. (1980) Sequencing end-labeled DNA with base-specific chemical cleavage. Methods in Enzymology. 65,499-560. Michel, B., Ehrlich, S.D. and Uzest, M., (1997) DNA double-strand breaks caused by replication arrest. EMBO J. 16,430-438. Mila M. Castellvi-Bel S. Sanchez A. Lazaro C. Villa M. Estivill X (1996) Mosaicism for the fragile X syndrome full mutation and deletions within the CGG repeat o f the FMRl gene. Journal o f Medical Genetics. 33, 338-40. Millard JT. Spencer RJ. Hopkins PB. (1998) Effect of nucleosome structure on DNA interstrand cross-linking reactions. Biochemistry. 37, 5211-52119. Millard, J.T., Raucher, S., and Hopkins, P.B. (1990) Mechlorethamine cross-links deoxyguanosine residues at 5’-GNC sequences in duplex DNA fragments. J. Am. Chem. Soc. 112, 2459-2460. Millard, J. T. and White, M. M. (1993) Diepoxybutane Cross-Links DNA at 5 -GNC Sequences. Biochemistry 32, 2120-2124. Millard, J. T., Weidner, M. F., Kirchner, J. J., Ribeiro, S., and Hopkins, P. B. (1991) Sequence Preferences o f DNA Interstrand Crosslinking Agents: Quantitation of Interstrand Crosslink Locations in DNA Duplex Fragments Containing Multiple Crosslinkable sites. Nucleic Acids Res. 19, 1885-1891. Mitas, M. (1997) Trinucleotide repeats associated with human diseases. Nucleic Acids Res. 25, 2245-2253. Mitas, M., Yu, A., Dill, J., and Haworth, I. S. (1995a) The trinucleotide repeat sequence d(CGG)i5 forms a heat-stable hairpin containing G(syn)«G(anti) base pairs. Biochemistry 34,12803-12811. Mitas, M., Yu, A., Dill, J., Kamp, T. J., Chambers, E. J., and Haworth, I. S. (1995b) Hairpin properties o f single-stranded DNA containing a GC-rich triplet repeat: (CTG) 15 Nucleic Acids Res. 23, 1050-1059. Mitchell, M. A., Kelly, R. C., Wicnienski, N. A., Hatzenbuhler, N. T., Williams, M. G., Petzold, G. L., Slightom, J. L. and Siemieniak, D. R. (1991) Synthesis and DNA Cross- linking o f a Rigid CPI Dimer. J. Am. Chem. Soc. 113, 8994-8995. 197 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Mitchell, J.E., Newbury S.F., and McClellan, J.A. (1995) Compact structures o f d(CNG)n oligonucleotides in solution and their possible relevance to fragile X and related human genetic diseases. Nucleic Acids Res. 23, 1876-1881. Modrich, P. (1991) Mechanisms and biological effects o f mismatch repair. Annu. Rev. Genet. 25, 229-253. Modrich, P. (1994) Mismatch repair, genetic stability, and cancer. Science 266, 1959- 1960. Murray, J.M., Tavassoli, M., al Harithy, R., Sheldrick, K.S., Lehmann, A.R., Carr, A.M., and Watts, F.Z. (1994) Structural and functional conservation o f the human homolog o f the Schizosaccharomyces pombe rad2 gene, which is required for chromosome segregation and recovery from DNA damage. Mol. Cell. Biol. 14,4878-4888. Myers, R.M., Maniatis, T., and Lerman., L.S. (1987) Detection and localization o f single base changes by denaturing gradient gel electrophoresis. Methods in Enzymology 155, 501-527. Nancarrow, J. K., Kremer, E., Holman, K., Eyre, H., Doggett, N. A., LePasiler, D., Callen, D. F., Sutherland, G. R., and Richards, R. I. (1994) Implications o f FRA16A structure for the mechanism o f chromosomal fragile site genesis. Science 264, 1938- 1941. Nolan, J.P., Shen, B., Park, M.S., and Sklar, L.A. (1996) Kinetic analysis o f human flap endonuclease-1 by flow cytometry. Biochemistry 35, 11668-11676. Oberle I., Rousseau, F., Heitz, D., Kretz, C., Devys, D., Hanauer, A., Boue, J., Bertheas, M.F., and Mandel, J-L. (1991). Instability of a 550-base pair DNA segment and abnormal methylation in fragile X syndrome. Science 252, 1097-1102. Okazaki, R., Okazaki, T., Sakabe, K., Sugimoto, K., and Sugino, A. (1968) Mechanism of DNA chain growth. Possible discontinuity and unusual secondary structure o f newly synthesized chains. Proc. Natl. Acad. Sci. USA 59, 598-605. Ojwang, J. O., Grueneberg, D. A., and Loechler, E. L. (1989) Synthesis o f a Duplex Oligonucleotide Containing a Nitrogen Mustard Interstrand DNA-DNA Cross-Link. Cancer Res. 49, 6529-6537. 198 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Parish, J. E., Oostra, B. A., Berkerk, A. J. M. H., Richards, C. S., Reynolds, J., Spikes, A. S., Shaffer, L. G., and Nelson, D. L. (1994) Isolation o f a GCC repeat showing expansion in FRAXF, a fragile site distal to FRAXA and FRAXE. Nature Genet. 8, 229-235. Paroush, Z., Keshet, I., Yisraeli, J, and Cedar, H. (1990) Dynamics o f demethylation and activation o f the alpha-actin gene in myoblasts. Cell 63, 1229-1237. Pearson CE. Sinden RR (1996) Alternative structures in duplex DNA formed within the trinucleotide repeats o f the myotonic dystrophy and fragile X loci. Biochemistry. 35, 5041-5053. Pearlman, D.A., Case, D.A., Caldwell, J.C., Seibel, G.L., Singh, U.C., Weiner, P., and Kollman, P.A. (1991) AMBER4.0, University o f California, San Francisco. Pearson, C.E., and Sinden, R.R. (1996) Alternative Structures in Duplex DNA Formed within the Trinucleotide Repeats of the Myotonic Dystrophy and Fragile X Loci. Biochemistry 35, 5041-5053 Petruska, J., Amheim, N., and Goodman, M. F. (1996) Stability o f intrastrand hairpin structures formed by the CAG/CTG class o f DNA triplet repeats associated with neurological diseases. Nucleic Acids Res. 24, 1992-1998. Petruska, J., Hartenstine, M .J., and Goodman, M.F. (1998) Ananlysis o f Strand Slippage in DNA Polymerase Expansion o f CAG/CTG Triplet Repeats Associated with Neurodegenerative Disease. Journal o f Biological Chemistry. 273, 5204-5210. Peyret, N., Seneviratne, A., Allawi, H. T. and SantaLucia J. Jr. (1999) Nearest-Neighbor Thermodynamics and NMR o f DNA Sequences with Internal A«A, C«C, G«G, and T«T Mismatches. Biochemistry, 38, 3468-3477. Pieper, R. O., and Erickson, L. C. (1990) DNA Adenine Adducts Induced by Nitrogen Mustards and Their Role in Transcription Termination in vivo. Carcinogenesis 11, 1739- 1746 Pieper, R.O., Futscher, B.W. and Erickson, L.C. (1989) Transcription-Terminating Lesions Induced by Bifunctional Alkylating Agents in vitro. Carcinogenesis, 10, 1307- 1314. 199 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Pieretti, M., Zhang, F., Fu, Y-H, Warren, S.T., Oostra, B.A., Caskey, C.T., and Nelson, D.L. (1991) Absence o f expression o f the FMR-1 gene in fragile X syndrome. Cell 66, 817-822. Pjura, P. E., Grzeskowiak, K., and Dickerson, R. E. (1987) Binding o f Hoechst 33258 to the minor groove ofB-D NA . J. Mol. Biol. 197, 257-271. Povirk, L. F. and Shuker, DE (1994) DNA damage and mutagenesis induced by nitrogen mustards. Mutat. Res., 318, 205-226. Price, C.C. (1958) Fundamental Mechanisms of Alkylation. Ann. N. Y. Acad. Sci., 68, 663-668. RATTLER, Oxford Molecular.Ltd., Oxford Science Park, Oxford ,U.K. Ravishanker, G.; Swaminathan, S.; Beveridge, D.L.; Lavery, R.; Sklenar, H. (1989) Conformational and Helicoidal Analysis o f 30ps o f Molecular Dynamics on the d(CGCGAATTCGCG) Double Helix: 'Curves', Dials and Windows. J.Biomol.Struct.Dyn. 6, 669-699. Razin, A., Levine, A., Kafri, T., Agostini, S., Gomi, T. and Cantoni G.L. (1988) Relationship between transient DNA hypomethylation and erythroid differentiation of murine erythroleukemia cells. Proc. Natl. Acad. Sci. USA 85, 9003-9006. Razin, A., Szyf, M., Kafri, T., Roll, M., Giloh, H., Scarpa, S., Carrotti, D. and Cantoni, G.L. (1986) Replacement o f 5-methylcytosine by cytosine: a possible mechanism for transient DNA demethylation during differentiation. Proc. Natl. Acad. Sci. USA 83, 2827-2831. Razin, A., Webb, C., Szyf, M., Yisraeli, J., Rosenthal, A., Naveh-many, T., Sciaky- Gallili, N. and Cedar, H. (1984) Variations in DNA methylation during mouse cell differentiation in vivo and in vitro. Proc. Natl. Acad. Sci. USA 81, 2275-2279. Remias, M. G., Lee, C-S., and Haworth, I. S. (1995) Molecular Dynamics Simulations of Chlorambucil / DNA Adducts. A Structural Basis for the 5’-GNC Interstrand DNA Crosslink Formed by Nitrogen Mustards. J. Biomol. Struct. Dyn. 12, 911-936. Richards, R. I., and Sutherland, G. R. (1992) Heritable unstable DNA sequences. Nature Genet. 1, 7-9. 200 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Richards, R. I., and Sutherland, G. R. (1994) Simple repeat DNA is not replicated simply. Nature. Genet. 6, 114-116. Rink, S. M., and Hopkins, P. B. (1995) A mechlorethamine-induced DNA interstrand cross-link bends duplex DNA. Biochemistry 34, 1439-1445 Rink, S. M., Solomon, M. S., Taylor, M. J., Raja, S. B., Mclaughlin, L. W., and Hopkins, P. B. (1993) Covalent Structure o f a Nitrogen Mustard-Induced DNA Interstrand Cross- Link: An N7-to-N7 Linkage o f deoxyguanosine Residues at the Duplex Sequence 5’- d(GNC). J. Am. Chem. Soc. 115, 2551-2557. Roberts, R. J. (1995) On base flipping. Cell 82, 9-12. Romero, R. M., Cheng, H. Y., Mitas, M„ and Haworth, I. S. (1998) Structure, Motion, Interaction and Expression of Biological Macromolecules, Volume 2. Proceedings o f the 10th Conversation in Biomolecular Stereodynamics, Albany, N.Y. Eds. R.H. Sarma & M.H. Sarma, Adenine Press, 215-220. Romero, R. M., Mitas, M. and Haworth, I. S. (1999a) Anomalous Cross-linking by Mechlorethamine o f DNA Duplexes Containing C-C Mismatch Pairs. Biochemistry, 38, 3641-3648. Romero, R. M., Rodesittisuk, P. and Haworth, I. S. (1999b) Kinetics and Sequence Dependence o f the Cytosine-Cytosine Mismatch Crosslink formed by Mechlorethamine to be published Rosenberg, R.N.,(1996) DNA-Triplet Repeats and Neurologic Disease. N. Engl. J. Med 335, 1222-1224. Rousseau, F., Heitz, D., Biancalana, V., Blumenfeld, S., Kretz, C., Boue, J., Tommerup, N,. Van Der Hagen, C., DeLozier-Blanchet, C., Croquette, M.-F., Gilgenkrantz, S., Jalbert, P., Voelekel, M.-A., Obeite, L, and Mandel, J.-L. (1991) Direct diagnosis by DNA analysis of the fragile X syndrome o f mental retardation. N. Engl. J. Med. 325, 1673-1681. Rubin, C. M., and Schmid, C. W. (1980) Pyrimidine-specific chemical reactions useful for DNA sequencing. Nucleic Acids Res. 8,4613-4619. 201 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Rutman, R. J., Chun, E. H. L., and Jones, J. (1969) Observations on the mechanism of the alkylation reaction between nitrogen mustard and DNA. Biochim. Biophys. Acta 174, 663-673. Saenger, W. (1984) Principles o f Nucleic Acid Structure, Springer-Verlag, New York. Saluz, H.P., Jiricny, J. and Jost, J.P. (1986) Genomic sequencing reveals a positive correlation between the kinetics o f strand-specific DNA demethylation o f the overlapping estradiol/glucocorticoid-reccptor binding sites and the rate o f avian vitellogenin mRNA synthesis. Proc. Natl. Acad. Sci. USA 83, 7167-7171. Salvati, M.E., Moran, E.J., and Armstrong, R.W. (1992) Rates o f nitrogen mustard hydrolysis. Tetrahedron Lett., 33, 3711. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) in Molecular Cloning: A Laboratory Manual, 2nd edn., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. Sarkar, P.S., Chang, H-C, Boudi, F.B. and Reddy, S. (1998) CTG Repeats Show Bimodal Amplification in E. coli. Cell 95, 531-540. Schlotterer, C., and Tautz, D. (1992) Slippage synthesis o f simple sequence DNA. Nucleic Acids Res. 20, 211-215. Schweitzer, J.K., and Livingston, D.M. (1998) Expansions o f CAG repeat tracts are frequent in a yeast mutant defective in Okazaki fragment maturation. Hum. Mol. Genet. 7, 69-74. Sealey, P. G., Southern, E. M. (1990) Gel Electrophoresis o f Nucleic Acids. Oxford University Press. Oxford, U. K., 51-100. Singer, B., and Grunberger, D. (1983) Molecular biology o f mutagens and carcinogens, Plenum Press, New York. Sinden, R. R. (1994) DNA Structure and Function, Academic Press, San Diego. Sinden, R. R., and Wells, R. D. (1992) DNA structure, mutations and human genetic disease. Curr. Opin. Biotechnol. 3, 612-622. 202 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Siomi, H., Siomi, M.C., Nussbaum, R.L., and Dreyfuss, G. (1993) The protein product o f the fragile X gene, FM R l, has characteristics o f an RNA-binding protein. Cell 74, 291 - 298. Siomi M., Siomi H., Sauer W., Srinivasan, S. Nussbaum R., Dreyfuss G. (1995) FXR1, an autosomal homolog o f the fragile X mental retardation Gene. EMBO 142,401-2408. Smith, G. K., Jie, J., Fox, G. E„ and Gao, X. (1995) DNA CTG triplet repeats involved in dynamic mutations o f neurological related gene sequences form stable duplexes. Nucleic Acids Res. 23,4303-4311. Smith, S.S. (1994) Biological implications o f the mechanism o f action o f human DNA(cytosine-5)methyltransferases. In Progress in. Nucleic Acids Research and Molecular Biology (Cohn W.E. and Moldave, K., eds),. 49, 65-111. Academic Press. New York. Smith, S.S.,Hardy, T. A., and Baker, D. J. (1987) Human DNA(cytosine- 5)methyltransferase selectively methylates duplex DNA containing mispairs. Nucleic Acids Res. 15, 6899-6916. Smith, S.S., Kan, J. L. C., Baker, D. J., Kaplan, B. E., and Dembek, P. (1991) Recognition o f unusual DNA structures by human DNA(cytosine-5)methyltransferase. J. Mol. Biol. 217, 39-51. Smith, S.S., Layoun, A., Lingeman, R. G., Baker, D. J., and Riley, J. (1994) Hypermethylation o f telomere-like foldbacks at codon 12 o f the human c-Ha-ras gene and the trinucleotide repeat o f the FMR-1 gene o f fragile X. J. Mol. Biol. 243, 143-151. Smith, S.S., Lingeman, R. G., and Kaplan, B. E. (1992) Recognition o f foldback DNA by the human DNA(cytosine-5)methyltransferase. Biochemistry 31, 850-854. Snow, K., Tester, D.J., Kruckeberg, K.E., Schaid, D.J., and Thibodeau, S.N. (1994) Sequence analysis o f the fragile X trinucleotide repeat: implications for the origin o f the fragile X mutation. Hum. Mol. Genet. 3, 1543-1551. Stewart, P. J., and Owen, W. R. (1980) A study o f the pKa o f chlorambucil. Aust. J. Pharm. Sci. 9, 15-18. 203 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Su, S. S., Lahue, R. S., Au, K. G., and Modrich, P. (1988) M ispair specificity o f methyl- directed DNA mismatch correction J. Biol. Chem. 263, 6829-6835. Sullivan, C. H. and Grainger, R.M. (1986) Delta-crystallin genes become hypomethylated in postmitotic lens cells during chicken development. Proc. Natl. Acad. Sci. USA 83, 329-333. Sutherland, G. R., Baker, E., and Fratini, A. (1985) Excess thymidine induces folate sensitive fragile sites. J. Med.. Genet. 22, 433-443. Sutherland, G. R., and Hecht, F. (1985) Fragile sites on human chromosomes, Vol. 13, Oxford University Press, New York. Sutherland, G.R., and Richards, R.I. (1995a) The molecular basis o f fragile sites on human chromosomes. Curr. Opin. Genet. Dev. 5, 323-327. Sutherland, G.R., and Richards, R.I. (1995b) Simple tandem repeats and human genetic disease. Proc. Natl. Acad. Sci. USA 92, 3636-3641 Teng, M.K..., Usman, N., Frederick, C. A., and Wang, A. H.J. (1988) The molecular structure o f the complex o f Hoechst 33258 and the DNA dodecamer d(CGCGAATTCGCG). Nucleic Acids Res. 16, 2671-2690. Thomas, D. C., Roberts, J. D., and Kunkel, T. A. (1991) Heteroduplex repair in extracts of human HeLa cells. J. Biol. Chem. 266, 3744-3751. Timchenko, L.T. and Caskey, C.T. (1996) Trinucleotide repeat disorders in humans: discussions o f mechanisms and medical issues.FASEB J. 10, 1589-1597. Tishkoff, D.X., Filosi, N., Gaida, G.M., and Kolodner, R.D. (1997). A novel mutation avoidance mechanism dependent on S. cerevisiae RAD27 is distinct from DNA mismatch repair. Cell 88, 253-263. Turchi, J.J., and Bambara, R.A. . (1993). Completion o f mammalian lagging strand DNA replication using purified proteins. J. Biol. Chem. 268, 15136-15141. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Ulanov B. P. , Matorina T. I. and Pozdeev P. P. (1992) [Nitrogen mustard fixes the Z- cont'ormation in poly[d(G-C)]poly[d(G-C)] and DNA]. Molekuliamaia Biologiia, 26, 583-590. Usdin, K., and Woodford, K. J. (1995) CGG repeats associated with DNA instability and chromosome fragility form structures that block DNA synthesis in vitro. Nucleic Acids Res. 23,4202-4209. Ussery. D.W., Hoepfner R.W.. and Sinden. R. R (1992) Probing DNA Structure with Psoralen in Vitro. Methods in Enzymology 212, 242-256. van Gunsteren, W.F.; Berendsen, H.J.; Guersten, R.G.; Zwinderman, H.R. (1986) Molecular Dynamics Simulation o f an Eight Base Pair DNA Fragment in Aqueous Solution. Ann.N.Y.Acad.Sci. 482, 287-303 van Gunsteren, W.F; Berendsen, H.J.C. (1977) Algorithms for Macromolecular Dynamics and Constraint Dynamics. Mol.Phys. 34, 1311-1327. Ververk, A. J. J. H., Pieretti, M., Sutcliffe, J. S., Fu, Y.-H., Kuhl, D. P. A., Pizzuti, A., Reiner, O., Richards, S., Victoria, M. F., Zhang, F., Eussen, B. E., van Ommen, G.-J. B., Blonden, L. A. J., Riggins, G. J., Chastain, J. L., Kunst, C. B., Galjaard, H., Caskey, C. T., Nelson, D. L., Oostra, B. A., and Warren, S. T. (1991) Identification o f a gene (FMR- 1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndrome. Cell 65, 905-914. Vincent, A., Heitz, D., Petit, C., Kretz, C., Oberle, I., and Mandel, J-L. (1991). Abnormal pattern detected in fragile X patients by pulse-fteld gel electrophoresis. Nature 349, 624- 626. Waga, S., Bauer, G., and Stillman, B. (1994). Reconstitution o f complete SV40 DNA replication with purified replication factors. J. Biol. Chem. 269, 10923-10934. Wagner, R., Debbie, P., and Radman, M. (1995) Mutation detection using immobilized mismatch binding protein (MutS). Nucleic Acids Res. 23, 3944-3948. Wang, P., Bauer, G. B., Bennett, R. A. O., and Povirk, L. F. (1991) Thermolabile Adenine Adducts and A.T Base Pair Substitutions Induced by Nitrogen Mustard Analogues in an SV40-Based Shuttle Plasmid. Biochemistry 30, 11515-11521. 205 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Wang, P., Bauer, G. B., Kellogg, G. E., Abraham, D. J., and Povirk, L. F. (1994) Effect o f distamycin on chlorambucil-induced mutagenesis in pZ189: evidence o f a role for minor groove alkylation at adenine N3. Mutagenesis 9, 133-139. Wang, Y.-H., and Griffith, J. (1996) Methylation o f expanded CCG triplet repeat DNA from fragile X syndrome patients enhances nucleosome exclusion. J. Biol. Chem. 271, 22937-22940. Warren, S.T. and Nelson, D.L. (1994) Advances in m olecular analysis o f fragile X syndrome. Journal o f the American Medical Association 271:536-542 Warren, S. T. (1994) Advances in molecular analysis o f fragile X syndrome. Jama, 271, .536-539 Wartell, R. M., Hosseini, S. H., and Moran, C. P. J. (1990) Detecting base pair substitutions in DNA fragments by temperature-gradient gel electrophoresis. Nucleic Acids Res. 18, 2699-2705. Weber, J.L. (1990) Informativeness o f human (dC-dA)n.(dG-dT)n polymorphisms Genomics 7, 524-530. Weiler I.J., Wang, X., Greenough W.T. (1994) Synapse-activated protein synthesis as a possible mechanism o f plastic neural change, Neuroscience:From the Molecular to the Cognitive, Progress in Brain Research, Elsevier Science. Weiner, S.J., Kollman, P.A., Nguyen, D.T., and Case, D.A. (1986) An All Atom Force Field for Simulation o f Proteins and Nucleic Acids. J. Comp. Chem. 7, 230-252. Weiss, A., Keshet, I., Razin, A. and Cedar, H. (1996) DNA demethylation in vitro: involvement o f RNA.Cell 86, 709-718. Wells, R.D. (1996). Molecular Basis o f Genetic Instability o f Triplet Repeats. J. Biol. Chem. 271,2875-2878. Wells, R. D., and Sinden, R. R. (1993) in Genome Analysis Volume 7: Genome Rearrangement and Stability (Davies, K., and Warren, S., Eds.) pp 107-138, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 206 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Wohlrab, F. (1992) Enzyme Probes in Vitro. Methods in Enzymology. 212B, 294-301. Yu, A., Dill, J., and Mitas, M. (1995a) The purine-rich trinucleotide repeat sequences d(CAG)i5 and d(GAC)is form hairpins. Nucleic Acids Res. 23, 4055-4057. Yu, A., Dill, J., Wirth, S. S., Huang, G„ Lee, V. H., Haworth, I. S., and Mitas, M. (1995b) The trinucleotide repeat sequence d(GTC)is adopts a hairpin conformation. Nucleic Acids Res. 23, 2706-2714. Yu, A., Romero, R.M., Barron, M.D., Dill, J., Christy, M., Gold,B., Gray, D.M., Haworth, I.S., and Mitas, M. (1997) At physiological pH d(CCG)15 forms a hairpin containing protonated cytosines and a distorted helix. Biochemistry, 36, 3687-3699. Yu, S., Mulley., J., Loesch, D., Turner, G., Donnelly, A., Gedeon, A., Hillen, D., Kremer, E., Lynch, M., Pritchard, M., Sutherland, G.R., and Richards, R.l. (1992) Fragile-X syndrome: unique genetics o f the heritable unstable element. Am J. Hum. Genet. 50,968- 980. Yu, S., Pritchard, M., Kremer, E., Lynch, M., Nancarrow. J., Baker, E., Holman, K., Mulley, J.C., Warren, S.T., Schlessinger, D., Sutherland, G.R., and Richards, R.L (1991) Fragile X genotype characterized by an unstable region of DNA. Science 252, 1179- 1181. Yuki, M., and Haworth, I.S. (1993) The DNA Complexes o f Aziridinylbenzoquinones. A Molecular Mechanics Study. Anti-Cancer Drug Design 8, 269-287. Zhang Y., O ’Connor J.P., Siomi M., Srinivasan S., Dutra A., Nussban R.L., and Dreyfuss G. (1995) The Fragile X Mental Retardation Syndrome protein interacts with novel homologys FXR1 and FXR2. EMBO 14, 5358-5366. Zheng, M., Huang,X., Smith, G.K., Yang, X., and Gao, X. (1996) Genetically Unstable CXG Repeats are Structurally Dynamic and Have a High Propensity for Folding. An NMR and UV Spectroscopic Study. J. Mol. Biol. 264, 323-336. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Abbreviations: O + C base pairs: The superscript plus designates a proton shared between the cytosines, the center dot designates H-bonds. C pG base-pair step: Duplex base pair alingment d(CG£CG)n«d(CG£CG)n, where £ is a C-C mismatch pair and CpG is the dinucleotide surrounding the mismatched cytosines. G pC base-pair step: Duplex base pair alingment d(GC£GC)n»d(GC£GC)n, where £ is a C-C mismatch pair and GpC is the dinucleotide surrounding the mismatched cytosines. CD: circular dichroism DPAGE: denaturing polyacrylamide gel electrophoresis DMS: dimethyl sulfate ds: double-stranded EM M P: electrophoretic mobility melting profiles HA: hydroxylamine M rel: relative electrophoretic mobility M Tase: methyltransferase ss: single-stranded T H F-O O H : 2-hydroperoxytetrahydrofuran Tm : melting temperature TREDs: triplet repeat expansion diseases Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. U M I films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of th e copy subm itted. Broken or indistinct print colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send U M I a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact U M I directly to order. Bell & Howell Information and Learning 300 North Zeeb Road, Ann Arbor, M l 48106-1346 USA UMI’ 800-521-0600 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Num ber 9987624 U M I* UMI Microform9987624 Copyright 2001 by Bell & Howell Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. Beil & Howell Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Asset Metadata
Creator
Romero, Rebecca Miranda (author)
Core Title
A novel reaction of mismatched cytosine-cytosine pairs associated with Fragile X
Contributor
Digitized by ProQuest
(provenance)
School
Graduate School
Degree
Doctor of Philosophy
Degree Program
Biochemistry
Degree Conferral Date
1999-12
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
biology, molecular,chemistry, biochemistry,chemistry, pharmaceutical,oai:digitallibrary.usc.edu:usctheses,OAI-PMH Harvest
Language
English
Advisor
Haworth, Ian S. (
committee chair
), Broek, D. (
committee member
), Johnson, M. (
committee member
), Reddy, Sita (
committee member
), Stallcup, Michael R. (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c17-578989
Unique identifier
UC11354440
Identifier
9987624.pdf (filename),usctheses-c17-578989 (legacy record id)
Legacy Identifier
9987624.pdf
Dmrecord
578989
Document Type
Dissertation
Rights
Romero, Rebecca Miranda
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
biology, molecular
chemistry, biochemistry
chemistry, pharmaceutical
Linked assets
University of Southern California Dissertations and Theses