Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Identification of DNA methylation markers in diffuse large B-cell lymphoma
(USC Thesis Other)
Identification of DNA methylation markers in diffuse large B-cell lymphoma
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
IDENTIFICATION OF CANDIDATE DNA METHYLATION MARKERS IN
DIFFUSE LARGE B-CELL LYMPHOMA
by
Brian Lee Pike
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(BIOCHEMISTRY AND MOLECULAR BIOLOGY)
December 2006
Copyright 2006 Brian Lee Pike
ii
Dedication
To My Parents and Sean, in Loving Memory of Jeremy
iii
Acknowledgements
This work would not have been possible without the support of some very
important individuals, and I would like to take this opportunity to acknowledge
their contributions to my life and education. First, I would like to thank my advisor
and friend, Dr. Joseph Hacia, whose guidance and support have been invaluable
throughout the course of my graduate school career. His friendship, as well as his
contributions to my education, not only made this dissertation possible, but also
made my tenure at USC an enjoyable one.
In selecting faculty members for my dissertation committee, I sought out
individuals whom I respected for their scholarly rigor and looked up to as
contributing members of the scientific community. For these reasons, I asked Dr.
Baruch Frenkel, Dr. Peter Laird, Dr. Juergen Reichardt, and Dr. Robert Stellwagen
to serve on my dissertation committee. I am thankful that each accepted my
invitation. Without exception, my committee members were genuinely interested
in my graduate school career. I am honored by their willingness to participate in
my education and I have benefited greatly from their thoughtful discussions and
learned advice.
The nature of my research topic required that I be involved in several
collaborations. As a result, many people have donated their time and, in some
instances, their resources to aid me in my research. In particular, I would like to
iv
thank Dr. Timothy Greiner at the University of Nebraska Medical Center, for
taking an interest in my research. If not for his generosity in providing the clinical
samples, this work would not have been possible. It is also necessary that I thank
Dr. Daniel Weisenberger, Dr. Myungjin Kim, Mrs. Tiffany Long and Dr. Mihaela
Campan of Dr. Peter Laird’s laboratory. The successful completion of my
dissertation would not have been possible without their help in generating much of
the data in this work. Additionally, I need to thank both Dr. Bernard Futscher and
Mr. Nicholas Holtan of the University of Arizona for donating the human CpG
island library used in this study and for their assistance in transferring the DNA
methylation microarray technique to USC. I would also like to acknowledge Ms.
Ya-Hsuan Hsu and Dr. Ruty Mehrian Shai at the Spotted Microarray Core Facility
at the Institute for Genetic Medicine for their assistance. It is also important that I
recognize Dr. Susan Groshen, Dr. Kimberly Siegmund and Dr. Wei Ye at the
University of Southern California, as well as Dr. Tyra Wolfsberg at the National
Human Genome Research Institute, for their willingness to donate their time and
technical expertise to my project. Beyond the technical support I have received, I
would also like to thank Dr. Michael Stallcup and the organizers of the NIH-
sponsored Predoctoral Research Training Program in Genetic, Molecular and
Cellular Biology, for their financial support and training opportunities.
v
Having spent five years at USC, members of the Hacia laboratory, both past
and present, have had a significant impact on my life. In particular, I would
especially like to acknowledge Dr. Mazen Karaman, whose camaraderie I could not
have done without, and Dr. Krishna Ramaswamy for her friendship in the waning
months of my graduate school career. I would also like to acknowledge my former
mentor, Dr. Lawrence Brody, for his advice and encouragement. Without his
support, I might never have taken up the challenge of graduate school.
The debt that I owe to my family is, unquestionably, the greatest. I am truly
humbled by the sacrifices my parents and grandparents have made on my behalf.
Their unfailing belief in me continues to be a motivating force in my life. Beyond
having the unconditional support of my parents and grandparents, I was also
blessed with the companionship that comes with brotherhood. I have my two
brothers to thank for a continuous source of inspiration.
Finally, my tenure at USC marked a period of tremendous personal growth
in my life. I credit one wonderful woman, Alexandra, with much of that personal
growth. I am inspired by the way she sees the world, and humbled by her faith in
me. I only hope that I enrich her life as much as she continues to enrich mine.
vi
Table of Contents
Dedication ii
Acknowledgements iii
List of Tables vii
List of Figures viii
Abstract x
Chapter 1: Introduction 1
Chapter 2: CpG Island Methylation Microarray Analysis 10
Chapter 3: MethyLight Analysis 47
Chapter 4: Confirmation Analysis of DNA Methylation 78
Chapter 5: Comparison of PCR-Based Genome Amplification Systems 94
Chapter 6: Summary and Discussion 126
References 133
Appendices
Appendix A: CpG Island Clone Information 144
Appendix B: DNA Methylation Microarray Plots 145
Appendix C: Previously Published MethyLight Reactions 153
Appendix D: Unpublished MethyLight Assay Information 155
Appendix E: Unpublished MethyLight Assay Sequences 157
Appendix F: Methylation vs. Expression Plots 161
Appendix G: Sample Set of 26 PMR Values 172
Appendix H: Replicate MethyLight Reactions 174
Appendix I: CpG Island Microarray vs. MethyLight 176
Appendix J: Bisulfite Sequencing Results 180
Appendix K: Reproduction Permission 185
vii
List of Tables
Table 2.1: CpG Island Methylation Microarray Results 30
Table 2.2: Microarray Data Quality Filters 32
Table 2.3: Analysis of CpG Island Microarray Data 39
Table 3.1: Heavily Methylated MethyLight Markers 65
Table 3.2: MethyLight Reactions Summary Statistics 69
Table 4.1: List of DLBCL and MethyLight Reactions 81
Table 4.2: Expanded Sample Set Summary Statistics 90
Table 4.3: Summary of Replicate MethyLight Reactions 91
Table 5.1: Quartile Analysis of Enhancement Factors 121
Table 5.2: Comparison of Enhancement Factors 125
viii
List of Figures
Figure 2.1: DNA Methylation Microarray Sample Preparation 13
Figure 2.2: PCR Product Visualization 18
Figure 2.3: Double-Stranded DNA Linker 20
Figure 2.4: Data Quality Filtration and Normalization 28
Figure 2.5: Average Ratio (Cy3/Cy5) ABC vs. GCB 35
Figure 2.6: Distribution of Average Cy3/Cy5 Ratios 36
Figure 2.7: ABC and GCB Clone Overlap 37
Figure 2.8: ABC Sample 6328 Cy3/Cy5 Plot 40
Figure 2.9: GCB Sample 9323 Cy3/Cy5 Plot 41
Figure 2.10: ABC and GCB Composite Plot 42
Figure 2.11: ONECUT2 in ABC vs. GCB 46
Figure 3.1: Outline of PCR-Based Bisulfite Assays 50
Figure 3.2: Calculation of PMR Value 57
Figure 3.3: Frequency and Distribution of PMR Values 64
Figure 3.4: MethyLight and CpG Island Microarray Overlap 68
Figure 3.5: ONECUT2: Microarray vs. MethyLight 74
Figure 3.6: Expression vs. Methylation 77
Figure 4.1: ZNF615: MethyLight vs. CpG Island Microarray 92
Figure 4.2: Position of HB-442 Sequencing Primers 93
ix
Figure 5.1: Sample Preparation for Hybridization Analysis 101
Figure 5.2: Flowchart of Data Filtration 108
Figure 5.3: Taq vs. Taq Plot 109
Figure 5.4: AccuPrime vs. Taq Plot 110
Figure 5.5: GC RICH vs. Taq Plot 111
Figure 5.6: ThermalAce vs. Taq Plot 112
Figure 5.7: Quartile 1 Enhancement Factor Values 117
Figure 5.8: Quartile 2 Enhancement Factor Values 118
Figure 5.9: Quartile 3 Enhancement Factor Values 119
Figure 5.10: Quartile 4 Enhancement Factor Values 120
Figure 6.1: ONECUT2 and HNF6 Protein Alignment 132
x
Abstract
Clinically distinct subtypes of Diffuse Large B Cell Lymphoma (DLBCL)
have gene expression profiles that reflect their origins from specific stages of B-cell
maturation. We conducted epigenetic analyses to evaluate the DNA methylation
status of CpG islands in germinal center B-cell-like (GCB) and activated B-cell-
like (ABC) DLBCL subtypes. Using two different platforms, we uncovered gene-
associated CpG islands whose DNA methylation levels varied among DLBCL. Of
these, the methylation levels of CpG islands proximal to ONECUT2 and FLJ21062
(HIC3) correlated with subtype identity. Interestingly, ONECUT2 is involved
regulating TGF-beta signaling pathways crucial for B cell maturation. In contrast
to expectations based on the two-hit hypothesis, ONECUT2 resides on a frequently
amplified, instead of deleted, genomic segment in DLBCL. This novel observation
may reflect a mechanism for silencing potential tumor suppressor genes present in
large, amplified genomic regions. Overall, these results suggest that DNA
methylation may prove to be valuable for the identification and early detection of
cancers derived from closely related cell lineages.
1
CHAPTER 1: Introduction
Diffuse Large B cell Lymphoma
Non-Hodgkin’s Lymphoma (NHL) are a collection of heterogeneous
malignancies of the lymphoid cells that represent the second fastest growing cancer
in the United States, and the sixth leading cause of cancer death (Fisher 2003). In
fact, the incidence of NHL increased at a rate of 3-4% each year from 1973 to the
mid-1990s (Fisher and Fisher 2004). Some of the increase seen over the past
quarter decade may be partly explained by improvements in diagnosis and
reporting, changes in lymphoma classification, as well as an increase in AIDS-
associated lymphoma. However, the source of approximately 50% of this increase
is still unknown.
The majority of NHL originate from B-cells in various stages of
differentiation (Harris, Stein et al. 2001). Diffuse large B-cell lymphoma (DLBCL)
represent the largest fraction of NHL, accounting for approximately one-third of all
cases (Chan 2001; Evans and Hancock 2003; Lossos and Morgensztern 2006).
DLBCL is an aggressive disease of the mature B-lymphocyte (Alizadeh, Eisen et
al. 2000), and is believed to derive from B-cells of the germinal center or at a
subsequent developmental stage (Alizadeh, Eisen et al. 2000). Evidence for this
lies in the fact that DLBCL often harbor genetic rearrangements of
immunoglobulin gene loci and somatic hypermutations characteristic of the
2
immunoglobulin diversification that takes place in the germinal center of secondary
lymph tissue (Alizadeh, Eisen et al. 2000; Lossos 2005; Chen, Houldsworth et al.
2006). While advances have been made in the treatment of DLBCL patients, 50-
70% of these individuals ultimately die of the disease (Chan 2001; Evans and
Hancock 2003; Fisher 2003; Lossos 2005; Chen, Houldsworth et al. 2006). The
highly variable overall survival rates of DLBCL patients as well as their
unpredictable clinical response to treatment underscores the clinical heterogeneity
of DLBCL (Alizadeh, Eisen et al. 2000; Lossos and Levy 2003; Moller, Pedersen
et al. 2003).
It is well-established that DLBCL comprise a number of different
histological, immunophenotypic, cytogenetic and molecular genetic features
(Alizadeh, Eisen et al. 2000; Lossos and Levy 2003). The most widely referenced
lymphoma classification system, the Revised European-American Lymphoma
(REAL) classification scheme acknowledged this heterogeneity within DLBCL and
suggested that this class of lymphoma was likely to include more than one disease
entity (Harris, Jaffe et al. 1994). However, DLBCL were ultimately grouped into a
single category, owing to limitations in the classification process (Harris, Jaffe et
al. 1994; Alizadeh, Eisen et al. 2000). In the decade since the REAL classification
scheme was introduced, much progress has been made in dissecting the distinct
entities within DLBCL (Alizadeh, Eisen et al. 2000; Rosenwald, Wright et al.
2002; Shipp, Ross et al. 2002; Lossos 2005). In a particularly noteworthy effort, on
3
which much of this dissertation is based, Alizadeh and colleagues were able to
successfully sub-classify DLBCL into distinct subtypes on the basis of differential
gene expression profiles (Alizadeh, Eisen et al. 2000). Using custom cDNA
microarrays designed to assay the expression of genes that were either known or
suspected of being important in immunology or cancer, they analyzed 42 DLBCL
and compared their expression profiles with those of normal lymphocyte
subpopulations as well as with leukemia and lymphoma cell cultures (Alizadeh,
Eisen et al. 2000). The investigators demonstrated that, unlike other B-cell
malignancies examined (follicular lymphoma and chronic lymphocytic leukemia),
DLBCL tended to express to a greater degree genes involved in cellular
proliferation, such as cell-cycle control and checkpoint genes (e.g. CDKN2A,
CCNB1, and CDC20), as well as those important in DNA synthesis and replication.
Beyond this, using hierarchical clustering methods, they determined that DLBCL
could be sub-divided into at least two distinct subtypes
1
. The subtypes could be
distinguished from one another based on gene expression signatures they had in
common with either normal B-cells of the germinal center or B-cells that had been
in vitro-activated using anti-IgM antibody, IL-4 and/or CD40 ligand-containing
membranes. The commonalities in expression led the authors to define the
identified DLBCL subtypes in this study as either activated B-cell-like (ABC-
1
A third sub-group (Subtype 3) was also identified by Alizadeh et al. However, the
discussion will be limited to the ABC-DLBCL and GCB-DLBCL subtypes.
4
DLBCL) or germinal center B-cell-like (GCB-DLBCL). In addition, they
suggested that the shared gene expression patterns between these two DLBCL
subtypes and those of either normal B-cells of the germinal center or in vitro-
activated peripheral blood B-cells may suggest that these subtypes derive from a
particular stage in the development of a normal B-cell. Importantly, the authors
also demonstrated that these two DLBCL subtypes may prove useful in predicting a
patient’s response to treatment. For example, patients with the ABC-DLBCL and
GCB-DLBCL subtypes were shown to have statistically significant differences in
overall survival, with GCB-DLBCL patients having a better overall survival than
ABC-DLBCL patients. In total, 76% of GCB-DLBCL patients were alive five
years after treatment. This is in contrast to those patients with ABC-DLBCL, of
whom only 16% survived more than five years.
In the years since the ABC-DLBCL and GCB-DLBCL subtypes were
identified, other studies have added to the list of their distinguishing characteristics
(Alizadeh, Eisen et al. 2000; Lossos, Alizadeh et al. 2000; Huang, Sanger et al.
2002; Rosenwald, Wright et al. 2002; Barrans, Evans et al. 2003; Hans,
Weisenburger et al. 2004; Iqbal, Sanger et al. 2004; Bea, Zettl et al. 2005; Tagawa,
Suguro et al. 2005; Chen, Houldsworth et al. 2006). Noteworthy examples include
the constitutive activation of NF-κB in the ABC-DLBL subtype (Davis, Brown et
al. 2001; Lossos 2005), the occurrence of BCL2 translocations in the GCB-DLBCL
subtype (Huang, Sanger et al. 2002), REL amplification in the GCB-DLBCL
5
subtype (Houldsworth, Olshen et al. 2004), and the contrasting response that each
subtype displays to IL-4 stimulation (Lossos 2005; Lu, Nechushtan et al. 2005).
Taken together, these studies, and others, strongly indicate that DLBCL may be
categorized into distinct, clinically relevant, molecular entities. However, while
efforts to subclassify DLBCL have met with considerable success, clinical outcome
remains quite variable among patients with DLBCL (Alizadeh, Eisen et al. 2000;
Rosenwald, Wright et al. 2002Lossos, 2005 #39). This suggests that new
approaches may be needed to fully elucidate the diversity that lies within DLBCL.
DNA Methylation
For much of the past decade, the sequencing of the human genome has been
a major focal point in the biomedical research community. Now, with the genome
sequence in hand (Antequera and Bird 1993; Lander, Linton et al. 2001), the
community has begun, in earnest, to describe the variation within the genome.
Evidence of this process exists in the announcement of the completion of the first
map of common human haplotypes (HapMap) (Altshuler, Brooks et al. 2005), and
the concerted initiative of the Cancer Genome Project (Kaiser 2005). In keeping
with this trend, many laboratories have set out to document heritable modifications
that are not coded for in the nucleotide sequence, a field referred to as epigenetics
(Issa and Baylin 1996; Laird and Jaenisch 1996; Jones and Laird 1999; Laird
2005). In fact, our increasing awareness of the importance of epigenetics has led to
6
calls for an “Epigenome Project” (Eckhardt, Beck et al. 2004; Jones and
Martienssen 2005; Rauscher 2005; Esteller 2006; Garber 2006).
To date, the most widely studied epigenetic event is DNA methylation. The
enzymatic addition of a methyl group to DNA plays a vital role in a diverse
collection of biological processes that range from host defense mechanisms in
prokaryotes to transcriptional control in eukaryotes (Laird and Jaenisch 1996). In
humans, as well as in other mammals, DNA methylation primarily occurs at the 5-
position of cytosine in the context of the CpG dinucleotide, whereby cytosine is 5’-
adjacent to a guanine nucleotide (5’- CpG - 3’). DNA methylation is a dynamic
process and the methylation of any particular cytosine within a CpG dinucleotide
pair is dependent on a number of factors, including: developmental stage (Ehrlich
2003), tissue (Futscher, Oshiro et al. 2002) and sequence context (Feltus, Lee et al.
2006).
CpG dinucleotides are under represented in the mammalian genome, having
an observed frequency that is approximately one-fifth of what would be expected
(Antequera and Bird 1993; Lander, Linton et al. 2001). This under-representation
is believed to be due to the loss of methyl-cytosines over time through chemical
processes (Lander, Linton et al. 2001). Briefly, 5-methyl-cytosines that undergo
deamination, when left un-repaired, result in a cytosine to thymine transition
(Lander, Linton et al. 2001). Despite the global loss of CpG dinucleotides in the
human genome, large tracts of CpG-rich sequences, referred to as CpG islands
7
(commonly defined as a >500-bp DNA segment with a G+C content ≥ 55% and
observed CpG/expected CpG ≥ 0.65 (Takai and Jones 2002)), still remain.
Approximately one-half of all the genes in the human genome have associated CpG
islands (Antequera and Bird 1993; Jones and Baylin 2002). Furthermore, the
methylation of a CpG island is associated with the transcriptional silencing of
neighboring genes (Jones 1999; Robertson and Jones 2000; Jones and Baylin 2002;
Feinberg and Tycko 2004; Laird 2005; Robertson 2005; Klose and Bird 2006).
While the mechanism has yet to be fully elucidated, evidence suggests that this
silencing is the product of a number of complex interactions that involve changes in
chromatin structure, as well as modulating the access that transcriptional activators
and repressors have to their target DNA sequences (Robertson 2005; Klose and
Bird 2006).
DNA Methylation and Cancer
DNA methylation is known to play an important role in the development
and progression of many different types of cancer (Laird and Jaenisch 1996; Jones
and Laird 1999; Robertson and Jones 2000; Jones and Baylin 2002; Laird 2003;
Laird 2005; Robertson 2005). In fact, the loss of tumor-suppressor gene activity as
result of DNA methylation is at least as common as that resulting from mutation or
deletion (Baylin 2005). Since Costello and colleagues first described the aberrant
DNA methylation of cancer as being non-random and tumor-type specific
8
(Costello, Fruhwald et al. 2000), many efforts have focused on utilizing DNA
methylation signatures to further classify cancer. Importantly, early evidence
suggests that tumor classification schemes based on DNA methylation may prove
to be at least as useful as those established on the basis of gene expression
(Adorjan, Distler et al. 2002; Virmani, Tsou et al. 2002; Wei, Chen et al. 2002;
Laird 2003). In fact, DNA methylation-based classification schemes may offer
distinct advantages over similar schemes grounded in gene expression profiles. For
example, it is generally easier to isolate intact DNA than RNA from a variety of
different tumor preparations (Laird 2003). Aberrant hypermethylation events also
make for an attractive target in disease classification because the increase in
methylation is easily detectable in a background of unmethylated sequences using
PCR-based methods (Laird 2003). This is desirable when faced with screening
complex mixtures of normal cells and rare cancerous ones.
While efforts to sub-classify DLBCL on the basis of DNA methylation
patterns have not been reported, attempts to define subtypes within other tumors
using DNA methylation have shown some promise and demonstrated the feasibility
of such an endeavor (Uhlmann, Rohde et al. 2003; Widschwendter, Siegmund et al.
2004; Tsou, Shen et al. 2005; Woodson, Weisenberger et al. 2005). In one such
effort, Wei and colleagues were able to sub-categorize a cohort of late stage (III
and IV stage) ovarian carcinomas into two distinct subtypes on the basis of DNA
methylation profiles (Wei, Chen et al. 2002). In another study, Virmani and
9
colleagues were able to distinguish between different histological subtypes of lung
cancer based on differences in DNA methylation (Virmani, Tsou et al. 2002).
Given the fact that subtypes of DLBCL have clinically distinct gene expression
profiles, as well as other molecular characteristics, it stands to reason that DLBCL
may also differ with respect to DNA methylation. Furthermore, noting the recent
successes in using DNA methylation profiles to identify subtypes of cancer, it is
plausible that DNA methylation could be used to differentiate between DLBCL
subtypes.
10
CHAPTER 2: CpG Island Methylation Microarray Analysis
As discussed previously, Diffuse Large B Cell Lymphoma (DLBCL) have
been classified into at least two distinct groups on the basis of gene expression
analysis; germinal center B-cell-like (GCB) and activated B-cell-like (ABC)
DLBCL (Alizadeh, Eisen et al. 2000). GCB-DLBCL have gene expression
signatures that resemble those of normal B-cells of the germinal center. Likewise,
ABC-DLBCL display expression patterns similar to in vitro-activated B-cells
(Alizadeh, Eisen et al. 2000). Because GCB-DLBCL and ABC-DLBCL have
distinct genetic characteristics that appear to be related to their cellular origins
(Alizadeh, Eisen et al. 2000; Lossos 2005), we hypothesized that there may also
exist DNA methylation events that are unique to each of these two subtypes.
To begin to test our hypothesis, we obtained DNA from a sub-set of those
lymphoma samples that were originally analyzed by gene expression and used to
define the GCB-DLBCL and ABC-DLBCL subgroups. The blinded samples were
kindly provided by our collaborator, Dr. Timothy Greiner at the University of
Nebraska Medical Center. As an initial survey of DNA methylation in these
DLBCL, we adapted a protocol developed at the University of Arizona in the
laboratory of Dr. Bernard Futscher. This procedure employs the use of CpG island
microarrays to analyze the methylation dependent cleavage of genomic DNA by
the enzyme McrBC (Figure 2.1). Briefly, sample DNA is digested to completion
11
with the restriction endonuclease, MseI which cleaves the genomic DNA into
fragments that are approximately 500-700-bp long. Since its recognition site (5’-
TTAA-3’) is devoid of cytosine and guanine, CpG islands are generally left intact.
Double-stranded DNA linkers complementary to the ends left by the MseI digest
are next ligated to all of the fragments to tag all of the MseI fragments with defined
segments of DNA. The linker-ligated sample is next divided into two equal
aliquots. One half (test) is digested with the methylation-dependent McrBC
endonuclease (5’ … Pu
m
C [N
400-3,000
] Pu
m
C … 3’) while the other half (reference)
is left untreated. Any MseI fragments in the test fraction that are cut by the McrBC
endonuclease contained at least two methylated CpG dinucleotides in the original
sample. Afterwards, the test and reference fractions are separately subjected to
PCR amplification using linker-specific primers. Only those fragments in the test
fraction with the unmethylated recognition sequence for McrBC will remain intact
after the digestion and thus amplify by PCR. After the amplification, the resulting
test and reference fractions are random-prime labeled with Cy3- or Cy5-dUTP,
respectively. The labeled-test (Cy3) and reference (Cy5) fractions were then
subject to hybridization analysis on CpG island microarrays, which were composed
of more than 5,000 PCR-amplified clone inserts from the human CpG island library
constructed by Cross et al (Cross, Charlton et al. 1994). DNA methylation in the
original DNA sample results in a loss of hybridization signal at a given location on
the microarray in the test fraction relative to the reference fraction. Therefore, a
12
(Test (Cy3) / Reference (Cy5)) hybridization signal ratio equal to one would be
expected for an unmethylated CpG island on the microarray. Conversely, a
hybridization ratio (Test/Reference) of zero would be indicative of a fully
methylated CpG island.
Using this protocol, we were able to analyze nearly 592 non-repetitive,
gene-associated CpG islands in 15 different DLBCL samples. Of these CpG
islands, we identified candidate regions in the genome that showed evidence of
differential DNA methylation between the two subtypes of DLBCL. Genes
proximal to these sites of differential methylation included the transcription factor
ONECUT2 (Clotman, Jacquemin et al. 2005), and the molecular chaperone PFDN5
which is also a candidate tumor suppressor in lymphoma (Fujioka, Taira et al.
2001). In addition, GNMT , a methyltransferase that is involved in the
detoxification pathway in liver and, interestingly, important in the regulation of S-
adenosylmethionine, the methyl-donor of DNA methylation (Chen, Lin et al. 2004)
was also differentially methylated.
13
MseI Digest
Genomic DNA
Linker Ligation
McrBC Mock
PCR and Label
Hybridize
Cy3 Cy5
Figure 2.1: DNA Methylation Microarray Sample Preparation
Figure 2.1: As described above,
each sample was processed and
labeled for hybridization. Briefly,
genomic DNA was digested with
MseI and double-stranded linkers
were ligated to the ends of each
fragment. The fragment pool was
then divided equally into Test
and Reference fractions. The Test
fraction was digested with McrBC
and the Reference fraction was
left untreated (Mock). After
treatment, the two fractions were
independently amplified by PCR
using linker specific primers and
the resultant PCR products were
labeled with Cy3 (McrBC treated)
and Cy5 (McrBC untreated). The
Cy3 and Cy5 products were then
co-hybridized to a CpG island
microarray. The Cy3/Cy5 ratios
were then calculated as a measure
of DNA methylation.
14
MATERIALS AND METHODS:
Clinical DLBCL Samples
Frozen tissue sample DNA from nodal and extra-nodal tumor biopsies from
DLBCL patients acquired prior to treatment were obtained from the University of
Nebraska Medical Center. Clinical information, such as clinical presentation,
treatment records, and follow-ups are also available for each of these samples. All
patients were treated with similar chemotherapeutic regimens. In all cases, HIPAA
and NIH guidelines were followed.
CpG island Microarrays:
An aliquot of the human CpG island library described by Cross et al.
(Cross, Charlton et al. 1994) was purchased from Geneservice Ltd. (Cambridge,
UK) by Dr. Bernard Futscher at the University of Arizona. From this aliquot the
Futscher laboratory arrayed and sequenced more than 5,000 clones. Dr. Futscher
kindly provided a copy of the arrayed and sequenced library to our laboratory. The
clones were stored at –80 °C in LB ampicillin (50 ug/ml) media plus 10% glycerol
in sealed 96-well trays. To make a copy of this clone library, each bacterial plate
was thawed at room temperature for 30-45 minutes, and the bacterial clones were
collected at the bottom of each well by centrifugation at 2,000 rpm for 2 minutes.
Using a 96-pin replicating tool (V & P Scientific, Inc.), 1-2 ul of bacterial culture
15
was transferred from the thawed and pelleted bacterial stock trays to a new 96-well,
U-bottom tray containing 100 ul of LB ampicillin (50 ug/ml) media. After
inoculating each new tray, the replicating tool was thoroughly cleaned by
submerging the pins of the tool in 10% bleach, water, and ethanol prior to flaming.
The newly inoculated trays were then sealed with gas-permeable seals and grown at
37 °C overnight on a tabletop rotating shaker at 225 rpm. After the overnight
incubation, 100 ul of LB ampicillin (50 ug/ml) plus 20% glycerol was added to
each well, and the tray was sealed with a foil plate sealer (Abgene, Inc.) and frozen
at –80°C.
In preparation for the printing of the CpG island microarrays, the insert of
each CpG island clone in the library copied above was first amplified by PCR. To
do this, a total of 50 ul of PCR master mix, containing final concentrations of 1X
PCR Buffer (Applied Biosystems Inc.), 4 mM MgCl
2
, 0.5 mM of each dNTP, 0.8
mM of each vector specific primer (Forward: 5’-
CGGCCGCCTGCAGGTCGACCTTAA-3’ and Reverse: 5’-
AACGCGTTGGGAGCTCTCCCTTAA-3’), and 2 units of AmpliTaq Gold
Polymerase (Applied Biosystems, Inc.), was added to each well of a 96-well,
skirted PCR tray (Abgene, Inc.). After the master mix was added, ~2 ul bacterial
stock from each library plate was transferred to the 96-well PCR tray containing the
master mix using a 96-pin tool (V & P Scientific, Inc). Between each transfer of
bacterial template to the PCR master mix, the 96-pin tool was cleaned in 10%
16
bleach. Next, the 96-pin tool was vigorously washed with water by vigorous
agitation. After the washing step above, the tool was transferred to a second water
bath and the water wash was repeated a second time. Following the second water
wash, the tool was transferred to a fourth and final wash, in a 100% ethanol bath.
After the ethanol bath, the tool was flamed. After the flame was completely
extinguished, a second dip in the ethanol bath and flaming was completed.
Following the addition of the bacterial template to the PCR master mix, an
adhesive PCR Foil Seal (Abgene, Inc.) was applied to each PCR tray and the PCR
reaction was carried out as follows: 10 minutes at 98 ºC, 45 cycles of 95 ºC for 1
minute, 55 ºC for 1 minute, and 72 ºC for 3 minutes, and a final annealing step at
72 ºC for 5 minutes.
After cycling, 3 ul of each PCR product was mixed with 17 ul of water and
visualized on a 1% agarose 96-well E-Gel (Invitrogen, Inc.) (Fig. 2.2). Each PCR
product was manually scored as being positive, negative, or having multiple
products. The remaining PCR product was cleaned using the MinElute 96 UF-PCR
Purification Kit (Qiagen, Inc.) as follows: PCR product was transferred to each
well of the PCR purification tray. Once all of the PCR products were transferred to
the purification tray, it was placed on a tray-specific vacuum manifold and a
vacuum was applied for 10 minutes to drawn the PCR volume through the tray to
bind the PCR product to the purification filter. After the 10 minute vacuum step,
45 ul of WATER was applied to each well and the purification tray was vortexed
17
gently for 2 minutes. Afterwards, the product in each well was resuspended by
pipetting the 45 ul volume up and down 20 times and transferring it to a new,
skirted PCR tray (Abgene Inc.). Each tray containing the cleaned PCR products
was sealed with a new foil seal and frozen at –80ºC. Once all of the PCR products
were purified, each plate was thawed at room temperature and centrifuged briefly
to collect the sample at the bottom of the tube before removing the foil seal. The
thawed plates were then placed in the tissue culture hood where the samples were
allowed to dry completely. This process took approximately 24 hours. Each
lyophilized PCR product was resuspended in 15 ul of 3X SSC, sealed with an
aluminum foil plate sealer and frozen at –80 ºC until the night before it was to be
printed on the microarray. The day before a plate was to be printed it was placed at
4 ºC where it thawed overnight. Each PCR product was printed on Epoxy-coated
microscope slides (Corning, Inc.) in the Spotted Microarray Facility at the Institute
for Genetic Medicine at the University of Southern California using a RoboArrayer
(RoboDesign, Inc.). Following printing, the microarrays were incubated at 42 ºC in
a humidified chamber (45% humidity) for 8 hours. Following incubation, the
microarrays were washed in 0.2% SDS with rigid agitation for 2 minutes at room
temperature, and twice in water with rigid agitation at room temperature for 1
minute. Finally, the arrays were incubated for 20 minutes in water at 50 ºC prior to
being dried by centrifugation.
18
Figure 2.2: PCR Product Visualization
Figure 2.2: A 3 ul volume of each 50 ul PCR reaction was subjected
to gel electrophoresis. From this, the quality of each PCR product was
scored as being positive, negative or having more than one product.
19
Sample Preparation:
Samples were prepared for hybridization to the printed CpG island
microarrays by, first, digesting 1 ug of genomic DNA with the methylation
insensitive restriction endonuclease MseI. To do this, a reaction mixture was made
containing 1 ug of genomic DNA with 5 ul of 10X Buffer #2 (New England
Biosystems), 0.5 ul of 100X BSA (New England Biosystems), and 10 units of MseI
enzyme (New England Biosystems) in a total volume of 50 ul. This restriction
digest was then incubated at 37 ºC for 6 hours, and heat inactivated at 65 ºC for 20
minutes. Following the heat inactivation, each MseI digested sample was cleaned
on a MinElute PCR Purification column (Qiagen Inc.). Briefly, 250 ul of PB
Buffer (Qiagen, Inc.) was added to each 50 ul reaction and passed through the
column by centrifugation. The column was washed once with 750 ul of PE Buffer
(Qiagen Inc.). Following the wash, the column was spun dry and the product was
eluted from the column twice with 15 ul of EB Buffer (Qiagen Inc.) and the
concentration was quantified using a Nanodrop spectrophotometer (Nanodrop
Technologies).
Double-stranded DNA linkers were next ligated to the ends created by the
MseI digest described above. First, the double stranded DNA linkers were made by
annealing the following two oligonucleotides: Linker Oligo #1 5’-
TAAGTACTGCACCAGCAAATCC-3’ and Linker Oligo #2 5’-
GGATTTGCTGGTGCAGTACT-3’ (Fig 2.3). Each oligonucleotide was
20
resuspended in water to a concentration of 50 pmol/ul. Next, two 50 ul aliquots of
each oligonucleotide were mixed together and lyophilized to dryness. The
lyophilized oligonucleotides were then resuspended in a total volume of 100 ul of
TE buffer (pH 8.0) with 33 mM NaCl. After resuspending the oligonucleotide mix
was heated to 95 ºC for 2 minutes, and transferred to a 70 ºC heat block. After
transferring to the heat block, the heat block was turned off and the oligonucleotide
mixture was allowed to cool overnight. The resulting annealed linker was at a
concentration of 25 pmol/ul and stored in 20 ul aliquots at –70 ºC.
To ligate the double stranded-DNA linker to the ends created by the original
MseI digest, the following ligation was performed; 150 ng of MseI digested
genomic DNA, 25 pmol of annealed linker, 1X ligase buffer (New England
Biosystems), and 0.5 ul of T4 DNA Ligase (400 u/ul) (New England Biosystems),
Figure 2.3: Double-Stranded DNA Linker
Figure 2.3: Oligos #1 and #2 (above) are annealed to form a
double stranded linker product. The resultant linker has a
two-nucleotide overhang complementary to the end left by
the restriction endonuclease MseI.
5’-
3’-
-3’
-5’
21
in a total volume of 10 ul. The mixture was next incubated at 16 ºC for 12 hours.
This results in a linker-ligated genomic product that is at a concentration of
approximately 15 ng/ul.
To differentiate between methylated and unmethylated DNA fragments in
the original sample, samples were digested with the methylation dependent
restriction endonuclease McrBC (5’–Pu
m
C (N
40-3000
)Pu
m
C-3’) as follows; 4 ul
(15ng/ul) of linker ligated genomic DNA was combined with 1X NEB Buffer #2
(New England Biosystems), 1X BSA (New England Biosystems), 0.4 ul of 100X
GTP (New England Biosystems), and 0.24 ul of McrBC (10 units/ul) (New
England Biosystems), in a volume of 10 ul. In a separate reaction, a mock digest
was set up on the same sample as a control. The control reaction was identical to
the McrBC digest above, minus the GTP. Because McrBC is a GTP dependent
enzyme the methylated fragments remain intact in the mock reaction. Both the
mock and the McrBC reactions were incubated at 37 ºC for 6 hours and the enzyme
was heat inactivated at 65 ºC for 20 minutes. The final concentration of genomic
DNA in each of the reactions (McrBC and Mock) was approximately 6 ng/ul. The
completion of the McrBC digest may be monitored by setting up separate McrBC
digest and mock reactions on methylated plasmid and visualizing the reaction
products by gel electrophoresis.
Following the McrBC and mock digests, the resultant products were
amplified by PCR using an oligonucleotide primer complimentary to the linkers
22
previously added to each genomic fragment. An investigation into the
amplification efficiencies of GC-rich sequences by various polymerases indicated
that the GC-Rich PCR system by Roche (Roche Inc.) improved the amplification of
in-efficiently amplified sequences when compared to other enzymes (See Chapter
5). Given these results, each of the linker-mediated PCR reactions were
conducted using the GC-Rich PCR system. To do this, 24 ng of template (4 ul of
McrBC Cut or Mock digested product) was combined with 100 pmol of linker
specific primer (Oligo #1 in Fig. 2.3) , 5 ul of GC-Rich Resolution Solution, a 200
uM concentration of each dNTP, 5 ul of GC-Rich Resolution Solution, 10 ul of 5X
GC Rich Buffer, and 1.5 ul of GC-Rich PCR Enzyme Mix in a total volume of 50
ul.
To minimize PCR bias, the product was amplified using a minimum
number of PCR cycles (an initial cycle of 72 ºC for 15 minutes and 95 ºC for 3
minutes followed by 22 cycles of 95 ºC for 2 minutes, 55 ºC for 1 minute and 72 ºC
for 3 minutes). Following PCR amplification, each reaction was cleaned on a
Qiagen MinElute PCR Purification column. Briefly, 250 ul of PB Buffer (Qiagen,
Inc.) was added to each reaction and passed through the column. The column was
washed once with 750 ul of PE Buffer (Qiagen Inc.). Following the wash, the
column was spun dry and the PCR product was eluted from the column with 15 ul
of EB Buffer (Qiagen, Inc.).
23
Following the PCR cleanup outlined above, the PCR product was quantified
by a spectrophotomer. A total of 1 ug of PCR product was used in an aminoallyl-
dUTP incorporation reaction using the BioPrime Array CGH Kit (Invitrogen).
First, the PCR product was combined with 20 ul of 2.5X Random Primer Mix in a
41 ul volume. The PCR Product/Random Primer Mix was then denatured for 5
minutes at 95 ºC and chilled on ice for 2 minutes. The denatured mix was
increased to a volume of 50 ul with 5 ul of dUTP Mix, 1.8 ul of 10 mM 5-(3-
aminoallyl)-dUTP (Ambion), 1 ul of Klenow Mix from the BioPrime Array CGH
Kit, and water. The aminoallyl-dUTP incorporation reaction was incubated at 37 ºC
for 2 hours and inactivated at 65 ºC for 20 minutes. Following inactivation, each
reaction was cleaned on a Qiagen MinElute PCR Purification column as before,
and eluted from the MinElute column with 15 ul of Dye Conjugation Buffer (0.5 M
sodium bicarbonate pH. 9.0). The aminoallyl-labeled product was conjugated with
dye immediately (see below) or frozen at –20 °C for later use.
For dye conjugation, the entire aminoallyl-labeled product was combined
with 150 ug of lyophilized Cy3 or Cy5-NHS Ester dye (GE Healthcare) and
incubated in the dark for 2 hours at room temperature. Following incubation, each
dye-conjugated product was cleaned on a Qiagen MinElute PCR Purification
column as outlined below. Briefly, 250 ul of PB Buffer was added to each reaction
and passed through the column. The column was washed once with 750 ul of PE
Buffer and twice with 750 ul of 75% EtOH. Following the wash steps, the column
24
was spun dry and the product was eluted twice with 15 ul with EB Buffer. After
the clean up, each dye labeled product was analyzed by spectrophotomer to
determine the efficiency of the dye conjugation.
Hybridization:
Each microarray hybridization contained 2.5 ug of Cy3-labeled McrBC
digested and 2.5 ug of Cy5-labeled mock-digested products. These two dye-labeled
products were combined with 50 ug of Cot-1 DNA (Invitrogen) and 100 ug of
Yeast tRNA (Invitrogen) in a volume of 200 ul. Finally, 200 ul of 2X Agilent
Hybridization Buffer (Agilent Technologies) was added to each hybridization
cocktail. The hybridization mixture was denatured at 95 ºC for 5 minutes and snap
cooled on ice for 2 minutes prior to being applied to the CpG island microarray.
The hybridizations took place in an Agilent SureHyb hybridization chamber
(Agilent Technologies) at 65 ºC for 14-16 hours.
Microarray Processing and Data Acquisition:
The hybridization chamber was disassembled and the slides were washed in
Wash Buffer I (6X SSPE, 0.005% N-Lauroylsarcosine) for 5 minutes and in
Washing Buffer II (0.06X SSPE, 0.005% N-Lauroylsarcosine) for 5 minutes. After
the wash steps above, the microarrays were dried by centrifugation for 2 minutes at
3000 rpm, and scanned using the multi-laser ScanArray 5000 scanner (GSI
25
Lumonics, Inc.) with the following conditions: Cy5 laser power 90, PMT 75 and
Cy3 laser power 90, PMT 88 at a 10 um resolution. Array images were captured
with the Perkin Elmer software ScanArray Express and the Cy3 and Cy5
hybridization signal intensities were calculated using the ImaGene microarray
software by BioDiscovery, Inc.
Data Normalization and Filtering:
Hybridization signal intensities for each spot on the microarray were
exported into Microsoft Excel for further analysis. Here, background subtracted
intensities were log
2
-transformed for both Cy3 (test) and Cy5 (reference) signals.
Each spot on the array was subject to a number of filters to ensure that our final
analysis was limited to clones that yielded the most robust data (Fig. 2.4). First, we
eliminated from our analysis those clones on the microarray that either failed to
amplify or gave multiple PCR products, as judged by gel electrophoresis prior to
manufacturing the microarray (Fig 2.2). We also disregarded hybridization data
from clones on the microarray that were scored as being of poor quality by the
software used to quantify the signal intensities (ImaGene, Biodiscovery Inc.).
Furthermore, we instituted two signal intensity criteria that had to be met before a
clone was selected for analysis. We removed from further analysis data from
clones that had a reference (Cy5) signal intensity value less than two times that of
the local background signal. Additionally, we anticipate that some fraction of the
26
total number of clones may have low sequence complexity and be susceptible to
cross-hybridization. Therefore, to guard against analyzing data from clones that are
prone to cross-hybridization, we empirically set an upper signal threshold and
analyzed only those clones with a reference (Cy5) signal intensity of less than
30,000 units (Fig. 2.4). After these filtering processes, the Cy3 and Cy5 signals
were normalized to account for differences in labeling efficiencies. To accomplish
this, we used a modified interactive linear regression approach based on the
background subtracted signal intensities of mitochondrial clones on the microarray
(Nouzova, Holtan et al. 2004) (Fig. 2.4). This is possible because mitochondrial
DNA is unmethylated (Groot and Kroon 1979). Therefore, the signal intensities for
both Cy3 and Cy5 are expected to be equal. Thus, we plotted a trend line through
those mitochondrial clones on the array that passed the quality filters outlined
above. We next normalized the reference signal intensity for each clone using the
equation of the line for the plotted trend line. Following normalization, the ratio of
log-transformed, background-subtracted signals (Test/Normalized Reference) was
calculated for every clone. Clones with ratios that were greater than one reflect
noise in the experimental system and were automatically set to equal one.
27
Statistical Analysis:
Dr. Kimberly Siegmund at the University of Southern California calculated
a T-statistic (assuming unequal variance between the two groups) for each of the
592 clones analyzed. Furthermore, because of the relatively small sample size, she
also generated p-values using permutation tests (Wilcoxon) where the variable
denoting group membership (ABC-DLBCL vs. GCB-DLBCL) was permuted
10,000 times. Furthermore, she compared each of these statistics to a Benjamini-
Hochberg (BH) multiple test correction threshold using a 10% False Discovery
Rate (FDR). She also calculated a Receiver Operating Characteristic AUC value
for each of the clones on the array to determine the ability for a given clone to
distinguish between the ABC-DLBCL and GCB-DLBCL subtypes.
28
LOG2 Cy3
20
10
0
LOG2 Cy3
20
10
0
LOG2 Cy3
20
10
0
10 0 20
NORM LOG2 Cy5
10 0 20
NORM LOG2 Cy5
10 0 20
NORM LOG2 Cy5
10 0 20
NORM LOG2 Cy5
10 0 20
NORM LOG2 Cy5
10 0 20
NORM LOG2 Cy5
CLONES
FILTERED
OMITTED
MITOCHONDRIA
FILTERED
OMITTED
A
B
C
Figure 2.4: Data Quality Filtration and Normalization
Figure 2.4: Example Filtering of DLBCL Sample 6328. The
analysis began with more than 5,000 clones (Left Panel A). Only
those clones that passed each of the filters were included in the final
analysis (Left Panel B), those that did not pass the filtering process
were simply omitted from further analysis (Left Panel C). Cy3 and
Cy5 signal intensities were normalized based on those
mitochondrial clones on the array that passed each of the applied
filters (Panels on Right).
29
RESULTS:
From the larger set of DLBCL samples originally analyzed by Rosenwald
and colleagues (Rosenwald, Wright et al. 2002), we screened 15 different DLBCL
for DNA methylation. These samples consisted of both ABC-DLBCL and GCB-
DLBCL subtypes as defined previously by expression analysis. However, their
subtype status was blinded upon receipt and remained unknown until after the
experimental techniques had been completed. As an initial survey of DNA
methylation in these samples, we employed the use of CpG island microarrays.
Each sample was prepared as outlined above (Materials and Methods) and
hybridized to a single CpG island microarray. After hybridization, quality filters
were applied to each array clone, and the signal intensity ratios between Cy3-
labeled test material (digested with the methylation dependent endonuclease,
McrBC) and the normalized Cy5-labeled reference (not digested with McrBC) were
calculated (Table 2.1). A signal intensity ratio (Test/Reference) equal to one for a
given clone on the array would indicate the absence of DNA methylation. On the
other hand, a Test/Reference ratio equal to zero would indicate complete
methylation at the CpG island of interest. More simply, DNA methylation in the
original DNA sample results in a loss of hybridization signal at a given clone on the
array in the test fraction relative to the reference fraction.
30
Table 2.1: Listed by clone ID, this Table presents the CpG island clones that were best able to discriminate
between the ABC-DLBCL and GCB-DLBCL subtypes (p-value <0.05, see Table 2.4). The Cy3/normalized Cy5
ratio is given for each sample tested. The samples are segregated by subtype. The “AVG” column represents the
average Cy3/normalized Cy5 ratio value for those samples given. Omitted values represent clones that failed to
pass the quality filtering process in a given sample. Clone information is provided in Appendix A. Complete data
sets will be made available in a forthcoming publication (Pike Manuscript in Preparation).
Table 2.1: DNA Methylation Microarray Results
31
Once the data collection was completed, the subtype status (ABC-DLBCL
or GCB-DLBCL) was assigned to each of the 15 samples (eight GCB-DLBCL and
seven ABC-DLBCL) analyzed. Beyond those individual CpG island microarray
clones eliminated in the filtering process outlined in the methods section above, we
also discarded all non-gene associated clones and clones that had significant
similarity to more than one sequence in the genome. Furthermore, due to
experimental variation, few clones on the array passed the filtering process in all 15
of the samples analyzed. To add an additional layer of stringency, we ultimately
limited our analysis to the 592 gene-associated clones on the array that passed the
filtering process in at least four samples for each subtype. In other words, at least
four of the GCB-DLBCL samples and four of the ABC-DLBCL samples had to
have passed the filtering process for a given clone for it to be considered when
making comparisons between the two DLBCL subtypes (Table 2.2).
32
Table 2.2: Microarray Data Quality Filters
Table 2.2: The 5376 clones spotted on the array were subjected to a number of
quality filters that reduced the total number of clones analyzed to 592. Filter I
eliminated clones on the array that either failed to amplify or gave multiple PCR
products. Filter II reduced the number to 3636 by removing those clones that were
flagged by at least one of the quality steps (See Methods) in all 15 samples. Filter
III subtracted all repetitive and non-gene associated clones. Filter IV eliminated
those clones on the array that failed to pass the quality filtration in at least four
samples in each of the DLBCL subgroups analyzed (ABC-DLBCL and GCB-
DLBCL).
Filter Subtracted Remaining
I. 981 4395
II. 759 3636
III. 2358 1278
IV. 686 592
Total Analyzed 592
33
In an initial analysis, we first asked whether the two subtypes differed with
respect to the overall levels of DNA methylation. To do this, we calculated the
average Test/Reference ratio, the primary measure of DNA methylation in this
system, across all of the samples in the ABC-DLBCL group as well as for those
samples in the GCB-DLBCL group (Table 2.1). In the GCB-DLBCL group of
samples, this average ratio ranged from 0.714 to 1.0 (data not shown). These
values are very similar to those in the ABC-DLBCL subtype that ranged from
0.737 to 1.0 (Data Not Shown). As mentioned in the methods section above, ratios
greater than one represented noise in our system and were thus set to one. As a
result, 1.0 represented the upper bound of the Test/Reference ratio.
On the whole, we found that the overall amount of DNA methylation was
not statistically different between the ABC-DLBCL and GCB-DLBCL subtypes.
Perhaps, this is best illustrated by Figures 2.5 and 2.6. For each of the 592 clones
assayed, we calculated the average Test/Reference ratio across the samples in the
GCB-DLBCL group, as well as in the ABC-DLBCL group. Comparing the two
groups, it is evident that the majority are largely unmethylated, as most have a
Cy3/Cy5 ratio of >0.95 in the ABC-DLBCL and GCB-DLBCL subtypes (Fig. 2.5).
A more in-depth analysis highlights how little methylation exists between the two
groups. To do this, we determined the frequency with which each of the Cy3/Cy5
ratios appeared (Fig 2.6). More than one-third of those 592 clones analyzed
displayed virtually no methylation in either the ABC-DLBCL or GCB-DLBCL
34
subtypes, with average Test/Reference ratios between 0.975 and 1.000 (Fig. 2.6).
The majority of those clones analyzed had average Test/Reference ratios greater
than 0.95 in both the GCB-DLBCL (56%) and ABC-DLBCL (60%) groups,
indicating that most of the CpG island microarray clones assayed for displayed
little to no DNA methylation (Fig. 2.5 and 2.6).
Furthermore, the unmethylated clones were generally the same in the two
subgroups. Of the 356 clones that displayed an average Cy3/Cy5 ratio of > 0.95 in
the ABC-DLBCL subtype, 311 of them were among the 331 unmethylated (ratio
>0.95) clones in the GCB subtype (Fig. 2.7). Not only were the unmethylated
clones likely to be shared between the ABC-DLBCL and GCB-DLBCL subtypes,
but this was also true of those clones that were judged to be methylated. For
example, there were 54 clones that had an average Cy3/Cy5 ratio that was less than
0.85 in the ABC-DLBCL subtype, while the GCB-DLBCL subtype had 73 clones
with an average ratio of <0.85. The majority of these methylated clones (45 total)
were in common between the two groups.
35
Average Ratio (Cy3/Cy5) ABC vs. GCB
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0.70 0.75 0.80 0.85 0.90 0.95 1.00
ABC Average Ratio
GCB Average Ratio
Figure 2.5: Average Ratio (Cy3/Cy5) ABC vs. GCB
Figure 2.5: The average Cy3/normalized Cy5 ratio of each CpG island
microarray clone was calculated across the ABC-DLBCL subtype samples
and plotted along the x-axis. This same ratio was calculated across the
GCB-DLBCL samples and plotted along the y-axis. The majority of those
clones analyzed had an average Cy3/normalized Cy5 ratio of >0.95 in both
the subtypes (highlighted by red box), indicating the majority of the clones
showed low levels of methylation across both subgroups.
36
Distribution of Cy3/Cy5 Ratios
22
9 9
7
5
3 3
2
1 1
0
11
8
7
6
5
3
2
1
1
0
38 38
18
0
5
10
15
20
25
30
35
40
1.000 - 0.975
0.975 - 0.950
0.950 - 0.925
0.925 - 0.900
0.900 - 0.875
0.875 - 0.850
0.850 - 0.825
0.825 - 0.800
0.800 - 0.775
0.775 - 0.750
0.750 - 0.725
0.725 - 0.700
Average Cy3/Cy5 Ratio Range
Percentage of Clones
ABC
GCB
Figure 2.6: Distribution of Average Cy3/Cy5 Ratios
Figure 2.6: The average Cy3/normalized Cy5 ratio for every microarray
clone analyzed was calculated for both the ABC-DLBCL and the GCB-
DLBCL subtypes. Those ratios were then sorted into the ratio bins along
the x-axis. The y-axis shows the percentage clones that had Cy3/Cy5
ratios of a given range.
37
Figure 2.7: ABC and GCB Clone Overlap
A
B
Figure 2.7: The ABC-DLBCL and GCB-DLBCL subtypes
were very similar with respect to those clones that were
predicted to be primarily unmethylated and those that were
heavily methylated. Panel A demonstrates that the majority
of those clones displaying little to no methylation were
shared between the two subtype groups. This was also true
of those clones that were heavily methylated (Panel B).
Unmethylated
Average Ratio > 0.95
Methylated
Average Ratio < 0.85
38
Importantly, a number of CpG island microarray clones demonstrated
differential methylation between the ABC-DLBCL and GCB-DLBCL subtypes
(Table 2.3). This may best be illustrated by comparing individual samples from the
ABC-DLBCL and GCB-DLBCL subtypes (Fig. 2.8 and 2.9) (See also Appendix
B). In the two examples given, each of the analyzed clones is plotted according to
their individual Test and Reference background-subtracted, signal intensity ratios
(Log
2
). In both of these plots, the first five genes from Table 2.3 have been
highlighted to demonstrate that differences between individual samples are clone
specific. For example, the ONECUT2 clone in ABC-DLBCL sample 6328 (Fig.
2.8) had a reference hybridization signal intensity that was considerably greater
than that of the test. As a result, the point on the plot representing that clone falls
below the X=Y trend-line in the direction of the X-axis, indicating methylation.
This stands in marked contrast to the results gained from GCB-DLBCL sample
9323 for that same clone (Fig. 2.9). In sample 9323, the Test and Reference
hybridization signals for ONECUT2 were approximately equal, as the point on the
plot is nearing the X=Y axis, an indication that ONECUT2 is unmethylated in this
sample (Fig. 2.9). To summarize this point, Figure 2.10 shows the differences in
the five highlighted genes between the two examples given (ABC-DLBCL sample
6328 and GCB-DLBCL sample 9323). Here, as indicated by the direction of the
arrow, the array clone for ONECUT2 goes from being unmethylated in GCB-
DLBCL sample 9323 to methylated in ABC-DLBCL sample 6328.
39
CLONE GENE ABC AVG GCB AVG TTEST WILCOXON AUC THRESHOLD
13.F3 ONECUT2 0.83 0.96 0.00009 0.0095 1.00 0.00017
48.C7 PFDN5 0.91 0.83 0.00058 0.0047 0.95 0.00034
53.H6 GNMT 0.92 0.80 0.00065 0.0025 1.00 0.00051
56.E4 WDR33 0.97 0.99 0.00190 0.0038 0.97 0.00068
19.C9 GTF2A2 0.99 0.96 0.00410 0.0203 0.89 0.00084
25.C6 VPS52 0.96 0.99 0.00493 0.0365 0.95 0.00101
14.A8 C9orf64 0.99 0.95 0.00531 0.0115 0.96 0.00118
1.F4 CENPH 0.95 0.89 0.00756 0.0066 0.96 0.00135
52.F2 CPVL 0.94 0.86 0.00768 0.0140 0.90 0.00152
52.D7 ZNF615 0.93 0.82 0.01056 0.0111 0.90 0.00169
40.A4 ARPP-19 0.98 0.92 0.01135 0.0666 0.85 0.00186
46.H7 HOXC9 0.96 0.90 0.01297 0.0720 0.86 0.00203
3.D11 CPVL 0.94 0.87 0.01527 0.0350 0.86 0.00220
35.F10 RSPO3 0.97 0.93 0.01766 0.0173 0.93 0.00236
22.D3 EFNA5 1.00 0.96 0.01895 0.0234 0.96 0.00253
19.C7 SLC38A4 0.83 0.77 0.02062 0.0531 0.83 0.00270
15.B12 RGS6 0.99 1.00 0.02401 0.0341 0.90 0.00287
32.F2 FLJ32312 0.98 1.00 0.02712 0.0314 0.83 0.00304
28.H7 FLJ21062 0.96 0.90 0.03115 0.0519 0.87 0.00321
44.E11 TP53I11 0.86 0.80 0.03237 0.0480 0.86 0.00338
24.A2 PRIMA1 0.88 0.81 0.03270 0.0704 0.80 0.00355
48.H6 LARP5 0.82 0.73 0.03419 0.0813 0.79 0.00372
38.F2 ZNF675 0.98 0.99 0.04257 0.0716 0.79 0.00389
56.D3 ATP6V1G1 0.99 1.00 0.04809 0.0229 0.85 0.00405
Table 2.3: Analysis of CpG Island Microarray Data
Table 2.3: There were 24 clones whose methylation status could be used to
distinguish between the ABC and GCB subtypes with an uncorrected p-value of
<0.05. The average Cy3/Cy5 ratio of each clone was calculated across all of the
samples in the ABC subgroup (ABC AVG) and the GCB subgroup (GCB AVG).
They are listed above and ranked by p-value (“TTEST” Column). Beyond this
we also generated p-values using permutation tests (WILCOXON) where the
variable denoting group membership (ABC vs. GCB) was permuted 10,000
times. The area under the curve was also calculated for each clone (AUC) as
was a Benjamini-Hochberg (THRESHOLD) multiple test correction threshold
using a 10% False Discovery Rate.
40
ABC DLBCL (6328)
y = x - 5E-05
R
2
= 0.9642
9
10
11
12
13
14
15
16
9 10 11 12 13 14 15 16
Reference (Cy5)
Test (Cy3)
1
2
3
4
5
1. ONECUT2
2. PFDN5
3. GNMT
4. WDR33
5. GTF2A2
Figure 2.8: ABC Sample 6328 Cy3/Cy5 Plot
Figure 2.8: The log
2
Cy3 and Cy5 signal intensities are plotted for each
clone on the microarray (grey), together with the mitochondrial clones
(red). The idealized x = y trend line is also highlighted along with the R
2
value of the plotted mitochondrial clones. The first five clones listed in
Table 2.3 are also highlighted in blue.
41
GCB DLBCL (9323)
y = x + 2E-06
R
2
= 0.9607
9
10
11
12
13
14
15
16
9 10 11 12 13 14 15 16
Reference (Cy5)
Test (Cy3)
1. ONECUT2
2. PFDN5
3. GNMT
4. WDR33
5. GTF2A2
2
1
3
5
4
Figure 2.9: The log
2
Cy3 and Cy5 signal intensities are plotted for each
clone on the microarray (grey), together with the mitochondrial clones
(red). The idealized x = y trend line is also highlighted along with the R
2
value of the plotted mitochondrial clones. The first five clones listed in
Table 2.3 are also highlighted in orange.
Figure 2.9: GCB Sample 9323 Cy3/Cy5 Plot
42
ABC-DLBCL (6328) vs. GCB-DLBCL (9323)
9
10
11
12
13
14
15
16
9 10 11 12 13 14 15 16
Reference (Cy5)
Test (Cy3)
1. ONECUT2
2. PFDN5
3. GNMT
4. WDR33
5. GTF2A2
1
5
4
3
2
Figure 2.10: ABC and GCB Composite Plot
Figure 2.10: The highlighted clones from the ABC (blue) and GCB
(orange) plots in Figures 2.8 and 2.9 are shown to demonstrate their
relative positions. The further from the idealized x = y trend line, the
greater the methylation. The arrows point in the direction of greater
methylation (Clones 1 and 3 above).
43
Next, we sought to determine if any of the 592 individual clones on the
CpG island microarray have the ability to statistically distinguish between the two
subtypes analyzed. To do this, we calculated a T-statistic (assuming unequal
variance between the two groups) for each of the 592 clones analyzed and, because
of the relatively small sample size, we also generated p-values using permutation
tests (Wilcoxon) where the variable denoting group membership (ABC-DLBCL vs.
GCB-DLBCL) was permuted 10,000 times (Table 2.3). In addition, we also
calculated a Receiver Operating Characteristic AUC value for each of the clones on
the array to determine the ability for a given clone to distinguish between the ABC-
DLBCL and GCB-DLBCL subtypes (Table 2.3).
Clones proximal to the genes GNMT and ONECUT2 were both perfect
discriminators between the ABC-DLBCL and GCB-DLBCL subtypes with an
AUC value of 1.0 for each (Table 2.3). The GCB-DLBCL samples analyzed had
an average Test/Reference ratio of 0.962 for ONECUT2 indicting that this gene was
practically unmethylated in these samples. In contrast, the average Test/Reference
ratio for those ABC-DLBCL samples analyzed was 0.826 (Table 2.3). Figure 2.11
summarizes the differences in the Test/Reference ratio of ONECUT2 for each of
the ABC-DLBCL and GCB-DLBCL subtype samples analyzed.
The DNA methylation pattern for GNMT was exactly the opposite of that
found with ONECUT2. GNMT had a lower average Test/Reference ratio in the
44
GCB-DLBCL samples (0.800) compared to those ABC-DLBCL samples analyzed
(0.924), indicating that the extent of methylation was greater in the GCB-DLBCL
samples than in the ABC-DLBCL samples (Table 2.3).
Other genes also had the ability to strongly differentiate between the ABC-
DLBCL and GCB-DLBCL subtypes. To name a few, PFDN5, WDR33, GTF2A2,
CENPH, CPVL, and ZNF615 were strong predictors of subtype with AUC values
of 0.95, 0.97, 0.89, 0.96, 0.90 and 0.90, respectively (Table 2.3). However, of
those, WDR33 and GTF2A2 had relatively modest changes in DNA methylation
between the ABC-DLBCL and GCB-DLBCL subtypes, while PFDN5, CENPH,
CPVL and ZNF615 showed more substantial differences in their average
Test/Reference ratio. In total, 24 clones on the CpG island microarray gave p-
values < 0.05 using either of the two statistical approaches outlined above.
However, it should be noted that only the T-statistic p-value calculated for
ONECUT2 (p-value = 0.00009) met the Benjamini-Hochberg (BH) multiple test
correction threshold using a 10% False Discovery Rate (FDR) (Table 2.3).
ONECUT2 is a member of the ONECUT class of homeodomain
transcription factors whose function, at least partially, overlaps with that of
hepatocyte nuclear factor-6 (Jacquemin, Lannoy et al. 1999), which appears to be
important in the normal development of the liver (Jacquemin, Lannoy et al. 2001;
Jacquemin, Pierreux et al. 2003; Rousseau 2003; Briancon, Bailly et al. 2004;
Clotman, Jacquemin et al. 2005). The clone on the CpG island microarray that
45
interrogates the DNA methylation status of ONECUT2 is 538-bp in length and is
located in the first intron, approximately 3.5-kb down stream of the gene’s
transcriptional start site (Karolchik, Baertsch et al. 2003; Hinrichs, Karolchik et al.
2006). Furthermore, the clone is well-situated in the middle of a large CpG island
(5.7-kb) that spans most of the gene’s first exon (Karolchik, Baertsch et al. 2003;
Hinrichs, Karolchik et al. 2006).
In summary, we analyzed nearly 600 gene-associated CpG islands using a
hybridization-based platform to identify sites of DNA methylation in eight GCB-
DLBCL and seven ABC-DLBCL. Of the CpG islands analyzed, we identified 24
clones on the CpG island microarray that had potential to discriminate between the
GCB-DLBCL and ABC-DLBCL subtypes (p<0.05). Of these, the clone
interrogating the DNA methylation status of the transcription factor gene,
ONECUT2 showed the most promise in being able to differentiate between the two
subtypes. In this screen, ONECUT2 was a perfect discriminator between the ABC-
DLBCL and GCB-DLBCL subtypes. These results demonstrate that the previously
defined DLBCL subtypes may differ with respect to DNA methylation patterns.
Furthermore, beyond their discriminatory power, these DNA methylation patterns
they may shed new light on the molecular etiology of the disease and provide new
targets of therapeutic interest.
46
ONECUT2 ABC vs. GCB
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Reference (Cy5)
Test (Cy3)
ABC
GCB
1
2
3
4
5 6
7
10
8
9
1. LR6328
2. LR9251
3. LR12967
4. LR13112
5. LR4427
6. LR5063
7. LR8578
8. LR9284
9. LR9323
10. LR10618
Figure 2.11: ONECUT2 in ABC vs. GCB
Figure 2.11: The Cy3 and Cy5 signal intensities for ONECUT2 have been
plotted for each on the sample in which it was assayed. The ABC samples
are highlighted in blue, while the GCB samples are indicated in orange.
Sample ID
1. 6328
2. 9251
3. 12967
4. 13112
5. 4427
6. 5063
7. 8578
8. 9284
9. 9323
10. 10618
47
CHAPTER 3: MethyLight Analysis
A number of different techniques exist that assay for the presence DNA
methylation. Generally, these techniques may be divided into those that measure
DNA methylation based on the digestion of DNA by methylation sensitive (e.g.
HpaII, BstUI) or dependent (e.g. McrBC) enzymes and those that rely upon sodium
bisulfite chemistry
2
. In conjunction with the DNA methylation analysis described
in Chapter 2, which was dependent on the digestion of DNA with the enzyme
McrBC, we conducted another series of experiments using an alternative method
based on sodium bisulfite conversion chemistry. These additional experiments
were performed in an effort to maximize the likelihood of identifying DNA
methylation events that correlate with the previously identified DLBCL subtypes.
To do this, we benefited from a collaboration with Dr. Peter Laird, at the
University of Southern California. Using a technique developed in his laboratory
(MethyLight (Eads, Danenberg et al. 2000)), we analyzed DLBCL samples for
DNA methylation at more than 250 sites in the genome. The MethyLight analysis
was conducted in two phases. In the first, an initial survey of 262 MethyLight
markers (Appendices C, D and E) was made across five DLBCL samples to pre-
screen for markers that show a moderate degree of differential methylation in
2
Recent attempts have also been made to assay for DNA methylation using methylation specific
antibodies directed at detecting 5-methylcytidine Weber, M., J. J. Davies, et al. (2005).
"Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation
in normal and transformed human cells." Nat Genet 37(8): 853-62..
48
DLBCL (Coefficient of Variation >5%). Of those MethyLight markers tested, 89
were identified as being of potential interest. Using these 89 markers in a second
phase of analysis, we expanded our survey to include the 15 DLBCL samples
previously examined by CpG island microarray analysis in the second chapter.
MethyLight (Eads, Danenberg et al. 2000) is a fluorescence-based real-time
(TaqMan®) based assay performed on sodium bisulfite treated DNA that allows for
the rapid quantification of DNA methylation at a particular CpG island of interest.
Sodium bisulfite conversion is a commonly used chemistry in the identification of
DNA methylation (Clark, Harrison et al. 1994; Laird 2003). When treated with
sodium bisulfite, cytosines in DNA are deaminated to yield uracil (Fig. 3.1).
However, the deamination of 5-methylcytosine takes place at a much slower rate
relative to cytosine in its unmethylated form (Clark, Harrison et al. 1994). A
number of techniques have evolved that exploit the differential rate of deamination
between methylated and unmethylated cytosine. MethyLight is one such
technique. When combined with PCR amplification, the sodium bisulfite-
deaminated cytosines (uracil) are ultimately replaced with thymine, while 5-
methylcytosine remains as cytosine (Fig. 3.1). MethyLight assays consist of three
different oligonucleotides; a forward PCR primer, a reverse PCR primer and an
internal florescent-labeled probe. All three oligonucleotides are designed such that
each may bind to a targeted region in the genome. Moreover, each MethyLight
primer/probe set is designed to anneal specifically to a sodium bisulfite converted
49
template. Generally, the primers and probes are designed to anneal to a sodium
bisulfite converted template that was originally fully methylated (Trinh, Long et al.
2001). MethyLight reactions are entirely dependent on the proper annealing of
primers and probes to their cognate, sodium bisulfite converted target sequence
(Fig. 3.1). Thus, the annealing of the primer to a sodium bisulfite converted
template is dependent on whether or not cytosines in the target sequence were
originally methylated. As in traditional PCR, oligonucleotides bound to their target
sequences act as priming sites for DNA polymerase. The bound, internal probe has
both a fluorescent dye attached to its 5’-end and a quencher molecule on its 3’-end.
As the polymerase moves along the DNA strand, it encounters the probe that is
internal to the two flanking PCR primers. The exonuclease activity of the
polymerase degrades the internal probe. As it does so, the 5’- fluorescent dye is
released from the probe and is separated from the quencher molecule allowing for
the emission of a fluorescent signal. Based on these fluorescent signals, one can
calculate the amount of methylation in a given sample relative to a fully methylated
reference. The accepted unit of methylation, in this case, is expressed as a
percentage of methylated reference (PMR) (Eads, Lord et al. 2001; Weisenberger,
Campan et al. 2005).
50
Figure 3.1: (Adapted from Trinh, et. al (Trinh, Long et al. 2001))
Methylated and unmethylated templates are shown in Panel A. Sodium
bisulfite treatment converts unmethylated cytosines (open circle) to uracil
(Panel B). In Panel C, PCR amplification of the top strand replaces
uracil with thymidine (Template strands are shown with solid line, the
complementary strand is marked with a dashed line. PCR-based
techniques aimed at DNA methylation often rely upon the differential
annealing (Mismatch in Red) of oligonucleotides (Bolded Line) to
bisulfite products (Panel D).
C CG G 5’
G GC C 5’
C CG G 5’
G GC C 5’
U CG G 5’
G GC U 5’
+
U UG G 5’
G GU U 5’
+
A GC C 5’
T CG G 5’
G GC T 5’
C CG G 5’
+
A AC C 5’
T TG G 5’
G GT T 5’
C CA G 5’
+
A GC C
T CG G 5’ T TG G 5’
A GC C
A GC C
X
Methylated Unmethylated
Bisulfite
PCR
A
B
C
D
Figure 3.1: Outline of PCR-Based Bisulfite Assays
C CG G 5’
G GC C 5’
C CG G 5’
G GC C 5’
U CG G 5’
G GC U 5’
+
U UG G 5’
G GU U 5’
+
A GC C 5’
T CG G 5’
G GC T 5’
C CG G 5’
+
A AC C 5’
T TG G 5’
G GT T 5’
C CA G 5’
+
A GC C
T CG G 5’ T TG G 5’
A GC C
A GC C
X
Methylated Unmethylated
Bisulfite
PCR
A
B
C
D
Figure 3.1: Outline of PCR-Based Bisulfite Assays
51
Initially, we examined more than 250 individual MethyLight reactions for
differential methylation in DLBCL. Within this larger set of markers, we
successfully identified a limited number of candidate MethyLight reactions whose
DNA methylation status demonstrated the potential to distinguish between the
previously described ABC-DLBCL and GCB-DLBCL subtypes. Among the
candidate markers identified, reactions proximal to the transcription factor
ONECUT2 showed promise in their ability to discriminate between the lymphoma
subtypes. Specifically, one of the ONECUT2 reactions (HB-243) was more
methylated in the ABC-DLBCL (average PMR = 61) compared to the GCB-
DLBCL (average PMR = 31) (p < 0.005). These results are in accord with the prior
CpG island microarray experiments outlined in chapter two, which also defined
ONECUT2 as being more methylated in ABC-DLBCL versus GCB-DLBCL.
MATERIALS AND METHODS:
Bisulfite Conversion:
Sodium bisulfite treatment is a well-established protocol used in the
examination of DNA methylation. This protocol has been optimized by the
laboratory of Dr. Peter Laird. First, a 5 M solution of sodium bisulfite was made
by adding 1.9 grams of sodium metabisulfite to 3.2 ml of 0.44 M NaOH, and
incubating the solution at 50 ºC until the sodium metabisulfite dissolved.
Separately, a 1 M solution of hydroquinone was made by adding 0.11 grams of
52
hydroquinone to 1 ml of water. To dissolve the hydroquinone, the solution was
incubated at 65 ºC for approximately five minutes
3
. Once in solution, 0.5 ml of the
1 M hydroquinone was added to the previously made sodium bisulfite solution.
After the bisulfite solution was prepared, 2 µg of each DNA sample
4
was diluted to
a volume of 18 µl with water, denatured at 100 ºC for 10 minutes in a 1.5 ml screw-
cap microcentrifuge tube, and centrifuged briefly before being chilled on ice. Once
denatured, 2 µl of 3 M NaOH was added to each sample. Following addition of the
NaOH, the sample (now 20 µl) was vortexed to mix, centrifuged briefly and
incubated at 42 ºC for 20 minutes. At the end of the 42 ºC incubation, 120 µl of the
previously prepared sodium bisulfite solution was added to each sample and the
samples were incubated in the absence of light at 50 ºC for 12-15 hours.
A modified protocol using the QIAamp Viral RNA Mini Kit (Qiagen) was
used to desulfonate each sample and recover the DNA after the bisulfite
conversion. First, as specified in the QIAamp Viral RNA Mini Kit, the carrier
RNA was added to the AVL buffer and the AVL buffer is mixed thoroughly before
adding 560 µl of it to each bisulfite reaction. After the addition of the AVL buffer
to each sample, the samples were mixed and incubated at room temperature for 10
3
Hydroquinone is light sensitive. Therefore, it is necessary to protect the solution from light during
its preparation. Should the solution turn brown, it should be discarded and the solution should be
prepared once again.
4
The bisulfite conversion is sensitive to the amount of DNA converted. This protocol outlines the
procedure to convert up to 2 µg of DNA, using more may lead to incomplete conversion of the
sample.
53
minutes. To collect the sample into the bottom of the tube, each was briefly
centrifuged prior to the addition of 560 µl of ethanol to each, and the samples were
vortexed for 15 seconds to ensure adequate mixing. Following mixing,
approximately half of each sample mixture (approximately 630 µl) was loaded onto
a QIAamp spin column with a collection tube to collect the flow-through during
centrifugation. Once loaded, the spin columns were centifuged at 16,000 g for one
minute. After centrifugation, the flow-through was collected and passed over the
column a second time via centrifugation as before. This process was repeated with
the second half of the sample. Loading the column twice with each sample ensures
that a maximum amount of bisulfite converted DNA will be retained on the spin
column prior to desulfonation and elution. Subsequent to binding the DNA to the
column, the column was inserted into a new collection tube and a 200 µl mixture of
0.08 M NaOH and AW1 Buffer (196 µl AW1 + 4 µl 4 M NaOH) was added to
each column and incubated for 15 minutes at room temperature. After incubation,
a 200 µl mixture of 0.08 M HCl and AW1 Buffer (196 µl AW1 + 4 µl 4 M HCl)
was added to each column and incubated for 5 minutes to neutralize the
desulfonation reaction above. Following the neutralization, the columns were
centrifuged for 1 minute at 16,000 g, the filtrate was discarded and the column was
washed a second time with 500 µl AW1. After washing the column with AW1, the
filtrate was discarded and the spin column was added to a new collection tube.
54
Next, 500 µl of AW2 Buffer was added to the spin column and the column was
centrifuged at 16,000 g for 3 minutes. To dry the spin column and eliminate
possible AW2 and ethanol carryover, the column was then transferred to another
clean collection tube and centrifuged for another minute at 16,000 g.
To elute from the column the final bisulfite converted DNA, the spin
column was place into a clean 1.5 ml microcentrifuge tube, and 60 µl of buffer
AVE was dispensed onto the membrane of the spin column, and incubated at room
temperature for 1 minute prior to being centrifuged at 7600 g for 1 minute. This
elution step was done a total of two times, and resulted in a purified bisulfite
converted product volume between 110-120 µl. To further guard against the
carryover of ethanol from the purification process, which may inhibit PCR, the
purified samples were placed inside of an oven that was pre-heated to 80ºC. Once
the uncapped samples were placed inside and the door to the oven was closed, the
oven was turned off and the samples were allowed to remain inside for 20 minutes.
This reduces the volume of the sample to approximately 100 µl and ensures that
any residual ethanol that may have remained from the purification is evaporated.
Following the volume reduction, the samples were stored at –80 ºC until use.
MethyLight:
MethyLight assays were performed as previously described (Eads,
Danenberg et al. 2000; Trinh, Long et al. 2001; Ehrlich, Jiang et al. 2002;
55
Weisenberger, Campan et al. 2005) with assistance from the laboratory of Dr. Peter
Laird. Briefly, each MethyLight reaction contained the following components
from the Applied Biosystems’ TaqMan® Gold Pack with Buffer A; 1X Buffer, 3.5
mM MgCl
2
and 0.1 ul of Taq Gold Polymerase. The reaction also contained a 1X
volume of 10X stabilizer solution (0.1% Tween-20, 0.5% gelatin), 200 uM of each
dNTP, 0.3 uM of each primer (forward and reverse), as well as 0.1 uM of
fluorescent-labeled probe with either a black hole quencher (BHQ-1, Biosearch
Technologies), or a minor groove binder non-fluorescent quencher (MGBNFQ,
Applied Biosystems). These reagents were mixed with approximately 5-10 ng of
bisulfite converted DNA in a final volume of 30 ul. Finally, each reaction was
subjected to the following PCR cycling conditions; 95 ºC for 10 minutes, and 50
cycles of 95 ºC for 15 seconds and 60 ºC for 1 minute.
Quantitative Analysis Using MethyLight:
As each MethyLight reaction completes a PCR cycle, the amount of
fluorescence released from the reaction was calculated on an MJ Research Opticon
Continuous Fluorescence Detection System (BioRad), and expressed as a Ct
(Threshold cycle) value. The Ct value corresponds to the cycle number at which
the fluorescent signal from a given reaction statistically exceeds that of
background. An absolute measurement of DNA methylation is calculated by
comparing the Ct value for an experimental sample with a standard curve of known
56
template concentrations (Trinh, Long et al. 2001). From this, the percentage of
fully methylated molecules (PMR (Eads, Lord et al. 2001; Trinh, Long et al. 2001;
Weisenberger, Campan et al. 2005)) is calculated for each reaction. The
calculation of a PMR is dependent upon two separate MethyLight reactions that are
designed specifically for bisulfite converted DNA. One reaction is designed to
interrogate the methylation status of a CpG island of interest, while the second, a
control reaction, is used to normalize for the amount of input DNA. The
percentage of fully methylated molecules in a sample at the CpG island of interest
was calculated by dividing the amount of DNA methylation calculated for the CpG
island specific reaction by that of the control reaction and, then, dividing this ratio
by the corresponding ratio that was calculated when the same reactions were
carried out on an in vitro-methylated reference and multiplying by 100 (Fig. 3.2)
(Eads, Lord et al. 2001; Trinh, Long et al. 2001; Weisenberger, Campan et al.
2005).
Statistical Analysis:
As in the microarray experiments of Chapter Two, Dr. Siegmund calculated
a T-statistic (assuming unequal variance between the two groups) for each of the 89
MethyLight assays in the second phase of analysis. Furthermore, because of the
relatively small sample size, she also generated p-values using permutation tests
(Wilcoxon) where the variable denoting group membership (ABC-DLBCL vs.
57
GCB-DLBCL) was permuted 10,000 times. Furthermore, she compared each of
these statistics to a Benjamini-Hochberg (BH) multiple test correction threshold
using a 10% False Discovery Rate (FDR). Dr. Seigmund also calculated a
Receiver Operating Characteristic AUC value for each of the 89 MethyLight
reactions in the second phase screen to determine the ability for a given
MethyLight reaction to distinguish between the ABC-DLBCL and GCB-DLBCL
subtypes.
Locus of Interest (Sample) / Control (Sample)
Locus of Interest (Reference) / Control (Reference)
X 100 PMR =
Figure 3.2: Calculation of PMR Value
Figure 3.2: The unit of measure in MethyLight analysis is expressed as a
percentage of a methylated reference (PMR). MethyLight reactions are
conducted in tandem with a control reaction (Alu-specific) in both the
sample of interest and a methylated reference. PMR values are calculated
by dividing the ratio of CpG island of interest reaction to control reaction
in the sample by this same ratio in the methylated reference.
58
RESULTS:
As outlined in Chapter Two, a preliminary survey of methylation in
DLBCL using CpG island microarrays identified a number of sites that have the
potential to differentiate between two previously identified lymphoma subtypes.
Focusing our efforts on those DLBCL analyzed in the previous chapter, we
examined additional CpG islands for DNA methylation using MethyLight. Given
that we were limited in the quantity of sample DNA available and the fact that the
requisite amount of starting DNA increases with the number of MethyLight
reactions performed, we chose to conduct the MethyLight analysis in two phases.
The first phase was designed to screen a large number of MethyLight reactions in a
limited number of samples to identify reactions that displayed a moderate amount
of differential DNA methylation in DLBCL (i.e. coefficient of variation (CoV)
>5%). The second phase involved testing those differentially methylated reactions,
identified in phase one, in an expanded sample set that included those 15 DLBCL
samples examined previously in the second chapter. An additional GCB-DLBCL
sample (#3442) was also included in this expanded sample set.
Initially, five DLBCL samples were selected for the first phase of analysis.
Our collaborator, Dr. Timothy Greiner specified these five samples with the only
prerequisite being that the five must include at least two samples from the
previously defined GCB-DLBCL subgroup and two from the ABC-DLBCL
subgroup. At this stage, as before in the microarray experiments of Chapter Two,
59
the grouping of the individual samples into either the GCB-DLBCL group or the
ABC-DLBCL group was blinded to our laboratory. A total of 262 different
MethyLight reactions (Appendices C, D and E) were completed on each of the five
DLBCL samples (3735, 5063, 6328, 9284, and 10618), as well as on two control
samples; peripheral blood lymphocyte (PBL) DNA, and SssI-treated PBL DNA
(Pike Manuscript in Preparation). The SssI-treated PBL should be completely
methylated and was used as a control to gauge the efficiency of the reaction. The
untreated PBL DNA, on the other hand, was used to measure the amount of
methylation in normal PBL DNA.
A Ct value, the raw measure of DNA methylation in this system, was
calculated for each MethyLight reaction (Materials and Methods). Generally, the
lower the Ct value, the higher the degree of methylation in a given sample. Using
these Ct values, a rudimentary analysis was performed to identify MethyLight
reactions to be used in the second phase of analysis
5
. We used three criteria to
exclude reactions from the secondary analysis. First, we eliminated MethyLight
reactions having a Ct of 35 or greater across all five DLBCL samples and those
having a Ct of less than 30 in each of the five DLBCL samples. This, in effect,
removed from further consideration MethyLight markers that were either largely
methylated (Ct <30) or unmethylated (Ct > 35) across all of the samples tested.
5
It should be noted that by using the Ct values alone in this first phase of analysis, no corrections
were made for differences in the amount of sodium bisulfite converted DNA that each reaction may
have had as a template. Here, the Ct value is only an approximation of the amount of methylation in
each sample and these values are not quantitative.
60
Additionally, because our primary interest was in identifying differentially
methylated CpG islands between subtypes of lymphoma, we also eliminated
MethyLight markers based on the amount of variation the markers had in
methylation across the five samples examined. This was accomplished by
calculating the CoV across the batch of five samples. MethyLight reactions having
a CoV of less than 5% were excluded from the second phase of analysis. Using
this three-tier selection process, we reduced the 262 MethyLight reactions
examined in the first round of analysis by approximately two thirds, to just 89
reactions
6
.
With the number of MethyLight reactions reduced, we embarked on the
second phase of analysis which involved testing each of the 89 MethyLight
reactions, identified above, in an expanded sample set of DLBCL that included
samples examined previously in the microarray experiments of Chapter Two. In
addition to those 89 reactions selected in the first phase of the analysis, we also
included the control MethyLight reaction HB-313 in the panel of MethyLight
reactions. The HB-313 reaction was performed in triplicate and results of the
individual HB-313 reactions were averaged prior to analysis.
As before, the Ct value for each MethyLight reaction was determined in
each sample. Beyond this, we calculated the PMR for each reaction as previously
6
Complete data sets will be made available in a forthcoming publication Pike, B. L. (Manuscript in
Preparation)..
61
described (Materials and Methods). Briefly, The PMR value is a quantitative
measure of the amount of DNA methylation in a sample, compared to an in vitro
methylated reference. As stated above, the control reaction HB-313 reaction was
completed in each of the samples analyzed as well as on an in-vitro methylated
reference sample. HB-313 is a MethyLight reaction that is uniquely designed to
amplify a region of the Alu-repeat region found in humans (Weisenberger, Campan
et al. 2005). In this instance, the HB-313 reaction was done in triplicate on every
sample tested, as well as on an artificially methylated reference sample and the
results were averaged prior to analysis. Each of the 89 MethyLight reactions tested
was also completed in every sample analyzed as well as in the in vitro methylated
reference sample. Afterwards, the percentage of methylated reference molecules
(PMR) was calculated by taking the ratio for each MethyLight reaction to the
control reaction HB-313 (average of triplicate) in a sample and dividing this ratio
by that same ratio in the methylated reference and multiplying by 100 (Fig 3.2).
Because the PMR value is, as defined, a value that is calculated in reference to a
fully methylated control, one would not expect PMR values that are greater than
100, an indication that a sample is more methylated than the fully methylated
control. However, occasionally, PMR values greater than 100 do result from the
calculation. This may be due to experimental variation or to the fact that the “fully
methylated” control may not be completely methylated at some loci, despite efforts
to exhaustively methylate the reference (Trinh, Long et al. 2001). However,
62
because these measurements are not absolute in terms of the number of methylated
molecules and are, instead, simply values calculated in relation to a reference, PMR
values greater than 100 do not confound our analysis (Trinh, Long et al. 2001).
As in Chapter Two, each samples identity with respect to its membership to
one of the two subgroups (ABC-DLBCL or GCB-DLBCL) was blinded to our
laboratory prior to the final statistical analysis. After the PMR values were
calculated, the subtype classification (ABC-DLBCL or GCB-DLBCL) was
revealed for subsequent analysis. As before, in the microarray experiments of
Chapter Two, we first sought to determine if the overall amount of DNA
methylation was different between the ABC-DLBCL and GCB-DLBCL subtypes.
To do this, we calculated the average PMR value for each MethyLight reaction
across all of the samples in the ABC-DLBCL group as well as for those samples in
the GCB-DLBCL group. On the whole, the amount of methylation between the
two subtypes was not statistically different (data not shown). This is evident when
comparing the distribution of average PMR values across the two subgroups (Fig.
3.3). To do this, we calculated the frequency a given PMR value (average PMR)
occurred in each of the two subgroups (Fig. 3.3). Nearly 70% of those 89
MethyLight reactions tested (excluding control replicates HB-313) had an average
PMR value of <30% in each of the two subtypes, 59 total markers in each group.
While, these lymphoma generally displayed low to moderate amounts of
methylation at the sites examined, a relatively small fraction of those MethyLight
63
reactions examined showed substantial methylation, displaying an average PMR
value of >50% in the GCB-DLBCL and ABC-DLBCL lymphoma subtypes (Table
3.1). There were five MethyLight reactions that had an average PMR value of
>50% in the GCB-DLBCL samples. These MethyLight reactions were proximal to
the genes: AR, IGF2, CYP27B1, GATA4, and CDKN1C. The reactions proximal to
genes AR and IGF2 were also found to have an average PMR >50% in the ABC-
DLBCL subgroup. However, it is worth mentioning that AR resides on the X-
chromosome and displays a normal pattern of DNA methylation as a result of X-
inactivation in women(Yang, Chen et al. 2003). In addition, there were five
MethyLight reactions that demonstrated an average PMR >50% in the ABC-
DLBCL that did not meet the >50% threshold in the GCB-DLBCL samples.
These five reactions were proximal to genes SCGB3A1, DLC1, GDNF, and
ONECUT2 (Table 3.1).
64
Figure 3.3: Frequency and Distribution of PMR Values
Figure 3.3: The average PMR value was calculated for both the ABC and the GCB-
DLBCL subtypes. Those PMR values were then sorted into the bins along the x-axis. The
y-axis shows the number of clones that had PMR values within a given range.
65
Table 3.1: A total of five MethyLight reactions (Listed by ID number) displayed an average PMR value (AVG) in
the ABC-DLBCL samples of >50% (Top Panel). Whereas, three MethyLight reactions displayed this same level of
methylation in the GCB-DLBCL samples (Bottom Panel). Two MethyLight reactions (HB-249 and HB-318) gave
an average PMR value of >50% in both subgroups (Shaded Panel).
Table 3.1: Heavily Methylated MethyLight Markers
66
As in Chapter Two, we calculated a T-statistic (assuming unequal variance
between the two groups) for each of the MethyLight assays, as well as a p-value
using permutation tests (Wilcoxon) where the variable denoting group membership
(ABC-DLBCL vs. GCB-DLBCL) was permuted 10,000 times (Table 3.2). This
was done in an effort to identify individual MethyLight reactions that displayed
statistically significant differences in methylation between the two subtypes. In
addition, we also calculated a Receiver Operating Characteristic AUC value for
each of the reactions to determine the ability for a given MethyLight reaction to
distinguish between the DLBCL subtypes (Table 3.2).
Remarkably, the MethyLight reaction that was most able to distinguish
between the two DLBCL subgroups was proximal to ONECUT2, the same gene
that most strongly predicted subgroup affiliation (ABC-DLBCL or GCB-DLBCL)
in the CpG island microarray experiments of Chapter Two. This result is that much
more surprising considering that the two platforms (MethyLight and CpG island
microarray) had very little overlap with respect to the genomic regions each
analyzed. In fact, fewer than 3% of the gene-associated CpG islands on the
microarray were independently examined by MethyLight (Fig 3.4).
There were two different MethyLight reactions in the panel of 89 that
assayed for the methylation of ONECUT2 (HB-242 and HB-243) (Table 3.2). Both
reactions consistently showed that ABC-DLBCL have a higher degree of
methylation than GCB-DLBCL (HB-243 p = 0.0045 and HB-242 p = 0.0553). For
67
HB-243, the average PMR value for GCB-DLBCL was 31% and double that
amount for ABC-DLBCL (62%). This same trend was also seen for the other
ONECUT2 MethyLight reaction (HB-242). The average PMR value for HB-242
was 34% in GCB-DLBCL and 59% in ABC-DLBCL. In addition, the ability of
both HB-243 and HB-242 to correctly identify each member of the subtype was
quite good with Receiver Operating Characteristic AUC values of 0.89 and 0.71,
respectively (Table 3.2). While, both ONECUT2 assays (HB-243 and HB-242)
yield relatively strong, uncorrected p-values (0.005 and 0.055, respectively) with
respect their ability to discriminate between the DLBCL subgroups, it should be
noted that neither of the MethyLight reactions produced p-values that were
sufficient to overcome the Benjamini-Hochberg multiple test correction threshold
(HB-243 = 0.001 and HB-242 = 0.008). However, taking into consideration that
the microarray experiments of Chapter Two independently identified ONECUT2 as
being differentially methylated, these results strongly indicate that ONECUT2
could prove to be a predictor of DLBCL subgroup affiliation. Thus, these results
warrant further investigation into the possibility that ONECUT2 may be
differentially methylated between the ABC-DLBCL and GCB-DLBCL subtypes.
Other MethyLight reactions also showed some promise at having the ability to
distinguish between the two DLBCL subtypes. In order of increasing p-value, they
are: CYP27B1 (p=0.0066), MINT31 (p=0.0357), MTHFR (p=0.0387) and KL
(p=0.0434) (Table 3.2).
68
Figure 3.4: MethyLight and CpG Island Microarray Overlap
Figure 3.4: The 592 CpG islands clones in the microarray
experiments of Chapter Two were proximal to 473 unique genes.
Likewise, the 262 MethyLight assays examined in this chapter
interrogated CpG islands proximal to 216 different genes. Only 12 of
these genes were in common between the MethyLight and CpG Island
Microarray platforms.
69
ID GENE ABC GCB TTEST WTEST AUC BH
HB-243 ONECUT2 61.5 30.6 0.005 0.011 0.89 0.001
HB-223 CYP27B1 16.5 62.0 0.007 0.005 0.93 0.002
HB-162 MINT31 6.1 24.9 0.036 0.048 0.80 0.003
HB-058 MTHFR 28.8 44.3 0.039 0.072 0.78 0.004
HB-175 KL 11.2 25.2 0.043 0.043 0.81 0.006
HB-215 HLA-G 42.5 31.5 0.055 0.184 0.71 0.007
HB-242 ONECUT2 59.3 34.4 0.055 0.185 0.71 0.008
HB-160 MGMT 23.8 4.9 0.062 0.062 0.76 0.009
HB-261 NEUROG1 31.8 17.5 0.063 0.072 0.78 0.010
HB-195 CDX1 24.8 39.8 0.073 0.090 0.76 0.011
HB-255 GAD1 0.0 13.6 0.091 0.120 0.67 0.012
HB-252 DRD1 42.6 28.2 0.114 0.124 0.74 0.013
HB-327 GATA3 47.7 28.9 0.143 0.123 0.74 0.015
HB-219 LDLR 32.3 16.0 0.148 0.223 0.69 0.016
HB-222 GDNF 58.7 42.9 0.154 0.633 0.58 0.017
HB-310 CLDN7 17.8 31.4 0.158 0.185 0.71 0.018
HB-206 MT2A 0.3 11.2 0.159 0.095 0.74 0.019
HB-254 GABRA2 36.7 26.2 0.168 0.203 0.70 0.020
HB-146 CCND1 0.0 4.1 0.181 0.232 0.61 0.021
HB-253 DRD2 30.8 45.3 0.198 0.266 0.67 0.022
HB-324 GATA4 39.7 55.4 0.227 0.243 0.68 0.024
HB-173 CDKN2B 3.1 9.1 0.227 0.319 0.65 0.025
HB-184 SEZ6L 7.9 16.1 0.229 0.392 0.63 0.026
HB-260 NEUROD2 36.0 48.9 0.238 0.244 0.68 0.027
HB-204 MT1G 13.1 21.7 0.239 0.138 0.73 0.028
HB-166 CALCA 20.8 12.7 0.241 0.204 0.70 0.029
HB-361 TFPI2 28.2 13.3 0.247 0.390 0.63 0.030
HB-244 TFF1 20.5 28.5 0.253 0.594 0.59 0.031
HB-050 CDH1 17.8 28.3 0.255 0.244 0.68 0.033
Table 3.2: MethyLight Reactions Summary Statistics
70
HB-171 CDH1 23.2 36.6 0.264 0.264 0.67 0.034
HB-169 PGR 42.1 32.9 0.264 0.185 0.71 0.035
HB-044 RASSF1 0.1 2.8 0.285 0.232 0.61 0.036
HB-318 IGF2 51.5 72.5 0.291 0.185 0.71 0.037
HB-187 MINT2 27.1 12.6 0.295 0.276 0.67 0.038
HB-266 APP 22.8 14.1 0.302 0.310 0.66 0.039
HB-079 SLC6A20 25.6 9.6 0.303 0.447 0.62 0.040
HB-154 MYOD1 36.5 30.6 0.306 0.672 0.57 0.042
HB-168 HIC1 4.8 16.0 0.313 0.265 0.67 0.043
HB-239 CYP1B1 13.6 23.1 0.318 0.672 0.57 0.044
HB-145 TFF1 34.5 45.9 0.319 0.536 0.60 0.045
HB-068 VDR 2.7 7.6 0.321 0.080 0.75 0.046
HB-064 SCAM-1 0.0 5.5 0.326 0.232 0.61 0.047
HB-185 RBP1 31.8 20.9 0.329 0.525 0.60 0.048
HB-232 ERBB2 24.3 36.7 0.344 0.314 0.66 0.049
HB-203 JUP 14.9 23.2 0.346 0.220 0.69 0.051
HB-231 PSAT1 15.1 6.1 0.357 0.670 0.56 0.052
HB-149 PGR 25.5 19.1 0.362 0.396 0.63 0.053
HB-357 GDF15 10.9 14.4 0.370 0.708 0.56 0.054
HB-225 DLEC1 0.9 5.7 0.373 0.303 0.63 0.055
HB-268 HOXA1 0.3 1.5 0.378 0.486 0.59 0.056
HB-194 SCGB3A1 55.0 42.5 0.393 0.458 0.62 0.057
HB-281 SFRP4 19.1 27.8 0.393 0.524 0.60 0.058
HB-348 PTPN6 2.8 3.6 0.401 0.627 0.58 0.060
HB-047 TWIST1 30.9 23.0 0.422 0.669 0.57 0.061
HB-323 GATA4 20.0 28.7 0.425 0.525 0.60 0.062
HB-387 SAT 18.0 10.3 0.452 0.658 0.57 0.063
HB-085 SASH1 26.6 21.8 0.466 0.289 0.67 0.064
HB-218 DLC1 56.8 49.1 0.488 0.368 0.64 0.065
HB-320 NKD2 36.6 23.6 0.556 1.000 0.51 0.066
Table 3.2 Continued
71
HB-164 ESR1 13.6 17.5 0.557 0.455 0.62 0.067
HB-258 BDNF 20.2 13.7 0.570 0.828 0.54 0.069
HB-230 CDKN1A 2.3 0.9 0.572 0.578 0.57 0.070
HB-197 CRABP1 26.5 31.7 0.582 0.750 0.56 0.071
HB-075 CDH13 23.2 18.7 0.599 0.596 0.59 0.072
HB-069 IGSF4 24.4 30.6 0.603 0.747 0.56 0.073
HB-054 CXADR 1.8 4.2 0.618 1.000 0.51 0.074
HB-193 COL1A2 21.2 25.6 0.635 0.832 0.54 0.075
HB-040 CCND2 2.8 5.4 0.635 0.303 0.63 0.076
HB-259 NEUROD1 38.0 33.4 0.645 0.832 0.54 0.078
HB-048 DNAJC15 2.4 1.7 0.672 0.373 0.63 0.079
HB-236 PITX2 37.3 43.8 0.693 0.710 0.56 0.080
HB-328 CDKN1C 45.9 54.2 0.695 0.671 0.57 0.081
HB-176 RARB 21.3 17.8 0.697 0.633 0.58 0.082
HB-247 THBS1 6.4 9.1 0.720 0.683 0.56 0.083
HB-060 PPARG 14.0 17.2 0.731 0.912 0.52 0.084
HB-315 SMAD9 14.5 17.7 0.747 0.667 0.57 0.085
HB-363 BNIP3 24.2 19.9 0.765 0.868 0.53 0.087
HB-159 MGMT 11.8 10.7 0.791 1.000 0.51 0.088
HB-364 ZFP64 41.0 38.6 0.794 0.791 0.55 0.089
HB-282 SFRP5 24.0 28.0 0.811 0.364 0.64 0.090
HB-198 CXX1 32.9 37.5 0.814 1.000 0.51 0.091
HB-249 AR 50.3 52.8 0.827 0.791 0.55 0.092
HB-207 MT3 17.8 20.0 0.835 0.957 0.52 0.093
HB-322 RARRES1 30.2 31.8 0.917 0.873 0.53 0.094
HB-329 CDKN1C 37.3 38.1 0.942 0.918 0.52 0.096
HB-250 GRIN2B 40.0 40.7 0.948 0.874 0.53 0.097
HB-070 LTB4R 5.9 5.7 0.950 0.957 0.52 0.098
HB-081 CDKN2A 10.1 10.6 0.956 0.658 0.57 0.099
HB-078 CYP1B1 7.3 7.3 0.992 1.000 0.51 0.100
Table 3.2 Continued
Table 3.2: Summary statistics of 89 MethyLight reactions ranked by T-
Test. ID = MethyLight Reaction ID. Gene = Gene proximal to
MethyLight Reaction. ABC and GCB = average PMR values of
respective subtypes. TTEST = uncorrected student’s T-test. WTEST=
Wilcoxon permutation test. AUC = ROC area under curve. BH=
Benjamini-Hochberg Threshold.
72
There is an inverse relationship between the Cy3/Cy5 ratios of the CpG
island microarray experiments of Chapter Two and the PMR values of the
MethyLight assays presented here. In each CpG island microarray experiment, the
smaller the Cy3/Cy5 ratio (value < 1.0) the greater the methylation of the sample.
In contrast, the PMR values of MethyLight increase with methylation. There were
ten samples (four ABC-DLBCL and six GCB-DLBCL) for which there was both
MethyLight data and CpG island microarray data (Chapter 2) for the ONECUT2
gene. For each sample there were three data values: a Cy3/Cy5 ratio from the CpG
island microarray experiment and two PMR values, one for each MethyLight assay
(HB-242 and HB-243). To determine the relationship between the MethyLight
PMR values and the Cy3/Cy5 ratios of the CpG island microarray data, we
compared the Cy3/Cy5 ratio of ONECUT2 (clone 13.F3), with the PMR values of
both HB-242 and HB-243 (Fig. 3.5). While the concordance between the
MethyLight data and that data from the CpG island microarray is imperfect, the
comparison demonstrates the general inverse relationship between the platforms
(Fig. 3.5). There are, however, obvious differences in the data generated between
the two platforms. For example, the CpG island microarray data suggests that
sample 9323 is practically unmethylated, with a Cy3/Cy5 value that is approaching
1.0 (Fig. 3.5). However, each of the two MethyLight assays (HB-242 and HB-243)
score this sample as being substantially methylated with PMR values of 44 (HB-
73
242) and 52 (HB-243). However, these discrepancies may be, at least partly,
explained by differences in the assays themselves. For example, McrBC (5’- Pu
m
C
(N
40-3000
)Pu
m
C – 3’) requires that there be just two 5-methylcytosines located
between 40 and 3,000 nucleotides away from each other in order for it to cleave the
DNA. However, this enzyme is unable to discriminate between DNA with
extensive methylation versus DNA with relatively little methylation. While this
broad substrate specificity makes this enzyme an attractive tool in global methods
of analysis, it also hampers its ability to tease out fine differences in DNA
methylation. In contrast, MethyLight assays are highly dependent on the
methylation pattern of a particular sequence.
Another possible explanation for the apparent discrepancies between the
CpG island microarray and MethyLight data sets may have to do with the
sequences being evaluated. The clone on the CpG island microarray that
interrogates the methylation status of ONECUT2 lies within the first intron of the
gene, nearly 3.5-kb downstream from the transcriptional start site. The MethyLight
assay HB-242, on the other hand, is upstream of the transcriptional start site of
ONECUT2 by less than 500-bp, while the HB-243 assay interrogates a region in
between the clone from the microarray and HB-242. Therefore, a simple
explanation for the apparent discrepancies seen between the MethyLight data and
the CpG island microarray data may be that the assays are merely assaying for
DNA methylation at different sites in the genome.
74
Figure 3.5: Plotted above is the relationship between the MethyLight PMR values
and the CpG island microarray Cy3/Cy5 ratios for the gene ONECUT2. There
were 10 samples that had both microarray and MethyLight data for ONECUT2
(See Table). The trend line for each data set (HB-242 and HB243) is also
highlighted along with the equation of the line and the R
2
value.
ONECUT2: Microarray vs. MethyLight
y = -221.98x + 240.46
R
2
= 0.3525
y = -160.05x + 181.9
R
2
= 0.2506
0
10
20
30
40
50
60
70
80
90
100
0.75 0.80 0.85 0.90 0.95 1.00
Microarray Cy3/Cy5 Ratios
MethyLight PMR Values
HB-243
HB-242
Linear (HB-243)
Linear (HB-242)
13112 6328 12967 9251 10618 8578 9284 4427 9323 5063
Microarray Cy3/Cy5 0.84 0.81 0.85 0.80 0.97 0.91 1.00 0.96 0.97 0.95
MethyLight HB-243 82 56 77 39 0 29 42 21 44 0
MethyLight HB-242 55 40 71 49 18 1 2 23 52 54
ABC GCB
Figure 3.5: ONECUT2: Microarray vs. MethyLight
75
As mentioned, the samples used in our methylation analyses were
previously examined by gene expression analysis (Rosenwald, Wright et al. 2002).
Because the hypermethylation of CpG islands is associated with the transcriptional
silencing of neighboring genes, we were interested in determining whether or not
the DNA methylation events we identified are related to the expression levels of
proximal genes. To make these comparisons, we relied upon the previously
published expression data for those samples analyzed (Rosenwald, Wright et al.
2002). Of the 89 MethyLight markers examined in the second phase of analysis,
we identified 23 MethyLight reactions for which there was also expression data.
Generally, we found no correlation between DNA methylation status and the
expression of the genes proximal to those reactions. For example, of the samples
analyzed, there were 14 for which there was both expression data (cDNA clone
ID=26576 (Rosenwald, Wright et al. 2002)) and MethyLight data (HB-160) for the
gene MGMT (Fig. 3.6). The PMR values of the MethyLight reaction HB-160 in
these samples ranged from 0 to 54 and the Cy3/Cy5 expression ratio in these
samples ranged from 0.78 to 2.75. Again, the PMR value increases with
methylation. Likewise, increasing Cy3/Cy5 expression ratios indicate higher
expression in a given sample relative to a control. If methylation is correlated with
lower levels of expression, one would expect the PMR value to increase as the
Cy3/Cy5 ratio decreases. In contrast, as shown in Figure 3.6, samples with limited
76
methylation (PMR values <10) differ considerably in expression with Cy3/Cy5
ratios that range between 1.0 and 2.5. In addition, the sample with the greatest
level of methylation (PMR <50) also has one of the highest levels of expression,
which is counter to expectations if DNA methylation is causally related to gene
silencing. The remaining 22 comparisons show a similar lack of correlation
between DNA methylation status and gene expression (Appendix F).
In summary, a screen of more than 250 MethyLight reactions identified a
number of reactions that demonstrated a moderate degree of methylation in the
DLBCL tested. Five of the MethyLight reactions examined (HB-243, HB-223,
HB-162, HB-058, and HB-175) showed differential methylation between the two
ABC-DLBCL and GCB-DLBCL subtypes (p-value <0.05). Surprisingly, the
MethyLight reaction that was best able to distinguish between the DLBCL subtypes
was HB-243 (p = 0.005), a reaction proximal to the gene ONECUT2, the same gene
identified as being differentially methylated in the CpG island microarray
experiments in Chapter Two. Furthermore, we found that the CpG island
microarray experiments generally agreed with the MethyLight data generated for
the gene ONECUT2. Interestingly, however, we found that DNA methylation
levels did not correlate with gene expression.
77
MGMT
MethyLight HB-160 vs. Expression Clone 26576
y = 0.0039x + 1.6005
R
2
= 0.0099
0.00
0.50
1.00
1.50
2.00
2.50
3.00
0 10 20 30 40 50 60
PMR Value
Expression Cy3/Cy5 Ratio
Figure 3.6: Expression vs. Methylation
Figure 3.6: The PMR values from the MethyLight analysis are plotted on
the x-axis and the Cy3/Cy5 ratio from published expression data is plotted
on the y-axis. The equation of the line and the R
2
value of the trend line are
also given.
78
CHAPTER 4: Confirmation Analysis of DNA Methylation
As previously outlined in Chapters Two and Three, we utilized two
different technologies to screen for DNA methylation differences in ABC-DLBCL
and GCB-DLBCL. The first employed the use of CpG island microarrays to
characterize DNA methylation specific libraries for each sample analyzed. This
permitted us to examine nearly 600 non-repetitive, gene-associated CpG islands,
and identified candidate CpG islands that were able to distinguish between the two
lymphoma subtypes mentioned, namely ABC-DLBCL and GCB-DLBCL.
Secondly, using a quantitative, fluorescence-based PCR technique, MethyLight, we
analyzed another 262 sites for DNA methylation in the two lymphoma subtypes.
Together, the two methods identified a number of methylated regions in the
genome, and suggested that a fraction of those CpG islands tested have the ability
to discriminate between the ABC-DLBCL and GCB-DLBCL subtypes. Here,
building upon the work in Chapters Two and Three, we provide a more in-depth
analysis of candidate differentially methylated CpG islands and attempt to confirm
our prior results.
In the previous two chapters, our attempts to identify DNA methylation
differences in lymphoma focused exclusively on a relatively small sample set of
DLBCL. In this expanded analysis, we analyzed 26 samples, which includes all
79
but one of those samples previously analyzed in Chapter Three
7
. Because
MethyLight is a proven, highly-sensitive technique in the analysis of DNA
methylation, we elected to use this method as the primary means to confirm our
previous results in the expanded sample set. In particular, the power of MethyLight
lies in its ability to rapidly assay for DNA methylation at a moderate number of
sites in a relatively large sample set.
First, we focused our attention on those CpG island clones from the
microarray that displayed a moderate difference in methylation between the ABC-
DLBCL and GCB-DLBCL subtypes. This required that an individual MethyLight
assay be designed for each CpG island clone of interest. Ultimately, we designed
new assays for clones on the CpG island microarray that had promising p-values
and measurable overall differences in DNA methylation between the two
lymphoma subtypes analyzed. In total, the results from the microarray led to the
successful design of 17 new MethyLight assays (Table 4.1). Beyond this, a brief
survey of the literature led us to design two additional MethyLight assays. After a
re-analysis of the gene expression data published by Shipp et. al (Shipp, Ross et al.
2002) we identified genes that we hypothesized may be silenced by DNA
methylation (data not shown). To test this hypothesis, we designed assays for
FAM3C and SPINK2 (Table 4.1). In each case, the gene has an associated CpG
7
Due to a technical problem, GCB-DLBCL sample number 4427 was not analyzed in this stage of
analysis.
80
island and displayed a virtually undetectable level of expression in ABC-DLBCL
subtype lymphomas compared to samples of the GCB-DLBCL subtype.
Furthermore, these two genes had been among those highlighted by Wright et. al as
being strong predictors of DLBCL subtype based on gene expression (Wright, Tan
et al. 2003). In addition to these newly designed assays, we also had an interest in
re-examining those MethyLight assays that were best able to distinguish between
the ABC-DLBCL and GCB-DLBCL subtypes. Of the 262 assays originally
surveyed, we choose to repeat the ten MethyLight assays that yielded the most
robust p-values (Table 3.2) on the expanded set of 26 samples. The most
promising result to come out of our previous analyses involved the differential
methylation of ONECUT2. The CpG island microarray and MethyLight analyses
both point to this gene as having the potential to discriminate between the DLBCL
subtypes. However, in the course of our analysis we discovered that one of the
MethyLight assays for this gene (HB-243) has a primer that contains mismatches
with the bisulfite-converted template. To remedy this, a replacement assay was
designed (HB-446), and included in all subsequent analyses.
Between the newly designed MethyLight assays and those selected for re-
analysis based on the earlier MethyLight survey, we examined a total of 30
MethyLight markers in 26 DLBCL (Table 4.1). Importantly, CpG islands proximal
to the genes ONECUT2 and FLJ21062 continued to demonstrate differential
methylation between the ABC-DLBCL and GCB-DLBCL subtypes.
81
SAMPLE MethyLight ID GENE
6328 HB-058 MTHFR
3161 HB-160 MGMT
3735 HB-162 MINT31
4605 HB-175 KL
8597 HB-195 CDX1
9251 HB-215 HLA-G
11578 HB-223 CYP27B1
11912 HB-242 ONECUT2
12967 HB-243 ONECUT2
13112 HB-261 NEUROG1
HB-424 PFDN5
HB-425 PFDN5
HB-426 GNMT
HB-427 CPVL
3442 HB-428 CENPH
5063 HB-429 SLC38A4
5366 HB-430 SLC38A4
8578 HB-431 ZNF615
8773 HB-433 GTF2A2
9284 HB-435 WDR33
9323 HB-437 ARPP-19
11251 HB-438 VPS52
10618 HB-439 C9orf64
11066 HB-440 HOXC9
11224 HB-441 AK131409
12243 HB-442 FLJ21062
13247 HB-443 TP53I11
13621 HB-444 FAM3C
13822 HB-445 SPINK2
15843 HB-446 ONECUT2
ABC-DLBCL GCB-DLBCL
X
Table 4.1: List of DLBCL and MethyLight Reactions
Table 4.1: In this final stage of analysis, 26 DLBCL (on left) were examined
for DNA methylation at 30 sites (on right) in the genome. The lymphoma
samples (SAMPLE) are separated by subtype (ABC or GCB-DLBCL) and the
individual MethyLight reactions are sorted numerically by reaction (ID). The
proximal GENE for each reaction is also given. Reactions HB-058 through
HB-261 were previously examined in Chapter 3. Reactions HB-424 through
HB-443 were designed based on the CpG island microarray results in Chapter
2. HB-444 and HB-445 were designed based on available expression data,
and HB-446 is a re-design of HB-243. See Appendix C, D and E for detailed
assay information.
82
MATERIALS AND METHODS:
MethyLight Assay Design
New MethyLight assays were designed based on in silico-converted
bisulfite DNA sequence using the TaqMan primer/probe parameters in the Primer
Express program from Applied Biosystems Incorporated. Attempts were made to
ensure that both the primers and the internal probe contained CpG dinucleotides in
the 3’-prime end. Stretches of mono-nucleotide repeats were also avoided to add
specificity to each primer/probe set. A number of rules specified the design of each
probe as well. For one, probes were designed to be less than 31 nucleotides in
length. Additionally, in each probe there are fewer guanines in the sequence than
cytosines, and the 5’-end must not end with a guanine. Furthermore, to add
specificity, attempts were made to include, in each probe and primer, a number of
bisulfite conversion specific events (C ⇒ U) at non-CpG cytosines.
Bisulfite Conversion and MethyLight Analysis
Each sample was subjected to sodium bisulfite conversion and MethyLight,
essentially, as described in Chapter Three. Briefly, samples were converted by
bisulfite chemistry at 50 ºC for 12-15 hours, and purified as described on a
QIAamp Viral RNA Column (Qiagen). Following purification, MethyLight assays
were performed as previously described (Chapter 3) (Eads, Danenberg et al. 2000;
Trinh, Long et al. 2001; Ehrlich, Jiang et al. 2002; Weisenberger, Campan et al.
83
2005). Data from each MethyLight reaction was captured on a MJ Research
Opticon Continuous Fluorescence Detection System (BioRad), and expressed as a
Ct (Cycle Threshold) value. From these Ct values, the percentage of fully
methylated molecules (PMR (Eads, Lord et al. 2001; Trinh, Long et al. 2001;
Weisenberger, Campan et al. 2005)) was calculated as described. As in Chapter
Three, the control reaction HB-313 was used to control for DNA input.
Bisulfite Sequencing:
Bisulfite sequencing primers were designed using the internet-based application
MethPrimer (Li and Dahiya 2002). PCR primers were designed to amplify a region
of the genome that overlaps with the MethyLight assay for FLJ21062 (HB-442).
Deliberate attempts were made to locate each primer in regions that were devoid of
CpG dinucleotides, thus ensuring amplification after bisulfite treatment regardless
of methylation status. The genomic region of interest was amplified in each sample
using standard PCR conditions. Following amplification, PCR products were
cloned into the TOPO-TA 2.1 plasmid (Invitrogen), according to the
manufacturer’s protocol. Individual colonies were then sequenced using traditional
dideoxy-chain termination chemistry and vector specific M13 forward and reverse
primers.
84
RESULTS:
The analyses conducted in the previous two chapters were designed to
identify DNA methylation differences between the ABC-DLBCL and GCB-
DLBCL subtypes. Importantly, in a relatively small number of samples, the data
suggested that the ABC-DLBCL and GCB-DLBCL subtypes do differ with respect
to DNA methylation at a limited number of sites in the genome. Here, in an
expanded sample set of 26 lymphoma, we conducted a more in-depth analysis of
the candidate methylation events identified in the preceding chapters.
Based on our earlier MethyLight surveys in Chapter Three, we selected 10
previously screened MethyLight assays for validation in a larger set of samples.
However, in the analysis phase of the original MethyLight data set (Chapter 3), we
discovered that one of the MethyLight assays (HB-243) was incorrectly designed.
Thus, we re-designed the assay to correct the mistake. This corrected assay (HB-
446) was also included in our expanded survey of 26 samples. Beyond this, we
successfully designed 17 new MethyLight assays to follow up on sites that
demonstrated a moderate degree of differential methylation in the CpG island
microarray results of Chapter Two. Two additional MethyLight assays were also
designed for the genes FAMC3 and SPINK2. Based on the re-analysis of
previously published expression data, we identified these two genes as being
candidates for differential methylation between the ABC-DLBCL and GCB-
DLBCL subtypes. Each had expression patterns that implicated that they may be
85
differentially methylated in the ABC-DLBCL and GCB-DLBCL subtypes, since
they displayed moderate expression levels in one subtype and little to no expression
in the other subtype.
Many of the MethyLight assays tested showed substantial DNA methylation
across those 26 samples surveyed (Table 4.2 and Appendix G). Importantly, at
least two of the DNA methylation sites identified in the initial surveys highlighted
in Chapters Two and Three continued to demonstrate differential methylation
between the ABC-DLBCL and GCB-DLBCL subtypes. The CpG island proximal
to the hypothetical protein FLJ21062 (MethyLight assay HB-442) was completely
unmethylated in all of the ABC-DLBCL samples. This stands in marked contrast
to those GCB-DLBCL samples analyzed, which displayed an average PMR value
of 33. Likewise, the CpG island proximal to ONECUT2 (HB-242, HB-243, and
HB-446
8
) also continued to show differential methylation between the lymphoma
subtypes. However, the trend was the opposite of that seen for the gene FLJ21062.
For example, the ONECUT2 assay HB-446 had an average PMR value of 101 in
the ABC-DLBCL subtype versus 44 in the GCB-DLBCL subtype. FLJ21062 was
originally identified as being differentially methylated in the CpG island
microarray experiments of Chapter Two. ONECUT2, on the other hand, was
independently identified by both the CpG island microarray experiments of Chapter
8
MethyLight assays HB-243 and HB-446 interrogate the same sequence. However, HB-243 was
flawed in its original design. HB-446 is the corrected version of the original, flawed assay.
86
Two and the initial MethyLight survey outlined in Chapter Three. These results
appear to confirm those previous results.
Encouraged by the results generated using the 30 MethyLight assays in the
expanded sample set (Table 4.2), we elected to repeat five of the most promising
MethyLight assays in the 26 samples (HB-195, HB-223, HB-242, HB-442, and
HB-446). By averaging the two replicates of these five MethyLight reactions in the
26 samples (Table 4.3), we aimed to account for experimental variation that may
exist in this system. Overall, the two technical replicates were quite similar
(Appendix H). After averaging the data for each replicate in every sample, we
calculated a p-value (uncorrected student’s t-test) to determine the significance of
the differential methylation between the ABC-DLBCL and GCB-DLBCL subtypes
(Table 4.3). Again, the methylation status for genes FLJ21062 and ONECUT2
proved to be strong discriminators between the ABC-DBLCL and GCB-DBCL
subtypes. The one FLJ21062 assay (HB-442) had the most robust p-value (p =
0.0009).
In this replicate analysis, there were two individual MethyLight assays that
examined the methylation status of ONECUT2. The ONECUT2 assay HB-242,
which lies upstream of the ONECUT2 start site, had an average PMR value of 87 in
the ABC-DLBCL subtypes and 48 in the GCB-DLBCL subtype. While still
significant (p = 0.029), the p-value was more modest than the one for FLJ21062.
The second ONECUT2 assay, HB-446, is designed to interrogate a CpG dense
87
region located in the one and only intron of the gene. Again, the methylation of
HB-446 was higher in the ABC-DLBCL samples (Average PMR = 95) compared
to those GCB-DLBCL samples analyzed (Average PMR = 44). Of the two
ONECUT2 assays, HB-446 was best able to distinguish between the two DLBCL
subtypes with a p-value < 0.004 (Table 4.3).
The MethyLight reactions HB-195 and HB-223 were also completed in
duplicate on each of the 26 samples. HB-223 is proximal to the gene CYP27B1, a
member of the cytochrome P450 superfamily (Inouye and Sakaki 2001). While
CYP27B1 did display a greater degree of methylation in the GCB-DLBCL
(Average PMR = 51) relative to the ABC-DLBCL (Average PMR = 27), the
methylation difference between the two subgroups was not statistically significant
(p = 0.096) (Table 4.3). On the other hand, the MethyLight reaction HB-195
produced a marginally significant difference (p = 0.049) in methylation between
the ABC-DLBCL and GCB-DLBCL subtypes (Table 4.3). HB-195 is designed to
examine the methylation status of the CpG island proximal to the homeobox
containing transcription factor, CDX1 (Davidson and Zon 2006). Overall, the
methylation levels of the CpG island proximal to CDX1 in GCB-DLBCL (Average
PMR = 67) was nearly double that seen in ABC-DLBCL (Average PMR = 37)
(Table 4.3).
CpG islands proximal to ONECUT2 and FLJ21062 were both originally
identified as being differentially methylated based on the results of the CpG island
88
microarray experiments of Chapter Two. These results appear to confirm that data.
However, a number of the MethyLight assays that were designed based on
promising results of the CpG island microarray yielded data that was contradictory
with that of the microarray. For instance, the microarray data suggested that the
CpG island clone proximal to ZNF615 (clone 52.D7), had a moderate degree of
DNA methylation in some of the samples (Table 2.3). However, the MethyLight
assay designed for this clone failed to demonstrate that the CpG island was
methylated (Figure 4.1). In fact, only one of the samples had a PMR values > 10.
Of the MethyLight assays considered, there were nine that overlapped with the
genomic region encompassed by the clone on the CpG island microarray
(Appendix I). Overall, the degree of concordance varied widely. This may
highlight the differences between the two techniques, as previously described in
Chapter Three.
To further confirm the MethyLight data, we conducted bisulfite sequencing
on the genomic region that overlaps the MethyLight assay HB-442, which is
proximal to the gene FLJ21062. This region was PCR-amplified in a limited
number of DLBCL (four ABC-DLBCL and four GCB-DLBCL) using bisulfite-
specific primers (Forward: 5’-ATTAGAGGTTTGGGTTAATTGGG-3’ and
Reverse: 5’- AAAAAAAACAACCTAAAAATC-3’) (Fig. 4.2). The individual
PCR products were cloned into sequencing vectors and sequenced using the vector-
specific primers M13 forward and M13 reverse. In total, we sequenced between 10
89
and 20 clones in each sample. Every clone was sequenced using both the M13
forward and reverse sequencing primers. Following sequencing, the sequences
were then aligned and the C to T conversions (an indication of that the cytosine was
unmethylated) manually scored (Appendix J).
The bisulfite PCR amplicon for sequencing was designed to encompass the
majority of those CpG dinucleotides interrogated by the MethyLight assay HB-442
(Fig. 4.2). Between the three oligonucleotides (two primers and one internal probe)
that make up the MethyLight assay, they span nine CpG dinucleotides. The
sequenced amplicon overlaps with seven of the nine MethyLight CpG
dinucleotides and includes two additional CpG sites (Figure 4.2 and Appendix J).
Overall, the bisulfite sequencing results were in excellent agreement with
MethyLight data. For example, six of the samples sequenced had PMR values <
10. The bisulfite sequencing confirmed that these samples had little to no
methylation (Appendix J). Likewise, MethyLight analysis scored samples GCB-
DLBCL 12243 and GCB-DLBCL 8773 as having a high degree of DNA
methylation with PMR values equal to 72 and 104 respectively. The bisulfite
sequencing also confirmed the level of methylation in these samples (Appendix J).
90
ID Gene
ABC AVG GCB AVG TTEST
HB-442 FLJ21062 0 33 0.00028
HB-446 ONECUT2 101 44 0.00249
HB-243 ONECUT2 90 44 0.00419
HB-242 ONECUT2 82 45 0.02578
HB-195 CDX1 35 69 0.03375
HB-223 CYP27B1 28 52 0.08882
HB-439 C9orf64 0 10 0.09917
HB-443 TP53I11 0 11 0.10932
HB-427 CPVL 41 67 0.13194
HB-160 MGMT 37 10 0.13551
HB-058 MTHFR 68 88 0.18094
HB-175 KL 32 51 0.22767
HB-215 HLA-G 64 48 0.23800
HB-426 GNMT 30 51 0.26797
HB-444 FAM3C 0 0 0.31843
HB-430 SLC38A4 67 56 0.32637
HB-438 VPS52 0 0 0.33317
HB-433 GTF2A2 0 0 0.33317
HB-440 HOXC9 80 67 0.34232
HB-445 SPINK2 6 0 0.35061
HB-425 PFDN5 0 0 0.39840
HB-261 NEUROG1 29 38 0.46542
HB-431 ZNF615 5 7 0.64224
HB-162 MINT31 39 33 0.70081
HB-429 SLC38A4 78 76 0.88033
HB-424 PFDN5 0 0 0.89154
HB-441 AK131409 59 61 0.92499
HB-428 CENPH 0 0 -
HB-435 WDR33 0 0 -
HB-437 ARPP-19 0 0 -
Table 4.2: Expanded Sample Set Summary Statistics
Table 4.2: Listed above are the average (AVG) PMR values for
each of the MethyLight reactions (ID) analyzed in the 26
DLBCL. An uncorrected student’s T-test was also calculated
between the ABC and GCB-DLBCL samples.
91
ID
GENE
AVG ABC AVG GCB TTEST
HB-442
FLJ21062
0 30 0.00090
HB-446
ONECUT2
95 44 0.00367
HB-242
ONECUT2
87 48 0.02908
HB-195
CDX1
37 67 0.04857
HB-223
CYP27B1
27 51 0.09594
Table 4.3: Summary of Replicate MethyLight Reactions
Table 4.3: The MethyLight reactions (ID) above were completed in
duplicate on each of the 26 samples and the PMR values from each
replicate were averaged prior to analysis. From that averaged data set,
we calculated the average PMR value (AVG) for ABC-DLBCL
samples as well as the GCB-DLBCL samples. An uncorrected
student’s T-test (TTEST) was also calculated. (See also Appendix H)
92
ZNF615 (HG-431 vs. Clone 52.D7)
y = -0.0039x + 0.892
R
2
= 0.1658
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
Figure 4.1: The PMR value for MethyLight reaction HB-431 is plotted
along the x-axis for each sample analyzed. The Cy3/Cy5 ratio from the CpG
island microarray experiment is plotted on the y-axis. The greater the PMR
value the greater the methylation of the sample. In contrast, the lower the
Cy3/Cy5 ratio, the greater predicted levels of methylation. The equation of
the line and the R
2
value of the trend line are also given.
Figure 4.1: ZNF615: MethyLight vs. CpG Island Microarray
93
Figure 4.2: Position of HB-442 Sequencing Primers
Unmodified Sequence:
caccactcccCGctacctacCGaggtgcatcctgcaggctccttacCGcaagcctgcaaactCGccctgcC
GggCGCGgagtgcaattaggctttggggtggtttggctctcCGgctttcCGtagcctctggcccCGccccc
tagcaaCGCGctggcttgtgttaacaacCGgccCGggatcagaggtctgggtcaactggggggCGgcagCG
gCGctaagCGgactgtatggCGgtggcctaggcccctggCGgaattttgggacctttCGCGactctagCGa
ctctcaggctgccttcccttctCGgtggCGgggcctctttgggcccagCGgctgCGggCGcactgtaggac
aggaagatccccccactctccacccCGcCGccacCGgccatgtgg
Bisulfite Converted Sequence:
TattatttttCGttatttatCGaggtgtattttgtaggttttttatCGtaagtttgtaaattCGttttgtC
GggCGCGgagtgtaattaggttttggggtggtttggtttttCGgtttttCGtagtttttggtttCGttttt
tagtaaCGCGttggtttgtgttaataatCGgttCGggattagaggtttgggttaattggggggCGgtagCG
gCGttaagCGgattgtatggCGgtggtttaggtttttggCGgaattttgggatttttCGCGattttagCGa
tttttaggttgtttttttttttCGgtggCGgggtttttttgggtttagCGgttgCGggCGtattgtaggat
aggaagatttttttattttttatttCGtCGttatCGgttatgtgg
Figure 4.2: Both the unmodified and sodium bisulfite converted sequence is
given above. The CpG dinucleotides are underlined in both the unmodified
and converted sequences. All non-CpG cytidines have been converted to
thymidine in the bisulfite converted sequence. The Yellow indicates the
position of the bisulfite sequencing primer Set. Green marks the position of
the MethyLight primer/probe set (HB-442). Blue highlights were the
MethyLight primers and sequencing primers overlap.
94
CHAPTER 5: Comparisons of PCR-Based Amplification Systems
The global analysis of complex libraries of nucleic acids is playing an
increasingly important role in the identification and scoring of polymorphisms and
mutations in the human genome (Hughes, Arneson et al. 2005). In order to
construct these libraries, it is often necessary to amplify the starting genomic DNA,
or a specific fraction thereof, with minimal bias. The recent development of
multiple displacement amplification (MDA) protocols has enabled accurate whole
genome amplification (WGA) from limited numbers of cells (Dean, Hosono et al.
2002; Hosono, Faruqi et al. 2003; Hughes, Arneson et al. 2005). Despite the major
impact such protocols have had on genotyping and mutational analyses, they are
less well-suited for applications necessitating the amplification of specific fractions
of the human genome. For example, oligonucleotide microarray-based genotyping
and mutation detection assays provide their most optimal performance for human
genomic libraries of greatly reduced complexity (i.e. less than 5% of the genome
represented) (Hacia 1999; Dong, Wang et al. 2001; Patil, Berno et al. 2001; Fang,
Greiner et al. 2003; Kennedy, Matsuzaki et al. 2003; John, Shephard et al. 2004;
Matsuzaki, Loi et al. 2004; Hinds, Stuve et al. 2005).
PCR-based techniques provide a robust means of amplifying specific
fractions of the human genome (Hughes, Arneson et al. 2005). In fact, they have
been employed in numerous studies involving DNA microarray-based genotyping
95
and mutational analyses. For example, widely used microarray-based protocols for
genotyping over 10,000 single nucleotide polymorphisms (SNPs) across the human
genome require the use of PCR-based strategies to generate reduced complexity
libraries for analysis (Kennedy, Matsuzaki et al. 2003; John, Shephard et al. 2004;
Matsuzaki, Loi et al. 2004). However, PCR-based protocols of this nature are
hindered by the presence of sequence stretches that are intractable to amplification
by conventional means such as CpG island sequences which are proximal to nearly
half of all human genes (Antequera and Bird 1993; Lander, Linton et al. 2001;
Venter, Adams et al. 2001; Waterston, Lindblad-Toh et al. 2002). PCR efficiency
is often compromised at such sites due to their sequence content and potential for
secondary structures (Varadaraj and Skinner 1994; Baskaran, Kandpal et al. 1996;
McDowell, Burns et al. 1998; Benita, Oosting et al. 2003). This can lower the
sensitivity and specificity of resequencing analyses and skew experimental results.
Furthermore, poor amplification efficiency also increases the difficulty of
conventional PCR-based mutational analyses of individual promoter regions using
conventional PCR-based strategies.
Prior to the execution of the experiments described in Chapter Two, we
evaluated the relative abilities of four commercially available PCR systems to
amplify complex genomic templates with localized regions of high GC-content that
are refractory to conventional reaction and cycling conditions. This is especially
important for the McrBC-based methylation analysis outlined previously in the
96
second chapter. In particular, the McrBC assay is largely dependent upon the
amplification of sequences that are GC-rich.
To rapidly and efficiently make these comparisons, human genomic
libraries amplified using each of these PCR systems were subject to hybridization
analysis on the CpG island microarrays highlighted in the previous chapter. This
allowed us to evaluate the amplification efficiencies of more than a thousand
different DNA segments with high-GC content in parallel. Here, we demonstrate
that DNA microarrays provide a robust platform for evaluating the relative abilities
of different systems to faithfully amplify complex genomic libraries containing
problematic sequence tracts. These results were previously published (Pike,
Groshen et al. 2006) and are reproduced, here, with permission from the publisher
(Appendix K).
MATERIALS AND METHODS
CpG Microarray Manufacture:
An arrayed and sequenced copy of the CpG Island library was generated as
previously described (Nouzova, Holtan et al. 2004). The clones for this library
were originally described in Cross et al. (1994) and purchased from the
Geneservice Ltd. The inserts of these clones were PCR-amplified in 96-well
format by transferring approximately 2 ul of each bacterial culture to a PCR master
mix volume of 48 ul containing: 1X GeneAmp PCR Buffer II (Applied Biosystems
97
Inc.), 2.5 mM MgCl
2
, 0.5 mM of each dNTP, 0.8 uM of vector specific primer and
2 units of AmpliTaq Gold DNA Polymerase (Applied Biosystems Inc.). The
resulting PCR products were cleaned using a MinElute 96 UF PCR Purification Kit
(Qiagen, Inc.), according to the manufacturer’s protocol. To confirm insert
amplification, a fraction of each resultant PCR product was visualized by agarose
gel electrophoresis using 1% precast 96 well E-gels (Invitrogen Inc.). The 96-well
trays containing the purified PCR products were lyophilized dry overnight at room
temperature, and resuspended in 3X SSC. These products were spotted onto Epoxy
coated microscope slides (Corning Inc.) using a RoboArrayer (RoboDesign
International, Inc.) at the USC Spotted Microarray Core Facility. After arraying the
clones, the slides were processed according to the manufacturer’s protocol. The
overall quality and amount of DNA in each spotted clone was assessed by staining
unused microarrays with SYBR green II, essentially according to published
protocols (Battaglia, Salani et al. 2000).
Target Preparation and Hybridization:
Target was prepared for hybridization to the CpG island microarrays as
previously described, with minor modifications (Fig. 5.1). First, 15 ug of total
genomic DNA from a human mesothelioma cell line was digested to completion
with the restriction endonuclease MseI (New England BioLabs). Afterwards, the
sample was cleaned by phenol/chloroform extraction, precipitated and brought to a
98
concentration of ~100 ng/ul with water. Double stranded DNA linkers, created by
annealing two complimentary oligonucleotides (See Chapter 2), were then ligated
to the digested MseI fragments. Each reaction contained 1 ug of MseI-digested
genomic DNA, 40 pmol of annealed linker, 1X Ligase Buffer and 200 units of T4
DNA Ligase (New England BioLabs), in a 25 ul volume. Following incubation at
16 °C for 12 hours, up to four separate ligation reactions were pooled and diluted
with water to a concentration of ~10 ng/ul digested DNA.
PCR was carried out on 20 ng of linker ligated genomic DNA in 50 ul
volumes using four different commercially available PCR systems. For each
amplification system tested, PCR reactions were conducted three independent
times, all on different days. The concentrations of dNTP (0.2 mM each), and
primer (100 pmols of linker specific primer) were kept constant for all PCR
reactions. The remaining components of each PCR reaction varied for each
polymerase being evaluated. For Taq DNA Polymerase (Invitrogen, Inc.), 1X PCR
buffer (Invitrogen, Inc.), 1.5 mM MgCl
2
, and three units of Taq DNA Polymerase
was combined with the primer and dNTP concentrations above. For the AccuPrime
DNA Polymerase (Invitrogen, Inc.), 1X AccuPrime Buffer A (Invitrogen, Inc.), and
three units of AccuPrime DNA Polymerase was used. For the ThermalAce DNA
Polymerase (Invitrogen, Inc.), 1X ThermalAce Buffer and three units of
ThermalAce DNA Polymerase were used. In the case of the GC-RICH PCR
system (Roche), 1X GC-RICH Resolution Solution, 1X GC-RICH Reaction Buffer
99
and three units of GC-RICH PCR Enzyme was used. To minimize PCR bias, the
product was amplified using a minimum number of PCR cycles (an initial cycle of
72 ºC for 15 minutes and 95 ºC for 3 minutes followed by 22 cycles of 95 ºC for 2
minutes, 55 ºC for 1 minute and 72 ºC for 3 minutes). Following PCR
amplification, each reaction was cleaned on a Qiagen MinElute PCR Purification
column (Qiagen Inc.). We modified the manufacturer’s protocol by adding two
additional 75% EtOH wash steps prior to the elution of PCR products from the
column with 15 ul of water.
The purified PCR product was random prime-labeled with aminoallyl-
dUTP using the BioPrime® Array CGH Kit (Invitrogen, Inc.). Briefly, 1 ug of
PCR product was combined with 20 ul of 2.5X Random Primer Mix in a 41 ul
volume and incubated at 95 ºC for five minutes and chilled on ice for two minutes.
Next, 5 ul of dUTP Mix was added along with 1.5 ul of 10 mM aminoallyl-dUTP
(Ambion) and 1 ul of Klenow enzyme. Each labeling reaction was incubated at 37
ºC for two hours and cleaned on a Qiagen MinElute PCR Purification column as
described above. The products were eluted from the column with 13 ul of 50 mM
sodium bicarbonate buffer (pH. 9.0)
To conjugate these products with the appropriate dye, each eluted product
was added to a 150 ug packet of lyophilized Cy3 or Cy5 mono-reactive NHS ester
(GE Healthcare). The conjugation was carried out at room temperature for two
hours and the Cy3- or Cy5-labeled products were purified on a Qiagen MinElute
100
PCR Purification column and eluted twice with 15 ul of EB buffer. Afterwards, the
DNA and dye content of the targets were quantified using a ND-1000
Spectrophotometer (NanoDrop Technologies).
For each experiment, 2.5 ug of Cy3-labeled test target, made using one of
the four DNA polymerases described above, was combined with 2.5 ug of the
pooled Cy5-labeled reference target, made using Taq DNA polymerase, in 400 ul
of Hybridization Buffer (MWG Biotech). The hybridization mix was denatured at
95 ºC for five minutes and cooled on ice for an additional five minutes prior to its
hybridization to the surface of the microarray. Hybridizations were carried out in a
SureHyb Hybridization Chamber (Agilent Technologies) at 42 ºC for 16 hours.
Afterwards, the microarrays were washed five minutes in 2X SSC with 0.1% SDS,
five minutes in 1X SSC and five minutes in 0.5X SSC before being centrifuged at
1,500 RPM for two minutes to dryness.
101
MseI Digest
Genomic DNA
Linker Ligation
Test Reference
PCR and Label
Hybridize
Figure 5.1: Sample Preparation for Hybridization Analysis
Figure 5.1: Targets were
prepared for hybridization as
follows; Genomic DNA was
digested with MseI and double
stranded linkers were ligated to
the ends of each fragment. The
fragment pool was then divided
equally into Test and Reference
fractions. Using linker specific
primers, the Test fraction was
PCR amplified by one of four
polymerases while the Reference
fraction was PCR amplified by
Taq polymerase. Following
amplification, the resultant PCR
products were labeled with Cy3
(Test) and Cy5 (Reference). The
Cy3 and Cy5 products were then
co-hybridized to a CpG island
microarray. The Cy3/Cy5 ratio
for each clone on the array was
then calculated as a relative
measure of amplification
efficiency
102
Data Acquisition and Analysis:
Each processed microarray was scanned using the multi-laser ScanArray
5000 scanner (GSI Lumonics, Inc.) and the resulting images were processed using
ScanArray Express (PerkinElmer) software. The mean Cy3 test and Cy5 reference
target signal intensity and local background was calculated for each clone in the
microarray using the ImaGene (BioDiscovery, Inc., El Segundo, CA) software
package. These intensities were imported into Microsoft Excel where background
subtracted signals were log-transformed for further analysis. In total, four
comparisons (one for each enzyme evaluated) were made in triplicate. Following
data extraction, the log
2
background subtracted signal intensities were averaged for
the independent Cy3 test replicates, as was the Cy5 reference data across all the
experiments.
RESULTS:
The details concerning the construction of the CpG island microarrays used
in this study have previously been described (Nouzova, Holtan et al. 2004).
Briefly, they consist of 5,376 PCR-amplified clone inserts, averaging
approximately 500-bp in length, from a human CpG island library previously
described by Cross et al (Cross, Charlton et al. 1994). The library consists largely
of unique sequences which account for approximately 75% of the clones. The
103
remaining clones represent repetitive elements
such as Alu repeats, satellite DNA,
rDNA, and mitochondrial elements. Empirically, we found that ~80% of the clones
yielded a single PCR product (Materials and Methods). After amplification, the
PCR products were printed onto Epoxy-coated slides to create the CpG island
microarrays (Materials and Methods).
A preliminary survey of commercially available PCR amplification systems
yielded six candidates (AccuPrime (Invitrogen Inc), GC-RICH PCR system
(Roche), ThermalAce (Invitrogen Inc.), Therminator (New England BioLabs),
Deep Vent (exo-) (New England BioLabs), and Herculase (Stratagene Inc.)) that
have been optimized to more efficiently amplify templates with high GC-content.
Empirically, we identified three (i.e. AccuPrime, GC-RICH PCR system, and
ThermalAce) enzymes that generated robust PCR products from the linker-ligated
human genomic DNA templates (Material and Methods). Accuprime is a mixture
of Taq DNA polymerase with other accessory proteins, GC-RICH PCR system is a
blend of Taq polymerase with Tgo polymerase having 3’-5’ exonuclease proof-
reading activity, and ThermalAce contains a DNA polymerase from the
archaebacterium Pyrolobus fumarius with 3’-5’ exonuclease proof-reading activity.
The other amplification systems that failed to consistently yield robust PCR
products from the same linker–ligated templates using our conditions were not
subjected to further analysis (data not shown).
104
We conducted a series of twelve experiments that compared the properties
of four PCR amplification systems (AccuPrime, GC-RICH PCR, ThermalAce, and
Taq) in triplicate using targets generated from independent amplifications.
Individual Cy3-labeled test targets were co-hybridized to the CpG island
microarray with a Cy5-labeled Taq reference sample made with Taq DNA
polymerase (Fig. 5.1), a commonly used enzyme for whole genome PCR
amplification. The co-hybridization of Cy3-labeled Taq test and Cy5-labeled
reference targets provides an empirical measurement of the effects of the Cy3 and
Cy5 dyes on target hybridization. Furthermore, it should be emphasized that equal
amounts of test and reference samples were hybridized to each microarray. This
ensures that we focused on measuring the relative abundance of fragments with
high GC-content in a given labeled target population rather than simply the amount
of material applied to the microarray.
We employed a series of electronic filters in order to identify which of the
5,376 printed clones yielded robust hybridization data across all twelve
experiments (Fig. 5.2). We used only those clones that passed these filters in all
our subsequent analyses. As an initial filter, we disregarded clones with poor
amplification yields or non-specific amplification, as visualized by gel
electrophoresis (Materials and Methods), from the analysis. Second, all clones
judged by the ImaGene software as showing negative or poor quality hybridization
data, based on spot signal or morphology, were eliminated. Thirdly, those
105
remaining clones still having a mean signal intensity (Cy3 or Cy5) less than two-
fold above local background or with a mean signal (Cy3 or Cy5) greater than
55,000 were removed from the analysis. This allowed us to focus our analyses on
clones that consistently produce hybridization signals within the linear range of our
imaging system. Furthermore, it partially eliminates those clones that show
excessive signal that is likely due to cross-hybridization. Additionally, we focused
our analysis on those clones that passed the filtering criteria described above in all
twelve experiments (four polymerases analyzed in triplicate) conducted. While
these filtering steps limited the number of clones in our final analysis to a fraction
of those printed on the array (approximately 26%), it allowed us to maximize the
consistency and lessen bias in our comparisons. We used log-transformed
hybridization signal intensities from each of these clones in all subsequent
analyses.
To minimize experimental variation in our system, the Cy3-labeled test
target hybridization signals for each clone in all three replicates were averaged
prior to further analysis. Similarly, the hybridization signals from the Cy5-labeled
reference target in all twelve experiments were averaged to produce a composite
baseline against which the average Cy3-labeled test target signals were compared.
The data from each of the twelve experiments provided similar results (Pike,
Groshen et al. 2006).
106
To assess the global hybridization properties of each DNA amplification
system, we plotted the mean hybridization signals from test targets (AccuPrime,
GC-RICH PCR system, ThermalAce, and Taq) against those from the mean
hybridization signals from the reference target (Taq) (Fig. 5.3 – Fig 5.6). As
expected, the test and reference targets, each made using Taq DNA polymerase,
displayed similar hybridization properties (Fig. 5.3). The slope (0.985) and
intercept (0.3126) of the trend-line approached the ideal values of one and zero,
respectively, with a correlation coefficient of 0.9855. This indicates that Cy3- and
Cy5- labeling does not confer substantially different hybridization properties to
targets. This stands in marked contrast to the comparisons made between test
targets generated using the other three amplification systems and the Taq-amplified
reference target (Fig. 5.4 - 5.6). In each of these three comparisons, the test
signal
intensity is greater than the reference data set for the majority of clones surveyed.
This suggests that, on average, all three systems (AccuPrime, GC-RICH PCR, and
ThermalAce), outperform Taq polymerase in amplifying the genomic segments
complementary to the clones in the CpG island microarray.
Upon closer inspection, the targets generated using the AccuPrime, GC-
RICH PCR, and ThermalAce systems especially outperformed the Taq reference
target for those clones showing low reference hybridization signals. This is evident
when examining the slope of the trend-line for each comparison. While the slope
of the line is nearing 0.99 in the Taq comparison (Fig. 5.3), the slopes are 0.89,
107
0.86, and 0.92 for the AccuPrime, GC-RICH PCR system, and ThermalAce
comparisons, respectively (Fig. 5.4 - 5.6). This skewing could be expected since
the modified PCR systems are designed to improve the amplification of
problematic templates with high GC-content relative to Taq. The clones producing
low reference hybridization signals could represent genomic regions that are
inefficiently amplified by Taq and thus are most likely to be rescued using the other
amplification systems. In contrast, clones producing robust reference hybridization
signals arguably represent genomic regions that are already efficiently amplified by
Taq and thus cannot be rescued using the other DNA polymerases. Since a
substantial number of clones with almost equivalent test and reference target
hybridization signals exist (as discussed below), our observations are most likely
due to differences in the amplification of specific library members, but not the
entire library as a whole.
108
Total Number Arrayed
5,376 Clones
2X Background < Mean Signal < 55,000 Units
1,428 Clones
Pass ImaGene Software Quality Control
1,497 Clones
Pass PCR Examination
4,395 Clones
Figure 5.2: Flowchart of Data Filtration
Figure 5.2. Flowchart of in silico data filtration. The
data from all 12 experiments were electronically
filtered, as shown. The process is outlined with each
filtering step and the number of clones remaining after
the filtering step is provided.
109
y = 0.9846x + 0.3126
R
2
= 0.9855
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Mean Log2 Cy5 Taq Signal
Mean Log2 Taq Cy3 Signal
Figure 5.3: Taq vs. Taq Plot
Figure 5.3: Comparison of hybridization signals from two Taq generated
targets. The mean log2 hybridization signals from the Cy3-labeled test targets
is plotted on the y-axis. The mean log
2
hybridization signals from Cy5-labeled
Taq reference targets is plotted along the x-axis. An idealized line (y=x) is
shown in red along with a calculated trend line in black. The equations for each
trend line and the R
2
values are also shown.
110
y = 0.8897x + 1.5923
R
2
= 0.9378
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Mean Log2 Cy5 Taq Signal
Mean Log2 AccuPrime Cy3 Signal Figure 5.4: Comparison of hybridization signals from AccuPrime generated
targets relative to Taq. The mean log
2
hybridization signals from the Cy3-
labeled test targets is plotted on the y-axis. The mean log
2
hybridization signals
from Cy5-labeled Taq reference targets is plotted along the x-axis. An
idealized line (y = x) is shown in red along with a calculated trend line in
black. The equations for each trend-line and the R
2
values are also shown.
Figure 5.4: AccuPrime vs. Taq Plot
111
y = 0.86x + 2.2005
R
2
= 0.9506
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Mean Log2 Cy5 Taq Signal
Mean Log2 GC-RICH Cy3 Signal
Figure 5.5: Comparison of hybridization signals from GC-RICH PCR System
generated targets relative to Taq. The mean log
2
hybridization signals from the
Cy3-labeled test targets is plotted on the y-axis. The mean log
2
hybridization
signals from Cy5-labeled Taq reference targets is plotted along the x-axis. An
idealized line (y = x) is shown in red along with a calculated trend line in
black. The equations for each trend-line and the R
2
values are also shown.
Figure 5.5: GC-RICH vs. Taq Plot
112
y = 0.9228x + 1.3642
R
2
= 0.9571
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Mean Log2 Cy5 Taq Signal
Mean Log2 ThermalAce Cy3 Signal
Figure 5.6: Comparison of hybridization signals from ThermalAce generated
targets relative to Taq. The mean log
2
hybridization signals from the Cy3-
labeled test targets is plotted on the y-axis. The mean log
2
hybridization signals
from Cy5-labeled Taq reference targets is plotted along the x-axis. An
idealized line (y = x) is shown in red along with a calculated trend line in
black. The equations for each trend-line and the R
2
values are also shown.
Figure 5.6: ThermalAce vs. Taq Plot
113
For a more quantitative analysis of this phenomenon, hybridization data was
further divided into four reference signal intensity ranges or quartiles based on the
mean of the log transformed reference target hybridization signals (Fig. 5.7 – 5.10
and Table 5.1). Each Quartile evaluates 357 (25%) of the 1,428 clones surveyed
(Fig. 5.2). This provides us with a objective means of comparing the relative
properties of clones with low (Fig. 5.7: Quartile 1), low-medium (Fig. 5.8: Quartile
2), medium-high (Fig. 5.9: Quartile 3), and high (Fig. 5.10: Quartile 4) reference
target hybridization signals. As stated above, a likely explanation for these
differences in hybridization signal intensities is that these quartiles represent a
gradation from inefficient to efficient reference target amplification using Taq
polymerase. However, a more trivial explanation is that the differences in
reference target hybridization signals among quartiles are primarily due to different
amounts of spotted DNA in the clones comprising each quartile. We sought to
minimize potential microarray manufacturing effects by only analyzing data from
spotted clones derived from robust and specific PCR products (Fig. 5.2).
Furthermore, we assessed the amount of spotted DNA in the clones present in each
quartile by staining three unused CpG island microarrays with SYBR green II, a
fluorescent nucleic acid binding dye (Battaglia, Salani et al. 2000). Overall, the
clones in each quartile yielded nearly identical SYBR green II fluorescent signals
(i.e. on average, a 1.4% pair-wise difference in average fluorescent signals between
quartiles) (see Supplementary Table S2 (Pike, Groshen et al. 2006)). This strongly
114
supports the assertion that differences in reference target hybridization signals in
each quartile are primarily influenced by amplification efficiency of target nucleic
acid rather than the quality of CpG island microarray.
Overall, we found that Quartiles 1 and 2 showed the greatest skewing of
hybridization signals among the various amplification systems (Fig. 5.7 and 5.8,
respectively). In Quartile 1, test targets generated using the AccuPrime (1.22-fold),
GC-RICH PCR (1.48-fold), and ThermalAce (1.31-fold) systems all demonstrate a
significant improvement (i.e. enhancement factor) in hybridization signals relative
to Taq-generated test targets (Table 5.1). Similar enhancements in the
hybridization signals from test targets made using the AccuPrime, GC-RICH PCR,
and ThermalAce amplification systems were also observed for Quartile 2.
Strikingly, over 45% of the clones surveyed showed at least a 1.5-fold increase in
hybridization signal from GC-RICH PCR system test targets relative to Taq test
targets in both Quartiles 1 and 2 (Fig. 5.7 and 5.8). Thus, the level of hybridization
signal enhancement in both Quartiles 1 and 2 are in the following order: GC-RICH
PCR system > ThermalAce ≈ AccuPrime > Taq.
Quartile 3 (Fig. 5.9) displayed less pronounced enhancements in
hybridization signals from AccuPrime, GC-RICH PCR system, and ThermalAce
test targets relative to those from Taq test targets. For example, AccuPrime targets
showed a marginal 1.13-fold enhancement relative to Taq test targets. However,
the GC-RICH PCR system (1.38-fold) and ThermalAce (1.28-fold) generated
115
targets still showed substantial hybridization signal enhancements relative to Taq
test targets (Table 5.1). The largest skewing of hybridization signals appeared in
the GC-RICH PCR test targets with >30% of the clones showing at least a 1.50-
fold increase in hybridization signal relative to Taq test targets (Fig. 5.9). Based on
these observations, the level of hybridization signal enhancement in Quartile 3 was:
GC-RICH PCR system ≥ ThermalAce >AccuPrime ≈ Taq.
Quartile 4 (Fig. 5.10) displayed negligible to marginal enhancements in
mean hybridization signals in the Accuprime, GC-RICH PCR system and
ThermalAce targets relative to Taq test targets (Table 5.1). The analysis of binned
hybridization enhancement factors also showed no substantial skewing across the
amplification systems analyzed (Fig. 5.10). These observations should not be
affected by the robust hybridization signals observed in this quartile since they are
still within the linear range of the imaging system used to acquire hybridization
data (Material and Methods).
Overall, it is highly desirable that the three modified DNA amplification
systems (AccuPrime, GC-RICH PCR, and ThermalAce systems) all display the
greatest enhancement in hybridization signals for those clones that yield the lowest
hybridization signals using Taq reference targets (i.e. Quartiles 1 and 2). The
decrease in these modified PCR system-based enhancements in Quartiles 3 and
(especially) 4 are in keeping with the hypothesis that these genomic segments are
efficiently amplified with Taq DNA polymerase. The fact that targets generated
116
using all four amplification systems (Taq included) show almost equivalent
hybridization signals in Quartile 4 suggests that they all have approximately the
same hybridization specificity when present at comparable concentrations. This
supports our hypothesis that the enhancements in AccuPrime, GC-RICH PCR
system, and ThermalAce target hybridization signals in Quartiles 1 and 2 are
primarily due to an increased representation in these libraries of DNA fragments
that are inefficiently amplified using Taq DNA polymerase. Finally, it is important
to emphasize that we did not consider all the microarray data when coming to our
conclusions concerning the properties of the four amplification systems, but
focused on a subset (approximately one-quarter) of the clones that provided the
most robust data and thus met our filtration criteria.
117
Quartile 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
<0.25 0.25-
0.50
0.50-
0.75
0.75-
1.00
1.00-
1.25
1.25-
1.50
1.50-
1.75
1.75-
2.00
2.00-
2.25
>2.25
Enhancement Factor Bins
Fraction of Clones Figure 5.7: Distribution of enhancement factor values in Quartile 1.
Enhancement factors (i.e., the ratio of the geometric means of Cy3-labled
test target and Cy3-labeled Taq-generated target hybridization signals) were
calculated for each clone in each quartile for the AccuPrime, GC-RICH, and
ThermalAce amplification systems. These enhancement factors were placed
into bins along the x-axis in the appropriate quartile. The y-axis indicates the
fraction of clones present in each enhancement factor bin. The enhancement
factors of AccuPrime (blue), GC-RICH PCR system (red), and ThermalAce
(green) tests targets are depicted by different colored lines. The dashed
vertical line depicts an enhancement ratio of one.
Figure 5.7: Quartile 1 Enhancement Factor Values
118
Quartile 2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
<0.25 0.25-
0.50
0.50-
0.75
0.75-
1.00
1.00-
1.25
1.25-
1.50
1.50-
1.75
1.75-
2.00
2.00-
2.25
>2.25
Enhancement Factor Bins
Fraction of Clones Figure 5.8: Distribution of enhancement factor values in Quartile 2.
Enhancement factors (i.e., the ratio of the geometric means of Cy3-labled
test target and Cy3-labeled Taq-generated target hybridization signals) were
calculated for each clone in each quartile for the AccuPrime, GC-RICH, and
ThermalAce amplification systems. These enhancement factors were placed
into bins along the x-axis in the appropriate quartile. The y-axis indicates the
fraction of clones present in each enhancement factor bin. The enhancement
factors of AccuPrime (blue), GC-RICH PCR system (red), and ThermalAce
(green) tests targets are depicted by different colored lines. The dashed
vertical line depicts an enhancement ratio of one.
Figure 5.8: Quartile 2 Enhancement Factor Values
119
Quartile 3
0.0
0.1
0.2
0.3
0.4
0.5
0.6
<0.25 0.25-
0.50
0.50-
0.75
0.75-
1.00
1.00-
1.25
1.25-
1.50
1.50-
1.75
1.75-
2.00
2.00-
2.25
>2.25
Enhancement Factor Bins
Fraction of Clones
Figure 5.9: Distribution of enhancement factor values in Quartile 3.
Enhancement factors (i.e., the ratio of the geometric means of Cy3-labled
test target and Cy3-labeled Taq-generated target hybridization signals) were
calculated for each clone in each quartile for the AccuPrime, GC-RICH, and
ThermalAce amplification systems. These enhancement factors were placed
into bins along the x-axis in the appropriate quartile. The y-axis indicates the
fraction of clones present in each enhancement factor bin. The enhancement
factors of AccuPrime (blue), GC-RICH PCR system (red), and ThermalAce
(green) tests targets are depicted by different colored lines. The dashed
vertical line depicts an enhancement ratio of one.
Figure 5.9: Quartile 3 Enhancement Factor Values
120
Quartile 4
0.0
0.1
0.2
0.3
0.4
0.5
0.6
<0.25 0.25-
0.50
0.50-
0.75
0.75-
1.00
1.00-
1.25
1.25-
1.50
1.50-
1.75
1.75-
2.00
2.00-
2.25
>2.25
Enhancement Factor Bins
Fraction of Clones
Accu
GC
TAce
Figure 5.10: Distribution of enhancement factor values in Quartile 4.
Enhancement factors (i.e., the ratio of the geometric means of Cy3-labled
test target and Cy3-labeled Taq-generated target hybridization signals) were
calculated for each clone in each quartile for the AccuPrime, GC-RICH, and
ThermalAce amplification systems. These enhancement factors were placed
into bins along the x-axis in the appropriate quartile. The y-axis indicates the
fraction of clones present in each enhancement factor bin. The enhancement
factors of AccuPrime (blue), GC-RICH PCR system (red), and ThermalAce
(green) tests targets are depicted by different colored lines. The dashed
vertical line depicts an enhancement ratio of one.
Figure 5.10: Quartile 4 Enhancement Factor Values
121
Table 1. Quartile analysis of test target enhancement factors
Quartile analysis of test target enhancement factors
AccuPrime 1710 1.22 1.19 - 1.25
GC-RICH 2106 1.48 1.45 - 1.51
ThermalAce 1859 1.31 1.28 - 1.34
AccuPrime 2759 1.26 1.23 - 1.29
GC-RICH 3327 1.51 1.48 - 1.54
ThermalAce 2998 1.35 1.33 - 1.38
AccuPrime 4939 1.13 1.10 - 1.16
GC-RICH 6081 1.38 1.35 - 1.42
ThermalAce 5634 1.28 1.25 - 1.31
AccuPrime 17199 0.99 0.96 - 1.02
GC-RICH 19349 1.11 1.09 - 1.13
ThermalAce 20171 1.15 1.13 - 1.17
a
Geometric mean of test target hybridization signals
c
The 95% confidence interval of the mean enhancement factor
Test Target
Signal
a
Enhancement
b
95% CI
c
b
Mean enhancement factor in test target hybridization signals for all
clones in a given quartile. The enhancement factor for each clone is
defined as the ratio of the geometric means of Cy3 test and Cy3 Taq
target hybridization signals.
Quartile 1 Quartile 2 Quartile 3 Quartile 4
Table 5.1: Quartile Analysis of Enhancement Factors
Table 5.1: “Signal” is the geometric mean of test target
hybridization signals. “Enhancement” represents the mean
enhancement factor in test target hybridization signals for all
clones in a given quartile. The enhancement factor for each
clone is defined as the ratio of the geometric means of Cy3 test
and Cy5 Taq target hybridization signals. The “95% CI”
column is the 95% confidence interval of the mean
enhancement factor.
122
Thus far, we have used global approaches to describe overall trends in
amplification efficiencies in each of the PCR systems surveyed. Next, we sought to
determine if the enhancements seen for individual clones in one amplification
system were also found in the other systems. In other words, were the individual
enhancements seen for a given clone specific to a PCR system or shared across
multiple systems?
In order to do this, we first identified those clones with a robust (>2-fold)
enhancement in hybridization signal in each amplification system (Table 5.2).
Next, we queried whether those clones also showed comparable enhancements
(>1.5-fold) in each of the other systems examined (Table 5.2). This provides us
with an objective means to determine how consistent these robust enhancements
are across the multiple systems. For example, there were 41 clones showing a
greater than two-fold enhancement in target hybridization signal using the GC-
RICH system. Of these clones, 54% and 80% also showed comparable
enhancements in target hybridization signals using the AccuPrime and ThermalAce
systems, respectively. We found similar results in the other cross-comparisons.
Overall, this demonstrates that the majority of those robust enhancements seen for a
given clone were shared amongst multiple systems. However, there remain
enhancements in individual clones that are specific to targets made with a particular
amplification system. This indicates that while these three (AccuPrime, GC-RICH
PCR, and ThermalAce) systems have very similar global amplification properties,
123
enough heterogeneity exists that it is not possible to accurately predict which
system is best suited to amplify a given genomic region.
The hybridization-based protocols described in this study are especially
appealing for characterizing the ability of newly developed whole (Lage, Leamon
et al. 2003; Cardoso, Molenaar et al. 2004; Paez, Lin et al. 2004) or partial genome
amplification systems (Dong, Wang et al. 2001; Matsuzaki, Loi et al. 2004) to use
templates with high GC-content and/or other properties (i.e. secondary structures)
that can reduce their PCR efficiency. However, there are some caveats that can be
readily addressed in future experimental designs. For example, the printed PCR
products were generated using a hot-start Taq DNA polymerase that is not
especially well suited for efficiently amplifying all templates with high GC-content.
Therefore, clones that do not amplify using this Taq-based system are excluded
from this study even though, in theory, they may be efficiently amplified using the
systems we examined here. We sought to estimate the magnitude of this effect by
conducting PCR using the GC RICH PCR system and Taq DNA polymerase
(Supplementary Figure S1 and Supplementary Table S3 (Pike, Groshen et al.
2006)) on one hundred randomly selected clones from the CpG island library that
failed to amplify during the manufacture of the microarray. Based on
electrophoretic analyses on agarose gels, 78% of these clones yielded amplicons
with the GC Rich PCR system, while Taq polymerase only showed a 19% success
rate (Supplementary Figure S1 and Supplementary Table S3 (Pike, Groshen et al.
124
2006)). This strongly supports our initial conclusions concerning the relative
efficiencies of these PCR systems and also provides an estimate of the percentage
of clones that would have been rescued if the microarrays were constructed using
the GC RICH PCR system.
In the near future, microarrays consisting of oligonucleotides or amplicons
generated with other PCR systems (e.g. GC-RICH PCR system) will provide even
more comprehensive coverage of CpG islands and other difficult to amplify
sequences. This will further expand upon the utility of this method for the analysis
of genome amplification biases as well as other genomic applications. This
includes a wide-range of chromatin immunoprecipitation (ChIP) assays
(Rodriguez-Pinto 2005) that rely upon the PCR-based generation of complex
genomic libraries.
125
Amplification System II
Amplification System I AccuPrime GC-RICH ThermalAce
AccuPrime - 0.75 0.54
GC-RICH 0.54 - 0.80
ThermalAce 0.75 0.92 -
Table 5.2: Comparison of Enhancement Factors
Table 5.2: Amplification System I – Test targets generated using the
AccuPrime, GC-RICH, and ThermalAce systems produced 38, 41, and 12
clones, respectively, with a more than two-fold mean enhancement in
hybridization signal relative to test targets generated with Taq. Amplification
System II - Fraction of clones showing a more than two-fold mean enhancement
in hybridization signal for targets generated using amplification system I that
also show a >1.5-fold mean enhancement in hybridization signals for targets
generated using amplification system II.
126
CHAPTER 6: Summary and Discussion
Diffuse Large B Cell Lymphoma (DLBCL) is as an aggressive malignancy
of the mature B lymphocyte that accounts for approximately one-third of all cases
of Non-Hodgkins Lymphoma (Lossos 2005; Lossos and Morgensztern 2006).
Extensive molecular analyses of gene expression profiles (Alizadeh, Eisen et al.
2000; Rosenwald, Wright et al. 2002; Wright, Tan et al. 2003), copy number
changes (Bea, Zettl et al. 2005; Tagawa, Suguro et al. 2005; Chen, Houldsworth et
al. 2006), and protein biomarkers (Hans, Weisenburger et al. 2004) have begun to
elucidate the molecular bases for their extensive clinical heterogeneity and shed
additional light on the etiology of these diseases. For example, gene expression
analyses have identified two distinct subtypes, germinal center B cell-like (GCB-
DLBCL) and activated B cell-like (ABC-DLBCL), that likely originate from
different stages of normal B cell development (Lossos 2005). These subtypes have
significant differences in overall survival (OS), with GCB-DLBCL patients having
a longer median OS than ABC-DLBCL patients (Alizadeh, Eisen et al. 2000;
Rosenwald, Wright et al. 2002).
We undertook a multi-staged approach using two different screening
technologies (CpG island microarrays (Nouzova, Holtan et al. 2004; Pike, Groshen
et al. 2006) and MethyLight (Eads, Danenberg et al. 2000; Weisenberger, Campan
et al. 2005)) to identify gene-associated CpG islands whose methylation levels
127
varied among DLBCL. In particular, we were interested in CpG islands with
methylation levels that can distinguish between the closely related GCB-DLBCL
and ABC-DLBCL subtypes. In turn, this information could be useful for predicting
clinical outcomes and examining relationships with gene expression and genomic
copy number profiles.
In our initial surveys, we employed CpG island microarrays and
MethyLight to screen for CpG islands with variation among DLBCL. The CpG
island microarray assays evaluated the methylation dependent cleavage of 592 gene
associated CpG islands by McrBC endonuclease. Overall, we found no global
differences in the levels of DNA methylation between the two subtypes. However,
24 candidate CpG islands demonstrated differences in methylation levels (p<0.05,
uncorrected Student’s t-test) between these subtypes (Table 2.3). Included among
the 24 were CpG islands proximal genes ONECUT2 (Jacquemin, Lannoy et al.
2001), PFDN5 (Mori, Maeda et al. 1998; Fujioka, Taira et al. 2001), GNMT (Chen,
Chen et al. 2000; Chen, Lin et al. 2004) and GTF2A2 (Han, Zhou et al. 2001).
In conjunction with the microarray experiments mentioned above, we
analyzed 262 gene-associated CpG islands in DLBCL using MethyLight, a
quantitative bisulfite PCR-based assay (Eads, Danenberg et al. 2000;
Weisenberger, Campan et al. 2005). The MethyLight analysis was originally
conducted in two phases. The first phase involved testing all 262 MethyLight
markers on a limited sample set that included both ABC-DLBCL and GCB-
128
DLBCL. In this initial phase, we identified 89 CpG islands with moderate
differences (i.e. coefficient of variation > 5%) in methylation levels among the
DLBCL examined. These included three CpG islands (MGMT (Esteller 2003), AR
(Yang, Chen et al. 2003), and CDKN2A (Garcia, Martinez-Delgado et al. 2002))
previously reported to be methylated in lymphoma. Next, we analyzed these 89
CpG islands in an expanded set of DLBCL that included those previously analyzed
using the CpG island microarrays
9
. Five CpG islands showed significant
differences in DNA methylation levels between the two subtypes (p < 0.05,
uncorrected Students t-test) (Table 4.2). Remarkably, ONECUT2 was
independently identified by both CpG island microarrays and MethyLight as
showing the most robust differential methylation between the subtypes.
In the analyses outlined in Chapter 5, we analyzed 30 genomic sites for
DNA methylation by MethyLight in 26 DLBCL
10
(16 GCB-DLBCL, 10 ABC-
DLBCL). These included 28 CpG islands found by microarray (N = 17) and
MethyLight (N = 11) analyses. In addition, we included two CpG islands
(proximal to FAM3C and SPINK2) with expression profiles (Wright, Tan et al.
2003) consistent with DNA methylation being causally involved in differential
expression (i.e. weak expression in the one subtype and moderate to high
9
We initially examined 15 DLBCL in the microarray studies of Chapter 2 ( seven ABC-DLBCL
and eight GCB-DLBCL). An additional GCB-DLBCL was combined with the original 15 DLBCL
in the MethyLight analysis. Therefore, 16 DLBCL were analyzed in the MethyLight study.
10
The 26 DLBCL include those samples previously analyzed in the MethyLight studies outlined in
Chapter Two, with the exception of sample number 4427.
129
expression in the other). Two gene-associated CpG islands (nearby ONECUT2 and
FLJ21062) continued to show differential methylation and gave exceptional p-
values for their ability to discriminate between the ABC-DLBCL and GCB-
DLBCL subtypes (Table 4.3). A third gene, CDX1 also appeared to distinguish
between the ABC-DLBCL and GCB-DLBCL subtypes, but had a more modest p-
value. As an independent means of confirming the MethyLight data, we also
performed bisulfite sequencing analysis of individual FLJ21062 subclones and
found that the results are in excellent agreement with those obtained by
MethyLight.
Given the association between DLBCL subtype and the differential DNA
methylation of ONECUT2, CDX1, and FLJ21062 CpG islands, it is tempting to
speculate about their involvement in the molecular etiology of these diseases.
However, it is not possible to determine whether these are acquired changes during
disease progression or lineage specific epigenetic markers of normal B-cell
development retained in the tumor since the cellular origin of ABC-DLBCL is not
fully defined. Nevertheless, the ability of the homeobox-containing transcription
factor ONECUT2 to inhibit activin/TGF-beta signaling in liver parenchyma
(Clotman, Jacquemin et al. 2005) is in keeping with the importance of these
pathways in normal B-cell development (Zipori and Barda-Saad 2001).
Additionally, The function of ONECUT2 overlaps, at least partially, with another
member of the ONECUT domain protein family, HNF6 (Jacquemin, Lannoy et al.
130
1999), which has been shown to play an important role in B lymphopoiesis
(Bouzin, Clotman et al. 2003) in fetal liver. Given the relatedness of HNF6 and
ONECUT2 (70% identity at the amino acid level) (Fig. 6.1), and the noted
observation of the involvement of HNF6 in B-cell development, it is tempting to
speculate that ONECUT2 may have an overlapping role in lymphopoiesis and, by
association, lymphoma development and progression.
CDX1 is a homeobox-containing transcription factor that plays a major role
in the normal development of mammalian intestinal epithelium (Guo, Suh et al.
2004) and in hematopoiesis in zebrafish (Davidson and Zon 2006). Demethylation
of CDX1 is associated with both gastric intestinal (Pilozzi, Onelli et al. 2004) and
Barrett's (Wong, Wilding et al. 2005) metaplasia. Unlike ONECUT2 and CDX1,
FLJ21062 encodes a 101-kDa hypothetical protein without annotated functional
motifs or ascribed function. Provided that this gene is unmethylated in normal
tissue, given its striking methylation profile in DLBCL, we suggest renaming this
gene HIC3 (Hypermethylated In Cancer 3).
Based on Knudson’s two hit hypothesis (Knudson 1971) we expected that
hypermethylation of potential tumor suppressor genes (e.g. ONECUT2, CDX1, and
FLJ21062) might correlate with the occurrence of genomic deletions. However,
ONECUT2 is present in the chromosome 18q21-22 region that is preferentially
amplified in ABC-DLBCL relative to GCB-DLBCL (Bea, Zettl et al. 2005;
Tagawa, Suguro et al. 2005). One possible explanation is that imprecise
131
amplification events selected for by the presence of a proto-oncogene (e.g. BCL2 in
the 18q21-22 genomic region) result in the amplification of nearby tumor
suppressor genes. If methylation precedes amplification, the silencing of tumor
suppressor genes immediately confers a maximal growth and/or survival advantage
to the cell. If methylation occurs after amplification, epigenetic silencing could
serve as a means of fine-tuning the selective advantage of the amplification event.
Overall, our studies have uncovered significant epigenetic differences
among cancers derived from closely related cells of origin. These results suggest
that DNA methylation markers may prove to be a useful tool in helping to better
define lymphoma into unique subtypes, thus improving diagnosis and treatment.
Furthermore, beyond the discriminatory power of such markers, DNA methylation
may provide clues to the etiology of the individual diseases that make up DLBCL.
132
Figure 6.1: ONECUT2 and HNF6 Protein Alignment
Figure 6.1: The amino acid sequences of ONECUT2 (Query) and HNF6 (Sbjct)
were aligned using BLAST (Altschul, Gish et al. 1990), and found to have 70%
identity.
133
References
Adorjan, P., J. Distler, et al. (2002). "Tumour class prediction and discovery by
microarray-based DNA methylation analysis." Nucleic Acids Res 30(5):
e21.
Alizadeh, A. A., M. B. Eisen, et al. (2000). "Distinct types of diffuse large B-cell
lymphoma identified by gene expression profiling." Nature 403(6769): 503-
11.
Altschul, S. F., W. Gish, et al. (1990). "Basic local alignment search tool." J Mol
Biol 215(3): 403-10.
Altshuler, D., L. D. Brooks, et al. (2005). "A haplotype map of the human
genome." Nature 437(7063): 1299-320.
Antequera, F. and A. Bird (1993). "Number of CpG islands and genes in human
and mouse." Proc Natl Acad Sci U S A 90(24): 11995-9.
Barrans, S. L., P. A. Evans, et al. (2003). "The t(14;18) is associated with germinal
center-derived diffuse large B-cell lymphoma and is a strong predictor of
outcome." Clin Cancer Res 9(6): 2133-9.
Baskaran, N., R. P. Kandpal, et al. (1996). "Uniform amplification of a mixture of
deoxyribonucleic acids with varying GC content." Genome Res 6(7): 633-8.
Battaglia, C., G. Salani, et al. (2000). "Analysis of DNA microarrays by non-
destructive fluorescent staining using SYBR green II." Biotechniques 29(1):
78-81.
Baylin, S. B. (2005). "DNA methylation and gene silencing in cancer." Nat Clin
Pract Oncol 2 Suppl 1: S4-11.
Bea, S., A. Zettl, et al. (2005). "Diffuse large B-cell lymphoma subgroups have
distinct genetic profiles that influence tumor biology and improve gene-
expression-based survival prediction." Blood 106(9): 3183-90.
Benita, Y., R. S. Oosting, et al. (2003). "Regionalized GC content of template DNA
as a predictor of PCR success." Nucleic Acids Res 31(16): e99.
134
Bouzin, C., F. Clotman, et al. (2003). "The onecut transcription factor hepatocyte
nuclear factor-6 controls B lymphopoiesis in fetal liver." J Immunol 171(3):
1297-303.
Briancon, N., A. Bailly, et al. (2004). "Expression of the alpha7 isoform of
hepatocyte nuclear factor (HNF) 4 is activated by HNF6/OC-2 and HNF1
and repressed by HNF4alpha1 in the liver." J Biol Chem 279(32): 33398-
408.
Cardoso, J., L. Molenaar, et al. (2004). "Genomic profiling by DNA amplification
of laser capture microdissected tissues and array CGH." Nucleic Acids Res
32(19): e146.
Chan, J. K. (2001). "The new World Health Organization classification of
lymphomas: the past, the present and the future." Hematol Oncol 19(4):
129-50.
Chen, S. Y., J. R. Lin, et al. (2004). "Glycine N-methyltransferase tumor
susceptibility gene in the benzo(a)pyrene-detoxification pathway." Cancer
Res 64(10): 3617-23.
Chen, W., J. Houldsworth, et al. (2006). "Array comparative genomic hybridization
reveals genomic copy number changes associated with outcome in diffuse
large B-cell lymphomas." Blood 107(6): 2477-85.
Chen, Y. M., L. Y. Chen, et al. (2000). "Genomic structure, expression, and
chromosomal localization of the human glycine N-methyltransferase gene."
Genomics 66(1): 43-7.
Clark, S. J., J. Harrison, et al. (1994). "High sensitivity mapping of methylated
cytosines." Nucleic Acids Res 22(15): 2990-7.
Clotman, F., P. Jacquemin, et al. (2005). "Control of liver cell fate decision by a
gradient of TGF beta signaling modulated by Onecut transcription factors."
Genes Dev 19(16): 1849-54.
Costello, J. F., M. C. Fruhwald, et al. (2000). "Aberrant CpG-island methylation
has non-random and tumour-type-specific patterns." Nat Genet 24(2): 132-
8.
135
Cross, S. H., J. A. Charlton, et al. (1994). "Purification of CpG islands using a
methylated DNA binding column." Nat Genet 6(3): 236-44.
Davidson, A. J. and L. I. Zon (2006). "The caudal-related homeobox genes cdx1a
and cdx4 act redundantly to regulate hox gene expression and the formation
of putative hematopoietic stem cells during zebrafish embryogenesis." Dev
Biol 292(2): 506-18.
Davis, R. E., K. D. Brown, et al. (2001). "Constitutive nuclear factor kappaB
activity is required for survival of activated B cell-like diffuse large B cell
lymphoma cells." J Exp Med 194(12): 1861-74.
Dean, F. B., S. Hosono, et al. (2002). "Comprehensive human genome
amplification using multiple displacement amplification." Proc Natl Acad
Sci U S A 99(8): 5261-6.
Dong, S., E. Wang, et al. (2001). "Flexible use of high-density oligonucleotide
arrays for single-nucleotide polymorphism discovery and validation."
Genome Res 11(8): 1418-24.
Eads, C. A., K. D. Danenberg, et al. (2000). "MethyLight: a high-throughput assay
to measure DNA methylation." Nucleic Acids Res 28(8): E32.
Eads, C. A., R. V. Lord, et al. (2001). "Epigenetic patterns in the progression of
esophageal adenocarcinoma." Cancer Res 61(8): 3410-8.
Eckhardt, F., S. Beck, et al. (2004). "Future potential of the Human Epigenome
Project." Expert Rev Mol Diagn 4(5): 609-18.
Ehrlich, M. (2003). "Expression of various genes is controlled by DNA methylation
during mammalian development." J Cell Biochem 88(5): 899-910.
Ehrlich, M., G. Jiang, et al. (2002). "Hypomethylation and hypermethylation of
DNA in Wilms tumors." Oncogene 21(43): 6694-702.
Esteller, M. (2003). "Profiling aberrant DNA methylation in hematologic
neoplasms: a view from the tip of the iceberg." Clin Immunol 109(1): 80-8.
Esteller, M. (2006). "The necessity of a human epigenome project." Carcinogenesis
27(6): 1121-1125.
136
Evans, L. S. and B. W. Hancock (2003). "Non-Hodgkin lymphoma." Lancet
362(9378): 139-46.
Fang, N. Y., T. C. Greiner, et al. (2003). "Oligonucleotide microarrays demonstrate
the highest frequency of ATM mutations in the mantle cell subtype of
lymphoma." Proc Natl Acad Sci U S A 100(9): 5372-7.
Feinberg, A. P. and B. Tycko (2004). "The history of cancer epigenetics." Nat Rev
Cancer 4(2): 143-53.
Feltus, F. A., E. K. Lee, et al. (2006). "DNA motifs associated with aberrant CpG
island methylation." Genomics 87(5): 572-9.
Fisher, R. I. (2003). "Overview of non-Hodgkin's lymphoma: biology, staging, and
treatment." Semin Oncol 30(2 Suppl 4): 3-9.
Fisher, S. G. and R. I. Fisher (2004). "The epidemiology of non-Hodgkin's
lymphoma." Oncogene 23(38): 6524-34.
Fujioka, Y., T. Taira, et al. (2001). "MM-1, a c-Myc-binding protein, is a candidate
for a tumor suppressor in leukemia/lymphoma and tongue cancer." J Biol
Chem 276(48): 45137-44.
Futscher, B. W., M. M. Oshiro, et al. (2002). "Role for DNA methylation in the
control of cell type specific maspin expression." Nat Genet 31(2): 175-9.
Garber, K. (2006). "Momentum building for human epigenome project." J Natl
Cancer Inst 98(2): 84-6.
Garcia, M. J., B. Martinez-Delgado, et al. (2002). "Different incidence and pattern
of p15INK4b and p16INK4a promoter region hypermethylation in
Hodgkin's and CD30-Positive non-Hodgkin's lymphomas." Am J Pathol
161(3): 1007-13.
Groot, G. S. and A. M. Kroon (1979). "Mitochondrial DNA from various
organisms does not contain internally methylated cytosine in -CCGG-
sequences." Biochim Biophys Acta 564(2): 355-7.
Guo, R. J., E. R. Suh, et al. (2004). "The role of Cdx proteins in intestinal
development and cancer." Cancer Biol Ther 3(7): 593-601.
137
Hacia, J. G. (1999). "Resequencing and mutational analysis using oligonucleotide
microarrays." Nat Genet 21(1 Suppl): 42-7.
Han, S. Y., L. Zhou, et al. (2001). "TFIIAalpha/beta-like factor is encoded by a
germ cell-specific gene whose expression is up-regulated with other general
transcription factors during spermatogenesis in the mouse." Biol Reprod
64(2): 507-17.
Hans, C. P., D. D. Weisenburger, et al. (2004). "Confirmation of the molecular
classification of diffuse large B-cell lymphoma by immunohistochemistry
using a tissue microarray." Blood 103(1): 275-82.
Harris, N. L., E. S. Jaffe, et al. (1994). "A revised European-American
classification of lymphoid neoplasms: a proposal from the International
Lymphoma Study Group." Blood 84(5): 1361-92.
Harris, N. L., H. Stein, et al. (2001). "New approaches to lymphoma diagnosis."
Hematology (Am Soc Hematol Educ Program): 194-220.
Hinds, D. A., L. L. Stuve, et al. (2005). "Whole-genome patterns of common DNA
variation in three human populations." Science 307(5712): 1072-9.
Hinrichs, A. S., D. Karolchik, et al. (2006). "The UCSC Genome Browser
Database: update 2006." Nucleic Acids Res 34(Database issue): D590-8.
Hosono, S., A. F. Faruqi, et al. (2003). "Unbiased whole-genome amplification
directly from clinical samples." Genome Res 13(5): 954-64.
Houldsworth, J., A. B. Olshen, et al. (2004). "Relationship between REL
amplification, REL function, and clinical and biologic features in diffuse
large B-cell lymphomas." Blood 103(5): 1862-8.
Huang, J. Z., W. G. Sanger, et al. (2002). "The t(14;18) defines a unique subset of
diffuse large B-cell lymphoma with a germinal center B-cell gene
expression profile." Blood 99(7): 2285-90.
Hughes, S., N. Arneson, et al. (2005). "The use of whole genome amplification in
the study of human disease." Prog Biophys Mol Biol 88(1): 173-89.
138
Inouye, K. and T. Sakaki (2001). "Enzymatic studies on the key enzymes of
vitamin D metabolism; 1 alpha-hydroxylase (CYP27B1) and 24-
hydroxylase (CYP24)." Biotechnol Annu Rev 7: 179-94.
Iqbal, J., W. G. Sanger, et al. (2004). "BCL2 translocation defines a unique tumor
subset within the germinal center B-cell-like diffuse large B-cell
lymphoma." Am J Pathol 165(1): 159-66.
Issa, J. P. and S. B. Baylin (1996). "Epigenetics and human disease." Nat Med 2(3):
281-2.
Jacquemin, P., V. J. Lannoy, et al. (2001). "The transcription factor onecut-2
controls the microphthalmia-associated transcription factor gene." Biochem
Biophys Res Commun 285(5): 1200-5.
Jacquemin, P., V. J. Lannoy, et al. (1999). "OC-2, a novel mammalian member of
the ONECUT class of homeodomain transcription factors whose function in
liver partially overlaps with that of hepatocyte nuclear factor-6." J Biol
Chem 274(5): 2665-71.
Jacquemin, P., C. E. Pierreux, et al. (2003). "Cloning and embryonic expression
pattern of the mouse Onecut transcription factor OC-2." Gene Expr Patterns
3(5): 639-44.
John, S., N. Shephard, et al. (2004). "Whole-genome scan, in a complex disease,
using 11,245 single-nucleotide polymorphisms: comparison with
microsatellites." Am J Hum Genet 75(1): 54-64.
Jones, P. A. (1999). "The DNA methylation paradox." Trends Genet 15(1): 34-7.
Jones, P. A. and S. B. Baylin (2002). "The fundamental role of epigenetic events in
cancer." Nat Rev Genet 3(6): 415-28.
Jones, P. A. and P. W. Laird (1999). "Cancer epigenetics comes of age." Nat Genet
21(2): 163-7.
Jones, P. A. and R. Martienssen (2005). "A blueprint for a Human Epigenome
Project: the AACR Human Epigenome Workshop." Cancer Res 65(24):
11241-6.
139
Kaiser, J. (2005). "National Institutes of Health. NCI gears up for cancer genome
project." Science 307(5713): 1182.
Karolchik, D., R. Baertsch, et al. (2003). "The UCSC Genome Browser Database."
Nucleic Acids Res 31(1): 51-4.
Kennedy, G. C., H. Matsuzaki, et al. (2003). "Large-scale genotyping of complex
DNA." Nat Biotechnol 21(10): 1233-7.
Kent, W. J., C. W. Sugnet, et al. (2002). "The human genome browser at UCSC."
Genome Res 12(6): 996-1006.
Klose, R. J. and A. P. Bird (2006). "Genomic DNA methylation: the mark and its
mediators." Trends Biochem Sci 31(2): 89-97.
Knudson, A. G., Jr. (1971). "Mutation and cancer: statistical study of
retinoblastoma." Proc Natl Acad Sci U S A 68(4): 820-3.
Lage, J. M., J. H. Leamon, et al. (2003). "Whole genome analysis of genetic
alterations in small DNA samples using hyperbranched strand displacement
amplification and array-CGH." Genome Res 13(2): 294-307.
Laird, P. W. (2003). "The power and the promise of DNA methylation markers."
Nat Rev Cancer 3(4): 253-66.
Laird, P. W. (2005). "Cancer epigenetics." Hum Mol Genet 14 Spec No 1: R65-76.
Laird, P. W. and R. Jaenisch (1996). "The role of DNA methylation in cancer
genetic and epigenetics." Annu Rev Genet 30: 441-64.
Lander, E. S., L. M. Linton, et al. (2001). "Initial sequencing and analysis of the
human genome." Nature 409(6822): 860-921.
Li, L. C. and R. Dahiya (2002). "MethPrimer: designing primers for methylation
PCRs." Bioinformatics 18(11): 1427-31.
Lossos, I. S. (2005). "Molecular pathogenesis of diffuse large B-cell lymphoma." J
Clin Oncol 23(26): 6351-7.
140
Lossos, I. S., A. A. Alizadeh, et al. (2000). "Ongoing immunoglobulin somatic
mutation in germinal center B cell-like but not in activated B cell-like
diffuse large cell lymphomas." Proc Natl Acad Sci U S A 97(18): 10209-13.
Lossos, I. S. and R. Levy (2003). "Diffuse large B-cell lymphoma: insights gained
from gene expression profiling." Int J Hematol 77(4): 321-9.
Lossos, I. S. and D. Morgensztern (2006). "Prognostic biomarkers in diffuse large
B-cell lymphoma." J Clin Oncol 24(6): 995-1007.
Lu, X., H. Nechushtan, et al. (2005). "Distinct IL-4-induced gene expression,
proliferation, and intracellular signaling in germinal center B-cell-like and
activated B-cell-like diffuse large-cell lymphomas." Blood 105(7): 2924-32.
Matsuzaki, H., H. Loi, et al. (2004). "Parallel genotyping of over 10,000 SNPs
using a one-primer assay on a high-density oligonucleotide array." Genome
Res 14(3): 414-25.
McDowell, D. G., N. A. Burns, et al. (1998). "Localised sequence regions
possessing high melting temperatures prevent the amplification of a DNA
mimic in competitive PCR." Nucleic Acids Res 26(14): 3340-7.
Moller, M. B., N. T. Pedersen, et al. (2003). "Factors predicting long-term survival
in low-risk diffuse large B-cell lymphoma." Am J Hematol 74(2): 94-8.
Mori, K., Y. Maeda, et al. (1998). "MM-1, a novel c-Myc-associating protein that
represses transcriptional activity of c-Myc." J Biol Chem 273(45): 29794-
800.
Nouzova, M., N. Holtan, et al. (2004). "Epigenomic changes during leukemia cell
differentiation: analysis of histone acetylation and cytosine methylation
using CpG island microarrays." J Pharmacol Exp Ther 311(3): 968-81.
Paez, J. G., M. Lin, et al. (2004). "Genome coverage and sequence fidelity of phi29
polymerase-based multiple strand displacement whole genome
amplification." Nucleic Acids Res 32(9): e71.
Patil, N., A. J. Berno, et al. (2001). "Blocks of limited haplotype diversity revealed
by high-resolution scanning of human chromosome 21." Science 294(5547):
1719-23.
141
Pike, B. L. (Manuscript in Preparation).
Pike, B. L., S. Groshen, et al. (2006). "Comparisons of PCR-based genome
amplification systems using CpG island microarrays." Hum Mutat 27(6):
589-96.
Pilozzi, E., M. R. Onelli, et al. (2004). "CDX1 expression is reduced in colorectal
carcinoma and is associated with promoter hypermethylation." J Pathol
204(3): 289-95.
Rauscher, F. J., 3rd (2005). "It is time for a Human Epigenome Project." Cancer
Res 65(24): 11229.
Robertson, K. D. (2005). "DNA methylation and human disease." Nat Rev Genet
6(8): 597-610.
Robertson, K. D. and P. A. Jones (2000). "DNA methylation: past, present and
future directions." Carcinogenesis 21(3): 461-7.
Rodriguez-Pinto, D. (2005). "B cells as antigen presenting cells." Cell Immunol
238(2): 67-75.
Rosenwald, A., G. Wright, et al. (2002). "The use of molecular profiling to predict
survival after chemotherapy for diffuse large-B-cell lymphoma." N Engl J
Med 346(25): 1937-47.
Rousseau, G. G. (2003). "[Onecut transcription factors: role in the development of
the pancreas and the liver]." Bull Mem Acad R Med Belg 158(3-4): 207-12;
discussion 212-4.
Shipp, M. A., K. N. Ross, et al. (2002). "Diffuse large B-cell lymphoma outcome
prediction by gene-expression profiling and supervised machine learning."
Nat Med 8(1): 68-74.
Tagawa, H., M. Suguro, et al. (2005). "Comparison of genome profiles for
identification of distinct subgroups of diffuse large B-cell lymphoma."
Blood 106(5): 1770-7.
Takai, D. and P. A. Jones (2002). "Comprehensive analysis of CpG islands in
human chromosomes 21 and 22." Proc Natl Acad Sci U S A 99(6): 3740-5.
142
Trinh, B. N., T. I. Long, et al. (2001). "DNA methylation analysis by MethyLight
technology." Methods 25(4): 456-62.
Tsou, J. A., L. Y. Shen, et al. (2005). "Distinct DNA methylation profiles in
malignant mesothelioma, lung adenocarcinoma, and non-tumor lung." Lung
Cancer 47(2): 193-204.
Uhlmann, K., K. Rohde, et al. (2003). "Distinct methylation profiles of glioma
subtypes." Int J Cancer 106(1): 52-9.
Varadaraj, K. and D. M. Skinner (1994). "Denaturants or cosolvents improve the
specificity of PCR amplification of a G + C-rich DNA using genetically
engineered DNA polymerases." Gene 140(1): 1-5.
Venter, J. C., M. D. Adams, et al. (2001). "The sequence of the human genome."
Science 291(5507): 1304-51.
Virmani, A. K., J. A. Tsou, et al. (2002). "Hierarchical clustering of lung cancer
cell lines using DNA methylation markers." Cancer Epidemiol Biomarkers
Prev 11(3): 291-7.
Waterston, R. H., K. Lindblad-Toh, et al. (2002). "Initial sequencing and
comparative analysis of the mouse genome." Nature 420(6915): 520-62.
Weber, M., J. J. Davies, et al. (2005). "Chromosome-wide and promoter-specific
analyses identify sites of differential DNA methylation in normal and
transformed human cells." Nat Genet 37(8): 853-62.
Wei, S. H., C. M. Chen, et al. (2002). "Methylation microarray analysis of late-
stage ovarian carcinomas distinguishes progression-free survival in patients
and identifies candidate epigenetic markers." Clin Cancer Res 8(7): 2246-
52.
Weisenberger, D. J., M. Campan, et al. (2005). "Analysis of repetitive element
DNA methylation by MethyLight." Nucleic Acids Res 33(21): 6823-36.
Widschwendter, M., K. D. Siegmund, et al. (2004). "Association of breast cancer
DNA methylation profiles with hormone receptor status and response to
tamoxifen." Cancer Res 64(11): 3807-13.
143
Wong, N. A., J. Wilding, et al. (2005). "CDX1 is an important molecular mediator
of Barrett's metaplasia." Proc Natl Acad Sci U S A 102(21): 7565-70.
Woodson, K., D. J. Weisenberger, et al. (2005). "Gene-specific methylation and
subsequent risk of colorectal adenomas among participants of the polyp
prevention trial." Cancer Epidemiol Biomarkers Prev 14(5): 1219-23.
Wright, G., B. Tan, et al. (2003). "A gene expression-based method to diagnose
clinically distinct subgroups of diffuse large B cell lymphoma." Proc Natl
Acad Sci U S A 100(17): 9991-6.
Yang, H., C. M. Chen, et al. (2003). "The androgen receptor gene is preferentially
hypermethylated in follicular non-Hodgkin's lymphomas." Clin Cancer Res
9(11): 4034-42.
Zipori, D. and M. Barda-Saad (2001). "Role of activin A in negative regulation of
normal and tumor B lymphocytes." J Leukoc Biol 69(6): 867-73.
144
Appendix A: CpG Island Clone Information
Appendix A: Clones are sorted according to their p-value in Table 2.1. Each is listed with the name of the proximal “GENE” along with
the “LENGTH” of the clone (bp), the chromosome number (CHR), the clones nucleotide start number (Kent, Sugnet et al. 2002), and on
which DNA strand (+/-) the clone is located. Also given are the Reference Sequence IDs (REFSEQ) of the associated genes, together with
the strand information for each gene (GENE +/-), the length of the gene (GENE LENGTH) in base pairs, and the number of “EXONS”. In
addition, the relative “POSITION” of the clone to each gene, and the distance from the transcriptional start site (PROXIMITY) is also
provided.
145
DLBCL 4427
y = x + 1E-06
R
2
= 0.9777
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: DNA Methylation Microarray Plots
DLBCL 3735
y = x - 4E-05
R
2
= 0.9589
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
146
DLBCL 5063
y = x - 3E-05
R
2
= 0.9837
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
DLBCL 6328
y = x - 5E-05
R
2
= 0.9642
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: Continued
147
DLBCL 8578
y = x + 5E-06
R
2
= 0.8831
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
DLBCL 8597
y = x + 3E-05
R
2
= 0.9362
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: Continued
148
DLBCL 9284
y = x - 2E-05
R
2
= 0.9495
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
DLBCL 9251
y = x + 9E-05
R
2
= 0.9462
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: Continued
149
DLBCL 9323
y = x + 2E-06
R
2
= 0.9607
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
DLBCL 10618
y = x + 7E-05
R
2
= 0.9254
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: Continued
150
DLBCL 11578
y = x - 2E-06
R
2
= 0.9397
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
DLBCL 11224
y = x + 8E-05
R
2
= 0.9675
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: Continued
151
DLBCL 12967
y = x - 5E-05
R
2
= 0.9234
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
DLBCL 12243
y = x + 4E-05
R
2
= 0.9144
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: Continued
152
DLBCL 13112
y = x + 1E-05
R
2
= 0.9536
8
9
10
11
12
13
14
15
16
8 9 10 11 12 13 14 15 16
Cy5
Cy3
Appendix B: Continued
Appendix B: Individual microarray plots are given for each
sample analyzed. For each sample the background-subtracted,
Log
2
signal intensities are plotted for both Cy3 and Cy5. The
mitochondrial controls are in red. The trend line, the equation of
the trend line and the R
2
values are given for the mitochondrial
controls.
153
ID GENE ID GENE ID GENE
HB-051 ABCB1 (iv) HB-326 GATA5 (xv) HB-264 PSEN2 (xv)
HB-153 APC (ii) HB-221 GDNF (xv) HB-157 PTEN (iii)
HB-090 APEX1 (xv) HB-250 GRIN2B (xv) HB-065 PTGS2 (v)
HB-266 APP (xv) HB-172 GSTP1 (ii) HB-052 PTTG1 (xv)
HB-196 ARF (ii) HB-168 HIC1 (ii) HB-228 PYCARD (xv)
HB-186 ARPC1B (xv) HB-215 HLA-G (vii) HB-101 RAD23A (xv)
HB-179 ATM (xv) HB-268 HOXA1 (xv) HB-176 RARB (iii)
HB-180 ATR (xv) HB-270 HOXA10 (xv) HB-322 RARRES1 (xv)
HB-227 AXIN1 (xv) HB-144 HRAS (xvii) HB-044 RASSF1A (xiii)
HB-140 BCL2 (xvii) HB-066 HSD17B4 (viii) HB-245 RB1 (ii)
HB-258 BDNF (xv) HB-076 ICAM1B (iv) HB-185 RBP1 (iii)
HB-045 BRCA1 (v) HB-311 IFNG (xv) HB-103 RPA2 (xv)
HB-126 BRCA2 (xv) HB-319 IGF2 (xv) HB-104 RPA3 (xv)
HB-158 CACNA1G (iii) HB-069 IGSF4 (xvi) HB-181 RUNX3 (xv)
HB-166 CALCA (iii) HB-321 ITGA4 (xv) HB-061 S100A2 (viii)
HB-146 CCND1 (xv) HB-203 JUP (xv) HB-085 SASH1 (iii)
HB-040 CCND2 (iv) HB-175 KL (xv) HB-220 SASH1 (xv)
HB-050 CDH1 (iii) HB-219 LDLR (xv) HB-064 SCAM-1 (xv)
HB-171 CDH1 (iii) HB-091 LIG3 (xv) HB-194 SCGB3A1 (iii)
HB-075 CDH13 (iii) HB-202 LPHN2 (xv) HB-208 SERPINB5 (xv)
HB-226 CDK2AP1 (xv) HB-070 LTB4R (iii) HB-184 SEZ6L (iii)
HB-230 CDKN1A (xv) HB-200 LZTS1 (xv) HB-174 SFN (xv)
HB-329 CDKN1C (xv) HB-142 MBD2 (iii) HB-201 SFRP1 (xv)
HB-081 CDKN2A (ii) HB-083 MBD4 (xv) HB-280 SFRP2 (xv)
HB-173 CDKN2B (ii) HB-159 MGMT (iii) HB-281 SFRP4 (xv)
HB-195 CDX1 (xv) HB-160 MGMT (xiv) HB-282 SFRP5 (xv)
HB-237 CGA (xv) HB-161 MINT1 (iii) HB-079 SLC6A20 (xv)
HB-190 CHFR (xv) HB-187 MINT2 (iii) HB-275 SMAD2 (xv)
HB-059 CLDN1 (xv) HB-162 MINT31 (iii) HB-053 SMAD3 (xi)
HB-062 CLIC4 (xv) HB-150 MLH1 (v) HB-277 SMAD4 (xv)
HB-193 COL1A2 (xv) HB-099 MLH3 (xv) HB-278 SMAD6 (xv)
HB-197 CRABP1 (xv) HB-117 MMS19L (xv) HB-315 SMAD9 (xv)
HB-170 CTNNB1 (ii) HB-095 MSH2 (xv) HB-042 SOCS1 (v)
HB-147 CTSD (iii) HB-096 MSH4 (xv) HB-063 STAT1 (xv)
HB-148 CTSD (iii) HB-097 MSH5 (xv) HB-182 STK11 (iii)
HB-054 CXADR (iv) HB-084 MSH6 (xv) HB-183 STK11 (iii)
HB-223 CYB27B1 (xv) HB-205 MT1A (xv) HB-073 SYK (iii)
HB-078 CYP1B1 (v) HB-204 MT1G (xv) HB-241 SYK M2B (xv)
HB-046 DAPK1 (viii) HB-206 MT2A (xv) HB-074 TERT (v)
HB-178 DCC (xv) HB-207 MT3 (xv) HB-314 TFAP2A (xv)
HB-133 DCLRE1C (xv) HB-058 MTHFR (ii) HB-145 TFF1 (v)
HB-116 DDB1 (xv) HB-088 MUTYM1B (xv) HB-192 TGFBR1 (xv)
HB-043 DIRAS3 (x) HB-154 MYOD1 (ii) HB-246 TGFBR2 (ii)
Appendix C: Previously Published MethyLight Reactions
154
Appendix C: Previously published reactions listed by MethyLight Reaction ID.
Appendix C: Continued
HB-218 DLC1 (xv) HB-077 NCL (xv) HB-247 THBS1 (xvii)
HB-225 DLEC (xv) HB-259 NEUROD1 (xv) HB-216 THRB (xv)
HB-048 DNAJC15 (xii) HB-260 NEUROD2 (xv) HB-167 TIMP3 (ii)
HB-049 DPH2L1 (xv) HB-261 NEUROG1 (xv) HB-213 TITF1 (v)
HB-252 DRD1 (xv) HB-067 NR3C1 (iii) HB-274 TMEFF2 (xv)
HB-253 DRD2 (xv) HB-251 NTF3 (xv) HB-306 TNFRSF10A (xv)
HB-229 EBF3 (xv) HB-089 NTHL1 (xv) HB-307 TNFRSF10B (xv)
HB-152 EPM2AIP1 (ix) HB-087 OGG1 (xv) HB-308 TNFRSF10C (xv)
HB-233 ERBB2 (xv) HB-242 ONECUT2 (xv) HB-309 TNFRSF10D (xv)
HB-110 ERCC1 (xv) HB-209 OPCML (xv) HB-080 TNFRSF25 (vi)
HB-105 ERCC2 (xv) HB-093 PARP1 (xv) HB-217 TP53 (xv)
HB-111 ERCC4 (xv) HB-094 PARP2 (xv) HB-177 TP73 (xv)
HB-109 ERCC5 (xv) HB-211 PAX8 (xv) HB-141 TSHR (xv)
HB-114 ERCC6 (xv) HB-163 PENK (xv) HB-047 TWIST1 (viii)
HB-113 ERCC8 (xv) HB-169 PGR (iii) HB-248 TYMS (ii)
HB-164 ESR1 (i) HB-149 PGR (xviii) HB-082 UNG (xv)
HB-165 ESR2 (v) HB-235 PITX2 (xv) HB-224 UQCRM1B (xv)
HB-304 FAF1 (xv) HB-199 PLAGL1 (xv) HB-068 VDR (viii)
HB-151 FBXW7 (iii) HB-098 PMS2 (xv) HB-191 VHL (xv)
HB-041 FHIT (iii) HB-139 POLD1 (xv) HB-115 XAB2 (xv)
HB-254 GABRA2 (xv) HB-060 PPARG (xv) HB-102 XPA (xv)
HB-256 GAD1 (xv) HB-214 PRKAR1A (xv) HB-100 XPC (xv)
HB-327 GATA3 (xv) HB-231 PSAT1 (xv) HB-092 XRCC1 (xv)
HB-323 GATA4 (xv) HB-262 PSEN1 (xv)
(i) Eads, C.A. et al. Cancer Res 60, 5021-5026 (2000)
(ii) Eads, C.A. et al. Cancer Res 61, 3410-3418 (2001)
(iii) Ehrlich, M. et al Oncogene 25,2636-2645 (2006)
(iv) Ehrlich, M. et al. Oncogene 21, 6694-6702 (2002)
(v) Fiegl, H. et al Cancer Epidemiol Biomarkers Prev 13,882-888 (2004)
(vi) Formerly described as TNFRSF12 in Ehrlich, M. et al. Oncogene 21, 6694-6702 (2002)
(vii) Muller, H.M. et al Ann NY Acad Sci 1022, 44-49 (2004)
(viii) Muller, H.M. et al. Cancer Lett 209, 231-236 (2004)
(ix) Originally described as MLH1-M1 in Eads, C.A. et al. Cancer Res 61, 3410-3418 (2001)
(x) Previously described as ARHI in Fiegl, H. et al Cancer Epidemiol Biomarkers Prev 13,882-888 (2004)
(xi) Previously described as MADH3 in Ehrlich, M. et al. Oncogene 21, 6694-6702 (2002)
(xii) Previously described as MCJ in Ehrlich, M. et al. Oncogene 21, 6694-6702 (2002)
(xiii) Previously described as RASSF1A in Ehrlich, M. et al. Oncogene 21, 6694-6702 (2002)
(xiv) Virmani, A.K. et al. Cancer Epidemiol Biomarkers Prev11, 291-297 (2002)
(xv) Weisenberger, et al Nature Genetics (in press)
(xvi) Widschwendter, M. et al Cancer Res 64, 4472-4480 (2004)
(xvii) Widschwendter, M. et al Cancer Res 64,3807-3813 (2004)
(xviii) Woodson, K. et al Cancer Epidemiol Biomarkers Prev 14,1219-1223 (2005)
155
Appendix D: Unpublished MethyLight Assay Information
156
Appendix D: Continued
Appendix D: MethyLight Amplicon Information: GENE= proximal gene, CHR=chromosome, START =
Amplicon’s UCSC nucleotide starting number, END = Amplicon’s UCSC nucleotide ending number, USCS
157
Appendix E: Unpublished MethyLight Assay Sequences
158
Appendix E: Continued
159
Appendix E: Continued
160
Appendix E: Continued
Sequences of Previously Unpublished MethyLight Assays. All probes had
6FAM-molecule attached to 5’ end and a black hole quencher (BHQ-1,
Biosearch Technologies) on 3’ end, except for Reaction HB-312 which was
had a (MGBNFQ, Applied Biosystems, Inc.) on 3’ (denoted with *). The
reverse primer for assay HB-243 was mistakenly designed with a three
nucleotide deletion. The three deleted bases are denoted with “XXX”.
161
CDKN1C
MethyLight HB-329 vs. Expression Clone 17821
y = 0.0115x - 1.26
R
2
= 0.1954
-2.50
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
2.50
0 10 20 30 40 50 60 70 80
PMR Value
Expression Cy3/Cy5 Ratio
CDKN1C
MethyLight HB-328 vs. Expression Clone 17821
y = 0.0037x - 0.9999
R
2
= 0.0664
-2.50
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
2.50
0 20 40 60 80 100
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Methylation vs. Expression Plots
CDKN1C
MethyLight HB-328 vs. Expression Clone 17821
CDKN1C
MethyLight HB-329 vs. Expression Clone 17821
162
HLA-G
MethyLight HB-215 vs. Average of Expression Clones 17449 & 27947
y = 9E-05x + 0.4835
R
2
= 4E-06
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
0 10 20 30 40 50 60
PMR Value
Expression Cy3/Cy5 Ratio
GATA3
MethyLight HB-327 vs. Expression Clone 28964
y = -0.008x - 0.467
R
2
= 0.0931
-1.60
-1.10
-0.60
-0.10
0.40
0.90
1.40
0 10 20 30 40 50 60
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
GATA3
MethyLight HB-327 vs. Expression Clone 28964
HLA-G
MethyLight HB-215 vs. Average of Expression Clones 17449 & 27947
163
HOXA1
MethyLight HB-268 vs. Expression Clone 26142
y = -0.0097x + 0.3866
R
2
= 0.0002
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
0 1 2 3 4 5
PMR Value
Expression Cy3/Cy5 Ratio
ERBB2
MethyLight HB-232 vs. Average of Expression Clones 15974, 28548 & 33569
y = 0.0009x + 0.2023
R
2
= 0.0015
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
0 10 20 30 40 50 60 70 80
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
ERBB2
MethyLight HB-232 vs. Average of Expression Clones 15974, 28548 & 33569
HOXA1
MethyLight HB-268 vs. Expression Clone 26142
164
Appendix F: Continued
CDKN1A
MethyLight HB-230 vs. Expression Clone 19374
y = -0.0293x - 0.7252
R
2
= 0.0963
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
0 2 4 6 8 10 12 14 16
PMR Value
Expression Cy3/Cy5 Ratio
CCND1
MethyLight HB-146 vs. Expression Clone 16129
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
0 1 2 3 4 5
PMR Value
Expression Cy3/Cy5 Ratio
CCND1
MethyLight HB-146 vs. Expression Clone 16129
CDKN1A
MethyLight HB-230 vs. Expression Clone 19374
165
PTPN6
MethyLight HB-348 vs. Expression Clone 19260
y = -0.0829x + 0.7528
R
2
= 0.0259
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
0 1 2 3 4 5 6
PMR Value
Expression Cy3/Cy5 Ratio
CCND2
MethyLight HB-040 vs. Expression Clone 24787
y = -0.081x - 1.4068
R
2
= 0.0128
-4.00
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
0 1 2 3 4 5 6 7
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
PTPN6
MethyLight HB-348 vs. Expression Clone 19260
CCND2
MethyLight HB-040 vs. Expression Clone 24787
166
MT2A
MethyLight HB-206 vs. Expression Clone 27429
y = -0.0045x - 0.1364
R
2
= 0.0712
-0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0 10 20 30 40 50 60
PMR Value
Expression Cy3/Cy5 Ratio
VDR
MethyLight HB-068 vs. Expression Clone 28148
y = -0.001x + 0.1476
R
2
= 0.0006
-1.00
-0.80
-0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
0 5 10 15 20 25 30 35 40
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
MT2A
MethyLight HB-206 vs. Expression Clone 27429
VDR
MethyLight HB-068 vs. Expression Clone 28148
167
THBS1
MethyLight HB-247 vs. Average of Expression Clones 33546 & 17410
y = 0.029x + 2.1609
R
2
= 0.1246
0.00
1.00
2.00
3.00
4.00
5.00
6.00
0 10 20 30 40 50
PMR Value
Expression Cy3/Cy5 Ratio
CYP1B1
MethyLight HB-078 vs. Average of Expression Clones 28674 & 16127
y = 0.0347x + 1.9351
R
2
= 0.1585
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
0 5 10 15 20 25 30 35 40 45
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
THBS1
MethyLight HB-247 vs. Average of Expression Clones 33546 & 17410
CYP1B1
MethyLight HB-078 vs. Average of Expression Clones 28674 & 16127
168
ESR1
MethyLight HB-164 vs. Expression Clone 17240
y = 0.0072x + 0.6063
R
2
= 0.0269
-2.50
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
2.50
0 5 10 15 20 25 30 35 40
PMR Value
Expression Cy3/Cy5 Ratio
MGMT
MethyLight HB-159 vs. Expression Clone 26576
y = -0.0056x + 1.7356
R
2
= 0.0054
0.00
0.50
1.00
1.50
2.00
2.50
3.00
0 5 10 15 20 25
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
ESR1
MethyLight HB-164 vs. Expression Clone 17240
MGMT
MethyLight HB-159 vs. Expression Clone 26576
169
RARB
MethyLight HB-176 vs. Expression Clone 28413
y = 0.017x - 0.7699
R
2
= 0.2108
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
0 10 20 30 40 50
PMR Value
Expression Cy3/Cy5 Ratio
CYP1B1
MethyLight HB-239 vs. Average of Expression Clones 28674 & 16127
y = 0.0111x + 1.9246
R
2
= 0.052
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
0 10 20 30 40 50 60 70 80
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
CYP1B1
MethyLight HB-239 vs. Average of Expression Clones 28674 & 16127
RARB
MethyLight HB-176 vs. Expression Clone 28413
170
CDH13
MethyLight HB-075 vs. Average of Expression Clones 26552 & 28476
y = 0.0062x + 0.1238
R
2
= 0.0254
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
0 10 20 30 40 50 60
PMR Value
Expression Cy3/Cy5 Ratio
JUP
MethyLight HB-203 vs. Average of Expression Clones 27424 & 24402
y = 0.0253x + 0.9915
R
2
= 0.0914
-4.00
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
0 5 10 15 20 25 30 35 40 45
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
CDH13
MethyLight HB-075 vs. Average of Expression Clones 26552 & 28476
JUP
MethyLight HB-203 vs. Average of Expression Clones 27424 & 24402
171
CDH1
MethyLight HB-171 vs. Expression Clone 28504
y = -0.0009x + 0.8158
R
2
= 0.0006
0.00
0.50
1.00
1.50
2.00
2.50
0 10 20 30 40 50 60 70
PMR Value
Expression Cy3/Cy5 Ratio
CDH1
MethyLight HB-050 vs. Expression Clone 28504
y = -0.0008x + 0.8047
R
2
= 0.0002
0.00
0.50
1.00
1.50
2.00
2.50
0 10 20 30 40 50
PMR Value
Expression Cy3/Cy5 Ratio
Appendix F: Continued
CDH1
MethyLight HB-171 vs. Expression Clone 28504
CDH1
MethyLight HB-050 vs. Expression Clone 28504
172
173
Appendix G:
Due to spacing
considerations, the ABC
and GCB samples are
presented separately
along with the average
(AVG) PMR value across
each of the subtypes. The
MethyLight reactions are
sorted according to the p-
values listed in Table 5.2.
Both the MethyLight
reaction number (ID) and
proximal gene are also
given.
174
175
Appendix H: Replicate
MethyLight Reactions
Each MethyLight reaction
(ID) was completed in
duplicate and the replicates
were averaged (shaded).
Due to space
considerations, the ABC
and GCB samples are
presented separately. The
average (AVG) PMR value
was also calculated across
all of the samples in each
176
ARPP-19 (HB-437 vs. Clone 40.A4)
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
C9orf64 (HB-439 vs. Clone 14.A8)
y = -0.0003x + 0.9741
R
2
= 0.0712
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
Appendix I: CpG Island Microarray vs. MethyLight
177
GNMT (HB-426 vs. Clone 53.H6)
y = -0.0014x + 0.8956
R
2
= 0.515
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
CPVL (HB-427 vs. 52.F2)
y = -0.0007x + 0.9314
R
2
= 0.1369
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
Appendix I: Continued
178
GTF2A2 (HB-433 vs. Clone 19.C9)
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
SLC38A4 (HB-429 vs. Clone 19.C7)
y = -0.0001x + 0.8156
R
2
= 0.0032
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
Appendix I: Continued
179
SLC38A4 (HB-430 vs. Clone 19.C7)
y = 0.0007x + 0.7687
R
2
= 0.1495
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
WDR33 (HB-435 vs. Clone 56.E4)
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100
MethyLight PMR Value
CpG Island Microarray Cy3/Cy5
Appendix I: CpG Island Microarray vs. MethyLight
180
Appendix J: Bisulfite Sequencing Results
181
Appendix J: Continued
182
Appendix J: Continued
183
Appendix J: Continued
184
Appendix J: Continued:
Shown above are the bisulfite sequencing results for each sample
analyzed. Each sequenced clone is represented by a horizontal line and the
CpG dinucleotides are depicted by circles. The three oligonucleotides that
make up the MethyLight assay HB-442 are also shown. The sequenced
amplicon begins at +15 nucleotides after the transcriptional start site of
FLJ21062 and ends at +139. Methylated cytosines are represented closed
circles and unmethylated circles are represented with open circles. The
sample ID and original MethyLight PMR value are also given.
185
Appendix K: Reproduction Permission
Abstract (if available)
Abstract
Clinically distinct subtypes of Diffuse Large B Cell Lymphoma (DLBCL) have gene expression profiles that reflect their origins from specific stages of B-cell maturation. We conducted epigenetic analyses to evaluate the DNA methylation status of CpG islands in germinal center B-cell-like (GCB) and activated B-cell-like (ABC) DLBCL subtypes. Using two different platforms, we uncovered gene-associated CpG islands whose DNA methylation levels varied among DLBCL. Of these, the methylation levels of CpG islands proximal to ONECUT2 and FLJ21062 (HIC3) correlated with subtype identity. Interestingly, ONECUT2 is involved regulating TGF-beta signaling pathways crucial for B cell maturation. In contrast to expectations based on the two-hit hypothesis, ONECUT2 resides on a frequently amplified, instead of deleted, genomic segment in DLBCL. This novel observation may reflect a mechanism for silencing potential tumor suppressor genes present in large, amplified genomic regions. Overall, these results suggest that DNA methylation may prove to be valuable for the identification and early detection of cancers derived from closely related cell lineages.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The kinetic study of engineered MBD domain interactions with methylated DNA: insight into binding of methylated DNA by MBD2b
PDF
DNA methylation markers for blood-based detection of small cell lung cancer in mouse models
PDF
DNA methylation inhibitors and epigenetic regulation of microRNA expression
PDF
DNA methylation as a biomarker in human reproductive health and disease
PDF
DNA hypermethylation: its role in colorectal tumorigenesis and potential clinical applications
PDF
CpG methylation profiling in lung cancer cell lines, tumors and non-tumors
PDF
DNA methylation and gene expression profiles in Vidaza treated cultured cancer cells
PDF
Functional DNA methylation changes in normal and cancer cells
PDF
Understanding protein–DNA recognition in the context of DNA methylation
PDF
Development of DNA methylation based biomarkers for the early detection of squamous cell lung cancer
PDF
Comparative analysis of DNA methylation in mammals
PDF
An analysis of conservation of methylation
PDF
Understanding DNA methylation and nucleosome organization in cancer cells using single molecule sequencing
PDF
DNA methylation changes in the development of lung adenocarcinoma
PDF
Epigenetic regulation of non CPG island gene promoters
PDF
Sensitization of diffuse large B-cell lymphoma to chemotherapy by targeting autophagy
PDF
Studies of the biological relevance of Histone H4 Lysine 20 monomethylation: discovery of its role in the cell cycle and localization within the human genome
PDF
Differential methylation analysis of colon tissues
PDF
The relationship between DNA methylation and transcription factor binding in colon cancer cells
PDF
Identification and characterization of PR-Set7 and histone H4 lysine 20 methylation-associated proteins
Asset Metadata
Creator
Pike, Brian Lee
(author)
Core Title
Identification of DNA methylation markers in diffuse large B-cell lymphoma
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Biochemistry and Molecular Biology
Degree Conferral Date
2006-12
Publication Date
09/29/2006
Defense Date
01/26/2006
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
biomarker,cancer,DNA methylation,germinal center,lymphoma,OAI-PMH Harvest
Language
English
Advisor
Hacia, Joseph G. (
committee chair
), Frenkel, Baruch (
committee member
), Laird, Peter W. (
committee member
)
Creator Email
bpike@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m57
Unique identifier
UC184045
Identifier
etd-Pike-20060929 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-18695 (legacy record id),usctheses-m57 (legacy record id)
Legacy Identifier
etd-Pike-20060929.pdf
Dmrecord
18695
Document Type
Dissertation
Rights
Pike, Brian Lee
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
biomarker
cancer
DNA methylation
germinal center
lymphoma